QEMU+librbd调试环境搭建




本次环境搭建在云主机上完成。如在物理机上搭建,可能需要修改虚拟机xml配置文件。

前提

ceph环境已搭建完毕,并可以创建rbd卷。

安装qemu及libvirt

 

apt-get update
apt-get install qemu qemu-block-extra libvirt-daemon-system libvirt-daemon libvirt-clients

 

其中qemu-block-extra包是为了给qemu提供rbd协议存储后端扩展支持。

下载虚拟机镜像

wget https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img  ## 这个是最精简的虚拟机镜像,只有14M

 

创建虚拟机使用的rbd卷(虚拟机系统盘)

qemu-img convert -f qcow2 -O raw cirros-0.4.0-x86_64-disk.img rbd:rbd/vol1
# 注意这条命令会在转换镜像格式的同时在ceph rbd池里创建rbd卷vol1,因此vol1不能已存在
# 如有需要可以resize扩容rbd卷,扩容卷之后还需要扩容文件系统,建议单独挂载一块rbd卷到虚拟机用于测试

 

准备虚拟机xml配置文件

保存成xxx.xml,比如libvirt.xml:

<domain type="qemu">  <!-- 注意这里的type,如果是物理机上启动虚拟机,需要改为kvm  -->
  <uuid>5d1289be-50e1-47b7-86de-1de0ff16a9d4</uuid>  <!-- 虚拟机uuid  -->
  <name>ceph</name>    <!-- 虚拟机名称  -->
  <memory>524288</memory>   <!-- 虚拟机内存大小,这里是配置的512M  -->
  <vcpu>1</vcpu>   <!-- 虚拟机CPU数量  -->
  <os>
    <type>hvm</type>
    <boot dev="hd"/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <clock offset="utc">
    <timer name="pit" tickpolicy="delay"/>
    <timer name="rtc" tickpolicy="catchup"/>
  </clock>
  <cpu mode="host-model" match="exact"/>
  <devices>       <!-- 虚拟机磁盘配置,一般vda是系统盘  -->
      <disk type="network" device="disk">
      <driver type="raw" cache="none"/>
      <source protocol="rbd" name="rbd/vol1">      <!-- 一般需要修改name,也就是$pool/$volume  -->
        <host name="192.168.0.2" port="6789"/>        <!-- mon地址  -->
      </source>
      <target bus="virtio" dev="vda"/>         <!-- 虚拟机内设备  -->
    </disk>
    <serial type="file">
      <source path="/var/log/libvirt/qemu/ceph-console.log"/>        <!-- 把虚拟机控制台输出到文件,可选  -->
    </serial>
    <serial type="pty"/>
    <input type="tablet" bus="usb"/>
    <graphics type="vnc" autoport="yes" keymap="en-us" listen="0.0.0.0"/>      <!-- 虚拟机VNC监听地址,一般不需要修改  -->
  </devices>
</domain>

 

之后执行virsh define libvirt.xml(定义虚拟机并持久化虚拟机配置到libvirt),virsh list

–all (查看所有状态虚拟机,包含关机状态),virsh start ceph(启动虚拟机),virsh destroy/shutdown ceph(强制/正常关机),virsh undefine ceph(清理虚拟机)。

 

挂载卷到虚拟机上

准备挂载卷的xml配置,基本上就是从虚拟机配置里面摘出来的磁盘配置:

<disk type="network" device="disk">
  <driver type="raw" cache="none"/>
  <source protocol="rbd" name="rbd/vol2">   <!-- 卷名称要改下 -->
    <host name="192.168.0.2" port="6789"/>
  </source>
  <target bus="virtio" dev="vdb"/>   <!-- 主要是这里的虚拟机设备要改下,不能是虚拟机xml配置文件里面已有的 -->
</disk>

 

之后执行virsh attach-device ceph vdb.xml,虚拟机里面sudo fdisk -l即可看到,注意,virsh destroy之后再start虚拟机,动态挂载的卷会消失,可以在attach-device命令后加上–config参数进行持久化,或者直接把这段xml放到libvirt.xml里面(<devices></devices>段里面即可)再启动虚拟机。

卸载磁盘设备执行virsh detach-device ceph vdb.xml即可。

 

ceph-client socket

如果你在ceph.conf里面配置了admin_socket,并且相关目录的权限也放开(虚拟机对应的qemu进程是libvirt-qemu用户组),

[client]
admin_socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok
log_file = /var/log/ceph/qemu/qemu-guest-$pid.log

那么你就可以看到相应的ceph-client socket生成。

$ ll /var/run/ceph/
total 0
srwxrwxr-x 1 libvirt-qemu libvirt-qemu 0 Aug 21 11:30 ceph-client.admin.70781.94809655383520.asok
srwxrwxrwx 1 ceph         ceph         0 Aug 17 13:51 ceph-mgr.ceph1.asok
srwxrwxrwx 1 ceph         ceph         0 Aug 17 16:06 ceph-mon.ceph1.asok
srwxrwxrwx 1 ceph         ceph         0 Aug 17 13:51 ceph-osd.0.asok
srwxrwxrwx 1 ceph         ceph         0 Aug 17 13:51 ceph-osd.1.asok
srwxrwxrwx 1 ceph         ceph         0 Aug 17 13:51 ceph-osd.2.asok
$ ceph --admin-daemon /var/run/ceph/ceph-client.admin.70781.94809655383520.asok perf dump | grep librbd-fad56b8b4567-rbd-vol1 -A20                   
    "librbd-fad56b8b4567-rbd-vol1": {
        "rd": 1227,
        "rd_bytes": 25809408,
        "rd_latency": {
            "avgcount": 1227,
            "sum": 3.044197946,
            "avgtime": 0.002481008
        },
        "wr": 65,
        "wr_bytes": 159744,
        "wr_latency": {
            "avgcount": 65,
            "sum": 40.068453646,
            "avgtime": 0.616437748
        },
        "discard": 0,
        "discard_bytes": 0,
        "discard_latency": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000

 

通过qemu调试librbd

注意:需要先手工编译安装debug版本librbd,qemu使用上面提到的apt-get方式安装,未编译,因此调试时看不到相应源码,如有需要可以自行编译,或者安装debug包。

虚拟机启动后,可以用gdb调试qemu进程,qemu进程通过调用librbd.so来进行rbd卷的IO读写:

## 首先找到qemu进程pid
ps -ef | grep qemu | grep mon_host
libvirt+   70781       1  6 11:30 ?        00:15:18 /usr/bin/qemu-system-x86_64 -name guest=ceph,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-6-ceph/master-key.aes -machine pc-i440fx-2.8,accel=tcg,usb=off,dump-guest-core=off -cpu Broadwell,+vme,+ss,+osxsave,+f16c,+rdrand,+hypervisor,+arat,+tsc_adjust,+xsaveopt,+pdpe1gb,+abm -m 512 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 5d1289be-50e1-47b7-86de-1de0ff16a9d4 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-6-ceph/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=rbd:rbd/vol1:auth_supported=none:mon_host=192.168.0.2\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -add-fd set=0,fd=27 -chardev file,id=charserial0,path=/dev/fdset/0,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on
 
## 之后用gdb挂载到pid
$ gdb -p 70781
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
......
Attaching to process 70781
0x00007f1cb4f51741 in __GI_ppoll (fds=0x563a98ec8690, nfds=8, timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
39      ../sysdeps/unix/sysv/linux/ppoll.c: No such file or directory.
(gdb)b librbd::io::ImageRequest<librbd::ImageCtx>::create_write_request     ### 添加librbd断点
(gdb) c
Continuing.       #### 虚拟机里面执行IO操作
[Switching to Thread 0x7f1c84ff9700 (LWP 70800)]
Thread 20 "CPU 0/TCG" hit Breakpoint 3, librbd::io::ImageRequest<librbd::ImageCtx>::create_write_request(librbd::ImageCtx&, librbd::io::AioCompletion*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > >&&, ceph::buffer::list&&, int, ZTracer::Trace const&) (
    image_ctx=..., aio_comp=aio_comp@entry=0x7f1c417c9400,
    image_extents=image_extents@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1c9367c, DIE 0x1d71f9f>,
    bl=bl@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1c9367c, DIE 0x1d71fb7>, op_flags=op_flags@entry=0, parent_trace=...)
    at /mnt/ceph/src/librbd/io/ImageRequest.cc:96
96      ImageRequest<I>* ImageRequest<I>::create_write_request(
(gdb) bt
#0  librbd::io::ImageRequest<librbd::ImageCtx>::create_write_request(librbd::ImageCtx&, librbd::io::AioCompletion*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > >&&, ceph::buffer::list&&, int, ZTracer::Trace const&) (image_ctx=...,
    aio_comp=aio_comp@entry=0x7f1c417c9400, image_extents=image_extents@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1c9367c, DIE 0x1d71f9f>,
    bl=bl@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1c9367c, DIE 0x1d71fb7>, op_flags=op_flags@entry=0, parent_trace=...)
    at /mnt/ceph/src/librbd/io/ImageRequest.cc:96
#1  0x00007f1ca1318f59 in librbd::io::ImageRequestWQ<librbd::ImageCtx>::aio_write(librbd::io::AioCompletion*, unsigned long, unsigned long, ceph::buffer::list&&, int, bool) (this=0x563a97ee5090, c=0x7f1c417c9400, off=off@entry=43303936, len=len@entry=1024,
    bl=bl@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1dbf9f7, DIE 0x1e87755>, op_flags=op_flags@entry=0, native_async=true)
    at /mnt/ceph/src/librbd/io/ImageRequestWQ.cc:264
#2  0x00007f1ca1228310 in rbd_aio_write (image=<optimized out>, off=43303936, len=1024, buf=<optimized out>, c=<optimized out>)
    at /mnt/ceph/src/librbd/librbd.cc:3536
#3  0x00007f1ca172a33a in ?? () from /usr/lib/x86_64-linux-gnu/qemu/block-rbd.so
#4  0x00007f1ca172a426 in ?? () from /usr/lib/x86_64-linux-gnu/qemu/block-rbd.so
#5  0x0000563a965aac3c in ?? ()
#6  0x0000563a965abed0 in ?? ()
#7  0x0000563a965acba7 in bdrv_co_pwritev ()
#8  0x0000563a9656e469 in ?? ()
#9  0x0000563a965aab21 in ?? ()
#10 0x0000563a965abed0 in ?? ()
#11 0x0000563a965acba7 in bdrv_co_pwritev ()
#12 0x0000563a9659e90d in blk_co_pwritev ()
#13 0x0000563a9659ea2b in ?? ()
#14 0x0000563a9661752a in ?? ()
#15 0x00007f1cb4eb6000 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#16 0x00007ffda22a0ff0 in ?? ()
#17 0x0000000000000000 in ?? ()
(gdb) l
91                                       std::move(read_result), op_flags,
92                                       parent_trace);
93      }
94
95      template <typename I>
96      ImageRequest<I>* ImageRequest<I>::create_write_request(
97          I &image_ctx, AioCompletion *aio_comp, Extents &&image_extents,
98          bufferlist &&bl, int op_flags, const ZTracer::Trace &parent_trace) {
99        return new ImageWriteRequest<I>(image_ctx, aio_comp, std::move(image_extents),
100                                       std::move(bl), op_flags, parent_trace);
(gdb) l
101     }
102
103     template <typename I>
104     ImageRequest<I>* ImageRequest<I>::create_discard_request(
105         I &image_ctx, AioCompletion *aio_comp, uint64_t off, uint64_t len,
106         bool skip_partial_discard, const ZTracer::Trace &parent_trace) {
107       return new ImageDiscardRequest<I>(image_ctx, aio_comp, off, len,
108                                         skip_partial_discard, parent_trace);
109     }
110

 

安装qemu debug symbols

 

### 添加debug symbols源
cat <<EOF | sudo tee /etc/apt/sources.list.d/dbgsym.list
> deb http://debug.mirrors.debian.org/debian-debug/ stretch-debug main
> EOF
deb http://debug.mirrors.debian.org/debian-debug/ stretch-debug main
### 更新源
$ apt update -y
$ apt install qemu-system-x86-dbgsym qemu-block-extra-dbgsym qemu-system-common-dbgsym qemu-utils-dbgsym -y

之后再用gdb调试就可以看到全部的调用栈信息及源码位置了:

Thread 37 "CPU 0/TCG" hit Breakpoint 1, librbd::io::ImageRequest<librbd::ImageCtx>::create_write_request(librbd::ImageCtx&, librbd::io::AioCompletion*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > >&&, ceph::buffer::list&&, int, ZTracer::Trace const&) (
    image_ctx=..., aio_comp=aio_comp@entry=0x7facc4278200,
    image_extents=image_extents@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1c9367c, DIE 0x1d71f9f>,
    bl=bl@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1c9367c, DIE 0x1d71fb7>, op_flags=op_flags@entry=0, parent_trace=...)
    at /mnt/ceph/src/librbd/io/ImageRequest.cc:96
96      ImageRequest<I>* ImageRequest<I>::create_write_request(
(gdb) bt
#0  librbd::io::ImageRequest<librbd::ImageCtx>::create_write_request(librbd::ImageCtx&, librbd::io::AioCompletion*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > >&&, ceph::buffer::list&&, int, ZTracer::Trace const&) (image_ctx=...,
    aio_comp=aio_comp@entry=0x7facc4278200, image_extents=image_extents@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1c9367c, DIE 0x1d71f9f>,
    bl=bl@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1c9367c, DIE 0x1d71fb7>, op_flags=op_flags@entry=0, parent_trace=...)
    at /mnt/ceph/src/librbd/io/ImageRequest.cc:96
#1  0x00007fad47504f59 in librbd::io::ImageRequestWQ<librbd::ImageCtx>::aio_write(librbd::io::AioCompletion*, unsigned long, unsigned long, ceph::buffer::list&&, int, bool) (this=0x55cbc3898890, c=0x7facc4278200, off=off@entry=17248256, len=len@entry=1024,
    bl=bl@entry=<unknown type in /usr/local/lib/librbd.so.1, CU 0x1dbf9f7, DIE 0x1e87755>, op_flags=op_flags@entry=0, native_async=true)
    at /mnt/ceph/src/librbd/io/ImageRequestWQ.cc:264
#2  0x00007fad47414310 in rbd_aio_write (image=<optimized out>, off=off@entry=17248256, len=len@entry=1024,
    buf=buf@entry=0x7facc4279000 " Opts: (null)\nAug 22 04:20:32 cirros kern.info kernel: [    4.859555] EXT4-fs (sda1): re-mounted. Opts: data=ordered\nAug 22 04:20:32 cirros kern.notice kernel: [    5.703121] random: dd urandom read w"..., c=<optimized out>) at /mnt/ceph/src/librbd/librbd.cc:3536
#3  0x00007fad4791633a in rbd_start_aio (bs=<optimized out>, off=17248256, qiov=<optimized out>, size=1024, cb=<optimized out>, opaque=<optimized out>,
    cmd=RBD_AIO_WRITE) at ./block/rbd.c:697
#4  0x00007fad47916426 in qemu_rbd_aio_writev (bs=<optimized out>, sector_num=<optimized out>, qiov=<optimized out>, nb_sectors=<optimized out>, cb=<optimized out>,
    opaque=<optimized out>) at ./block/rbd.c:746
#5  0x000055cbc23b7c3c in bdrv_driver_pwritev (bs=bs@entry=0x55cbc36c9890, offset=offset@entry=17248256, bytes=bytes@entry=1024, qiov=qiov@entry=0x7facc4277c60,
    flags=flags@entry=0) at ./block/io.c:901
#6  0x000055cbc23b8ed0 in bdrv_aligned_pwritev (bs=bs@entry=0x55cbc36c9890, req=req@entry=0x7facf82fbbc0, offset=offset@entry=17248256, bytes=bytes@entry=1024,
    align=align@entry=512, qiov=qiov@entry=0x7facc4277c60, flags=0) at ./block/io.c:1360
#7  0x000055cbc23b9ba7 in bdrv_co_pwritev (child=<optimized out>, offset=<optimized out>, offset@entry=17248256, bytes=bytes@entry=1024,
    qiov=qiov@entry=0x7facc4277c60, flags=flags@entry=0) at ./block/io.c:1610
#8  0x000055cbc237b469 in raw_co_pwritev (bs=0x55cbc36c35e0, offset=17248256, bytes=1024, qiov=<optimized out>, flags=<optimized out>) at ./block/raw_bsd.c:243
#9  0x000055cbc23b7b21 in bdrv_driver_pwritev (bs=bs@entry=0x55cbc36c35e0, offset=offset@entry=17248256, bytes=bytes@entry=1024, qiov=qiov@entry=0x7facc4277c60,
    flags=flags@entry=0) at ./block/io.c:875
#10 0x000055cbc23b8ed0 in bdrv_aligned_pwritev (bs=bs@entry=0x55cbc36c35e0, req=req@entry=0x7facf82fbe90, offset=offset@entry=17248256, bytes=bytes@entry=1024,
    align=align@entry=1, qiov=qiov@entry=0x7facc4277c60, flags=0) at ./block/io.c:1360
#11 0x000055cbc23b9ba7 in bdrv_co_pwritev (child=<optimized out>, offset=<optimized out>, offset@entry=17248256, bytes=bytes@entry=1024,
    qiov=qiov@entry=0x7facc4277c60, flags=0) at ./block/io.c:1610
#12 0x000055cbc23ab90d in blk_co_pwritev (blk=0x55cbc36bd690, offset=17248256, bytes=1024, qiov=0x7facc4277c60, flags=<optimized out>) at ./block/block-backend.c:848
#13 0x000055cbc23aba2b in blk_aio_write_entry (opaque=0x7facc574eb50) at ./block/block-backend.c:1036
#14 0x000055cbc242452a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ./util/coroutine-ucontext.c:79
#15 0x00007fad5b0a2000 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#16 0x00007facf9ff98c0 in ?? ()
#17 0x0000000000000000 in ?? ()

 

从gdb启动qemu进程

如果需要用gdb直接启动qemu进程,可以使用qemu命令行方式启动虚拟机:

$ gdb qemu-system-x86_64
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from qemu-system-x86_64...(no debugging symbols found)...done.
(gdb) set args -m 512 -smp 1 -drive format=raw,file=rbd:rbd/vol1:auth_supported=none:mon_host=192.168.0.2\\:6789 -vnc :0
(gdb) b main

 

注意事项:

这里的虚拟机配置没有配置网络(如果需要网络,则需要先配置网桥),如果要连接到虚拟机内部,可以使用VNC客户端连接,下载地址:https://www.realvnc.com/en/connect/download/viewer/

vnc端口号查看:virsh vncdisplay ceph,:0表示5900,VNC客户端比较智能,会自动把0映射到5900,也即你在客户端里面输入:192.168.0.2和192.168.0.2:0和192.168.0.2:5900是一样的效果。

也可以通过ps -ef|grep qemu | grep vnc查看,如-vnc 0.0.0.0:0,表示vnc server监听端口5900。

公有云云主机,由于有安全组限制,需要打开对应的VNC server端口才能连接。

或者通过xshell/secureCRT客户端的端口映射转发来映射过去(secureCRT如下图),之后在VNC客户端输入127.0.0.1:5900即可访问虚拟机。