-
Notifications
You must be signed in to change notification settings - Fork 680
Description
Description
I have run into this issue where VPP crashes when VPP ML2 agent tries to delete the tapv2 interface during cleanup.
Here's the VPP logs:
Aug 01 15:27:24 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: ACL_FA_CLEANER_DELETE_BY_SW_IF_INDEX bitmap: 0, clear_all: 0
Aug 01 15:27:24 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: ACL_FA_CLEANER: thread 0, pending clear bitmap: 0
Aug 01 15:27:24 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: ACL_FA_CLEANER: thread 4294967295, pending clear bitmap: 0
Aug 01 15:27:24 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: CLEANER mains len: 2 per-worker len: 2
Aug 01 15:27:24 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: ACL_FA_NODE_CLEAN: cleaning done
Aug 01 15:27:26 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: ACL_FA_CLEANER_DELETE_BY_SW_IF_INDEX bitmap: 0, clear_all: 0
Aug 01 15:27:26 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: ACL_FA_CLEANER: thread 0, pending clear bitmap: 0
Aug 01 15:27:26 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: ACL_FA_CLEANER: thread 4294967295, pending clear bitmap: 0
Aug 01 15:27:26 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: CLEANER mains len: 2 per-worker len: 2
Aug 01 15:27:26 overcloud-controller-0.opnfvlf.org vnet[437543]: acl_plugin: ACL_FA_NODE_CLEAN: cleaning done
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org vnet[437543]: received signal SIGSEGV, PC 0x7f4370ce9c9a, faulting address 0xcc
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org vnet[437543]: #0 0x00007f43716186a5 0x7f43716186a5
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org vnet[437543]: #1 0x00007f436f9ca6d0 0x7f436f9ca6d0
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org vnet[437543]: #2 0x00007f4370ce9c9a 0x7f4370ce9c9a
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org vnet[437543]: #3 0x00007f432dc1ef7c dpdk_buffer_free_avx2 + 0xbdc
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org vnet[437543]: #4 0x00007f43710afb92 virtio_free_used_desc + 0x92
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org vnet[437543]: #5 0x00007f43710d3eab virtio_vring_free + 0x33b
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org vnet[437543]: #6 0x00007f43710d7b19 tap_delete_if + 0x119
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org vnet[437543]: #7 0x00007f43710d85e6 0x7f43710d85e6
Aug 01 15:27:27 overcloud-controller-0.opnfvlf.org systemd[1]: vpp.service: main process exited, code=killed, status=6/ABRT
Here's the gdb backtrace:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7061c9a in replication_recycle_callback (vm=0x7ffff7bacf80 <vlib_global_main>, fl=0x7fffb569b700) at /usr/src/debug/vpp-18.07/src/vnet/replication.c:181
181 feature_node_index = ctx->recycle_node_index;
(gdb) bt
#0 0x00007ffff7061c9a in replication_recycle_callback (vm=0x7ffff7bacf80 <vlib_global_main>, fl=0x7fffb569b700) at /usr/src/debug/vpp-18.07/src/vnet/replication.c:181
#1 0x00007fffb3f96f7c in vlib_buffer_free_inline (follow_buffer_next=1, n_buffers=, buffers=, vm=)
at /w/workspace/vpp-merge-1807-centos7/build-root/rpmbuild/vpp-18.07/build-data/../src/plugins/dpdk/buffer.c:388
#2 dpdk_buffer_free_avx2 (vm=, buffers=, n_buffers=)
at /w/workspace/vpp-merge-1807-centos7/build-root/rpmbuild/vpp-18.07/build-data/../src/plugins/dpdk/buffer.c:398
#3 0x00007ffff7427b92 in vlib_buffer_free (n_buffers=1, buffers=, vm=0x7ffff7bacf80 <vlib_global_main>) at /usr/src/debug/vpp-18.07/src/vlib/buffer_funcs.h:544
#4 virtio_free_used_desc (vm=vm@entry=0x7ffff7bacf80 <vlib_global_main>, vring=vring@entry=0x7fffb7336fc0) at /usr/src/debug/vpp-18.07/src/vnet/devices/virtio/device.c:115
#5 0x00007ffff744beab in virtio_vring_free (vm=vm@entry=0x7ffff7bacf80 <vlib_global_main>, vif=vif@entry=0x7fffb7337a80, idx=)
at /usr/src/debug/vpp-18.07/src/vnet/devices/virtio/virtio.c:165
#6 0x00007ffff744fb19 in tap_delete_if (vm=0x7ffff7bacf80 <vlib_global_main>, sw_if_index=sw_if_index@entry=4) at /usr/src/debug/vpp-18.07/src/vnet/devices/tap/tap.c:447
#7 0x00007ffff74505e6 in vl_api_tap_delete_v2_t_handler (mp=0x30172084) at /usr/src/debug/vpp-18.07/src/vnet/devices/tap/tapv2_api.c:166
#8 0x00007ffff7bb61d3 in vl_msg_api_handler_with_vm_node (am=am@entry=0x7ffff7dda000 <api_main>, the_msg=0x30172084, vm=vm@entry=0x7ffff7bacf80 <vlib_global_main>,
node=node@entry=0x7fffb5943000) at /usr/src/debug/vpp-18.07/src/vlibapi/api_shared.c:508
#9 0x00007ffff7bbdcb5 in void_mem_api_handle_msg_i (am=, q=, node=0x7fffb5943000, vm=0x7ffff7bacf80 <vlib_global_main>)
at /usr/src/debug/vpp-18.07/src/vlibmemory/memory_api.c:687
#10 vl_mem_api_handle_msg_main (vm=vm@entry=0x7ffff7bacf80 <vlib_global_main>, node=node@entry=0x7fffb5943000) at /usr/src/debug/vpp-18.07/src/vlibmemory/memory_api.c:697
#11 0x00007ffff7bcce1c in vl_api_clnt_process (vm=, node=0x7fffb5943000, f=) at /usr/src/debug/vpp-18.07/src/vlibmemory/vlib_api.c:349
#12 0x00007ffff7956836 in vlib_process_bootstrap (_a=) at /usr/src/debug/vpp-18.07/src/vlib/main.c:1231
#13 0x00007ffff6483068 in clib_calljmp () at /usr/src/debug/vpp-18.07/src/vppinfra/longjmp.S:110
#14 0x00007fffb5b4ce30 in ?? ()
#15 0x00007ffff7957b29 in vlib_process_startup (f=0x0, p=0x7fffb5943000, vm=0x7ffff7bacf80 <vlib_global_main>) at /usr/src/debug/vpp-18.07/src/vlib/main.c:1253
#16 dispatch_process (vm=0x7ffff7bacf80 <vlib_global_main>, p=0x7fffb5943000, last_time_stamp=9708908937542267, f=0x0) at /usr/src/debug/vpp-18.07/src/vlib/main.c:1298
#17 0x0000000000000000 in ?? ()
(gdb)
Assignee
Mohsin Kazmi
Reporter
Onong Tayeng
Comments
- sykazmi (Wed, 20 Feb 2019 12:54:08 +0000):
API custom dump trace:
- ot (Tue, 19 Feb 2019 15:11:12 +0000): Yes, I still see the issue with 19.01.
- jhahn (Sun, 17 Feb 2019 23:08:31 +0000): Onong Tayeng Is this still an issue in 19.01?
- ot (Thu, 16 Aug 2018 17:14:06 +0000): VPP version = 18.07 RC1
- ot (Thu, 16 Aug 2018 17:13:33 +0000): The API trace which reproduces the issue.
Original issue: https://jira.fd.io/browse/VPP-1398