[ewg] IPoIB-CM - ib0: dev_queue_xmit failed to requeue packet

Andrew McKinney am at sativa.org.uk
Mon Aug 5 02:42:32 PDT 2013


Hi list.

We're running a TCP middleware over IPoIB-CM (OFED-3.5-2) on Red Hat 6.4.
We intend to eventually run a multicast RDMA middleware on the stack.

The hardware stack is Dell R720s (some Westmere, mostly Sandy Bridge) with
bonded Mellanox MT26428 ConnectX-2 on two QLogc 12300 managed switches.
We're runnign the latest firmware on the HCAs and the switches.

We have been seeing the following messages in the kernel ring, which also
seems to coincide with page allocation errors:

ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
java: page allocation failure. order:1, mode:0x20
Pid: 24410, comm: java Tainted: P           ---------------
2.6.32-279.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff8112759f>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffff81489c00>] ? tcp_rcv_established+0x290/0x800
 [<ffffffff81161d62>] ? kmem_getpages+0x62/0x170
 [<ffffffff8116297a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff811623cf>] ? cache_grow+0x2cf/0x320
 [<ffffffff811626f9>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8143014d>] ? __alloc_skb+0x6d/0x190
 [<ffffffff811635bf>] ? kmem_cache_alloc_node_notrace+0x6f/0x130
 [<ffffffff811637fb>] ? __kmalloc_node+0x7b/0x100
 [<ffffffff8143014d>] ? __alloc_skb+0x6d/0x190
 [<ffffffff8143028d>] ? dev_alloc_skb+0x1d/0x40
 [<ffffffffa0673f90>] ? ipoib_cm_alloc_rx_skb+0x30/0x430 [ib_ipoib]
 [<ffffffffa067523f>] ? ipoib_cm_handle_rx_wc+0x29f/0x770 [ib_ipoib]
 [<ffffffffa018c828>] ? mlx4_ib_poll_cq+0xa8/0x890 [mlx4_ib]
 [<ffffffffa066c01d>] ? ipoib_ib_completion+0x2d/0x30 [ib_ipoib]
 [<ffffffffa066d80b>] ? ipoib_poll+0xdb/0x190 [ib_ipoib]
 [<ffffffff810600bc>] ? try_to_wake_up+0x24c/0x3e0
 [<ffffffff8143f193>] ? net_rx_action+0x103/0x2f0
 [<ffffffff81073ec1>] ? __do_softirq+0xc1/0x1e0
 [<ffffffff810db800>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81073ca5>] ? irq_exit+0x85/0x90
 [<ffffffff81505af5>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI>

These appear to be genuine drops, as we are seeing gaps in our middleware
which is then going on to re-cap.

We've just made a change to increase the page cache from ~90M to 128M - but
what is the lists feeling on the dev_queue_xmit errors? Could they be being
caused by the same issue? Unable to allocate pages in a timely manner
perhaps?

We're not running at anywhere near high messages rates (<1000 ~450b mps).

I can see a thread started in 2012 where someone had caused these
dev_queue_xmit using netperf and Roland had suggested that at worst one
packet was being dropped. Silence after this.

Has anyone seen this behavior, or got any pointers to chase this down?

Cheers,
-Andrew

ibv_devinfo

ca_id:    mlx4_1
    transport:            InfiniBand (0)
    fw_ver:                2.9.1000
    node_guid:            0002:c903:0057:2250
    sys_image_guid:            0002:c903:0057:2253
    vendor_id:            0x02c9
    vendor_part_id:            26428
    hw_ver:                0xB0
    board_id:            MT_0D90110009
    phys_port_cnt:            1
    max_mr_size:            0xffffffffffffffff
    page_size_cap:            0xfffffe00
    max_qp:                163776
    max_qp_wr:            16351
    device_cap_flags:        0x007c9c76
    max_sge:            32
    max_sge_rd:            0
    max_cq:                65408
    max_cqe:            4194303
    max_mr:                524272
    max_pd:                32764
    max_qp_rd_atom:            16
    max_ee_rd_atom:            0
    max_res_rd_atom:        2620416
    max_qp_init_rd_atom:        128
    max_ee_init_rd_atom:        0
    atomic_cap:            ATOMIC_HCA (1)
    max_ee:                0
    max_rdd:            0
    max_mw:                0
    max_raw_ipv6_qp:        0
    max_raw_ethy_qp:        0
    max_mcast_grp:            8192
    max_mcast_qp_attach:        248
    max_total_mcast_qp_attach:    2031616
    max_ah:                0
    max_fmr:            0
    max_srq:            65472
    max_srq_wr:            16383
    max_srq_sge:            31
    max_pkeys:            128
    local_ca_ack_delay:        15
        port:    1
            state:            PORT_ACTIVE (4)
            max_mtu:        4096 (5)
            active_mtu:        2048 (4)
            sm_lid:            1
            port_lid:        9
            port_lmc:        0x00
            link_layer:        InfiniBand
            max_msg_sz:        0x40000000
            port_cap_flags:        0x02510868
            max_vl_num:        4 (3)
            bad_pkey_cntr:        0x0
            qkey_viol_cntr:        0x0
            sm_sl:            0
            pkey_tbl_len:        128
            gid_tbl_len:        128
            subnet_timeout:        17
            init_type_reply:    0
            active_width:        4X (2)
            active_speed:        10.0 Gbps (4)
            phys_state:        LINK_UP (5)
            GID[  0]:        fe80:0000:0000:0000:0002:c903:0057:2251

hca_id:    mlx4_0
    transport:            InfiniBand (0)
    fw_ver:                2.9.1000
    node_guid:            0002:c903:0057:2764
    sys_image_guid:            0002:c903:0057:2767
    vendor_id:            0x02c9
    vendor_part_id:            26428
    hw_ver:                0xB0
    board_id:            MT_0D90110009
    phys_port_cnt:            1
    max_mr_size:            0xffffffffffffffff
    page_size_cap:            0xfffffe00
    max_qp:                163776
    max_qp_wr:            16351
    device_cap_flags:        0x007c9c76
    max_sge:            32
    max_sge_rd:            0
    max_cq:                65408
    max_cqe:            4194303
    max_mr:                524272
    max_pd:                32764
    max_qp_rd_atom:            16
    max_ee_rd_atom:            0
    max_res_rd_atom:        2620416
    max_qp_init_rd_atom:        128
    max_ee_init_rd_atom:        0
    atomic_cap:            ATOMIC_HCA (1)
    max_ee:                0
    max_rdd:            0
    max_mw:                0
    max_raw_ipv6_qp:        0
    max_raw_ethy_qp:        0
    max_mcast_grp:            8192
    max_mcast_qp_attach:        248
    max_total_mcast_qp_attach:    2031616
    max_ah:                0
    max_fmr:            0
    max_srq:            65472
    max_srq_wr:            16383
    max_srq_sge:            31
    max_pkeys:            128
    local_ca_ack_delay:        15
        port:    1
            state:            PORT_ACTIVE (4)
            max_mtu:        4096 (5)
            active_mtu:        2048 (4)
            sm_lid:            1
            port_lid:        10
            port_lmc:        0x00
            link_layer:        InfiniBand
            max_msg_sz:        0x40000000
            port_cap_flags:        0x02510868
            max_vl_num:        4 (3)
            bad_pkey_cntr:        0x0
            qkey_viol_cntr:        0x0
            sm_sl:            0
            pkey_tbl_len:        128
            gid_tbl_len:        128
            subnet_timeout:        17
            init_type_reply:    0
            active_width:        4X (2)
            active_speed:        10.0 Gbps (4)
            phys_state:        LINK_UP (5)
            GID[  0]:        fe80:0000:0000:0000:0002:c903:0057:2765


slabtop

 Active / Total Objects (% used)    : 3436408 / 5925284 (58.0%)
 Active / Total Slabs (% used)      : 178659 / 178867 (99.9%)
 Active / Total Caches (% used)     : 117 / 193 (60.6%)
 Active / Total Size (% used)       : 422516.74K / 692339.54K (61.0%)
 Minimum / Average / Maximum Object : 0.02K / 0.12K / 4096.00K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE
NAME
4461349 2084881  46%    0.10K 120577       37    482308K
buffer_head
548064 547979  99%    0.02K   3806      144     15224K
avtab_node
370496 368197  99%    0.03K   3308      112     13232K
size-32
135534 105374  77%    0.55K  19362        7     77448K
radix_tree_node
 67946  51531  75%    0.07K   1282       53      5128K
selinux_inode_security
 57938  35717  61%    0.06K    982       59      3928K
size-64
 42620  42303  99%    0.19K   2131       20      8524K
dentry
 25132  25129  99%    1.00K   6283        4     25132K
ext4_inode_cache
 23600  23436  99%    0.19K   1180       20      4720K
size-192
 18225  18189  99%    0.14K    675       27      2700K
sysfs_dir_cache
 17062  15025  88%    0.20K    898       19      3592K
vm_area_struct
 16555   9899  59%    0.05K    215       77       860K
anon_vma_chain
 15456  15143  97%    0.62K   2576        6     10304K
proc_inode_cache
 14340   8881  61%    0.19K    717       20      2868K
filp
 12090   7545  62%    0.12K    403       30      1612K
size-128
 10770   8748  81%    0.25K    718       15      2872K
skbuff_head_cache
 10568   8365  79%    1.00K   2642        4     10568K
size-1024
  8924   5464  61%    0.04K     97       92       388K
anon_vma
  7038   6943  98%    0.58K   1173        6      4692K
inode_cache
  5192   4956  95%    2.00K   2596        2     10384K
size-2048
  3600   3427  95%    0.50K    450        8      1800K
size-512
  3498   3105  88%    0.07K     66       53       264K
eventpoll_pwq
  3390   3105  91%    0.12K    113       30       452K
eventpoll_epi
  3335   3239  97%    0.69K    667        5      2668K
sock_inode_cache
  2636   2612  99%    1.62K    659        4      5272K
TCP
  2380   1962  82%    0.11K     70       34       280K
task_delay_info
  2310   1951  84%    0.12K     77       30       308K
pid
  2136   2053  96%    0.44K    267        8      1068K
ib_mad
  1992   1947  97%    2.59K    664        3      5312K
task_struct
  1888   1506  79%    0.06K     32       59       128K
tcp_bind_bucket
  1785   1685  94%    0.25K    119       15       476K
size-256
  1743    695  39%    0.50K    249        7       996K
skbuff_fclone_cache
  1652    532  32%    0.06K     28       59       112K
avc_node
  1640   1175  71%    0.19K     82       20       328K
cred_jar
  1456   1264  86%    0.50K    182        8       728K
task_xstate
  1378    781  56%    0.07K     26       53       104K
Acpi-Operand
  1156    459  39%    0.11K     34       34       136K
jbd2_journal_head
  1050    983  93%    0.78K    210        5       840K
shmem_inode_cache
  1021    879  86%    4.00K   1021        1      4084K
size-4096
  1020    537  52%    0.19K     51       20       204K
bio-0
  1008    501  49%    0.02K      7      144        28K
dm_target_io
   920    463  50%    0.04K     10       92        40K
dm_io
   876    791  90%    1.00K    219        4       876K
signal_cache
   840    792  94%    2.06K    280        3      2240K
sighand_cache
   740    439  59%    0.10K     20       37        80K
ext4_prealloc_space
   736    658  89%    0.04K      8       92        32K
Acpi-Namespace
   720    283  39%    0.08K     15       48        60K
blkdev_ioc
   720    294  40%    0.02K      5      144        20K
jbd2_journal_handle
   708    131  18%    0.06K     12       59        48K
fs_cache
   630    429  68%    0.38K     63       10       252K
ip_dst_cache
   627    625  99%    8.00K    627        1      5016K
size-8192
   616    297  48%    0.13K     22       28        88K
cfq_io_context
   480    249  51%    0.23K     30       16       120K
cfq_queue
   370    330  89%    0.75K     74        5       296K
UNIX
   368     31   8%    0.04K      4       92        16K
khugepaged_mm_slot
   357    325  91%    0.53K     51        7       204K
idr_layer_cache
   341    128  37%    0.69K     31       11       248K
files_cache
   270    159  58%    0.12K      9       30        36K
scsi_sense_cache
   246    244  99%    1.81K    123        2       492K
TCPv6
   231    131  56%    0.34K     21       11        84K
blkdev_requests
   210    102  48%    1.38K     42        5       336K
mm_struct
   210    116  55%    0.25K     14       15        56K
sgpool-8
   202     14   6%    0.02K      1      202         4K
jbd2_revoke_table
   192    192 100%   32.12K    192        1     12288K
kmem_cache
   180    121  67%    0.25K     12       15        48K
scsi_cmd_cache
   170    113  66%    0.11K      5       34        20K
inotify_inode_mark_entry
   144    121  84%    0.16K      6       24        24K
sigqueue
   134      4   2%    0.05K      2       67         8K
ext4_free_block_extents
   118     26  22%    0.06K      2       59         8K
fib6_nodes
   112      2   1%    0.03K      1      112         4K
ip_fib_alias
   112      1   0%    0.03K      1      112         4K
dnotify_struct
   112      2   1%    0.03K      1      112         4K
sd_ext_cdb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20130805/5a2a3777/attachment.html>


More information about the ewg mailing list