<div dir="ltr"><div>Hi Amir,</div><div> </div><div>I'm adding Eli Cohen to this thread. He has a patch for this which may still apply.</div><div> </div><div>-- Hal</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Sep 8, 2015 at 7:09 PM, Amir Shehata <span dir="ltr"><<a href="mailto:amir.shehata.whamcloud@gmail.com" target="_blank">amir.shehata.whamcloud@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello all,<div><br></div><div>Running Lustre with MLX5</div><div><br></div><div>We were trying to increase O2IBLND's peer_credits to 32 on MLX5. Here is the problematic code:</div><div><br></div><div><div> init_qp_attr->event_handler = kiblnd_qp_event;</div><div> init_qp_attr->qp_context = conn;</div><div> init_qp_attr->cap.max_send_wr = IBLND_SEND_WRS(version);</div><div> init_qp_attr->cap.max_recv_wr = IBLND_RECV_WRS(version);</div><div> init_qp_attr->cap.max_send_sge = 1;</div><div> init_qp_attr->cap.max_recv_sge = 1;</div><div> init_qp_attr->sq_sig_type = IB_SIGNAL_REQ_WR;</div><div> init_qp_attr->qp_type = IB_QPT_RC;</div><div> init_qp_attr->send_cq = cq;</div><div> init_qp_attr->recv_cq = cq;</div></div><div><br></div><div> rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, init_qp_attr);</div><div><br></div><div><div>#define IBLND_SEND_WRS(v) ((IBLND_RDMA_FRAGS(v) + 1) * IBLND_CONCURRENT_SENDS(v))</div></div><div><br></div><div><div>#define IBLND_RDMA_FRAGS(v) ((v) == IBLND_MSG_VERSION_1 ? \</div><div> IBLND_MAX_RDMA_FRAGS : IBLND_CFG_RDMA_FRAGS)</div></div><div><br></div><div><div>#define IBLND_CFG_RDMA_FRAGS (*kiblnd_tunables.kib_map_on_demand != 0 ? \</div><div> *kiblnd_tunables.kib_map_on_demand : \</div><div> IBLND_MAX_RDMA_FRAGS) /* max # of fragments configured by user */</div></div><div><br></div><div><div>#define IBLND_MAX_RDMA_FRAGS LNET_MAX_IOV /* max # of fragments supported */</div></div><div><br></div><div><div>/** limit on the number of fragments in discontiguous MDs */</div><div>#define LNET_MAX_IOV 256</div></div><div><br></div><div>Basically, when setting peer_credits to 32 then</div><div><br></div><div>init_qp_attr->cap.max_send_wr = 8224<br></div><div><br></div><div><div>[root@wt-2-00 ~]# ibv_devinfo -v | grep max_qp_wr</div><div> max_qp_wr: 16384</div></div><div><br></div><div>API returns -12 (out of memory)</div><div><br></div><div>peer_credits 16 == 4112 seems to work.</div><div><br></div><div>We're running on MOFED 3.0</div><div><br></div><div>Is there any limitation that we're hitting on the MLX side? As far as I know MLX4 works with peer_credits set to 32.</div><div><br></div><div>Full device info:<br><div>[wt2user1@wildcat2 ~]$ ibv_devinfo -v</div><div>hca_id: mlx5_0</div><div> transport: InfiniBand (0)</div><div> fw_ver: 12.100.6440</div><div> node_guid: e41d:2d03:0060:7652</div><div> sys_image_guid: e41d:2d03:0060:7652</div><div> vendor_id: 0x02c9</div><div> vendor_part_id: 4115</div><div> hw_ver: 0x0</div><div> board_id: MT_2180110032</div><div> phys_port_cnt: 1</div><div> max_mr_size: 0xffffffffffffffff</div><div> page_size_cap: 0xfffff000</div><div> max_qp: 262144</div><div> max_qp_wr: 16384</div><div> device_cap_flags: 0x40509c36</div><div> BAD_PKEY_CNTR</div><div> BAD_QKEY_CNTR</div><div> AUTO_PATH_MIG</div><div> CHANGE_PHY_PORT</div><div> PORT_ACTIVE_EVENT</div><div> SYS_IMAGE_GUID</div><div> RC_RNR_NAK_GEN</div><div> XRC</div><div> Unknown flags: 0x40408000</div><div> device_cap_exp_flags: 0x5020007100000000</div><div> EXP_DC_TRANSPORT</div><div> EXP_MEM_MGT_EXTENSIONS</div><div> EXP_CROSS_CHANNEL</div><div> EXP_MR_ALLOCATE</div><div> EXT_ATOMICS</div><div> EXT_SEND NOP</div><div> EXP_UMR</div><div> max_sge: 30</div><div> max_sge_rd: 0</div><div> max_cq: 16777216</div><div> max_cqe: 4194303</div><div> max_mr: 16777216</div><div> max_pd: 16777216</div><div> max_qp_rd_atom: 16</div><div> max_ee_rd_atom: 0</div><div> max_res_rd_atom: 4194304</div><div> max_qp_init_rd_atom: 16</div><div> max_ee_init_rd_atom: 0</div><div> atomic_cap: ATOMIC_HCA_REPLY_BE (64)</div><div> log atomic arg sizes (mask) 3c</div><div> max fetch and add bit boundary 64</div><div> log max atomic inline 5</div><div> max_ee: 0</div><div> max_rdd: 0</div><div> max_mw: 0</div><div> max_raw_ipv6_qp: 0</div><div> max_raw_ethy_qp: 0</div><div> max_mcast_grp: 2097152</div><div> max_mcast_qp_attach: 48</div><div> max_total_mcast_qp_attach: 100663296</div><div> max_ah: <a href="tel:2147483647" target="_blank" value="+12147483647">2147483647</a></div><div> max_fmr: 0</div><div> max_srq: 8388608</div><div> max_srq_wr: 16383</div><div> max_srq_sge: 31</div><div> max_pkeys: 128</div><div> local_ca_ack_delay: 16</div><div> hca_core_clock: 0</div><div> max_klm_list_size: 65536</div><div> max_send_wqe_inline_klms: 20</div><div> max_umr_recursion_depth: 4</div><div> max_umr_stride_dimension: 1</div><div> general_odp_caps:</div><div> rc_odp_caps:</div><div> NO SUPPORT</div><div> uc_odp_caps:</div><div> NO SUPPORT</div><div> ud_odp_caps:</div><div> NO SUPPORT</div><div> dc_odp_caps:</div><div> NO SUPPORT</div><div> xrc_odp_caps:</div><div> NO SUPPORT</div><div> raw_eth_odp_caps:</div><div> NO SUPPORT</div><div> max_dct: 262144</div><div> port: 1</div><div> state: PORT_ACTIVE (4)</div><div> max_mtu: 4096 (5)</div><div> active_mtu: 4096 (5)</div><div> sm_lid: 19</div><div> port_lid: 1</div><div> port_lmc: 0x00</div><div> link_layer: InfiniBand</div><div> max_msg_sz: 0x40000000</div><div> port_cap_flags: 0x2651e848</div><div> max_vl_num: 4 (3)</div><div> bad_pkey_cntr: 0x0</div><div> qkey_viol_cntr: 0x0</div><div> sm_sl: 0</div><div> pkey_tbl_len: 128</div><div> gid_tbl_len: 8</div><div> subnet_timeout: 18</div><div> init_type_reply: 0</div><div> active_width: 4X (2)</div><div> active_speed: 25.0 Gbps (32)</div><div> phys_state: LINK_UP (5)</div><div> GID[ 0]: fe80:0000:0000:0000:e41d:2d03:0060:7652</div><div><br></div></div><div><br></div><div>thanks</div><span class="HOEnZb"><font color="#888888"><div>amir</div></font></span></div>
<br>_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@lists.openfabrics.org">Users@lists.openfabrics.org</a><br>
<a href="http://lists.openfabrics.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.openfabrics.org/mailman/listinfo/users</a><br>
<br></blockquote></div><br></div>