<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Sep 13, 2017 at 2:06 PM, Simon Guilbault <span dir="ltr"><<a href="mailto:simon.guilbault@calculquebec.ca" target="_blank">simon.guilbault@calculquebec.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid"><div dir="ltr"><div>Hi, </div><div><br></div><div>SR-IOV can be used to pass a subset of a IB card directly to the VM. The VM will have RDMA access to the IB subnet without being able to start a subnet manager and access some of the Infiniband management features. I havent tested features in depth but a Lustre client will work with RDMA without any modification compared to a bare metal node. The VM will be also able to set any address for IPoIB, theres do not seem to be a filter like a virtualized Ethernet bridge. </div><div><br></div><div>Check if SR-IOV is enabled on the host and the card, this might require to flash the card to activate theses virtual functions. You might need to add a kernel parameter and enable a feature inside the BIOS.</div><div><br></div><div>Once SR-IOV is working, multiple virtual function will be listed for a card</div><div>$ lspci | grep -i mellanox<br></div><div>01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]<br></div><div>01:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]</div><div>01:00.2 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]</div><div>01:00.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]</div><div>[...]</div><div><br></div><div>If using Virsh and KVM, modify a VM .xml with something like this:</div><div><div> <hostdev mode='subsystem' type='pci' managed='yes'></div><div> <source></div><div> <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/></div><div> </source></div><div> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/></div><div> </hostdev></div></div><div><br></div><div>You can make it work with Openstack by adding theses line in nova.conf</div><div>pci_alias={ "vendor_id":"15b3","product_<wbr>id":"1004", "name": "ib"}<br></div><div>pci_passthrough_whitelist = [{ "vendor_id":"15b3","product_<wbr>id":"1004"}]<br></div><div><br></div><div>The compute flavor need a bit of metadata to bind a IB card to the VM. Nova will select a virtual interface on the hypervisor, this completely bypass any control done by Neutron.</div><div>pci_passthrough:alias = ib:1<br></div><div><br></div><div>The VM will see a standard infiniband interface</div><div># ip addr<br></div><div>[...]</div><div><div>3: ib0: <BROADCAST,MULTICAST,UP,LOWER_<wbr>UP> mtu 2044 qdisc pfifo_fast state UP qlen 256</div><div> link/infiniband 80:[...] brd 00:[...]</div><div> inet 10.x.x.x/16 brd 10.225.255.255 scope global ib0</div></div><div><div><br></div><div># lspci | grep -i mellanox</div><div>00:06.0 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]</div></div><div><br></div><div>Some management commands will not work inside a VM, but will work in the hypervisor:</div><div><div>[root@vm ~]# perfquery -H</div><div>ibwarn: [865] mad_rpc: _do_madrpc failed; dport (Lid 1504)</div><div>perfquery: iberror: failed: classportinfo query</div></div><div><div><br></div></div></div></blockquote><div><br></div><div>I think perfquery with -G option may work in VM as it adds the needed GRH.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid"><div dir="ltr"><div><div></div><div>[root@hypervisor ~]# perfquery -H</div><div># Port counters: Lid 1504 port 1 (CapMask: 0x1600)</div><div>PortSelect:...................<wbr>...1</div></div><div>[...]</div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Sep 13, 2017 at 1:39 PM, Kevin Abbey <span dir="ltr"><<a href="mailto:kevina@oarc.rutgers.edu" target="_blank">kevina@oarc.rutgers.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid">Hi,<br>
<br>
I'm new to IB pass-through and SR-IOV. We're looking to connect to storage on a physical IB subnet via rdma from a virtual IB port in a vm. I'm guessing that IB router may be possible or RoCE is needed?<br>
<br>
Is this possible?<br>
<br>
I haven't found a description of how to do this if it is possible.<br>
<br>
Thanks for any references or comments.<br>
Kevin<br>
<br>
-- <br>
Kevin Abbey<br>
Systems Administrator<br>
<br>
Office of Advanced Research Computing (OARC)<br>
Rutgers, The State University of New Jersey<br>
<a href="http://oarc.rutgers.edu/" target="_blank" rel="noreferrer">http://oarc.rutgers.edu/</a><br>
<br>
Telephone: <a href="tel:%28848%29%20445-5263" target="_blank" value="+18484455263">(848) 445-5263</a><br>
Email: <a href="mailto:kevina@oarc.rutgers.edu" target="_blank">kevina@oarc.rutgers.edu</a><br>
<br>
______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@lists.openfabrics.org" target="_blank">Users@lists.openfabrics.org</a><br>
<a href="http://lists.openfabrics.org/mailman/listinfo/users" target="_blank" rel="noreferrer">http://lists.openfabrics.org/m<wbr>ailman/listinfo/users</a><br>
</blockquote></div><br></div>
</div></div><br>______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@lists.openfabrics.org">Users@lists.openfabrics.org</a><br>
<a href="http://lists.openfabrics.org/mailman/listinfo/users" target="_blank" rel="noreferrer">http://lists.openfabrics.org/<wbr>mailman/listinfo/users</a><br>
<br></blockquote></div><br></div></div>