[ewg] bug 1918 - openmpi broken due to rdma-cm changes

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Fri Feb 5 10:56:17 PST 2010


On Fri, Feb 05, 2010 at 12:32:51PM -0600, Steve Wise wrote:

> I think we should remove the feature of allowing binds to 127.0.0.1  
> altogether based on Jeff's arguments and my assertion that 127.0.0.1 is  
> a sw-loopback mechanism anyway...

I don't agree, the kernel should be free to provide a loop back
service any way it likes, and if that means using one of the HW
adaptors to accelerate the work, then fine. Consider if we see the
RDMAoE (soft RDMA) patches then it would be reasonable for all
kernels to support RDMA on the loopback.

At a minimum, RDMA CM is an IP service, so whatever logic you use to
determine addresses for TCP must also be done after determining a list
of valid RDMA IPs. Trying to do RDMA CM bind just gives you the list
of candidate addreses, no different than netlink does for TCP.

One of those steps must be at least filtering 127.0.0.0/8. The user
should also be able to have some input into the IP filter - software
RDMAoE for instance really make this important.

Jason



More information about the ewg mailing list