[Openib-windows] RE: [openib-general] Microsoft virtual machine and Infiniband
Tzachi Dar
tzachid at mellanox.co.il
Wed Mar 1 12:52:15 PST 2006
See bellow.
Thanks
Tzachi
> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com]
> On Behalf Of Fabian Tillier
> Sent: Tuesday, February 28, 2006 12:49 AM
> To: Tzachi Dar
> Cc: openib-general at openib.org; openib-windows at openib.org
> Subject: Re: [openib-general] Microsoft virtual machine and Infiniband
>
> Hi Tzachi,
>
> On 2/27/06, Tzachi Dar <tzachid at mellanox.co.il> wrote:
> > Hi Fab,
> >
> > When trying to run windows 2003 server on a Microsoft
> virtual machine
> > we have found out that there is one problem that prevents
> IPOIB from running.
> >
> > The problem as you can guess is related to the way MAC
> addresses are
> > being handled. On such a machine, a fake Mac addresses is being
> > created and it is later used for communication (one MAC per
> guest OS).
> > However although this packets return to the correct computer, IPOIB
> > doesn't restore their correct dest MAC and therefore
> pinging to a remote host is impossible.
>
> Is IPoIB running on the guest OS, or on the host OS? I'm
> assuming host, and the guest sends packets using it's guest
> MAC. So a packet gets passed to IPoIB using the guest MAC as
> source. The recipient of such a packet tries to reconstruct
> the Ethernet header, and ends up with the sender's host MAC,
> rather than the sender's guest MAC. Am I following this properly?
>
Yes you are.
> > In order to solve this problem there is a need to create a
> mechanism
> > that will allow the IPOIB driver to correct the MAC addresses of
> > packets based on their IP addresses.
>
> So, the recipient should do an IP lookup on every received IP
> packet and restore the MAC based on the IP, rather than just
> based on the LID/GID of the source. This requires adding a
> mechanism to lookup by IP, which currently doesn't exist (do
> we need to to support duplicate
> IPs?)
>
We don't have to support multiple IP's, but I don't see an issue here.
> Currently the receive flow does something like this:
>
> resolve endpoints
> discard loopback
> switch packet type
> {
> case IP:
> handle IP packet; break;
>
> case ARP:
> handle ARP packet; break;
>
> default:
> handle generic packet; break;
> }
>
> This would have to change to something like this:
>
> resolve source by LID/GID and discard loopback switch packet type {
> case IP:
> resolve endpoints by IP;
> handle IP packet; break;
>
> case ARP:
> process ARP, creating IP mappings; break;
>
> default:
> resolve destination from WC;
> handle generic packet; break;
> }
>
I was thinking something a little different, see bellow. I believe that
it makes it simpler to see where the support for virtual machine is:
switch packet type
{
case IP:
handle IP packet;
Fix MAC by IP's
break;
case ARP:
handle ARP packet;
If Arp request update table.
If Arp reply, fix it's destination like for IP.
break;
default:
handle generic packet; break;
}
> > It seems that the best way to do this is to have a "static"
> table of
> > IP's and MAC addresses and to check every IP packet as well
> as every ARP reply.
> > We have done such an experiment and it did seems to work.
>
> Why have a static table? Why not just extend the endpoint
> lookup mechanisms to support lookup by IP?
>
> > We are still looking for a way to configure the table of
> guest OS and
> > their IPs and MACs. One way to achieve this is simply
> having a static
> > table that will be entered through some file. Although this is the
> > simplest way, it has an obvious disadvantage (the need to manually
> > configure the machine). A different way is to find some
> configuration
> > API's that the remote machine has, while the last possibility is
> > trying to find the information by sniffing for packets (the
> way that an Ethernet switch does things).
>
> We have to sniff the packets, both outbound and inbound, to
> do IPoIB encapsulation since we pretend to be a standard
> 802.3 NIC. Additional snooping shouldn't be a big deal. If
> it is, we can add a configuration parameter to turn the IP
> based MAC resolution on/off.
>
> > One bug that I have already found is that if a broadcast packet is
> > sent for example an ARP request, we send the packet as a multicast,
> > and we also receive the packet ourselves, and later we send this
> > packet to NDIS. This is not the correct behavior (assuming we are
> > emulating Ethernet behavior) and we should remove this packets.
>
> Yes, I have a fix for this in my sandbox already. Any packet
> we receive where we are the sender needs to be discarded.
> The existing check in the code for loopback packets uses the
> unformatted ethernet header, which clearly doesn't work.
> Thanks for pointing it out, though!
>
It would be nice if you can check in the fix to winib.
> > In the next week I'll try to create a patch that will allow the
> > virtual machine to work, I just wanted to know what your
> opinion about this issue.
>
> Cool, thanks! Hopefully my understanding above is correct.
> Please let me know if I've missed something.
>
> Thanks,
>
> - Fab
>
>
>
More information about the ofw
mailing list