[ofa-general] ipath oops

Bernd Schubert bs at q-leap.de
Mon Apr 2 02:57:54 PDT 2007


On Friday 30 March 2007 19:18:07 Robert Walsh wrote:
> > Stack traceback for pid 3191
> > 0xffff81007755c100     3191       19  1    3   R  0xffff81007755c3c0
> > *ib_cm/3 rsp                rip                Function (args)
> > 0xffff81007c0839d8 0xffffffff803513d2 __iowrite32_copy+0x2
> > 0xffff81007c083a08 0xffffffff88066161 [ib_ipath]ipath_verbs_send+0x10b
> > 0xffff81007c083a68 0xffffffff88061205 [ib_ipath]ipath_do_ruc_send+0x707
> > 0xffff81007c083af8 0xffffffff88061619 [ib_ipath]ipath_post_ruc_send+0x1fd
> > 0xffff81007c083b58 0xffffffff88065c39 [ib_ipath]ipath_post_send+0x70
> > 0xffff81007c083b88 0xffffffff88284685 [ko2iblnd]kiblnd_check_sends+0x5c0
>
> This looks a lot like an OOPs we saw recently in SDP.  Are you using
> dma_map_single or related functions?  If so, is the memory you're
> mapping going through the ib_dma_* interface?  On Mellanox hardware,
> these are all just pass-throughs to the real dma_map_* functions, but on
> ipath hardware we intercept the calls to set up mapping tables.  Without
> this, we won't work.
>
> Look in rdma/ib_verbs.h to see the list of functions that are
> intercepted.  Search or ib_dma and ib_sg.
>
> Let me know what you see.

Here is a list of calls in the lustre code intercepted by ipath.

o2iblnd.c:
                rx->rx_msgaddr = dma_map_single(cmid->device->dma_device,
                                                rx->rx_msg,
                                                IBLND_MSG_SIZE,
                                                DMA_FROM_DEVICE);

o2iblnd.c:
                tx->tx_msgaddr = dma_map_single(
                        kiblnd_data.kib_cmid->device->dma_device,
                        tx->tx_msg, IBLND_MSG_SIZE, DMA_TO_DEVICE);


o2iblnd.c:
                        dma_unmap_single(conn->ibc_cmid->device->dma_device,
                                         pci_unmap_addr(rx, rx_msgunmap),
                                         IBLND_MSG_SIZE, DMA_FROM_DEVICE);
o2iblnd.c:
                dma_unmap_single(kiblnd_data.kib_cmid->device->dma_device,
                                 pci_unmap_addr(tx, tx_msgunmap),
                                 IBLND_MSG_SIZE, DMA_TO_DEVICE);


o2iblnd_cb.c:
        rd->rd_nfrags = dma_map_sg(kiblnd_data.kib_cmid->device->dma_device,
                                   tx->tx_frags, tx->tx_nfrags,tx->tx_dmadir);


o2iblnd_cb.c:
                dma_unmap_sg(kiblnd_data.kib_cmid->device->dma_device,
                             tx->tx_frags, tx->tx_nfrags, tx->tx_dmadir);

o2iblnd_cb.c:
                rd->rd_frags[i].rf_addr = sg_dma_address(&tx->tx_frags[i]);

o2iblnd_cb.c:
                rd->rd_frags[i].rf_nob  = sg_dma_len(&tx->tx_frags[i]);



So, how to proceed now?


Thanks,
Bernd


-- 
Bernd Schubert
Q-Leap Networks GmbH



More information about the general mailing list