[ofa-general] synchronize commands issued to MTHCA
Yicheng Jia
YJia at tmriusa.com
Wed Jan 2 13:51:26 PST 2008
> What is the call chain that calls SW2HW_MPT in this case?
The SW2HW_MPT is called by mthca_mr_alloc function. In this function, It
first call "mthca_alloc" to get a mr key, then "mthca_table_get" to get a
mr ICM entry, then "mthca_alloc_mailbox" to alloc a block of mailbox for
the command. During the procedure, the mad completion handler of "
ib_mad_recv_done_handler" is also running, which processes the MAD_IFC
command and sends response, they are all completed without error report.
Also for your information, I'm using two Due Core Xeon CPU to run the
driver.
> Also are you going through the mthca_cmd_post_dbell() or
mthca_cmd_post_hcr()code to write the command params to the HCA?
Yes. I found there's a little difference between these two functions.
There are two "wmb()" functions call in mthca_cmd_post_dbell()but only one
"wmb()" in mthca_cmd_post_hcr(). Any perticular reason for it?
> I think the best way to debug this would be to work directly with
Mellanox to get a debug build of the HCA firmware and get definite info on
why the SW2HW_MPT command is failing.
Do you know who I am supposed to contact with?
Thanks!
Yicheng
Roland Dreier <rdreier at cisco.com>
01/02/2008 02:55 PM
To
Yicheng Jia <YJia at tmriusa.com>
cc
Jack Morgenstein <jackm at dev.mellanox.co.il>, general at lists.openfabrics.org
Subject
Re: [ofa-general] synchronize commands issued to MTHCA
> The SW2HW_MPT command is issued while UDAV table is been creating.
During
> the time that the driver is waiting for the completion of the command,
it
> does many other things: creating send mad package, posting send mad
> request to the SQ and posting another receive mad request to the RQ.
> There's no error report for all of these actions. However after it, the
> HCA report command parameter error for the SW2HW_MPT.
I doubt the problem is creating the UD address vector -- that is just
shuffling some things around in the CPU's memory. It seems more
likely that posting a send or receive request is messing things up
somehow. What is the call chain that calls SW2HW_MPT in this case?
Also are you going through the mthca_cmd_post_dbell() or
mthca_cmd_post_hcr()
code to write the command params to the HCA?
I think the best way to debug this would be to work directly with
Mellanox to get a debug build of the HCA firmware and get definite
info on why the SW2HW_MPT command is failing.
- R.
_____________________________________________________________________________
Scanned by IBM Email Security Management Services powered by MessageLabs.
For more information please visit http://www.ers.ibm.com
_____________________________________________________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20080102/f758a72f/attachment.html>
More information about the general
mailing list