[openib-general] Race in mthca_cmd_post()
Roland Dreier
rdreier at cisco.com
Fri Oct 13 12:40:15 PDT 2006
John> I put a PCI-X analyzer on the bus along with the HCA and
John> found that we saw a Memory Read to register 698 but no
John> evidence of the SYS_EN command making it down to the
John> card. We were trying to read the gobit before the DOORBELL
John> had completed. I think this could only happen on multi CPU
John> machines with fast CPU's and an architecture which does not
John> do PIO ordering very well.
Forgive me for being dense, but could you describe the race a little
more precisely? Like list out what's happening and point to where two
things happen out of order? What confuses me is that you seem to be
saying that a read of PCI MMIO space is racing with a write -- and I
would have thought that a read has to flush all posted writes.
And I really don't understand how readl() could panic the kernel
anyway -- the MMIO region should be set up properly long before, so
what is sn_dma_flush() crashing on?
Also a couple of comments on the patch itself:
> - if (event && dev->cmd.flags & MTHCA_CMD_POST_DOORBELLS && fw_cmd_doorbell)
> + if (event && dev->cmd.flags & MTHCA_CMD_POST_DOORBELLS && fw_cmd_doorbell) {
> + mmiowb();
> mthca_cmd_post_dbell(dev, in_param, out_param, in_modifier,
> op_modifier, op, token);
> - else
> + } else {
> + mmiowb();
> err = mthca_cmd_post_hcr(dev, in_param, out_param, in_modifier,
> op_modifier, op, token, event);
> + }
First and most trivial, you indented with spaces instead of tabs.
It's always obvious that a patch that looks like
old_code();
+ new_code();
has whitespace damage, because things don't line up.
Also, why did you do
if (something)
mmiowb()
...
else
mmiowb()
...
instead of just
mmiowb()
if (something)
...
else
...
Finally, if you're going to add a mmiowb() (and I don't doubt it is
necessary, I'd just like to understand why exactly) then you need to
add a big comment explaining what it protects -- mmiowb() is always
confusing to people so it's important to have an explanation so that
future changes don't break the driver again.
- R.
More information about the general
mailing list