[openib-general] Race in mthca_cmd_post()
John Partridge
johnip at sgi.com
Fri Oct 13 14:45:55 PDT 2006
Roland,
Thanks for looking at this, I appologize for not giving enough detail,
I'm never too sure how much to post.
Roland Dreier wrote:
> John> I put a PCI-X analyzer on the bus along with the HCA and
> John> found that we saw a Memory Read to register 698 but no
> John> evidence of the SYS_EN command making it down to the
> John> card. We were trying to read the gobit before the DOORBELL
> John> had completed. I think this could only happen on multi CPU
> John> machines with fast CPU's and an architecture which does not
> John> do PIO ordering very well.
>
> Forgive me for being dense, but could you describe the race a little
> more precisely? Like list out what's happening and point to where two
> things happen out of order? What confuses me is that you seem to be
> saying that a read of PCI MMIO space is racing with a write -- and I
> would have thought that a read has to flush all posted writes.
I was confused by this, but I believe that the readl to reg 698 is not
completing because the DDR memory is not yet available because SYS_EN never
got down to the card before the readl, or did not complete before readl.
This could be wrong I don't know. I was expecting to see a PIO write to reg 680
(which I can't find in the analyzer trace). What I did see is a lot of Config
Write's to the PCI config space and then a Memeory read (to 698).
Here is a snapshot of the trace :-
23438: * Config Write REG = 01 TYPE = 1 Bus1 Device0 Func0
23439: BE = 0000 Requester = Bus0 Device0 Func0 Tag = 1
23440: <Decoding> (no DEVSEL#, no IRDY#)
23441: <Response>
23442: GAP of 2 line(s)
23443: <TERM> -RETRY
23444: <TERM>
23445: GAP of 24 line(s)
23446: * Config Write REG = 01 TYPE = 1 Bus1 Device0 Func0
23447: BE = 0000 Requester = Bus0 Device0 Func0 Tag = 1
23448: <Decoding> (no DEVSEL#, no IRDY#)
23449: <Response>
23450: GAP of 2 line(s)
23451: Data = 02300107 -SPLIT RESPONSE
23452: <Recover>
23453: GAP of 3 line(s)
23454: * Memory Rd DW Addr = 00280698
23455: BE = 0000 Requester = Bus0 Device0 Func0 Tag = 0
23456: <Decoding> (no DEVSEL#, no IRDY#)
23457: <Response>
23458: GAP of 2 line(s)
23459: Data = ffffffff -SPLIT RESPONSE
23460: <Recover>
23461: GAP of 7 line(s)
23462: * Split compl. (Requester = Bus0 Device0 Func0) Tag = 0
23463: Completer = Bus0 Device2 Func0 (Error completion)
23464: <Response>
23465: Data = 10000004
23466: <Recover>
>
> And I really don't understand how readl() could panic the kernel
> anyway -- the MMIO region should be set up properly long before, so
> what is sn_dma_flush() crashing on?
The panic occurs because of the Memory read error completion, the read probably
fails because the card is not ready enabled ??
I don't know about other architectures, but on ours a PIO read failing will cause
a panic (maybe a more correct description would be MCA)
>
> Also a couple of comments on the patch itself:
>
> > - if (event && dev->cmd.flags & MTHCA_CMD_POST_DOORBELLS && fw_cmd_doorbell)
> > + if (event && dev->cmd.flags & MTHCA_CMD_POST_DOORBELLS && fw_cmd_doorbell) {
> > + mmiowb();
> > mthca_cmd_post_dbell(dev, in_param, out_param, in_modifier,
> > op_modifier, op, token);
> > - else
> > + } else {
> > + mmiowb();
> > err = mthca_cmd_post_hcr(dev, in_param, out_param, in_modifier,
> > op_modifier, op, token, event);
> > + }
>
> First and most trivial, you indented with spaces instead of tabs.
> It's always obvious that a patch that looks like
>
> old_code();
> + new_code();
>
> has whitespace damage, because things don't line up.
Sorry I just cut and pasted it, just to get a discussion under way. It was in no way
meant to be a final patch. I would attach a real file when it comes to it. Sorry I should
have made that clear.
>
> Also, why did you do
>
> if (something)
> mmiowb()
> ...
> else
> mmiowb()
> ...
>
> instead of just
>
> mmiowb()
> if (something)
> ...
> else
> ...
Actually looking at the code again, I'm not sure that is the correct
place for the mmiowb() I think maybe it would be better done in the
mthca_cmd_post_hcr() routine before we do any work in there. But,
I need to make sure that mthca_cmd_post_hcr() is always protected
by a lock before we do that. What do you think ?
>
> Finally, if you're going to add a mmiowb() (and I don't doubt it is
> necessary, I'd just like to understand why exactly) then you need to
> add a big comment explaining what it protects -- mmiowb() is always
> confusing to people so it's important to have an explanation so that
> future changes don't break the driver again.
OK thanks excellent point.
John
--
John Partridge
Silicon Graphics Inc
Tel: 651-683-3428
Vnet: 233-3428
E-Mail: johnip at sgi.com
More information about the general
mailing list