[openib-general] Re: testing amso1100

Steve Wise swise at opengridcomputing.com
Mon Mar 20 12:24:22 PST 2006


> > I've never seen this.  I'm wondering how old these amso cards are?  Are
> > they running the latest fpga image from 1.2u1?  IE:  did you use the
> > Ammasso 1.2u1 package and ccflash2 to bring the device up to the latest
> > HW image?  I'm not talking about firmware.  1.2u1 released a new FPGA
> > image that needs to be applied with the 1.2u1 ccflash2.  
> 
> I used ccflash2 to update to this hardware image:
> C2L_H23_B58_F61_080507.bit, from ogc_amso_kit_20060308.tgz.  Not the
> one in the 1.2u1 package, from Amso1100-1.2u1-ga.tgz, as it appeared
> by name to be older: C2L_H22_B58_F61_040814.bit.  Let me know if I
> should try the H22 one instead.
> 

Ok.  H23 is fine. I didn't think I'd packaged up the FPGA image in
openib kit.

> (Before loading the iw_c2 module, every time, I install the
> boot_image from ogc_amso_kit_20060308, too.)
> 
> To work around the one minute delay, I hacked the timeout in
> vq_wait_for_reply down to 5 sec.  No idea why the NIC isn't
> responding or if it is a bad thing.  I've not seen a minute-long
> hang in any other circumstances yet.  A traceback during the 60
> second hang shows this (de-uglified from x86-64 sysrq-T):
> 
>     <ffffffff802cc9ea>{schedule_timeout+154}
>     <ffffffff8013a9d0>{process_timeout+0}
>     <ffffffff880fc3fa>{:iw_c2:vq_wait_for_reply+106}
>     <ffffffff8012b430>{default_wake_function+0}
>     <ffffffff880fa942>{:iw_c2:c2_rnic_close+146}
>     <ffffffff880faa1d>{:iw_c2:c2_rnic_term+13}
>     <ffffffff880ed921>{:ib_core:ib_unregister_device+193}
>     <ffffffff880f70e0>{:iw_c2:c2_remove+112}
>     <ffffffff802caa90>{klist_release+0}
>     <ffffffff801dcf6c>{pci_device_remove+44}
>     <ffffffff80231905>{__device_release_driver+133}
>     <ffffffff80231ce8>{driver_detach+184}
>     <ffffffff802312ea>{bus_remove_driver+122}
> 
> It is in the context of "modprobe -r iw_c2".  Not a big deal, but
> let me know if you'd like me to test something.


Hmm.  Somehow the firmware isn't responding to the close command.  This
is usually fatal.  IE: If you try and load iw_c2 again, it'll probably
not work.  We'll look into this...






More information about the general mailing list