[openib-general] Re: opensm and SIGINT

Viswanath Krishnamurthy viswa.krish at gmail.com
Thu Sep 22 12:55:59 PDT 2005


Hal,

Here is the log of osmtest failure. This was seen 150 times out of 2500
iterations. The opensm SUBNET UP failure is tough to reproduce. Saw it once
in 2500 iterations. Unfortunately I did not collect the log on that error.

The patch worked as expected and did not see any issues with ctrl-C. When I
tried apply the patch, I got a failure. (I used the patch command). I
manually added those 2 lines.

Command Line Arguments
Done with args
Flow = All Validations
Sep 21 17:50:56 684254 [B7F026C0] -> osm_vendor_get_all_port_attr: assign CA
mthca0 port 1 guid (0x2c90200400cfd) as the def
ault port.
using default guid 0x2c90200400cfd
Sep 21 17:50:56 686301 [B7F026C0] -> osm_vendor_get_all_port_attr: assign CA
mthca0 port 1 guid (0x2c90200400cfd) as the def
ault port.
Sep 21 17:50:56 686347 [B7F026C0] -> osm_vendor_bind: Binding to port
0x2c90200400cfd.
Sep 21 17:50:56 689963 [B7F026C0] -> osm_vendor_get_all_port_attr: assign CA
mthca0 port 1 guid (0x2c90200400cfd) as the def
ault port.
Sep 21 17:50:56 691969 [B7F026C0] -> osm_vendor_get_all_port_attr: assign CA
mthca0 port 1 guid (0x2c90200400cfd) as the def
ault port.
Sep 21 17:50:56 693187 [B7F026C0] -> osmtest_validate_sa_class_port_info:
-----------------------------
SA Class Port Info:
base_ver:1
class_ver:2
cap_mask:0x202
resp_time_val:0x64
-----------------------------
Sep 21 17:50:56 775383 [B7F026C0] -> osmtest_wrong_sm_key_ignored: Try
PortRecord for port with LID 0x0 Num:0x1.
Sep 21 17:51:00 775320 [B76FFBB0] -> umad_receiver: ERR 5409: send completed
with error (method=1 attr=12 trans_id=0x34) --
dropping.
Sep 21 17:51:00 775389 [B76FFBB0] -> umad_receiver: ERR 5410: class 0x3 LID
0x0
Sep 21 17:51:00 775418 [B76FFBB0] -> osmtest_query_res_cb: ERR 0003: Error
on query (IB_TIMEOUT).
Sep 21 17:51:00 775465 [B7F026C0] -> osmtest_wrong_sm_key_ignored: ERR 0011:
Did not get a timeout but got (IB_SUCCESS).
Sep 21 17:51:00 775581 [B7F026C0] -> osmt_register_service: Registering
Service: name:osmt.srvc.1804289383.7793 id:0x6b8b26f
6.
Sep 21 17:51:00 777143 [B7F026C0] -> osmt_register_service: Registering
Service: name:osmt.srvc.846930885.7793 id:0x327b0554
Sep 21 17:51:00 777143 [B7F026C0] -> osmt_register_service: Registering
Service: name:osmt.srvc.846930885.7793 id:0x327b0554
.
Sep 21 17:51:04 779578 [B76FFBB0] -> umad_receiver: ERR 5409: send completed
with error (method=2 attr=31 trans_id=0x36) --dropping.
Sep 21 17:51:04 779604 [B76FFBB0] -> umad_receiver: ERR 5410: class 0x3 LID
0x0
Sep 21 17:51:04 779631 [B76FFBB0] -> osmtest_query_res_cb: ERR 0003: Error
on query (IB_TIMEOUT).
Sep 21 17:51:04 779674 [B7F026C0] -> osmt_register_service: ERR 0364:
ib_query failed (IB_TIMEOUT).
Sep 21 17:51:04 779740 [B7F026C0] -> osmtest_run: ERR 00148: Service Flow
failed (IB_TIMEOUT)
OSMTEST: TEST "All Validations" FAIL


-Viswa



On 22 Sep 2005 15:08:02 -0400, Hal Rosenstock <halr at voltaire.com> wrote:
>
> On Thu, 2005-09-22 at 15:06, Viswanath Krishnamurthy wrote:
> > I do not think this would help. The system is never rebooted. Just
> > opensm is started and stopped. On the mext opensm start/stop the
> > subnet came up. I think it is more of an opensm issue than any kernel
> > module issue.
>
> Can you run opensm in -V mode and send the log. It might be related to
> the SM Set PortInfo armed->active issue which has been documented but
> not resolved.
>
> -- Hal
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050922/dba0cffd/attachment.html>


More information about the general mailing list