[ofa-general] [Fwd: [Error] Asynchronous Thread]
Dotan Barak
dotanb at dev.mellanox.co.il
Sun Jun 24 06:27:20 PDT 2007
Yann K. wrote:
>
>
> ------------------------------------------------------------------------
>
> Subject:
> [Error] Asynchronous Thread
> From:
> "Yann K." <yann.kalemkarian at bull.net>
> Date:
> Thu, 21 Jun 2007 16:50:59 +0200
> To:
> ewg-bounces at lists.openfabrics.org
>
> To:
> ewg-bounces at lists.openfabrics.org
>
>
> Hello everybody,
>
> I have a problem making a diagnostic on those kind of errors, which
> happen at the same time :
>
> At the mpi level :
>
> case IBV_EVENT_SRQ_ERR:
> ibv_error_abort(GEN_EXIT_ERR, "MPI Gen2 Async Special Event
> thread : Got FATAL event %d\n",
> event.event_type);
>
> At the kernel level :
>
> Jun 21 11:17:55 s_kernel at platine866 kernel: ib_mthca 0000:07:00.0: CQ
>> overrun on CQN c2009c
It seems that you got CQ overrun which means that more completions that
the CQ size were created.
You can solve this by creating a bigger CQ or use more than one CQ...
(i don't really understand why you sent the code from the MPI which
handles SRQ error).
thanks
Dotan
More information about the general
mailing list