[ofa-general] OFED 1.2.5 SRP driver did not send DID_NO_CONNECT on target failure

David Dillow dillowda at ornl.gov
Tue Feb 19 16:12:45 PST 2008


On Thu, 2008-01-31 at 14:53 -0700, Shi, Harris wrote:
> Currently when I was working a failover solution on Engenio storage
> array with IB host connection, I noticed that there is no
> DID_NO_CONNECT notification to upper level driver when the link to
> target is failed. Our failover driver relied heavily on this notice
> from OFED 1.2 SRP driver to send out command to do failover at the
> expiration of link_down_timeout period. Due to this reason, the IO
> command eventually times out and failover occurred much later than
> what we expected. I am wondering if anyone is familiar with SRP driver
> and possibly have something for me to work around the issue.

I expect to be poking around in this area in the near future, and I
noticed that Vu Pham recently posted a patch that will cause the SRP
initiator in 1.3 to return DID_NO_CONNECT from the SCSI queue function.

However, a quick search back through the ofed_kernel stack doesn't seem
to indicate that OFED 1.2 ever returned DID_NO_CONNECT from the SRP
initiator. It is likely that I missed it -- can you confirm OFED 1.2 was
the version you were working with? What was the base OS -- RHEL4? SLES?

I've been trying to determine the difference between the stack's
semantics with DID_NO_CONNECT vs DID_BAD_TARGET, but they seem to be
mostly treated the same. I've not gone looking to see if these are
defined in some standard, though.

Thanks!

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office





More information about the general mailing list