[openfabrics-ewg] RE: [openib-general] OFED-1.0-rc4 is available

Scott Weitzenkamp (sweitzen) sweitzen at cisco.com
Fri May 5 13:50:33 PDT 2006


I see the failure too, and opened bug #57 for it.

http://openib.org/bugzilla/show_bug.cgi?id=57

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
 

> -----Original Message-----
> From: openib-general-bounces at openib.org 
> [mailto:openib-general-bounces at openib.org] On Behalf Of 
> Woodruff, Robert J
> Sent: Thursday, May 04, 2006 3:57 PM
> To: Hefty, Sean; Davis, Arlin R
> Cc: openfabrics-ewg at openib.org; openib-general
> Subject: RE: [openib-general] OFED-1.0-rc4 is available
> 
> Tziporet wrote,
> 
> >Hi All,
> 
> >We have prepared OFED 1.0 RC4.
> 
> I took a version of the OFED RC4 kernel code, 
> gen2/branches/1.0/ofed/tags/rc4/linux-kernel
> applied my latest backport patch (for svn6829), which applied fine.
> and built a kernel RPM for testing.
> 
> Then I took the 1.0 userspace code and built it.
> 
> I found that using the cma version of uDAPL did not work
> and caused a core dump. Using the newer userspace cma.c code
> fixes the problem. I applied this patch and it fixed the
> problem. 
> 
> Not sure if anyone cares about having the rdma_cm in OFED, but 
> if they do, I think it needs this fix. 
> 
> woody
> 
> --- cma.c	2006-04-07 10:15:20.000000000 -0700
> +++ /home/woody/gen2/trunk/src/userspace/librdmacm/src/cma.c
> 2006-05-04 16:24:00.701184088 -0700
> @@ -109,6 +109,7 @@ struct cma_id_private {
>  	struct rdma_cm_id id;
>  	struct cma_device *cma_dev;
>  	int		  events_completed;
> +	int		  connect_error;
>  	pthread_cond_t	  cond;
>  	pthread_mutex_t	  mut;
>  	uint32_t	  handle;
> @@ -150,10 +151,8 @@ static int check_abi_version(void)
>  		return -ENODEV;
>  	}
>  
> -	strncat(path, "/class/misc/rdma_cm/abi_version", sizeof path);
> -	if (sysfs_read_attribute_value(path, val, sizeof val))
> -		abi_ver = 1; /* ABI version wasn't available until
> version 2 */
> -	else
> +	strncat(path, "/class/infiniband_ucma/abi_version", sizeof
> path);
> +	if (!sysfs_read_attribute_value(path, val, sizeof val))
>  		abi_ver = strtol(val, NULL, 10);
>  
>  	if (abi_ver < RDMA_USER_CM_MIN_ABI_VERSION ||
> @@ -435,11 +434,9 @@ int rdma_bind_addr(struct rdma_cm_id *id
>  	if (ret != size)
>  		return (ret > 0) ? -ENODATA : ret;
>  
> -	if (abi_ver > 1) {
> -		ret = ucma_query_route(id);
> -		if (ret)
> -			return ret;
> -	}
> +	ret = ucma_query_route(id);
> +	if (ret)
> +		return ret;
>  
>  	memcpy(&id->route.addr.src_addr, addr, addrlen);
>  	return 0;
> @@ -689,7 +686,7 @@ int rdma_listen(struct rdma_cm_id *id, i
>  	if (ret != size)
>  		return (ret > 0) ? -ENODATA : ret;
>  
> -	return 0;
> +	return ucma_query_route(id);
>  }
>  
>  int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param
> *conn_param)
> @@ -924,17 +921,27 @@ retry:
>  		evt->status = ucma_process_conn_resp(id_priv);
>  		if (!evt->status)
>  			evt->event = RDMA_CM_EVENT_ESTABLISHED;
> -		else
> +		else {
>  			evt->event = RDMA_CM_EVENT_CONNECT_ERROR;
> +			id_priv->connect_error = 1;
> +		}
>  		break;
>  	case RDMA_CM_EVENT_ESTABLISHED:
>  		evt->status = ucma_process_establish(&id_priv->id);
> -		if (evt->status)
> +		if (evt->status) {
>  			evt->event = RDMA_CM_EVENT_CONNECT_ERROR;
> +			id_priv->connect_error = 1;
> +		}
>  		break;
>  	case RDMA_CM_EVENT_REJECTED:
> +		if (id_priv->connect_error)
> +			goto retry;
>  		ucma_modify_qp_err(evt->id);
>  		break;
> +	case RDMA_CM_EVENT_DISCONNECTED:
> +		if (id_priv->connect_error)
> +			goto retry;
> +		break;
>  	default:
>  		break;
>  	}
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 



More information about the ewg mailing list