[openib-general] Re: [PATCH] mutex-backport: add mutex_trylock support

Michael S. Tsirkin mst at mellanox.co.il
Sun Feb 19 05:39:15 PST 2006


Quoting r. Or Gerlitz <ogerlitz at voltaire.com>:
> Subject: Re: [PATCH] mutex-backport: add mutex_trylock support
> 
> Michael S. Tsirkin wrote:
> >Lets be explicit then, and give some actual information:
> >/* NB: mutex_trylock returns 1 on success, down_trylock - 0 on success. */
> >Make sense?
> 
> it does make sense, anyway I just brought a quote from the code of 
> 2.6.16/kernel/mutex.c so i dont see any reason to change it.

Please copy verbatim then:

 * NOTE: this function follows the spin_trylock() convention, so
 * it is negated to the down_trylock() return values! Be careful
 * about this when converting semaphore users to mutexes.

Both NOTE and the last line are missing.


-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies

Return-Path: <Thomas.Talpey at netapp.com>
Delivered-To: openib-general at openib.org
Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37])
	by openib.ca.sandia.gov (Postfix) with ESMTP id 886582283D8
	for <openib-general at openib.org>; Sun, 19 Feb 2006 07:03:37 -0800 (PST)
Received: from smtp1.corp.netapp.com ([10.57.156.124])
	by mx2.netapp.com with ESMTP; 19 Feb 2006 06:56:32 -0800
X-IronPort-AV: i="4.02,127,1139212800"; 
	d="scan'208"; a="359977071:sNHT25492488"
Received: from svlexc03.hq.netapp.com (svlexc03.corp.netapp.com
	[10.57.156.149])
	by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id
	k1JEuUMq015428; Sun, 19 Feb 2006 06:56:30 -0800 (PST)
Received: from lavender.hq.netapp.com ([10.56.11.75]) by
	svlexc03.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); 
	Sun, 19 Feb 2006 06:56:30 -0800
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by lavender.hq.netapp.com
	with Microsoft SMTPSVC(5.0.2195.6713); 
	Sun, 19 Feb 2006 06:56:30 -0800
Received: from tmt.netapp.com ([10.30.32.67]) by exnane01.hq.netapp.com with
	Microsoft SMTPSVC(6.0.3790.0); Sun, 19 Feb 2006 09:56:28 -0500
Message-Id: <7.0.1.0.2.20060219093506.040ee028 at netapp.com>
X-Mailer: QUALCOMM Windows Eudora Version 7.0.1.0
Date: Sun, 19 Feb 2006 09:56:18 -0500
To: Christoph Hellwig <hch at lst.de>
From: "Talpey, Thomas" <Thomas.Talpey at netapp.com>
Subject: Re: [openib-general] NFS/RDMA client release for Linux 2.6.15
In-Reply-To: <20060219120155.GA10268 at lst.de>
References: <7.0.1.0.2.20060207211754.0409e8a0 at netapp.com>
	<20060219120155.GA10268 at lst.de>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-OriginalArrivalTime: 19 Feb 2006 14:56:28.0782 (UTC)
	FILETIME=[A9C3C8E0:01C63564]
Cc: nfsv4 at linux-nfs.org, openib-general at openib.org
X-BeenThere: openib-general at openib.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: OpenIB General Mailing List <openib-general.openib.org>
List-Unsubscribe: <http://openib.org/mailman/listinfo/openib-general>,
	<mailto:openib-general-request at openib.org?subject=unsubscribe>
List-Archive: <http://openib.org/pipermail/openib-general>
List-Post: <mailto:openib-general at openib.org>
List-Help: <mailto:openib-general-request at openib.org?subject=help>
List-Subscribe: <http://openib.org/mailman/listinfo/openib-general>,
	<mailto:openib-general-request at openib.org?subject=subscribe>
X-List-Received-Date: Sun, 19 Feb 2006 15:03:40 -0000

Thanks for the detailed review! Some replies below. I left the IETF
list out of this reply since it's basically porting, not protocol.

At 07:01 AM 2/19/2006, Christoph Hellwig wrote:
>On Wed, Feb 08, 2006 at 03:58:56PM -0500, Talpey, Thomas wrote:
>> We have released an updated NFS/RDMA client for Linux at
>> the project's Sourceforge site:
>
>Thanks, this looks much better than the previous patch.
>
>Comments:
>
>  - please don't build the rdma transport unconditional, but make it
>    a user-visible config option

It's an option, but it's located in fs/Kconfig not net/. This is the way
SUNRPC is selected, so we simply followed that. BTW, Chuck's transport
switch doesn't support dynamically loading modules yet so there is a
dependency to work out until that's in place.

>  - please use the kernel u*/s* types instead of (u)int*_t

We use uint*_t for the user-visible protocol definitions (on the wire) and
u32 etc for kernel stuff. I'll recheck if we got something wrong.

>  - please include your local headers after the <linux/*.h> headers,

There are a couple of issues with header include ordering that seem to
change pretty often. In a couple of cases we had to rearrange things
to avoid forward declarations, I'll recheck this.

>    and keep all the includes at the beginning of the files, just after
>    the licence comment block
>  - chunktype shouldn't be a typedef but a pure enum, and the
>    names look a bit too generic, please add an rdma_ prefix

Ok on both.

>  - please kill the XDR_TARGET and pos0 macros, maybe RPC_SEND_SEG0
>    and RPC_SEND_LEN0, too
>  - RPC_SEND_VECS should become an inline functions and be spelled
>    lowercase
>  - RPC_SEND_COPY is probably too large to be inlined and should be
>    spelled lowercase
>  - RPC_RECV_VECS should be an inline and spelled lowercase
>  - RPC_RECV_SEG0 and PC_RECV_LEN0 should probably go away.
>  - RPC_RECV_COPY is probably too large to be inlined and should be
>    spelled lowercase
>  - RPC_RECV_COPY same comment about highmem and kmap as in
>    RPC_SEND_COPY

These are killable. They were there to support code sharing for 2.4 kernels
and are easy to eliminate now.

>  - the CONFIG_HIGHMEM ifdef block in RPC_SEND_COPY is wrong.  Please
>    always use kmap, it does the right thing for non-highmem aswell.
>    The PageHighMem check and using kmap_high directly is always
>    wrong, they are internal implementation details.  I'd also suggest
>    evaluating kmap_atomic because it scales much better on SMP systems.

Yes, there are some issues here which we're still working out. In fact, we
can't use kunmap() in the context you mention because in 2.6.14 (or is it
.15) it started to check for being invoked in interrupt context. There is
one configuration in which we do call it in bh context. The call won't block
but the kernel BUG_ON's. This is something on our list to address.

>  - please try to avoid file-scope forward-prototypes but try to order the
>    code in the natural flow where they aren't required

Good point. Will recheck for these.

>  - structures like rpcrdma_msg that are on the wire should use __be*
>    for endianess annotations, and the cpu_to_be*/be*_to_cpu accessor
>    functions instead of hton?/ntoh?.  Please verify that these annotations
>    are correct using sparse -D__CHECK_ENDIAN__=1

Hmm, okay but existing RPC and NFS code don't do this. I'm reluctant to
differ from the style of the governing subsystem. I'll check w/Trond.

>  - rdma_convert_physiov/rdma_convert_phys are completely broken.
>    page_to_phys can't be used by driver/fs code.  RDMA only deals with bus
>    addresses, not physical addresses.  You must use the dma mapping API
>    instead. Also coalescing decisions are made by the dma layer, because
>    they are platform dependent and much more complex then what the code
>    in this patch does.

Now that we are moving to OpenIB api's this is needed. There is some
thought necessary w.r.t. our max-performance mode of preregistering
memory in DMA mode. That's on our list of course.

>  - transport.c is missing a GPL license statement

Oops.

>  - in transport.c please don't use CamelCase variable names.

This is just for module parameters? These are going away but we don't have
the new NFS mount API yet. There is a comment to that effect but maybe
it doesn't mention the module stuff.

>  - MODULE_PARM shouldn't be used in new code, but module_param instead.

Ditto.

>  - please don't use the (void) function() style, it just obsfucates the
>    code without benefit.

Ok.

>  - try_module_get(THIS_MODULE) is always wrong.  Reference counting
>    should happen from the calling module.

This is the same convention used by the other RPC transports. I will pass
the comment along.

>  - please initialize global or file-scope spinlocks with
>    DEFINE_SPINLOCK().

Ok.

>  - the traditional name for the second argument to spin_lock_irqsave is
>    just flags, not lock_flags.  This doesn't really matter, but
>    following such conventions makes it easier to understand the code
>    for kernel hackers that just occasionally drop into your code.

Ok.

>  - no need to case the return value from kmalloc/kzalloc/etc.  They
>    return void * which can be directly assigned to any pointer type.

Ok. Did *I* write that code? Very unlike me. :-)

>  - please avoid typedes for structure types, like struct rdma_ia,
>    struct rdma_ep, etc..

I thought we got rid of those. Will recheck.

Tom.




More information about the general mailing list