[openib-general] IBM eHCA Device Driver for gen2 IB stack
IBMEHCA DD
IBMEHCAD at de.ibm.com
Tue Sep 20 05:48:14 PDT 2005
we released a
https://sourceforge.net/projects/ibmehcad/ ehca2_0025
today which adresses most of these comments,
and which survives most of pallas
Roland Dreier <rolandd at cisco.com> wrote on 25.07.2005 22:42:16:
> IBMEHCA> Hi, we've completed the first alpha code drop of the
> IBMEHCA> Power5 IBM eHCA Device Driver for the for the gen2
> IBMEHCA> openib.org stack. We're running IPoIB and ibv userspace
> IBMEHCA> programs successfully with this code in our lab setup.
>
> IBMEHCA> The source files can be downloaded from
> IBMEHCA> https://sourceforge.net/projects/ibmehcad/ ehca2_0011e
>
> Thanks for posting this. A few comments from a first read through.
> These are in no particular order, with minor nitpicks mixed in with
> more series problems. Some more review is still needed before this
> code should go upstream.
> ehca (kernel driver):
> - Please use /* */ instead of // for comments
done
> - Can the debugging stuff be less ugly than EDEB_AV_EN() etc?
> Also the formatting of your goto labels is not very nice. Instead
> of something like
>
> EHCA_REG_SMR_EXIT0:
> if (retcode == 0)
>
> I'd rather see
> out:
> if (retcode == 0)
we changed the goto statements to lower case, but didn't remove the
numbering yet.
The numbering actually is sort of a coding pattern to detect cleanup bugs.
The idea behind it is that resources must be freed in the inverse order
they where created,
so a
void * pd=malloc(4096)
if (0 == pd) {
goto error_out_dealloc_pd
}
...
void * cq=malloc(4096)
if (0 == cq) {
goto error_out_dealloc_cq
}
....
if (register_cq_fails) {
goto error_out_dealloc_cq
}
...
void * qp=malloc(4096)
if (0 == qp) {
goto error_out_dealloc_qp
}
goto out;
error_out_dealloc_qp
free(qp);
error_out_dealloc_cq
free(cq);
error_out_dealloc_pd
free(pd);
out:
return
would be more error prone than using numering. The basic idea is as soon
as you allocate/register a resource (memory, HCA resources, kobject...)
increase the goto statement number by one
free in opposite order as allocated,
never skip a number during the initial checks
...and if you see sth like
if (...) goto out_0
if (...) goto out_2
if (...) goto out_1
it's very obviously wrong.
>
> - In ehca_classes.c, it would be better to avoid using vmalloc() to
> allocate your structures.
done
>
> - In ehca_common.h, all the stuff copied from from hvcall.h should be
> deleted. Also, there are some functions that seem to use ugly
> return codes like H_Success for no good reason.
done (mostly, some of these defines are IB specific)
>
> - The definitions of p_to_u64() etc. seem unnecesary and needlessly
> obfuscated. You can always do
>
> (unsigned long) ptr
> to cast a void * to u64, and
> (void *) (unsigned long) val
done
> - Similarly in ehca_qp.c,
> #define QP_ATTR_IS_SET(mask,attr) (((mask)&(attr))!=0)
> is just needless obfuscation: just do the & directly where you need it.
done
> - In ehca_qp.c, is there a way we can take modqp_statetrans_table[]
> into common code and share with mthca_qp.c?
>
well, maybe, modify QP is very complicated to get straight, but if
somebody from mthca_qp.c thinks these are the right set of transitions
(this state machine base algorithm over the last 2 years)
> - It seems many source files include ehca_classes_pSeries.h
> directly. They should probably include ehca_classes.h. Also
> ehca_classes_zSeries.h seems to be missing.
done
we currently don't have a ehca_classes_zSeries.h
>
> Instead of defining P_SERIES on your command line, I think you
> should just include <linux/config.h> and test CONFIG_PPC_PSERIES
> and CONFIG_ARCH_S390 instead.
>
> - In ehca_main.c, ehca_hca_resources_show() violates the "one value
> per file" rule for sysfs. Either put this in debugfs or split it
> up into separate attributes in a subdirectory.
done
>
> - What is ehca_register_pci() doing? Why are you changing the main
> kernel's pci_dma_ops?? Shouldn't you be creating a device on your
> own virtual bus with it's own dma ops?
>
done,
see ebus
we're currently trying to get that integrated into base ppc64 kernel
> - It seems ehca_nopage() accesses ehca_idr without taking the
> ehca_idr_sem, is this OK?
>
good, point!
done
> libehca:
> - It would probably be a good idea to use GNU autotools to build
> libehca. In particular you want to make the destination directory
> configurable.
done (partly)
this is the initial version of autotools build,
still learning how to use autotools in a better way...
>
> - I don't think you should hard code -m64 in your CFLAGS or put
> /usr/local/lib64 anywhere -- I believe nearly everyone running
> ppc64 is using 32-bit userspace. Have you tested 32-bit userspace
> on a 64-bit kernel?
32 bit userspace will be supported in one of the next versions.
we're running mvapich as 64 bit library
>
> - I don't think the install should unconditionally delete
> /dev/infiniband and then create the device nodes -- this should
> really be handled by either udev or the distribution.
>
removed that from standard build, still trying to figure out how to
configure udev to automatically add the infiniband device nodes
> - In ehca_reqs_core.c, get rid of all the
> #ifndef __KERNEL__
> #define ib_recv_wr ibv_recv_wr
> #define ib_send_wr ibv_send_wr
> //#define ib_ah ibv_ah
> #define ehca_av ehcau_av
> #define unlikely(x) x
> // ib_wr_opcode
> #define IB_WR_SEND IBV_WR_SEND
> #define IB_WR_SEND_WITH_IMM IBV_WR_SEND_WITH_IMM
> #define IB_WR_RDMA_WRITE IBV_WR_RDMA_WRITE
> #define IB_WR_RDMA_WRITE_WITH_IMM IBV_WR_RDMA_WRITE_WITH_IMM
>
> and so on. It's not a good idea to try and use the same source for
> code that runs in the kernel and code that runs in userspace.
technically the kernel version looks like it's a kernel only module...
> Splitting your source also gets rid of stuff like
>
> #ifdef EHCA_USERDRIVER
> u8 *data = (u8 *) sge->addr;
> #else
> u8 *data = (u8 *) abs_to_virt(sge->addr);
> #endif
EHCA_USERDRIVER is the rest of a userspace sanbox to run the kernel code
in userspace, as done for parts of the scsi stack for example.
we'll try to remove that flag on the next release
> - In ehca_uinit.c, is the LIBEHCA_FILE environment variable handling
> safe? It seems that a user could trick a setuid process into doing
> something bad here.
>
good point!
now picking up the place where to trace in a config file in /etc
> - In ehca_umain.c there seems to be some weird indentation:
> my_cq->ehca_cq_core.galpas.kernel.fw_handle = (u64)
> mmap(NULL, 4096, PROT_WRITE, MAP_SHARED, context->cmd_fd,
> ((u64) (my_cq->token) << 32) | 0x11000000);
hmm, changed from bad to not completely good.
Will fix in the next version. Hope we didn't add some more of those.
> - Does it make sense to encapsulate to use of ppc64-specific assembly
> stuff like:
>
> __asm__ __volatile("dcbz 0,%0"::"r"(adr));
> __asm__ __volatile("dclst 0,%0"::"r"(adr));
> __asm__ __volatile__("sync":::"memory"); // serialize GAL register
access
done
>
> - In ehca_utools.h, ehca_swapbl32 and ehca_swapbl16 seem kind of
useless.
> Why not just use ntohl/ntohs directly?
done
>
> Thanks,
> Roland
Thanks for your input, I'm pretty sure this only was the start to get that
code kernel ready.
Christoph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050920/41e9c0ab/attachment.html>
More information about the general
mailing list