From kliteyn at dev.mellanox.co.il Tue Jan 1 01:01:10 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 01 Jan 2008 11:01:10 +0200 Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements In-Reply-To: <20071231154815.GD11591@sashak.voltaire.com> References: <4770CDCE.8040200@dev.mellanox.co.il> <20071229182718.GA19160@sashak.voltaire.com> <1199032710.23289.340.camel@hrosenstock-ws.xsigo.com> <20071230181610.GC10650@sashak.voltaire.com> <1199115083.23289.359.camel@hrosenstock-ws.xsigo.com> <47790C4D.7080405@dev.mellanox.co.il> <20071231154815.GD11591@sashak.voltaire.com> Message-ID: <477A0156.2090307@dev.mellanox.co.il> Sasha Khapyorsky wrote: > On 17:35 Mon 31 Dec , Yevgeny Kliteynik wrote: >> Hal Rosenstock wrote: >>> On Sun, 2007-12-30 at 18:16 +0000, Sasha Khapyorsky wrote: >>>> On 08:38 Sun 30 Dec , Hal Rosenstock wrote: >>>>> On Sat, 2007-12-29 at 18:27 +0000, Sasha Khapyorsky wrote: >>>>>> This improves handling of mcast join/leave requests storming. Now mcast >>>>>> routing will be recalculated for all mcast groups where changes occurred >>>>>> and not one by one. For this it queues mcast groups instead of mcast >>>>>> rerouting requests, this also makes state_mgr idle queue obsolete. >>>>> Looks like a nice improvement. >>>>> >>>>> What testing has been done with this change ? Can you comment on any >>>>> results ? >>>> osmtest, basic ipoib, SA db and MFTs dump diffs. Didn't find any >>>> problem. >>> What size topologies ? real and/or simulated ? >>>>> For which branches is this change being proposed ? >>>> I think it should go to OFED 1.3. >>> Perhaps if there is sufficient soak time on real life topologies and >>> other torture tests for this. >> I will include this patch in the nightly simulation today, Looks like there was some problem sending the simulation report tonight, but test logs show that everything is ok. -- Yevgeny > Thanks! > > Sasha > >> but currently I don't have access to any real cluster. >> >> -- Yevgeny >> >> >>> -- Hal >>>> Sasha >>>> >>>>> -- Hal >>>>> >>>>>> Signed-off-by: Sasha Khapyorsky >>>>>> --- >>>>>> >>>>>> Hi Yevgeny, >>>>>> >>>>>> For me it looks that it should solve the original problem (mcast group >>>>>> list is purged in osm_mcast_mgr_process()). Could you review and ideally >>>>>> test it? Thanks. >>>>>> >>>>>> Sasha >>>>>> >>>>>> --- >>>>>> opensm/include/opensm/osm_mcast_mgr.h | 14 +-- >>>>>> opensm/include/opensm/osm_multicast.h | 2 + >>>>>> opensm/include/opensm/osm_sm.h | 2 + >>>>>> opensm/include/opensm/osm_state_mgr.h | 95 ----------------- >>>>>> opensm/opensm/osm_mcast_mgr.c | 187 >>>>>> +++++++++++++++------------------ >>>>>> opensm/opensm/osm_sm.c | 70 ++++++------- >>>>>> opensm/opensm/osm_state_mgr.c | 138 +------------------------ >>>>>> 7 files changed, 130 insertions(+), 378 deletions(-) >>>>>> >>>>>> diff --git a/opensm/include/opensm/osm_mcast_mgr.h >>>>>> b/opensm/include/opensm/osm_mcast_mgr.h >>>>>> index 3e0b761..47b67ed 100644 >>>>>> --- a/opensm/include/opensm/osm_mcast_mgr.h >>>>>> +++ b/opensm/include/opensm/osm_mcast_mgr.h >>>>>> @@ -100,7 +100,6 @@ typedef struct _osm_mcast_mgr { >>>>>> osm_req_t *p_req; >>>>>> osm_log_t *p_log; >>>>>> cl_plock_t *p_lock; >>>>>> - >>>>>> } osm_mcast_mgr_t; >>>>>> /* >>>>>> * FIELDS >>>>>> @@ -253,25 +252,22 @@ osm_signal_t osm_mcast_mgr_process(IN >>>>>> osm_mcast_mgr_t * const p_mgr); >>>>>> * Multicast Manager, Node Info Response Controller >>>>>> *********/ >>>>>> -/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgrp_cb >>>>>> +/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgroups >>>>>> * NAME >>>>>> -* osm_mcast_mgr_process_mgrp_cb >>>>>> +* osm_mcast_mgr_process_mgroups >>>>>> * >>>>>> * DESCRIPTION >>>>>> -* Callback entry point for the osm_mcast_mgr_process_mgrp function. >>>>>> +* Process only requested mcast groups. >>>>>> * >>>>>> * SYNOPSIS >>>>>> */ >>>>>> osm_signal_t >>>>>> -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const >>>>>> Context2); >>>>>> +osm_mcast_mgr_process_mgroups(IN osm_mcast_mgr_t *p_mgr); >>>>>> /* >>>>>> * PARAMETERS >>>>>> -* (Context1) p_mgr >>>>>> +* p_mgr >>>>>> * [in] Pointer to an osm_mcast_mgr_t object. >>>>>> * >>>>>> -* (Context2) p_mgrp >>>>>> -* [in] Pointer to the multicast group to process. >>>>>> -* >>>>>> * RETURN VALUES >>>>>> * IB_SUCCESS >>>>>> * >>>>>> diff --git a/opensm/include/opensm/osm_multicast.h >>>>>> b/opensm/include/opensm/osm_multicast.h >>>>>> index 729a2ea..f442a45 100644 >>>>>> --- a/opensm/include/opensm/osm_multicast.h >>>>>> +++ b/opensm/include/opensm/osm_multicast.h >>>>>> @@ -50,6 +50,7 @@ >>>>>> #include >>>>>> #include >>>>>> +#include >>>>>> #include >>>>>> #include >>>>>> #include >>>>>> @@ -121,6 +122,7 @@ const char *osm_get_mcast_req_type_str(IN >>>>>> osm_mcast_req_type_t req_type); >>>>>> * SYNOPSIS >>>>>> */ >>>>>> typedef struct osm_mcast_mgr_ctxt { >>>>>> + cl_list_item_t list_item; >>>>>> ib_net16_t mlid; >>>>>> osm_mcast_req_type_t req_type; >>>>>> ib_net64_t port_guid; >>>>>> diff --git a/opensm/include/opensm/osm_sm.h >>>>>> b/opensm/include/opensm/osm_sm.h >>>>>> index 4c6ce27..a676cd6 100644 >>>>>> --- a/opensm/include/opensm/osm_sm.h >>>>>> +++ b/opensm/include/opensm/osm_sm.h >>>>>> @@ -140,6 +140,8 @@ typedef struct osm_sm { >>>>>> cl_dispatcher_t *p_disp; >>>>>> cl_plock_t *p_lock; >>>>>> atomic32_t sm_trans_id; >>>>>> + cl_spinlock_t mgrp_lock; >>>>>> + cl_qlist_t mgrp_list; >>>>>> osm_req_t req; >>>>>> osm_resp_t resp; >>>>>> osm_ni_rcv_t ni_rcv; >>>>>> diff --git a/opensm/include/opensm/osm_state_mgr.h >>>>>> b/opensm/include/opensm/osm_state_mgr.h >>>>>> index dada097..f51593a 100644 >>>>>> --- a/opensm/include/opensm/osm_state_mgr.h >>>>>> +++ b/opensm/include/opensm/osm_state_mgr.h >>>>>> @@ -109,8 +109,6 @@ typedef struct _osm_state_mgr { >>>>>> osm_stats_t *p_stats; >>>>>> struct _osm_sm_state_mgr *p_sm_state_mgr; >>>>>> const osm_sm_mad_ctrl_t *p_mad_ctrl; >>>>>> - cl_spinlock_t idle_lock; >>>>>> - cl_qlist_t idle_time_list; >>>>>> cl_plock_t *p_lock; >>>>>> cl_event_t *p_subnet_up_event; >>>>>> osm_sm_state_t state; >>>>>> @@ -172,99 +170,6 @@ typedef struct _osm_state_mgr { >>>>>> * State Manager object >>>>>> *********/ >>>>>> -/****s* OpenSM: State Manager/_osm_idle_item >>>>>> -* NAME >>>>>> -* _osm_idle_item >>>>>> -* >>>>>> -* DESCRIPTION >>>>>> -* Idle item. >>>>>> -* >>>>>> -* SYNOPSIS >>>>>> -*/ >>>>>> - >>>>>> -typedef osm_signal_t(*osm_pfn_start_t) (IN void *context1, IN void >>>>>> *context2); >>>>>> - >>>>>> -typedef void >>>>>> - (*osm_pfn_done_t) (IN void *context1, IN void *context2); >>>>>> - >>>>>> -typedef struct _osm_idle_item { >>>>>> - cl_list_item_t list_item; >>>>>> - void *context1; >>>>>> - void *context2; >>>>>> - osm_pfn_start_t pfn_start; >>>>>> - osm_pfn_done_t pfn_done; >>>>>> -} osm_idle_item_t; >>>>>> - >>>>>> -/* >>>>>> -* FIELDS >>>>>> -* list_item >>>>>> -* list item. >>>>>> -* >>>>>> -* context1 >>>>>> -* Context pointer >>>>>> -* >>>>>> -* context2 >>>>>> -* Context pointer >>>>>> -* >>>>>> -* pfn_start >>>>>> -* Pointer to the start function. >>>>>> -* >>>>>> -* pfn_done >>>>>> -* Pointer to the dine function. >>>>>> -* SEE ALSO >>>>>> -* State Manager object >>>>>> -*********/ >>>>>> - >>>>>> -/****f* OpenSM: State Manager/osm_state_mgr_process_idle >>>>>> -* NAME >>>>>> -* osm_state_mgr_process_idle >>>>>> -* >>>>>> -* DESCRIPTION >>>>>> -* Formulates the osm_idle_item and inserts it into the queue and >>>>>> -* signals the state manager. >>>>>> -* >>>>>> -* SYNOPSIS >>>>>> -*/ >>>>>> - >>>>>> -ib_api_status_t >>>>>> -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr, >>>>>> - IN osm_pfn_start_t pfn_start, >>>>>> - IN osm_pfn_done_t pfn_done, >>>>>> - void *context1, void *context2); >>>>>> - >>>>>> -/* >>>>>> -* PARAMETERS >>>>>> -* p_mgr >>>>>> -* [in] Pointer to a State Manager object to construct. >>>>>> -* >>>>>> -* pfn_start >>>>>> -* [in] Pointer the start function which will be called at >>>>>> -* idle time. >>>>>> -* >>>>>> -* pfn_done >>>>>> -* [in] pointer the done function which will be called >>>>>> -* when outstanding smps is zero >>>>>> -* >>>>>> -* context1 >>>>>> -* [in] Pointer to void >>>>>> -* >>>>>> -* context2 >>>>>> -* [in] Pointer to void >>>>>> -* >>>>>> -* RETURN VALUE >>>>>> -* IB_SUCCESS or IB_ERROR >>>>>> -* >>>>>> -* NOTES >>>>>> -* Allows osm_state_mgr_destroy >>>>>> -* >>>>>> -* Calling osm_state_mgr_construct is a prerequisite to calling any >>>>>> other >>>>>> -* method except osm_state_mgr_init. >>>>>> -* >>>>>> -* SEE ALSO >>>>>> -* State Manager object, osm_state_mgr_init, >>>>>> -* osm_state_mgr_destroy >>>>>> -*********/ >>>>>> - >>>>>> /****f* OpenSM: State Manager/osm_state_mgr_construct >>>>>> * NAME >>>>>> * osm_state_mgr_construct >>>>>> diff --git a/opensm/opensm/osm_mcast_mgr.c >>>>>> b/opensm/opensm/osm_mcast_mgr.c >>>>>> index 50b95fd..f51a45a 100644 >>>>>> --- a/opensm/opensm/osm_mcast_mgr.c >>>>>> +++ b/opensm/opensm/osm_mcast_mgr.c >>>>>> @@ -815,7 +815,7 @@ static osm_mtree_node_t >>>>>> *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr, >>>>>> } >>>>>> free(list_array); >>>>>> - Exit: >>>>>> +Exit: >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> return (p_mtn); >>>>>> } >>>>>> @@ -932,7 +932,7 @@ __osm_mcast_mgr_build_spanning_tree(osm_mcast_mgr_t >>>>>> * const p_mgr, >>>>>> "Configured MLID 0x%X for %u ports, max tree depth = %u\n", >>>>>> cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth); >>>>>> - Exit: >>>>>> +Exit: >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> return (status); >>>>>> } >>>>>> @@ -1171,7 +1171,7 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * >>>>>> const p_mgr, >>>>>> } >>>>>> } >>>>>> - Exit: >>>>>> +Exit: >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> return (status); >>>>>> } >>>>>> @@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * >>>>>> const p_mgr, >>>>>> port_guid); >>>>>> } >>>>>> - Exit: >>>>>> +Exit: >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> return (status); >>>>>> } >>>>>> >>>>>> /********************************************************************** >>>>>> Process the entire group. >>>>>> - >>>>>> NOTE : The lock should be held externally! >>>>>> >>>>>> **********************************************************************/ >>>>>> -static osm_signal_t >>>>>> -osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr, >>>>>> - IN osm_mgrp_t * const p_mgrp, >>>>>> - IN osm_mcast_req_type_t req_type, >>>>>> - IN ib_net64_t port_guid) >>>>>> +static ib_api_status_t >>>>>> +mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr, >>>>>> + IN osm_mgrp_t * const p_mgrp, >>>>>> + IN osm_mcast_req_type_t req_type, >>>>>> + IN ib_net64_t port_guid) >>>>>> { >>>>>> - osm_signal_t signal = OSM_SIGNAL_DONE; >>>>>> ib_api_status_t status; >>>>>> - osm_switch_t *p_sw; >>>>>> - cl_qmap_t *p_sw_tbl; >>>>>> - boolean_t pending_transactions = FALSE; >>>>>> OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp); >>>>>> - p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; >>>>>> - >>>>>> status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, >>>>>> port_guid); >>>>>> if (status != IB_SUCCESS) { >>>>>> osm_log(p_mgr->p_log, OSM_LOG_ERROR, >>>>>> - "osm_mcast_mgr_process_mgrp: ERR 0A19: " >>>>>> + "mcast_mgr_process_mgrp: ERR 0A19: " >>>>>> "Unable to create spanning tree (%s)\n", >>>>>> ib_get_err_str(status)); >>>>>> - >>>>>> goto Exit; >>>>>> } >>>>>> + p_mgrp->last_tree_id = p_mgrp->last_change_id; >>>>>> - /* >>>>>> - Walk the switches and download the tables for each. >>>>>> + /* Remove MGRP only if osm_mcm_port_t count is 0 and >>>>>> + * Not a well known group >>>>>> */ >>>>>> - p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl); >>>>>> - while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) { >>>>>> - signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw); >>>>>> - if (signal == OSM_SIGNAL_DONE_PENDING) >>>>>> - pending_transactions = TRUE; >>>>>> - >>>>>> - p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); >>>>>> + if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) >>>>>> { >>>>>> + osm_log(p_mgr->p_log, OSM_LOG_DEBUG, >>>>>> + "mcast_mgr_process_mgrp: " >>>>>> + "Destroying mgrp with lid:0x%X\n", >>>>>> + cl_ntoh16(p_mgrp->mlid)); >>>>>> + /* Send a Report to any InformInfo registered for >>>>>> + Trap 67 : MCGroup delete */ >>>>>> + osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log, >>>>>> + p_mgrp); >>>>>> + cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl, >>>>>> + (cl_map_item_t *) p_mgrp); >>>>>> + osm_mgrp_delete(p_mgrp); >>>>>> } >>>>>> - osm_dump_mcast_routes(p_mgr->p_subn->p_osm); >>>>>> - >>>>>> - Exit: >>>>>> +Exit: >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> - >>>>>> - if (pending_transactions == TRUE) >>>>>> - return (OSM_SIGNAL_DONE_PENDING); >>>>>> - else >>>>>> - return (OSM_SIGNAL_DONE); >>>>>> + return status; >>>>>> } >>>>>> >>>>>> /********************************************************************** >>>>>> @@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN >>>>>> osm_mcast_mgr_t * const p_mgr) >>>>>> osm_switch_t *p_sw; >>>>>> cl_qmap_t *p_sw_tbl; >>>>>> cl_qmap_t *p_mcast_tbl; >>>>>> + cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list; >>>>>> osm_mgrp_t *p_mgrp; >>>>>> - ib_api_status_t status; >>>>>> boolean_t pending_transactions = FALSE; >>>>>> OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process); >>>>>> p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; >>>>>> - >>>>>> p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl; >>>>>> /* >>>>>> While holding the lock, iterate over all the established >>>>>> @@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN >>>>>> osm_mcast_mgr_t * const p_mgr) >>>>>> /* We reached here due to some change that caused a heavy sweep >>>>>> of the subnet. Not due to a specific multicast request. >>>>>> So the request type is subnet_change and the port guid is 0. */ >>>>>> - status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, >>>>>> - OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, >>>>>> - 0); >>>>>> - if (status != IB_SUCCESS) { >>>>>> - osm_log(p_mgr->p_log, OSM_LOG_ERROR, >>>>>> - "osm_mcast_mgr_process: ERR 0A20: " >>>>>> - "Unable to create spanning tree (%s)\n", >>>>>> - ib_get_err_str(status)); >>>>>> - } >>>>>> - >>>>>> + mcast_mgr_process_mgrp(p_mgr, p_mgrp, >>>>>> + OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0); >>>>>> p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item); >>>>>> } >>>>>> @@ -1364,10 +1347,14 @@ osm_signal_t osm_mcast_mgr_process(IN >>>>>> osm_mcast_mgr_t * const p_mgr) >>>>>> signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw); >>>>>> if (signal == OSM_SIGNAL_DONE_PENDING) >>>>>> pending_transactions = TRUE; >>>>>> - >>>>>> p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); >>>>>> } >>>>>> + while (!cl_is_qlist_empty(p_list)) { >>>>>> + cl_list_item_t *p = cl_qlist_remove_head(p_list); >>>>>> + free(p); >>>>>> + } >>>>>> + >>>>>> CL_PLOCK_RELEASE(p_mgr->p_lock); >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> @@ -1395,79 +1382,79 @@ osm_mgrp_t *__get_mgrp_by_mlid(IN >>>>>> osm_mcast_mgr_t * const p_mgr, >>>>>> >>>>>> /********************************************************************** >>>>>> This is the function that is invoked during idle time to handle the >>>>>> - process request. Context1 is simply the osm_mcast_mgr_t*, Context2 >>>>>> - hold the mlid, port guid and action (join/leave/delete) required. >>>>>> + process request for mcast groups where join/leave/delete was >>>>>> required. >>>>>> >>>>>> **********************************************************************/ >>>>>> -osm_signal_t >>>>>> -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const >>>>>> Context2) >>>>>> +osm_signal_t osm_mcast_mgr_process_mgroups(osm_mcast_mgr_t * p_mgr) >>>>>> { >>>>>> - osm_mcast_mgr_t *p_mgr = (osm_mcast_mgr_t *) Context1; >>>>>> + cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list; >>>>>> + osm_switch_t *p_sw; >>>>>> + cl_qmap_t *p_sw_tbl; >>>>>> osm_mgrp_t *p_mgrp; >>>>>> ib_net16_t mlid; >>>>>> - osm_signal_t signal = OSM_SIGNAL_DONE; >>>>>> - osm_mcast_mgr_ctxt_t *p_ctxt = (osm_mcast_mgr_ctxt_t *) Context2; >>>>>> - osm_mcast_req_type_t req_type = p_ctxt->req_type; >>>>>> - ib_net64_t port_guid = p_ctxt->port_guid; >>>>>> - >>>>>> - OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp_cb); >>>>>> - >>>>>> - /* nice copy no warning on size diff */ >>>>>> - memcpy(&mlid, &p_ctxt->mlid, sizeof(mlid)); >>>>>> + osm_signal_t ret, signal = OSM_SIGNAL_DONE; >>>>>> + osm_mcast_mgr_ctxt_t *ctx; >>>>>> + osm_mcast_req_type_t req_type; >>>>>> + ib_net64_t port_guid; >>>>>> - /* we can destroy the context now */ >>>>>> - free(p_ctxt); >>>>>> + OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgroups); >>>>>> /* we need a lock to make sure the p_mgrp is not change other ways */ >>>>>> CL_PLOCK_EXCL_ACQUIRE(p_mgr->p_lock); >>>>>> - p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid); >>>>>> - /* since we delayed the execution we prefer to pass the >>>>>> - mlid as the mgrp identifier and then find it or abort */ >>>>>> + if (cl_is_qlist_empty(p_list)) { >>>>>> + CL_PLOCK_RELEASE(p_mgr->p_lock); >>>>>> + return OSM_SIGNAL_NONE; >>>>>> + } >>>>>> + >>>>>> + while (!cl_is_qlist_empty(p_list)) { >>>>>> + ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list); >>>>>> + req_type = ctx->req_type; >>>>>> + port_guid = ctx->port_guid; >>>>>> + >>>>>> + /* nice copy no warning on size diff */ >>>>>> + memcpy(&mlid, &ctx->mlid, sizeof(mlid)); >>>>>> - if (p_mgrp) { >>>>>> + /* we can destroy the context now */ >>>>>> + free(ctx); >>>>>> + >>>>>> + /* since we delayed the execution we prefer to pass the >>>>>> + mlid as the mgrp identifier and then find it or abort */ >>>>>> + p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid); >>>>>> + if (!p_mgrp) >>>>>> + continue; >>>>>> - /* if there was no change from the last time we processed the group >>>>>> - we can skip doing anything >>>>>> + /* if there was no change from the last time >>>>>> + * we processed the group we can skip doing anything >>>>>> */ >>>>>> if (p_mgrp->last_change_id == p_mgrp->last_tree_id) { >>>>>> osm_log(p_mgr->p_log, OSM_LOG_DEBUG, >>>>>> - "osm_mcast_mgr_process_mgrp_cb: " >>>>>> + "osm_mcast_mgr_process_mgroups: " >>>>>> "Skip processing mgrp with lid:0x%X change id:%u\n", >>>>>> cl_ntoh16(mlid), p_mgrp->last_change_id); >>>>>> - } else { >>>>>> - osm_log(p_mgr->p_log, OSM_LOG_DEBUG, >>>>>> - "osm_mcast_mgr_process_mgrp_cb: " >>>>>> - "Processing mgrp with lid:0x%X change id:%u\n", >>>>>> - cl_ntoh16(mlid), p_mgrp->last_change_id); >>>>>> - >>>>>> - signal = >>>>>> - osm_mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, >>>>>> - port_guid); >>>>>> - p_mgrp->last_tree_id = p_mgrp->last_change_id; >>>>>> + continue; >>>>>> } >>>>>> - /* Remove MGRP only if osm_mcm_port_t count is 0 and >>>>>> - * Not a well known group >>>>>> - */ >>>>>> - if ((0x0 == cl_qmap_count(&p_mgrp->mcm_port_tbl)) && >>>>>> - (p_mgrp->well_known == FALSE)) { >>>>>> - osm_log(p_mgr->p_log, OSM_LOG_DEBUG, >>>>>> - "osm_mcast_mgr_process_mgrp_cb: " >>>>>> - "Destroying mgrp with lid:0x%X\n", >>>>>> - cl_ntoh16(mlid)); >>>>>> - >>>>>> - /* Send a Report to any InformInfo registered for >>>>>> - Trap 67 : MCGroup delete */ >>>>>> - osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log, >>>>>> - p_mgrp); >>>>>> - >>>>>> - cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl, >>>>>> - (cl_map_item_t *) p_mgrp); >>>>>> + osm_log(p_mgr->p_log, OSM_LOG_DEBUG, >>>>>> + "osm_mcast_mgr_process_mgroups: " >>>>>> + "Processing mgrp with lid:0x%X change id:%u\n", >>>>>> + cl_ntoh16(mlid), p_mgrp->last_change_id); >>>>>> + mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, port_guid); >>>>>> + } >>>>>> - osm_mgrp_delete(p_mgrp); >>>>>> - } >>>>>> + /* >>>>>> + Walk the switches and download the tables for each. >>>>>> + */ >>>>>> + p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; >>>>>> + p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl); >>>>>> + while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) { >>>>>> + ret = __osm_mcast_mgr_set_tbl(p_mgr, p_sw); >>>>>> + if (ret == OSM_SIGNAL_DONE_PENDING) >>>>>> + signal = ret; >>>>>> + p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); >>>>>> } >>>>>> + osm_dump_mcast_routes(p_mgr->p_subn->p_osm); >>>>>> + >>>>>> CL_PLOCK_RELEASE(p_mgr->p_lock); >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> return signal; >>>>>> diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c >>>>>> index 88e6d4a..b295a77 100644 >>>>>> --- a/opensm/opensm/osm_sm.c >>>>>> +++ b/opensm/opensm/osm_sm.c >>>>>> @@ -144,6 +144,7 @@ void osm_sm_construct(IN osm_sm_t * const p_sm) >>>>>> cl_event_construct(&p_sm->signal_event); >>>>>> cl_event_construct(&p_sm->subnet_up_event); >>>>>> cl_thread_construct(&p_sm->sweeper); >>>>>> + cl_spinlock_construct(&p_sm->mgrp_lock); >>>>>> osm_req_construct(&p_sm->req); >>>>>> osm_resp_construct(&p_sm->resp); >>>>>> osm_ni_rcv_construct(&p_sm->ni_rcv); >>>>>> @@ -245,6 +246,7 @@ void osm_sm_destroy(IN osm_sm_t * const p_sm) >>>>>> cl_event_destroy(&p_sm->signal_event); >>>>>> cl_event_destroy(&p_sm->subnet_up_event); >>>>>> cl_spinlock_destroy(&p_sm->signal_lock); >>>>>> + cl_spinlock_destroy(&p_sm->mgrp_lock); >>>>>> osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n"); /* Format Waived >>>>>> */ >>>>>> OSM_LOG_EXIT(p_sm->p_log); >>>>>> @@ -292,6 +294,12 @@ osm_sm_init(IN osm_sm_t * const p_sm, >>>>>> if (status != CL_SUCCESS) >>>>>> goto Exit; >>>>>> + cl_qlist_init(&p_sm->mgrp_list); >>>>>> + >>>>>> + status = cl_spinlock_init(&p_sm->mgrp_lock); >>>>>> + if (status != CL_SUCCESS) >>>>>> + goto Exit; >>>>>> + >>>>>> status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl, >>>>>> p_sm->p_subn, >>>>>> p_sm->p_mad_pool, >>>>>> @@ -551,32 +559,43 @@ osm_sm_bind(IN osm_sm_t * const p_sm, IN const >>>>>> ib_net64_t port_guid) >>>>>> /********************************************************************** >>>>>> >>>>>> **********************************************************************/ >>>>>> static ib_api_status_t >>>>>> -__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm, >>>>>> +__osm_sm_mgrp_process(IN osm_sm_t * const p_sm, >>>>>> IN osm_mgrp_t * const p_mgrp, >>>>>> IN const ib_net64_t port_guid, >>>>>> IN osm_mcast_req_type_t req_type) >>>>>> { >>>>>> - ib_api_status_t status; >>>>>> osm_mcast_mgr_ctxt_t *ctx2; >>>>>> - OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_connect); >>>>>> - >>>>>> /* >>>>>> * 'Schedule' all the QP0 traffic for when the state manager >>>>>> * isn't busy trying to do something else. >>>>>> */ >>>>>> ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t)); >>>>>> + if (!ctx2) >>>>>> + return IB_ERROR; >>>>>> + memset(ctx2, 0, sizeof(*ctx2)); >>>>>> memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid)); >>>>>> ctx2->req_type = req_type; >>>>>> ctx2->port_guid = port_guid; >>>>>> - status = osm_state_mgr_process_idle(&p_sm->state_mgr, >>>>>> - osm_mcast_mgr_process_mgrp_cb, >>>>>> - NULL, &p_sm->mcast_mgr, >>>>>> - (void *)ctx2); >>>>>> + cl_spinlock_acquire(&p_sm->mgrp_lock); >>>>>> + cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx2->list_item); >>>>>> + cl_spinlock_release(&p_sm->mgrp_lock); >>>>>> - OSM_LOG_EXIT(p_sm->p_log); >>>>>> - return (status); >>>>>> + osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); >>>>>> + >>>>>> + return IB_SUCCESS; >>>>>> +} >>>>>> + >>>>>> +/********************************************************************** >>>>>> + >>>>>> **********************************************************************/ >>>>>> +static ib_api_status_t >>>>>> +__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm, >>>>>> + IN osm_mgrp_t * const p_mgrp, >>>>>> + IN const ib_net64_t port_guid, >>>>>> + IN osm_mcast_req_type_t req_type) >>>>>> +{ >>>>>> + return __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, req_type); >>>>>> } >>>>>> >>>>>> /********************************************************************** >>>>>> @@ -586,31 +605,7 @@ __osm_sm_mgrp_disconnect(IN osm_sm_t * const p_sm, >>>>>> IN osm_mgrp_t * const p_mgrp, >>>>>> IN const ib_net64_t port_guid) >>>>>> { >>>>>> - ib_api_status_t status; >>>>>> - osm_mcast_mgr_ctxt_t *ctx2; >>>>>> - >>>>>> - OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_disconnect); >>>>>> - >>>>>> - /* >>>>>> - * 'Schedule' all the QP0 traffic for when the state manager >>>>>> - * isn't busy trying to do something else. >>>>>> - */ >>>>>> - ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t)); >>>>>> - memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid)); >>>>>> - ctx2->req_type = OSM_MCAST_REQ_TYPE_LEAVE; >>>>>> - ctx2->port_guid = port_guid; >>>>>> - >>>>>> - status = osm_state_mgr_process_idle(&p_sm->state_mgr, >>>>>> - osm_mcast_mgr_process_mgrp_cb, >>>>>> - NULL, &p_sm->mcast_mgr, ctx2); >>>>>> - if (status != IB_SUCCESS) { >>>>>> - osm_log(p_sm->p_log, OSM_LOG_ERROR, >>>>>> - "__osm_sm_mgrp_disconnect: ERR 2E11: " >>>>>> - "Failure processing multicast group (%s)\n", >>>>>> - ib_get_err_str(status)); >>>>>> - } >>>>>> - >>>>>> - OSM_LOG_EXIT(p_sm->p_log); >>>>>> + __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, >>>>>> OSM_MCAST_REQ_TYPE_LEAVE); >>>>>> } >>>>>> >>>>>> /********************************************************************** >>>>>> @@ -719,8 +714,8 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm, >>>>>> goto Exit; >>>>>> } >>>>>> - CL_PLOCK_RELEASE(p_sm->p_lock); >>>>>> status = __osm_sm_mgrp_connect(p_sm, p_mgrp, port_guid, req_type); >>>>>> + CL_PLOCK_RELEASE(p_sm->p_lock); >>>>>> Exit: >>>>>> OSM_LOG_EXIT(p_sm->p_log); >>>>>> @@ -782,9 +777,8 @@ osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm, >>>>>> osm_port_remove_mgrp(p_port, mlid); >>>>>> - CL_PLOCK_RELEASE(p_sm->p_lock); >>>>>> - >>>>>> __osm_sm_mgrp_disconnect(p_sm, p_mgrp, port_guid); >>>>>> + CL_PLOCK_RELEASE(p_sm->p_lock); >>>>>> Exit: >>>>>> OSM_LOG_EXIT(p_sm->p_log); >>>>>> diff --git a/opensm/opensm/osm_state_mgr.c >>>>>> b/opensm/opensm/osm_state_mgr.c >>>>>> index 5c39f11..d4dd782 100644 >>>>>> --- a/opensm/opensm/osm_state_mgr.c >>>>>> +++ b/opensm/opensm/osm_state_mgr.c >>>>>> @@ -76,7 +76,6 @@ osm_signal_t osm_qos_setup(IN osm_opensm_t * p_osm); >>>>>> void osm_state_mgr_construct(IN osm_state_mgr_t * const p_mgr) >>>>>> { >>>>>> memset(p_mgr, 0, sizeof(*p_mgr)); >>>>>> - cl_spinlock_construct(&p_mgr->idle_lock); >>>>>> p_mgr->state = OSM_SM_STATE_INIT; >>>>>> } >>>>>> @@ -88,9 +87,6 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const >>>>>> p_mgr) >>>>>> OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_destroy); >>>>>> - /* destroy the locks */ >>>>>> - cl_spinlock_destroy(&p_mgr->idle_lock); >>>>>> - >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> } >>>>>> @@ -112,8 +108,6 @@ osm_state_mgr_init(IN osm_state_mgr_t * const >>>>>> p_mgr, >>>>>> IN cl_event_t * const p_subnet_up_event, >>>>>> IN osm_log_t * const p_log) >>>>>> { >>>>>> - cl_status_t status; >>>>>> - >>>>>> OSM_LOG_ENTER(p_log, osm_state_mgr_init); >>>>>> CL_ASSERT(p_subn); >>>>>> @@ -145,17 +139,8 @@ osm_state_mgr_init(IN osm_state_mgr_t * const >>>>>> p_mgr, >>>>>> p_mgr->p_lock = p_lock; >>>>>> p_mgr->p_subnet_up_event = p_subnet_up_event; >>>>>> - cl_qlist_init(&p_mgr->idle_time_list); >>>>>> - >>>>>> - status = cl_spinlock_init(&p_mgr->idle_lock); >>>>>> - if (status != CL_SUCCESS) { >>>>>> - osm_log(p_mgr->p_log, OSM_LOG_ERROR, >>>>>> - "osm_state_mgr_init: ERR 3302: " >>>>>> - "Spinlock init failed (%s)\n", CL_STATUS_MSG(status)); >>>>>> - } >>>>>> - >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> - return (status); >>>>>> + return IB_SUCCESS; >>>>>> } >>>>>> >>>>>> /********************************************************************** >>>>>> @@ -989,79 +974,6 @@ static ib_api_status_t >>>>>> __osm_state_mgr_light_sweep_start(IN osm_state_mgr_t * >>>>>> } >>>>>> >>>>>> /********************************************************************** >>>>>> - >>>>>> **********************************************************************/ >>>>>> -static void __process_idle_time_queue_done(IN osm_state_mgr_t * const >>>>>> p_mgr) >>>>>> -{ >>>>>> - cl_qlist_t *p_list = &p_mgr->idle_time_list; >>>>>> - cl_list_item_t *p_list_item; >>>>>> - osm_idle_item_t *p_process_item; >>>>>> - >>>>>> - OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done); >>>>>> - >>>>>> - cl_spinlock_acquire(&p_mgr->idle_lock); >>>>>> - p_list_item = cl_qlist_remove_head(p_list); >>>>>> - >>>>>> - if (p_list_item == cl_qlist_end(p_list)) { >>>>>> - cl_spinlock_release(&p_mgr->idle_lock); >>>>>> - osm_log(p_mgr->p_log, OSM_LOG_ERROR, >>>>>> - "__process_idle_time_queue_done: ERR 3314: " >>>>>> - "Idle time queue is empty\n"); >>>>>> - return; >>>>>> - } >>>>>> - cl_spinlock_release(&p_mgr->idle_lock); >>>>>> - >>>>>> - p_process_item = (osm_idle_item_t *) p_list_item; >>>>>> - >>>>>> - if (p_process_item->pfn_done) { >>>>>> - >>>>>> - p_process_item->pfn_done(p_process_item->context1, >>>>>> - p_process_item->context2); >>>>>> - } >>>>>> - >>>>>> - free(p_process_item); >>>>>> - >>>>>> - OSM_LOG_EXIT(p_mgr->p_log); >>>>>> - return; >>>>>> -} >>>>>> - >>>>>> -/********************************************************************** >>>>>> - >>>>>> **********************************************************************/ >>>>>> -static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t >>>>>> * >>>>>> - const p_mgr) >>>>>> -{ >>>>>> - cl_qlist_t *p_list = &p_mgr->idle_time_list; >>>>>> - cl_list_item_t *p_list_item; >>>>>> - osm_idle_item_t *p_process_item; >>>>>> - osm_signal_t signal; >>>>>> - >>>>>> - OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_start); >>>>>> - >>>>>> - cl_spinlock_acquire(&p_mgr->idle_lock); >>>>>> - >>>>>> - p_list_item = cl_qlist_head(p_list); >>>>>> - if (p_list_item == cl_qlist_end(p_list)) { >>>>>> - cl_spinlock_release(&p_mgr->idle_lock); >>>>>> - OSM_LOG_EXIT(p_mgr->p_log); >>>>>> - return OSM_SIGNAL_NONE; >>>>>> - } >>>>>> - >>>>>> - cl_spinlock_release(&p_mgr->idle_lock); >>>>>> - >>>>>> - p_process_item = (osm_idle_item_t *) p_list_item; >>>>>> - >>>>>> - CL_ASSERT(p_process_item->pfn_start); >>>>>> - >>>>>> - signal = >>>>>> - p_process_item->pfn_start(p_process_item->context1, >>>>>> - p_process_item->context2); >>>>>> - >>>>>> - CL_ASSERT(signal != OSM_SIGNAL_NONE); >>>>>> - >>>>>> - OSM_LOG_EXIT(p_mgr->p_log); >>>>>> - return signal; >>>>>> -} >>>>>> - >>>>>> -/********************************************************************** >>>>>> * Go over all the remote SMs (as updated in the sm_guid_tbl). >>>>>> * Find if there is a remote sm that is a master SM. >>>>>> * If there is a remote master SM - return a pointer to it, >>>>>> @@ -1558,7 +1470,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * >>>>>> const p_mgr, >>>>>> case OSM_SM_STATE_PROCESS_REQUEST: >>>>>> switch (signal) { >>>>>> case OSM_SIGNAL_IDLE_TIME_PROCESS: >>>>>> - signal = __process_idle_time_queue_start(p_mgr); >>>>>> + signal = osm_mcast_mgr_process_mgroups(p_mgr->p_mcast_mgr); >>>>>> switch (signal) { >>>>>> case OSM_SIGNAL_NONE: >>>>>> p_mgr->state = OSM_SM_STATE_IDLE; >>>>>> @@ -1604,14 +1516,6 @@ void osm_state_mgr_process(IN osm_state_mgr_t * >>>>>> const p_mgr, >>>>>> switch (signal) { >>>>>> case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: >>>>>> case OSM_SIGNAL_DONE: >>>>>> - /* CALL the done function */ >>>>>> - __process_idle_time_queue_done(p_mgr); >>>>>> - >>>>>> - /* >>>>>> - * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS >>>>>> - * so that the next element in the queue gets processed >>>>>> - */ >>>>>> - >>>>>> signal = OSM_SIGNAL_IDLE_TIME_PROCESS; >>>>>> p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST; >>>>>> break; >>>>>> @@ -2424,41 +2328,3 @@ void osm_state_mgr_process(IN osm_state_mgr_t * >>>>>> const p_mgr, >>>>>> OSM_LOG_EXIT(p_mgr->p_log); >>>>>> } >>>>>> - >>>>>> -/********************************************************************** >>>>>> - >>>>>> **********************************************************************/ >>>>>> -ib_api_status_t >>>>>> -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr, >>>>>> - IN osm_pfn_start_t pfn_start, >>>>>> - IN osm_pfn_done_t pfn_done, void *context1, >>>>>> - void *context2) >>>>>> -{ >>>>>> - osm_idle_item_t *p_idle_item; >>>>>> - >>>>>> - OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process_idle); >>>>>> - >>>>>> - p_idle_item = malloc(sizeof(osm_idle_item_t)); >>>>>> - if (p_idle_item == NULL) { >>>>>> - osm_log(p_mgr->p_log, OSM_LOG_ERROR, >>>>>> - "osm_state_mgr_process_idle: ERR 3321: " >>>>>> - "insufficient memory\n"); >>>>>> - return IB_ERROR; >>>>>> - } >>>>>> - >>>>>> - memset(p_idle_item, 0, sizeof(osm_idle_item_t)); >>>>>> - p_idle_item->pfn_start = pfn_start; >>>>>> - p_idle_item->pfn_done = pfn_done; >>>>>> - p_idle_item->context1 = context1; >>>>>> - p_idle_item->context2 = context2; >>>>>> - >>>>>> - cl_spinlock_acquire(&p_mgr->idle_lock); >>>>>> - cl_qlist_insert_tail(&p_mgr->idle_time_list, &p_idle_item->list_item); >>>>>> - cl_spinlock_release(&p_mgr->idle_lock); >>>>>> - >>>>>> - osm_sm_signal(&p_mgr->p_subn->p_osm->sm, >>>>>> - OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); >>>>>> - >>>>>> - OSM_LOG_EXIT(p_mgr->p_log); >>>>>> - >>>>>> - return IB_SUCCESS; >>>>>> -} >>>> _______________________________________________ >>>> general mailing list >>>> general at lists.openfabrics.org >>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>>> >>>> To unsubscribe, please visit >>>> http://openib.org/mailman/listinfo/openib-general >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> To unsubscribe, please visit >>> http://openib.org/mailman/listinfo/openib-general > From vlad at lists.openfabrics.org Tue Jan 1 03:07:26 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 1 Jan 2008 03:07:26 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080101-0200 daily build status Message-ID: <20080101110726.863F8E60090@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.15 Passed on x86_64 with linux-2.6.18 Passed on ia64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.19 Passed on ppc64 with linux-2.6.15 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.12 Passed on powerpc with linux-2.6.12 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.19 Passed on ppc64 with linux-2.6.14 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.14 Passed on x86_64 with linux-2.6.15 Passed on ppc64 with linux-2.6.18 Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.15 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.14 Passed on ia64 with linux-2.6.21.1 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-53.el5 Failed: From sashak at voltaire.com Tue Jan 1 05:11:32 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 1 Jan 2008 13:11:32 +0000 Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements In-Reply-To: <477A0156.2090307@dev.mellanox.co.il> References: <4770CDCE.8040200@dev.mellanox.co.il> <20071229182718.GA19160@sashak.voltaire.com> <1199032710.23289.340.camel@hrosenstock-ws.xsigo.com> <20071230181610.GC10650@sashak.voltaire.com> <1199115083.23289.359.camel@hrosenstock-ws.xsigo.com> <47790C4D.7080405@dev.mellanox.co.il> <20071231154815.GD11591@sashak.voltaire.com> <477A0156.2090307@dev.mellanox.co.il> Message-ID: <20080101131131.GA13518@sashak.voltaire.com> On 11:01 Tue 01 Jan , Yevgeny Kliteynik wrote: > > Looks like there was some problem sending the simulation > report tonight, but test logs show that everything is ok. Good. I'm applying the patches. Happy New Year! Sasha From sashak at voltaire.com Tue Jan 1 05:12:38 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 1 Jan 2008 13:12:38 +0000 Subject: [ofa-general] Re: [PATCH v2] opensm: osm_state_mgr.c - stop idle queue processing if heavy sweep requested In-Reply-To: <47667D5B.8000802@dev.mellanox.co.il> References: <47667D5B.8000802@dev.mellanox.co.il> Message-ID: <20080101131238.GB13518@sashak.voltaire.com> On 15:44 Mon 17 Dec , Yevgeny Kliteynik wrote: > If a heavy sweep requested during idle queue processing, OSM continues > to process it till the end and only then notices the heavy sweep request. > In some cases this might leave a topology change unhandled for several > minutes. > > Signed-off-by: Yevgeny Kliteynik Rebased and applied. Thanks. Sasha From bunk at kernel.org Tue Jan 1 05:47:10 2008 From: bunk at kernel.org (Adrian Bunk) Date: Tue, 1 Jan 2008 15:47:10 +0200 Subject: [ofa-general] [2.6 patch] mthca: the scheduled MSI support removal Message-ID: <20080101134710.GH2360@does.not.exist> This patch contains the scheduled removal of the MSI support in the mthca driver. Signed-off-by: Adrian Bunk --- Documentation/feature-removal-schedule.txt | 10 ----- drivers/infiniband/hw/mthca/mthca_dev.h | 1 drivers/infiniband/hw/mthca/mthca_eq.c | 6 +-- drivers/infiniband/hw/mthca/mthca_main.c | 38 ++------------------- 4 files changed, 7 insertions(+), 48 deletions(-) c548d98cb3d4b9001be2e7246d1ab08eee58d5e1 diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 93aac19..a9b00f9 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -287,16 +287,6 @@ Who: linuxppc-dev at ozlabs.org --------------------------- -What: mthca driver's MSI support -When: January 2008 -Files: drivers/infiniband/hw/mthca/*.[ch] -Why: All mthca hardware also supports MSI-X, which provides - strictly more functionality than MSI. So there is no point in - having both MSI-X and MSI support in the driver. -Who: Roland Dreier - ---------------------------- - What: sk98lin network driver When: Feburary 2008 Why: In kernel tree version of driver is unmaintained. Sk98lin driver diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h index 15aa32e..53159c0 100644 --- a/drivers/infiniband/hw/mthca/mthca_dev.h +++ b/drivers/infiniband/hw/mthca/mthca_dev.h @@ -60,7 +60,6 @@ enum { MTHCA_FLAG_DDR_HIDDEN = 1 << 1, MTHCA_FLAG_SRQ = 1 << 2, - MTHCA_FLAG_MSI = 1 << 3, MTHCA_FLAG_MSI_X = 1 << 4, MTHCA_FLAG_NO_LAM = 1 << 5, MTHCA_FLAG_FMR = 1 << 6, diff --git a/drivers/infiniband/hw/mthca/mthca_eq.c b/drivers/infiniband/hw/mthca/mthca_eq.c index b29de51..b60eb5d 100644 --- a/drivers/infiniband/hw/mthca/mthca_eq.c +++ b/drivers/infiniband/hw/mthca/mthca_eq.c @@ -827,8 +827,7 @@ int mthca_init_eq_table(struct mthca_dev *dev) if (err) goto err_out_free; - if (dev->mthca_flags & MTHCA_FLAG_MSI || - dev->mthca_flags & MTHCA_FLAG_MSI_X) { + if (dev->mthca_flags & MTHCA_FLAG_MSI_X) { dev->eq_table.clr_mask = 0; } else { dev->eq_table.clr_mask = @@ -839,8 +838,7 @@ int mthca_init_eq_table(struct mthca_dev *dev) dev->eq_table.arm_mask = 0; - intr = (dev->mthca_flags & MTHCA_FLAG_MSI) ? - 128 : dev->eq_table.inta_pin; + intr = dev->eq_table.inta_pin; err = mthca_create_eq(dev, dev->limits.num_cqs + MTHCA_NUM_SPARE_EQE, (dev->mthca_flags & MTHCA_FLAG_MSI_X) ? 128 : intr, diff --git a/drivers/infiniband/hw/mthca/mthca_main.c b/drivers/infiniband/hw/mthca/mthca_main.c index 60de6f9..2fc36ca 100644 --- a/drivers/infiniband/hw/mthca/mthca_main.c +++ b/drivers/infiniband/hw/mthca/mthca_main.c @@ -65,14 +65,9 @@ static int msi_x = 1; module_param(msi_x, int, 0444); MODULE_PARM_DESC(msi_x, "attempt to use MSI-X if nonzero"); -static int msi = 0; -module_param(msi, int, 0444); -MODULE_PARM_DESC(msi, "attempt to use MSI if nonzero (deprecated, use MSI-X instead)"); - #else /* CONFIG_PCI_MSI */ #define msi_x (0) -#define msi (0) #endif /* CONFIG_PCI_MSI */ @@ -816,13 +811,11 @@ static int mthca_setup_hca(struct mthca_dev *dev) err = mthca_NOP(dev, &status); if (err || status) { - if (dev->mthca_flags & (MTHCA_FLAG_MSI | MTHCA_FLAG_MSI_X)) { + if (dev->mthca_flags & MTHCA_FLAG_MSI_X) { mthca_warn(dev, "NOP command failed to generate interrupt " "(IRQ %d).\n", - dev->mthca_flags & MTHCA_FLAG_MSI_X ? - dev->eq_table.eq[MTHCA_EQ_CMD].msi_x_vector : - dev->pdev->irq); - mthca_warn(dev, "Trying again with MSI/MSI-X disabled.\n"); + dev->eq_table.eq[MTHCA_EQ_CMD].msi_x_vector); + mthca_warn(dev, "Trying again with MSI-X disabled.\n"); } else { mthca_err(dev, "NOP command failed to generate interrupt " "(IRQ %d), aborting.\n", @@ -1128,29 +1121,12 @@ static int __mthca_init_one(struct pci_dev *pdev, int hca_type) if (msi_x && !mthca_enable_msi_x(mdev)) mdev->mthca_flags |= MTHCA_FLAG_MSI_X; - else if (msi) { - static int warned; - - if (!warned) { - printk(KERN_WARNING PFX "WARNING: MSI support will be " - "removed from the ib_mthca driver in January 2008.\n"); - printk(KERN_WARNING " If you are using MSI and cannot " - "switch to MSI-X, please tell " - ".\n"); - ++warned; - } - - if (!pci_enable_msi(pdev)) - mdev->mthca_flags |= MTHCA_FLAG_MSI; - } err = mthca_setup_hca(mdev); - if (err == -EBUSY && (mdev->mthca_flags & (MTHCA_FLAG_MSI | MTHCA_FLAG_MSI_X))) { + if (err == -EBUSY && (mdev->mthca_flags & MTHCA_FLAG_MSI_X)) { if (mdev->mthca_flags & MTHCA_FLAG_MSI_X) pci_disable_msix(pdev); - if (mdev->mthca_flags & MTHCA_FLAG_MSI) - pci_disable_msi(pdev); - mdev->mthca_flags &= ~(MTHCA_FLAG_MSI_X | MTHCA_FLAG_MSI); + mdev->mthca_flags &= ~MTHCA_FLAG_MSI_X; err = mthca_setup_hca(mdev); } @@ -1192,8 +1168,6 @@ err_cleanup: err_close: if (mdev->mthca_flags & MTHCA_FLAG_MSI_X) pci_disable_msix(pdev); - if (mdev->mthca_flags & MTHCA_FLAG_MSI) - pci_disable_msi(pdev); mthca_close_hca(mdev); @@ -1246,8 +1220,6 @@ static void __mthca_remove_one(struct pci_dev *pdev) if (mdev->mthca_flags & MTHCA_FLAG_MSI_X) pci_disable_msix(pdev); - if (mdev->mthca_flags & MTHCA_FLAG_MSI) - pci_disable_msi(pdev); ib_dealloc_device(&mdev->ib_dev); mthca_release_regions(pdev, mdev->mthca_flags & From kliteyn at dev.mellanox.co.il Tue Jan 1 05:50:10 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 01 Jan 2008 15:50:10 +0200 Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements In-Reply-To: <20071229182718.GA19160@sashak.voltaire.com> References: <4770CDCE.8040200@dev.mellanox.co.il> <20071229182718.GA19160@sashak.voltaire.com> Message-ID: <477A4512.1000207@dev.mellanox.co.il> Sasha Khapyorsky wrote: > This improves handling of mcast join/leave requests storming. Now mcast > routing will be recalculated for all mcast groups where changes occurred > and not one by one. For this it queues mcast groups instead of mcast > rerouting requests, this also makes state_mgr idle queue obsolete. > > Signed-off-by: Sasha Khapyorsky > --- > > Hi Yevgeny, > > For me it looks that it should solve the original problem (mcast group > list is purged in osm_mcast_mgr_process()). Could you review and ideally > test it? Thanks. > > Sasha > > --- > opensm/include/opensm/osm_mcast_mgr.h | 14 +-- > opensm/include/opensm/osm_multicast.h | 2 + > opensm/include/opensm/osm_sm.h | 2 + > opensm/include/opensm/osm_state_mgr.h | 95 ----------------- > opensm/opensm/osm_mcast_mgr.c | 187 +++++++++++++++------------------ > opensm/opensm/osm_sm.c | 70 ++++++------- > opensm/opensm/osm_state_mgr.c | 138 +------------------------ > 7 files changed, 130 insertions(+), 378 deletions(-) > > diff --git a/opensm/include/opensm/osm_mcast_mgr.h b/opensm/include/opensm/osm_mcast_mgr.h > index 3e0b761..47b67ed 100644 > --- a/opensm/include/opensm/osm_mcast_mgr.h > +++ b/opensm/include/opensm/osm_mcast_mgr.h > @@ -100,7 +100,6 @@ typedef struct _osm_mcast_mgr { > osm_req_t *p_req; > osm_log_t *p_log; > cl_plock_t *p_lock; > - > } osm_mcast_mgr_t; > /* > * FIELDS > @@ -253,25 +252,22 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr); > * Multicast Manager, Node Info Response Controller > *********/ > > -/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgrp_cb > +/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgroups > * NAME > -* osm_mcast_mgr_process_mgrp_cb > +* osm_mcast_mgr_process_mgroups > * > * DESCRIPTION > -* Callback entry point for the osm_mcast_mgr_process_mgrp function. > +* Process only requested mcast groups. > * > * SYNOPSIS > */ > osm_signal_t > -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2); > +osm_mcast_mgr_process_mgroups(IN osm_mcast_mgr_t *p_mgr); > /* > * PARAMETERS > -* (Context1) p_mgr > +* p_mgr > * [in] Pointer to an osm_mcast_mgr_t object. > * > -* (Context2) p_mgrp > -* [in] Pointer to the multicast group to process. > -* > * RETURN VALUES > * IB_SUCCESS > * > diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h > index 729a2ea..f442a45 100644 > --- a/opensm/include/opensm/osm_multicast.h > +++ b/opensm/include/opensm/osm_multicast.h > @@ -50,6 +50,7 @@ > > #include > #include > +#include > #include > #include > #include > @@ -121,6 +122,7 @@ const char *osm_get_mcast_req_type_str(IN osm_mcast_req_type_t req_type); > * SYNOPSIS > */ > typedef struct osm_mcast_mgr_ctxt { > + cl_list_item_t list_item; > ib_net16_t mlid; > osm_mcast_req_type_t req_type; > ib_net64_t port_guid; > diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h > index 4c6ce27..a676cd6 100644 > --- a/opensm/include/opensm/osm_sm.h > +++ b/opensm/include/opensm/osm_sm.h > @@ -140,6 +140,8 @@ typedef struct osm_sm { > cl_dispatcher_t *p_disp; > cl_plock_t *p_lock; > atomic32_t sm_trans_id; > + cl_spinlock_t mgrp_lock; > + cl_qlist_t mgrp_list; > osm_req_t req; > osm_resp_t resp; > osm_ni_rcv_t ni_rcv; > diff --git a/opensm/include/opensm/osm_state_mgr.h b/opensm/include/opensm/osm_state_mgr.h > index dada097..f51593a 100644 > --- a/opensm/include/opensm/osm_state_mgr.h > +++ b/opensm/include/opensm/osm_state_mgr.h > @@ -109,8 +109,6 @@ typedef struct _osm_state_mgr { > osm_stats_t *p_stats; > struct _osm_sm_state_mgr *p_sm_state_mgr; > const osm_sm_mad_ctrl_t *p_mad_ctrl; > - cl_spinlock_t idle_lock; > - cl_qlist_t idle_time_list; > cl_plock_t *p_lock; > cl_event_t *p_subnet_up_event; > osm_sm_state_t state; > @@ -172,99 +170,6 @@ typedef struct _osm_state_mgr { > * State Manager object > *********/ > > -/****s* OpenSM: State Manager/_osm_idle_item > -* NAME > -* _osm_idle_item > -* > -* DESCRIPTION > -* Idle item. > -* > -* SYNOPSIS > -*/ > - > -typedef osm_signal_t(*osm_pfn_start_t) (IN void *context1, IN void *context2); > - > -typedef void > - (*osm_pfn_done_t) (IN void *context1, IN void *context2); > - > -typedef struct _osm_idle_item { > - cl_list_item_t list_item; > - void *context1; > - void *context2; > - osm_pfn_start_t pfn_start; > - osm_pfn_done_t pfn_done; > -} osm_idle_item_t; > - > -/* > -* FIELDS > -* list_item > -* list item. > -* > -* context1 > -* Context pointer > -* > -* context2 > -* Context pointer > -* > -* pfn_start > -* Pointer to the start function. > -* > -* pfn_done > -* Pointer to the dine function. > -* SEE ALSO > -* State Manager object > -*********/ > - > -/****f* OpenSM: State Manager/osm_state_mgr_process_idle > -* NAME > -* osm_state_mgr_process_idle > -* > -* DESCRIPTION > -* Formulates the osm_idle_item and inserts it into the queue and > -* signals the state manager. > -* > -* SYNOPSIS > -*/ > - > -ib_api_status_t > -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr, > - IN osm_pfn_start_t pfn_start, > - IN osm_pfn_done_t pfn_done, > - void *context1, void *context2); > - > -/* > -* PARAMETERS > -* p_mgr > -* [in] Pointer to a State Manager object to construct. > -* > -* pfn_start > -* [in] Pointer the start function which will be called at > -* idle time. > -* > -* pfn_done > -* [in] pointer the done function which will be called > -* when outstanding smps is zero > -* > -* context1 > -* [in] Pointer to void > -* > -* context2 > -* [in] Pointer to void > -* > -* RETURN VALUE > -* IB_SUCCESS or IB_ERROR > -* > -* NOTES > -* Allows osm_state_mgr_destroy > -* > -* Calling osm_state_mgr_construct is a prerequisite to calling any other > -* method except osm_state_mgr_init. > -* > -* SEE ALSO > -* State Manager object, osm_state_mgr_init, > -* osm_state_mgr_destroy > -*********/ > - > /****f* OpenSM: State Manager/osm_state_mgr_construct > * NAME > * osm_state_mgr_construct > diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c > index 50b95fd..f51a45a 100644 > --- a/opensm/opensm/osm_mcast_mgr.c > +++ b/opensm/opensm/osm_mcast_mgr.c > @@ -815,7 +815,7 @@ static osm_mtree_node_t *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr, > } > > free(list_array); > - Exit: > +Exit: > OSM_LOG_EXIT(p_mgr->p_log); > return (p_mtn); > } > @@ -932,7 +932,7 @@ __osm_mcast_mgr_build_spanning_tree(osm_mcast_mgr_t * const p_mgr, > "Configured MLID 0x%X for %u ports, max tree depth = %u\n", > cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth); > > - Exit: > +Exit: > OSM_LOG_EXIT(p_mgr->p_log); > return (status); > } > @@ -1171,7 +1171,7 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * const p_mgr, > } > } > > - Exit: > +Exit: > OSM_LOG_EXIT(p_mgr->p_log); > return (status); > } > @@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * const p_mgr, > port_guid); > } > > - Exit: > +Exit: > OSM_LOG_EXIT(p_mgr->p_log); > return (status); > } > > /********************************************************************** > Process the entire group. > - > NOTE : The lock should be held externally! > **********************************************************************/ > -static osm_signal_t > -osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr, > - IN osm_mgrp_t * const p_mgrp, > - IN osm_mcast_req_type_t req_type, > - IN ib_net64_t port_guid) > +static ib_api_status_t > +mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr, > + IN osm_mgrp_t * const p_mgrp, > + IN osm_mcast_req_type_t req_type, > + IN ib_net64_t port_guid) > { > - osm_signal_t signal = OSM_SIGNAL_DONE; > ib_api_status_t status; > - osm_switch_t *p_sw; > - cl_qmap_t *p_sw_tbl; > - boolean_t pending_transactions = FALSE; > > OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp); > > - p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; > - > status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, port_guid); > if (status != IB_SUCCESS) { > osm_log(p_mgr->p_log, OSM_LOG_ERROR, > - "osm_mcast_mgr_process_mgrp: ERR 0A19: " > + "mcast_mgr_process_mgrp: ERR 0A19: " > "Unable to create spanning tree (%s)\n", > ib_get_err_str(status)); > - > goto Exit; > } > + p_mgrp->last_tree_id = p_mgrp->last_change_id; > > - /* > - Walk the switches and download the tables for each. > + /* Remove MGRP only if osm_mcm_port_t count is 0 and > + * Not a well known group > */ > - p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl); > - while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) { > - signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw); > - if (signal == OSM_SIGNAL_DONE_PENDING) > - pending_transactions = TRUE; > - > - p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > + if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) { > + osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > + "mcast_mgr_process_mgrp: " > + "Destroying mgrp with lid:0x%X\n", > + cl_ntoh16(p_mgrp->mlid)); > + /* Send a Report to any InformInfo registered for > + Trap 67 : MCGroup delete */ > + osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log, > + p_mgrp); > + cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl, > + (cl_map_item_t *) p_mgrp); > + osm_mgrp_delete(p_mgrp); > } > > - osm_dump_mcast_routes(p_mgr->p_subn->p_osm); > - > - Exit: > +Exit: > OSM_LOG_EXIT(p_mgr->p_log); > - > - if (pending_transactions == TRUE) > - return (OSM_SIGNAL_DONE_PENDING); > - else > - return (OSM_SIGNAL_DONE); > + return status; > } > > /********************************************************************** > @@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr) > osm_switch_t *p_sw; > cl_qmap_t *p_sw_tbl; > cl_qmap_t *p_mcast_tbl; > + cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list; > osm_mgrp_t *p_mgrp; > - ib_api_status_t status; > boolean_t pending_transactions = FALSE; > > OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process); > > p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; > - > p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl; > /* > While holding the lock, iterate over all the established > @@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr) > /* We reached here due to some change that caused a heavy sweep > of the subnet. Not due to a specific multicast request. > So the request type is subnet_change and the port guid is 0. */ > - status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, > - OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, > - 0); > - if (status != IB_SUCCESS) { > - osm_log(p_mgr->p_log, OSM_LOG_ERROR, > - "osm_mcast_mgr_process: ERR 0A20: " > - "Unable to create spanning tree (%s)\n", > - ib_get_err_str(status)); > - } > - > + mcast_mgr_process_mgrp(p_mgr, p_mgrp, > + OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0); > p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item); > } > > @@ -1364,10 +1347,14 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr) > signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw); > if (signal == OSM_SIGNAL_DONE_PENDING) > pending_transactions = TRUE; > - > p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > } > > + while (!cl_is_qlist_empty(p_list)) { > + cl_list_item_t *p = cl_qlist_remove_head(p_list); > + free(p); > + } > + > CL_PLOCK_RELEASE(p_mgr->p_lock); > > OSM_LOG_EXIT(p_mgr->p_log); > @@ -1395,79 +1382,79 @@ osm_mgrp_t *__get_mgrp_by_mlid(IN osm_mcast_mgr_t * const p_mgr, > > /********************************************************************** > This is the function that is invoked during idle time to handle the > - process request. Context1 is simply the osm_mcast_mgr_t*, Context2 > - hold the mlid, port guid and action (join/leave/delete) required. > + process request for mcast groups where join/leave/delete was required. > **********************************************************************/ > -osm_signal_t > -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2) > +osm_signal_t osm_mcast_mgr_process_mgroups(osm_mcast_mgr_t * p_mgr) > { > - osm_mcast_mgr_t *p_mgr = (osm_mcast_mgr_t *) Context1; > + cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list; > + osm_switch_t *p_sw; > + cl_qmap_t *p_sw_tbl; > osm_mgrp_t *p_mgrp; > ib_net16_t mlid; > - osm_signal_t signal = OSM_SIGNAL_DONE; > - osm_mcast_mgr_ctxt_t *p_ctxt = (osm_mcast_mgr_ctxt_t *) Context2; > - osm_mcast_req_type_t req_type = p_ctxt->req_type; > - ib_net64_t port_guid = p_ctxt->port_guid; > - > - OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp_cb); > - > - /* nice copy no warning on size diff */ > - memcpy(&mlid, &p_ctxt->mlid, sizeof(mlid)); > + osm_signal_t ret, signal = OSM_SIGNAL_DONE; > + osm_mcast_mgr_ctxt_t *ctx; > + osm_mcast_req_type_t req_type; > + ib_net64_t port_guid; > > - /* we can destroy the context now */ > - free(p_ctxt); > + OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgroups); > > /* we need a lock to make sure the p_mgrp is not change other ways */ > CL_PLOCK_EXCL_ACQUIRE(p_mgr->p_lock); > - p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid); > > - /* since we delayed the execution we prefer to pass the > - mlid as the mgrp identifier and then find it or abort */ > + if (cl_is_qlist_empty(p_list)) { > + CL_PLOCK_RELEASE(p_mgr->p_lock); > + return OSM_SIGNAL_NONE; > + } > + > + while (!cl_is_qlist_empty(p_list)) { > + ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list); > + req_type = ctx->req_type; > + port_guid = ctx->port_guid; > + > + /* nice copy no warning on size diff */ > + memcpy(&mlid, &ctx->mlid, sizeof(mlid)); > > - if (p_mgrp) { > + /* we can destroy the context now */ > + free(ctx); > + > + /* since we delayed the execution we prefer to pass the > + mlid as the mgrp identifier and then find it or abort */ > + p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid); > + if (!p_mgrp) > + continue; > > - /* if there was no change from the last time we processed the group > - we can skip doing anything > + /* if there was no change from the last time > + * we processed the group we can skip doing anything > */ > if (p_mgrp->last_change_id == p_mgrp->last_tree_id) { > osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > - "osm_mcast_mgr_process_mgrp_cb: " > + "osm_mcast_mgr_process_mgroups: " > "Skip processing mgrp with lid:0x%X change id:%u\n", > cl_ntoh16(mlid), p_mgrp->last_change_id); > - } else { > - osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > - "osm_mcast_mgr_process_mgrp_cb: " > - "Processing mgrp with lid:0x%X change id:%u\n", > - cl_ntoh16(mlid), p_mgrp->last_change_id); > - > - signal = > - osm_mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, > - port_guid); > - p_mgrp->last_tree_id = p_mgrp->last_change_id; > + continue; > } > > - /* Remove MGRP only if osm_mcm_port_t count is 0 and > - * Not a well known group > - */ > - if ((0x0 == cl_qmap_count(&p_mgrp->mcm_port_tbl)) && > - (p_mgrp->well_known == FALSE)) { > - osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > - "osm_mcast_mgr_process_mgrp_cb: " > - "Destroying mgrp with lid:0x%X\n", > - cl_ntoh16(mlid)); > - > - /* Send a Report to any InformInfo registered for > - Trap 67 : MCGroup delete */ > - osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log, > - p_mgrp); > - > - cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl, > - (cl_map_item_t *) p_mgrp); > + osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > + "osm_mcast_mgr_process_mgroups: " > + "Processing mgrp with lid:0x%X change id:%u\n", > + cl_ntoh16(mlid), p_mgrp->last_change_id); > + mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, port_guid); > + } > > - osm_mgrp_delete(p_mgrp); > - } > + /* > + Walk the switches and download the tables for each. > + */ > + p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; > + p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl); > + while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) { > + ret = __osm_mcast_mgr_set_tbl(p_mgr, p_sw); > + if (ret == OSM_SIGNAL_DONE_PENDING) > + signal = ret; > + p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > } So basically what you do here is processing all the idle queue requests one by one, but you actually update the mcast tables on switches only once at the end of idle queue processing. This should speed up mcast handling significantly, which is great. However, let's get back to the original problem that triggered this change. During a join/leave "burst" in a big fabric (let's say couple of K's of CAs) with hundreds of mcast groups this can still be a problem. If there are hundreds of requests in the idle queue, function osm_mcast_mgr_process_mgroups() will return only when it will finish the whole idle queue processing, so if during that time there will be a topology change in the fabric, and immediate heavy sweep will be requested, the OSM state mgr will notice it only after finishing processing the idle queue, right? Any idea how fast the processing is right now? In the original problem I saw osm busy with mcast for more than 10 minutes. How much will it take now? Even if it's 10 times faster, 1 minute of unattended topology change is still too long. -- Yevgeny > + osm_dump_mcast_routes(p_mgr->p_subn->p_osm); > + > CL_PLOCK_RELEASE(p_mgr->p_lock); > OSM_LOG_EXIT(p_mgr->p_log); > return signal; > diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c > index 88e6d4a..b295a77 100644 > --- a/opensm/opensm/osm_sm.c > +++ b/opensm/opensm/osm_sm.c > @@ -144,6 +144,7 @@ void osm_sm_construct(IN osm_sm_t * const p_sm) > cl_event_construct(&p_sm->signal_event); > cl_event_construct(&p_sm->subnet_up_event); > cl_thread_construct(&p_sm->sweeper); > + cl_spinlock_construct(&p_sm->mgrp_lock); > osm_req_construct(&p_sm->req); > osm_resp_construct(&p_sm->resp); > osm_ni_rcv_construct(&p_sm->ni_rcv); > @@ -245,6 +246,7 @@ void osm_sm_destroy(IN osm_sm_t * const p_sm) > cl_event_destroy(&p_sm->signal_event); > cl_event_destroy(&p_sm->subnet_up_event); > cl_spinlock_destroy(&p_sm->signal_lock); > + cl_spinlock_destroy(&p_sm->mgrp_lock); > > osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n"); /* Format Waived */ > OSM_LOG_EXIT(p_sm->p_log); > @@ -292,6 +294,12 @@ osm_sm_init(IN osm_sm_t * const p_sm, > if (status != CL_SUCCESS) > goto Exit; > > + cl_qlist_init(&p_sm->mgrp_list); > + > + status = cl_spinlock_init(&p_sm->mgrp_lock); > + if (status != CL_SUCCESS) > + goto Exit; > + > status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl, > p_sm->p_subn, > p_sm->p_mad_pool, > @@ -551,32 +559,43 @@ osm_sm_bind(IN osm_sm_t * const p_sm, IN const ib_net64_t port_guid) > /********************************************************************** > **********************************************************************/ > static ib_api_status_t > -__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm, > +__osm_sm_mgrp_process(IN osm_sm_t * const p_sm, > IN osm_mgrp_t * const p_mgrp, > IN const ib_net64_t port_guid, > IN osm_mcast_req_type_t req_type) > { > - ib_api_status_t status; > osm_mcast_mgr_ctxt_t *ctx2; > > - OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_connect); > - > /* > * 'Schedule' all the QP0 traffic for when the state manager > * isn't busy trying to do something else. > */ > ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t)); > + if (!ctx2) > + return IB_ERROR; > + memset(ctx2, 0, sizeof(*ctx2)); > memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid)); > ctx2->req_type = req_type; > ctx2->port_guid = port_guid; > > - status = osm_state_mgr_process_idle(&p_sm->state_mgr, > - osm_mcast_mgr_process_mgrp_cb, > - NULL, &p_sm->mcast_mgr, > - (void *)ctx2); > + cl_spinlock_acquire(&p_sm->mgrp_lock); > + cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx2->list_item); > + cl_spinlock_release(&p_sm->mgrp_lock); > > - OSM_LOG_EXIT(p_sm->p_log); > - return (status); > + osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); > + > + return IB_SUCCESS; > +} > + > +/********************************************************************** > + **********************************************************************/ > +static ib_api_status_t > +__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm, > + IN osm_mgrp_t * const p_mgrp, > + IN const ib_net64_t port_guid, > + IN osm_mcast_req_type_t req_type) > +{ > + return __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, req_type); > } > > /********************************************************************** > @@ -586,31 +605,7 @@ __osm_sm_mgrp_disconnect(IN osm_sm_t * const p_sm, > IN osm_mgrp_t * const p_mgrp, > IN const ib_net64_t port_guid) > { > - ib_api_status_t status; > - osm_mcast_mgr_ctxt_t *ctx2; > - > - OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_disconnect); > - > - /* > - * 'Schedule' all the QP0 traffic for when the state manager > - * isn't busy trying to do something else. > - */ > - ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t)); > - memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid)); > - ctx2->req_type = OSM_MCAST_REQ_TYPE_LEAVE; > - ctx2->port_guid = port_guid; > - > - status = osm_state_mgr_process_idle(&p_sm->state_mgr, > - osm_mcast_mgr_process_mgrp_cb, > - NULL, &p_sm->mcast_mgr, ctx2); > - if (status != IB_SUCCESS) { > - osm_log(p_sm->p_log, OSM_LOG_ERROR, > - "__osm_sm_mgrp_disconnect: ERR 2E11: " > - "Failure processing multicast group (%s)\n", > - ib_get_err_str(status)); > - } > - > - OSM_LOG_EXIT(p_sm->p_log); > + __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, OSM_MCAST_REQ_TYPE_LEAVE); > } > > /********************************************************************** > @@ -719,8 +714,8 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm, > goto Exit; > } > > - CL_PLOCK_RELEASE(p_sm->p_lock); > status = __osm_sm_mgrp_connect(p_sm, p_mgrp, port_guid, req_type); > + CL_PLOCK_RELEASE(p_sm->p_lock); > > Exit: > OSM_LOG_EXIT(p_sm->p_log); > @@ -782,9 +777,8 @@ osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm, > > osm_port_remove_mgrp(p_port, mlid); > > - CL_PLOCK_RELEASE(p_sm->p_lock); > - > __osm_sm_mgrp_disconnect(p_sm, p_mgrp, port_guid); > + CL_PLOCK_RELEASE(p_sm->p_lock); > > Exit: > OSM_LOG_EXIT(p_sm->p_log); > diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c > index 5c39f11..d4dd782 100644 > --- a/opensm/opensm/osm_state_mgr.c > +++ b/opensm/opensm/osm_state_mgr.c > @@ -76,7 +76,6 @@ osm_signal_t osm_qos_setup(IN osm_opensm_t * p_osm); > void osm_state_mgr_construct(IN osm_state_mgr_t * const p_mgr) > { > memset(p_mgr, 0, sizeof(*p_mgr)); > - cl_spinlock_construct(&p_mgr->idle_lock); > p_mgr->state = OSM_SM_STATE_INIT; > } > > @@ -88,9 +87,6 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const p_mgr) > > OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_destroy); > > - /* destroy the locks */ > - cl_spinlock_destroy(&p_mgr->idle_lock); > - > OSM_LOG_EXIT(p_mgr->p_log); > } > > @@ -112,8 +108,6 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr, > IN cl_event_t * const p_subnet_up_event, > IN osm_log_t * const p_log) > { > - cl_status_t status; > - > OSM_LOG_ENTER(p_log, osm_state_mgr_init); > > CL_ASSERT(p_subn); > @@ -145,17 +139,8 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr, > p_mgr->p_lock = p_lock; > p_mgr->p_subnet_up_event = p_subnet_up_event; > > - cl_qlist_init(&p_mgr->idle_time_list); > - > - status = cl_spinlock_init(&p_mgr->idle_lock); > - if (status != CL_SUCCESS) { > - osm_log(p_mgr->p_log, OSM_LOG_ERROR, > - "osm_state_mgr_init: ERR 3302: " > - "Spinlock init failed (%s)\n", CL_STATUS_MSG(status)); > - } > - > OSM_LOG_EXIT(p_mgr->p_log); > - return (status); > + return IB_SUCCESS; > } > > /********************************************************************** > @@ -989,79 +974,6 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_state_mgr_t * > } > > /********************************************************************** > - **********************************************************************/ > -static void __process_idle_time_queue_done(IN osm_state_mgr_t * const p_mgr) > -{ > - cl_qlist_t *p_list = &p_mgr->idle_time_list; > - cl_list_item_t *p_list_item; > - osm_idle_item_t *p_process_item; > - > - OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done); > - > - cl_spinlock_acquire(&p_mgr->idle_lock); > - p_list_item = cl_qlist_remove_head(p_list); > - > - if (p_list_item == cl_qlist_end(p_list)) { > - cl_spinlock_release(&p_mgr->idle_lock); > - osm_log(p_mgr->p_log, OSM_LOG_ERROR, > - "__process_idle_time_queue_done: ERR 3314: " > - "Idle time queue is empty\n"); > - return; > - } > - cl_spinlock_release(&p_mgr->idle_lock); > - > - p_process_item = (osm_idle_item_t *) p_list_item; > - > - if (p_process_item->pfn_done) { > - > - p_process_item->pfn_done(p_process_item->context1, > - p_process_item->context2); > - } > - > - free(p_process_item); > - > - OSM_LOG_EXIT(p_mgr->p_log); > - return; > -} > - > -/********************************************************************** > - **********************************************************************/ > -static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t * > - const p_mgr) > -{ > - cl_qlist_t *p_list = &p_mgr->idle_time_list; > - cl_list_item_t *p_list_item; > - osm_idle_item_t *p_process_item; > - osm_signal_t signal; > - > - OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_start); > - > - cl_spinlock_acquire(&p_mgr->idle_lock); > - > - p_list_item = cl_qlist_head(p_list); > - if (p_list_item == cl_qlist_end(p_list)) { > - cl_spinlock_release(&p_mgr->idle_lock); > - OSM_LOG_EXIT(p_mgr->p_log); > - return OSM_SIGNAL_NONE; > - } > - > - cl_spinlock_release(&p_mgr->idle_lock); > - > - p_process_item = (osm_idle_item_t *) p_list_item; > - > - CL_ASSERT(p_process_item->pfn_start); > - > - signal = > - p_process_item->pfn_start(p_process_item->context1, > - p_process_item->context2); > - > - CL_ASSERT(signal != OSM_SIGNAL_NONE); > - > - OSM_LOG_EXIT(p_mgr->p_log); > - return signal; > -} > - > -/********************************************************************** > * Go over all the remote SMs (as updated in the sm_guid_tbl). > * Find if there is a remote sm that is a master SM. > * If there is a remote master SM - return a pointer to it, > @@ -1558,7 +1470,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, > case OSM_SM_STATE_PROCESS_REQUEST: > switch (signal) { > case OSM_SIGNAL_IDLE_TIME_PROCESS: > - signal = __process_idle_time_queue_start(p_mgr); > + signal = osm_mcast_mgr_process_mgroups(p_mgr->p_mcast_mgr); > switch (signal) { > case OSM_SIGNAL_NONE: > p_mgr->state = OSM_SM_STATE_IDLE; > @@ -1604,14 +1516,6 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, > switch (signal) { > case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: > case OSM_SIGNAL_DONE: > - /* CALL the done function */ > - __process_idle_time_queue_done(p_mgr); > - > - /* > - * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS > - * so that the next element in the queue gets processed > - */ > - > signal = OSM_SIGNAL_IDLE_TIME_PROCESS; > p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST; > break; > @@ -2424,41 +2328,3 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, > > OSM_LOG_EXIT(p_mgr->p_log); > } > - > -/********************************************************************** > - **********************************************************************/ > -ib_api_status_t > -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr, > - IN osm_pfn_start_t pfn_start, > - IN osm_pfn_done_t pfn_done, void *context1, > - void *context2) > -{ > - osm_idle_item_t *p_idle_item; > - > - OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process_idle); > - > - p_idle_item = malloc(sizeof(osm_idle_item_t)); > - if (p_idle_item == NULL) { > - osm_log(p_mgr->p_log, OSM_LOG_ERROR, > - "osm_state_mgr_process_idle: ERR 3321: " > - "insufficient memory\n"); > - return IB_ERROR; > - } > - > - memset(p_idle_item, 0, sizeof(osm_idle_item_t)); > - p_idle_item->pfn_start = pfn_start; > - p_idle_item->pfn_done = pfn_done; > - p_idle_item->context1 = context1; > - p_idle_item->context2 = context2; > - > - cl_spinlock_acquire(&p_mgr->idle_lock); > - cl_qlist_insert_tail(&p_mgr->idle_time_list, &p_idle_item->list_item); > - cl_spinlock_release(&p_mgr->idle_lock); > - > - osm_sm_signal(&p_mgr->p_subn->p_osm->sm, > - OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST); > - > - OSM_LOG_EXIT(p_mgr->p_log); > - > - return IB_SUCCESS; > -} From info at oxyto.net Tue Jan 1 06:53:47 2008 From: info at oxyto.net (=?windows-1255?B?7uvs7OQt?=) Date: Tue, 1 Jan 2008 08:53:47 -0600 Subject: [ofa-general] =?windows-1255?b?IOHl4CD67O7jIOzi4+Xs?= Message-ID: <20080101145346.74200E607F7@openfabrics.org> An HTML attachment was scrubbed... URL: From sashak at voltaire.com Tue Jan 1 07:20:37 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 1 Jan 2008 15:20:37 +0000 Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements In-Reply-To: <477A4512.1000207@dev.mellanox.co.il> References: <4770CDCE.8040200@dev.mellanox.co.il> <20071229182718.GA19160@sashak.voltaire.com> <477A4512.1000207@dev.mellanox.co.il> Message-ID: <20080101152037.GE13518@sashak.voltaire.com> On 15:50 Tue 01 Jan , Yevgeny Kliteynik wrote: > > @@ -1395,79 +1382,79 @@ osm_mgrp_t *__get_mgrp_by_mlid(IN osm_mcast_mgr_t * > > const p_mgr, > > /********************************************************************** > > This is the function that is invoked during idle time to handle the > > - process request. Context1 is simply the osm_mcast_mgr_t*, Context2 > > - hold the mlid, port guid and action (join/leave/delete) required. > > + process request for mcast groups where join/leave/delete was required. > > **********************************************************************/ > > -osm_signal_t > > -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const > > Context2) > > +osm_signal_t osm_mcast_mgr_process_mgroups(osm_mcast_mgr_t * p_mgr) > > { > > - osm_mcast_mgr_t *p_mgr = (osm_mcast_mgr_t *) Context1; > > + cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list; > > + osm_switch_t *p_sw; > > + cl_qmap_t *p_sw_tbl; > > osm_mgrp_t *p_mgrp; > > ib_net16_t mlid; > > - osm_signal_t signal = OSM_SIGNAL_DONE; > > - osm_mcast_mgr_ctxt_t *p_ctxt = (osm_mcast_mgr_ctxt_t *) Context2; > > - osm_mcast_req_type_t req_type = p_ctxt->req_type; > > - ib_net64_t port_guid = p_ctxt->port_guid; > > - > > - OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp_cb); > > - > > - /* nice copy no warning on size diff */ > > - memcpy(&mlid, &p_ctxt->mlid, sizeof(mlid)); > > + osm_signal_t ret, signal = OSM_SIGNAL_DONE; > > + osm_mcast_mgr_ctxt_t *ctx; > > + osm_mcast_req_type_t req_type; > > + ib_net64_t port_guid; > > - /* we can destroy the context now */ > > - free(p_ctxt); > > + OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgroups); > > /* we need a lock to make sure the p_mgrp is not change other ways */ > > CL_PLOCK_EXCL_ACQUIRE(p_mgr->p_lock); > > - p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid); > > - /* since we delayed the execution we prefer to pass the > > - mlid as the mgrp identifier and then find it or abort */ > > + if (cl_is_qlist_empty(p_list)) { > > + CL_PLOCK_RELEASE(p_mgr->p_lock); > > + return OSM_SIGNAL_NONE; > > + } > > + > > + while (!cl_is_qlist_empty(p_list)) { > > + ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list); > > + req_type = ctx->req_type; > > + port_guid = ctx->port_guid; > > + > > + /* nice copy no warning on size diff */ > > + memcpy(&mlid, &ctx->mlid, sizeof(mlid)); > > - if (p_mgrp) { > > + /* we can destroy the context now */ > > + free(ctx); > > + > > + /* since we delayed the execution we prefer to pass the > > + mlid as the mgrp identifier and then find it or abort */ > > + p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid); > > + if (!p_mgrp) > > + continue; > > - /* if there was no change from the last time we processed the group > > - we can skip doing anything > > + /* if there was no change from the last time > > + * we processed the group we can skip doing anything > > */ > > if (p_mgrp->last_change_id == p_mgrp->last_tree_id) { > > osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > > - "osm_mcast_mgr_process_mgrp_cb: " > > + "osm_mcast_mgr_process_mgroups: " > > "Skip processing mgrp with lid:0x%X change id:%u\n", > > cl_ntoh16(mlid), p_mgrp->last_change_id); > > - } else { > > - osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > > - "osm_mcast_mgr_process_mgrp_cb: " > > - "Processing mgrp with lid:0x%X change id:%u\n", > > - cl_ntoh16(mlid), p_mgrp->last_change_id); > > - > > - signal = > > - osm_mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, > > - port_guid); > > - p_mgrp->last_tree_id = p_mgrp->last_change_id; > > + continue; > > } > > - /* Remove MGRP only if osm_mcm_port_t count is 0 and > > - * Not a well known group > > - */ > > - if ((0x0 == cl_qmap_count(&p_mgrp->mcm_port_tbl)) && > > - (p_mgrp->well_known == FALSE)) { > > - osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > > - "osm_mcast_mgr_process_mgrp_cb: " > > - "Destroying mgrp with lid:0x%X\n", > > - cl_ntoh16(mlid)); > > - > > - /* Send a Report to any InformInfo registered for > > - Trap 67 : MCGroup delete */ > > - osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log, > > - p_mgrp); > > - > > - cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl, > > - (cl_map_item_t *) p_mgrp); > > + osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > > + "osm_mcast_mgr_process_mgroups: " > > + "Processing mgrp with lid:0x%X change id:%u\n", > > + cl_ntoh16(mlid), p_mgrp->last_change_id); > > + mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, port_guid); > > + } > > - osm_mgrp_delete(p_mgrp); > > - } > > + /* > > + Walk the switches and download the tables for each. > > + */ > > + p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; > > + p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl); > > + while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) { > > + ret = __osm_mcast_mgr_set_tbl(p_mgr, p_sw); > > + if (ret == OSM_SIGNAL_DONE_PENDING) > > + signal = ret; > > + p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > > } > > So basically what you do here is processing all the idle queue requests > one by one, but you actually update the mcast tables on switches only > once at the end of idle queue processing. This should speed up mcast > handling significantly, which is great. > > However, let's get back to the original problem that triggered this change. > During a join/leave "burst" in a big fabric (let's say couple of K's of CAs) > with hundreds of mcast groups this can still be a problem. > > If there are hundreds of requests in the idle queue, function > osm_mcast_mgr_process_mgroups() will return only when it will finish the > whole > idle queue processing, Right, but the difference is that re-routing for all mcast groups will be done in one pass. So assuming that the most time consuming operation here is switch's MFTs update (which by itself requires improvement, but this is different story) the time which will be needed for this should be comparable with single mcast group request processing before the patch. > so if during that time there will be a topology > change > in the fabric, and immediate heavy sweep will be requested, the OSM state > mgr > will notice it only after finishing processing the idle queue, right? Right, but note the second patch too (v2 of your state_mgr patch). It will process the only list of mcast groups which were requested *before* heavy sweep request (the rest will be processed during heavy sweep itself and rest of the list will be purged there). > Any idea how fast the processing is right now? Hard to find exact numbers - I saw significant differences between simulated and real cluster times in the past. But in general I think a queue processing time should be comparable to a single request processing time. > In the original problem I saw osm busy with mcast for more than 10 minutes. > How much will it take now? Even if it's 10 times faster, 1 minute of > unattended > topology change is still too long. I think that real life scenario here is follow: some nodes were rebooted, OpenSM starts to get mcast join/leave requests and quickly starts to process very few first requests, including updating switch MFTs, during this time (when it waits for NO_PENDED_TRANSACTION signal), it gets re-sweep request and will continue with heavy sweep. Sasha From apploid at yahoo.com Tue Jan 1 15:28:51 2008 From: apploid at yahoo.com (Old Apple) Date: Tue, 1 Jan 2008 15:28:51 -0800 (PST) Subject: [ofa-general] ***SPAM*** IBGD/OFED configuration/performance question Message-ID: <680271.43537.qm@web45111.mail.sp1.yahoo.com> We have a CISCO/Topspin 7000D DDR switch with a built-in subnet manager. In a minimal configuration we need to connect a DDN 9550 disk system (one controller with two SDR ports) and just one server node with CISCO Tavor DDR HCA. I.e. we have two DDN outlets and one HCA all connected to the same 7000D switch. Inside DDN, we have configured 6 two-tier LUNs; first three of them are zoned to one SDR outlet, and the remaining three - to another SDR outlet. Thus we have 6 logical drives and must be able to see them all inside our only host. With OFED 1.2.5.4, everything connects without any problem, all 6 disks are seen inside the server. By setting max_sect to 16384 and srp_sg_tablesize to 256 we however obtain only 110 MB/sec per LUN which is too low for our DDN system. (To make sure we are not simply piping everything into one SDR port of DDN, we tried to use only 3 LUNS together, and still never went over 110 MB/sec per LUN). Exactly the same configuration with IBGD 1.8.3 loaded on the server with max_xfer_sectors_per_io set to 8192 leads us to 220+ MB/sec per LUN, but we only see three luns out of 6. The IBGD simply ignores one of the two DDN ports. If we put all 6 LUNS on the only visible port of DDN, the disks are all present, but then we end up eating all the bandwidth of the SDR port, and hence may leave nothing to other servers. To conclude, we are now trying to find out what's wrong with OFED 1.2.5.4 performance, and/or what could be done with IBGD 1.8.3 to ensure that our host sees both DDN ports. If by chance somebody on this list had similar issues in the past, please comment. Thanks ahead! ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From kliteyn at mellanox.co.il Tue Jan 1 17:14:58 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 2 Jan 2008 03:14:58 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-02:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-01 OpenSM git rev = Sat_Dec_29_21:02:49_2007 [f7b47c635d291a4aef38c609028791b4ad1f1259] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From jackieict at gmail.com Tue Jan 1 17:58:15 2008 From: jackieict at gmail.com (zhang Jackie) Date: Wed, 2 Jan 2008 09:58:15 +0800 Subject: [ofa-general] Do RDMA write can use async event in userspace? Message-ID: <13432ab00801011758ga8a0942o1934584e999b7ab3@mail.gmail.com> I found that in the perftest library, ib_write_lat and ib_write_bw dont have a option "-e" to let the program use async event. Do it mean RDMA write cant use together with async event? why? Thanks, Best Regards & Happy New Year! -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave at thedillows.org Tue Jan 1 19:26:16 2008 From: dave at thedillows.org (Dave Dillow) Date: Tue, 01 Jan 2008 22:26:16 -0500 Subject: [ofa-general] ***SPAM*** IBGD/OFED configuration/performance question In-Reply-To: <680271.43537.qm@web45111.mail.sp1.yahoo.com> References: <680271.43537.qm@web45111.mail.sp1.yahoo.com> Message-ID: <1199244377.4023.11.camel@obelisk.thedillows.org> On Tue, 2008-01-01 at 15:28 -0800, Old Apple wrote: > With OFED 1.2.5.4, everything connects without any > problem, all 6 disks are seen inside the server. By > setting max_sect to 16384 and srp_sg_tablesize to 256 > we however obtain only 110 MB/sec per LUN which is too > low for our DDN system. (To make sure we are not > simply piping everything into one SDR port of DDN, we > tried to use only 3 LUNS together, and still never > went over 110 MB/sec per LUN). As a quick hack/workaround, add max_cmds_per_lun=5 to the string you are echoing to add-target under OFED. That should get you better performance, as the SRP implementation will send requests to the array when it has no credits available to do so. For a better fix that will let you queue commands to the DDN for a single lun when the others are not busy, I sent some patches a weeks ago to respect the credit handling, and to force a maximum queue length. Search for subjects: IB/srp: respect target credit limit IB/srp: use scatter gather chaining IB/srp: allow user to control host queue length The one for scatter gather chaining may not work for pre 2.6.24-rc1 kernels, I haven't checked. It isn't absolutely critical. With the third one, "...queue length", you can replace the max_cmds_per_lun=5 with "queue_len=31" to prevent some performance degradation at full tilt -- I'm still tracking it down, but it wasn't as bad as seeing 110MB/s, IIRC. You may want to keep the max_cmds_per_lun, though, if your workload is such that all 6 LUNs will be equally busy. With these, I can max out a port on the DDN with a single LUN (~750MB/s). From dotanb at dev.mellanox.co.il Tue Jan 1 22:16:24 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Wed, 02 Jan 2008 08:16:24 +0200 Subject: [ofa-general] Do RDMA write can use async event in userspace? In-Reply-To: <13432ab00801011758ga8a0942o1934584e999b7ab3@mail.gmail.com> References: <13432ab00801011758ga8a0942o1934584e999b7ab3@mail.gmail.com> Message-ID: <477B2C38.8080705@dev.mellanox.co.il> zhang Jackie wrote: > I found that in the perftest library, ib_write_lat and ib_write_bw > dont have a option "-e" to let the program use async event. Do it mean > RDMA write cant use together with async event? why? > > Thanks, Best Regards & Happy New Year! Here is the description of the "-e" parameter from ib_send_lat: -e, --events sleep on CQ events (default poll) Which means that completion events are being used (and not async events). There isn't any limitation and completion events can be used with RDMA Read/Write (for the side that get the completions, which is the requestor). Dotan From mdf at bethandersonhomes.com Tue Jan 1 01:32:13 2008 From: mdf at bethandersonhomes.com (Dexter Wagner) Date: Wed, 1 Jan 2008 11:32:13 +0200 Subject: [ofa-general] Das Zeichen der neuen Software zu einem niedrigen Preis Message-ID: <439137715.03331249679512@bethandersonhomes.com> Um die echte und vollige Software in kurzer Zeit zu bekommen, braucht man nur zu bezahlen und auszulasten. Sie haben dann die Programmen auf allen europaischen Sprachen uberlassen, die fur Windows und Macintosh vorherbestimmt sind. Die professionelle Konsultation des Anwenderdienstes hilft Ihnen jedes Programm leicht aufstellen. Schnelle Antwort ist garantiert. Die Ruckzahlung ist moglich. Sie kaufen die Software, sie funktionieren ausgezeichnet http://geocities.com/annealexander32/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From luxusyachting.com at ffissy.com Wed Jan 2 02:55:47 2008 From: luxusyachting.com at ffissy.com (Dominic Henderson) Date: Wed, 02 Jan 2008 10:55:47 -0000 Subject: [ofa-general] Adobe Photoshop CS3 Extended MAC/XP/Vista for 89, Retails @ 999 (You Save 909) Message-ID: <000801c84d2d$44036780$0100007f@gvdww> Type 'xhighersoftware. com' in your |E (please remove space and quote) sony sound forge 9.0 - 49 roxio digitalmedia studio deluxe suite 7.0 - 49 final draft 7 - 39 symantec norton 360 - 29 2003 microsoft office professional with business contact manager for outlook - 69 autodesk autocad lt 2008 - 69 acronis true image workstation 9.1.3887 - 29 coreldraw graphics suite 12 - 49 symantec norton 360 - 29 roxio easy media creator 8 - 39 crystal reports professional edition 11 - 69 adobe after effects cs3 - 69 You can return 71-90% here! From vlad at lists.openfabrics.org Wed Jan 2 03:07:56 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 2 Jan 2008 03:07:56 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080102-0200 daily build status Message-ID: <20080102110756.5BBA0E600A0@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.18 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.16 Passed on ia64 with linux-2.6.18 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.14 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.14 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.12 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.13 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.13 Passed on powerpc with linux-2.6.14 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.22 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-53.el5 Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-8.el5 Failed: From clemmons at taffyhill.com Tue Jan 1 04:31:29 2008 From: clemmons at taffyhill.com (Gregory Sanford) Date: Wed, 1 Jan 2008 15:31:29 +0300 Subject: [ofa-general] Die neue Software: leicht zu bekommen, wenig zu bezahlen Message-ID: <01c84c8b$60e75e80$01cd3956@clemmons> Brauchen Sie die Software momentan und fur wenig Geld? Kein Problem! Hier ist etwas fur Ihnen. Die Programmen sind auf allen europaischen Sprachen uberlassen und fur Windows und Macintosh vorherbestimmt. Alle prasentierten Produkte der Software sind original und vollig.Mit der Hilfe der professionellen Konsultation des Anwenderdienstes ist Die Aufstellung des Programms kein Problem fur Ihnen. Antwort ist garantiert. Die Ruckzahlung ist moglich. Sie kaufen nur die vollkommen funktionierende Software http://geocities.com/lane.sofia/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From apploid at yahoo.com Wed Jan 2 06:52:13 2008 From: apploid at yahoo.com (Old Apple) Date: Wed, 2 Jan 2008 06:52:13 -0800 (PST) Subject: ***SPAM*** Re: [ofa-general] ***SPAM*** IBGD/OFED configuration/performance question In-Reply-To: <1199244377.4023.11.camel@obelisk.thedillows.org> Message-ID: <102922.16786.qm@web45111.mail.sp1.yahoo.com> --- Dave Dillow wrote: > As a quick hack/workaround, add max_cmds_per_lun=5 > to the string you are echoing to add-target under > OFED. That should get you better performance, as > the SRP implementation will send requests to the > array when it has no credits available to do so. Hello Dave, many thanks - this really helped! The actual parameter name was max_cmd_per_lun; setting it to a small number allowed us to use all the available bandwidth of the IB port on DDN. Andy. ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From jvirtue at ravideoauditions.com Tue Jan 1 07:01:16 2008 From: jvirtue at ravideoauditions.com (Erick Pitts) Date: Wed, 1 Jan 2008 17:01:16 +0200 Subject: [ofa-general] Hohe Qualitä t und niedriger Preis sind in der Software vereinigt Message-ID: <01c84c97$ebce5600$6584af4e@jvirtue> Brauchen Sie die Software momentan und fur wenig Geld? Kein Problem! Hier ist etwas fur Ihnen. Die Programmen sind auf allen europaischen Sprachen uberlassen und fur Windows und Macintosh vorherbestimmt. Alle prasentierten Produkte der Software sind original und vollig.Die professionelle Konsultation des Anwenderdienstes hilft Ihnen jedes Programm leicht aufstellen. Schnelle Antwort ist garantiert. Die Ruckzahlung ist moglich. Sie kaufen nur die vollkommen funktionierende Software -------------- next part -------------- An HTML attachment was scrubbed... URL: From changquing.tang at hp.com Wed Jan 2 07:26:55 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 2 Jan 2008 15:26:55 +0000 Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of any one user process In-Reply-To: <200712311339.41166.jackm@dev.mellanox.co.il> References: <200712311339.41166.jackm@dev.mellanox.co.il> Message-ID: This interface is OK for me. Now, every rank on a node who wants to receive message from the same remote rank must know the same receiving QP number, and register for receiving using this QP number. If rank B does not register (receiving QP has been created by another rank A on the node), and sender know B's SRQ number, if sender sends a message to B, can B still receive this message ? (I hope, no register, no receive) I hope to know the opinion from other MPI team, or other XRC user. --CQ > -----Original Message----- > From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il] > Sent: Monday, December 31, 2007 5:40 AM > To: pasha at mellanox.co.il > Cc: ishai at mellanox.co.il; Gleb Natapov; Roland Dreier; Tang, > Changqing; general at lists.openfabrics.org > Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP > independent of any one user process > > > Tang, Changqing wrote: > > > If I have a MPI server processes on a node, many > other MPI > > > client processes will dynamically connect/disconnect with the > > > server. The server use same XRC domain. > > > > > > Will this cause accumulating the "kernel" QP for such > > > application ? we want the server to run 365 days a year. > > > > > > I have some question about the scenario above. Did you > call for the > > > mpi disconnect on the both ends (server/client) before the client > > > exit (did we must to do it?) > > > > Yes, both ends will call disconnect. But for us, > MPI_Comm_disconnect() > > call is not a collective call, it is just a local operation. > > > > --CQ > > > Possible solution (internal review as yet): > > Each user process registers with the XRC QP: > a. each process registers ONCE. If it registers multiple > times, there is no reference increment -- > rather the registration succeeds, but only one PID > entry is kept per QP. > b. Can have cleanup in the event of a process dying suddenly. > c. QP cannot be destroyed while there are any user > processes still registered with it. > > libibverbs API is as follows: > > ============================================================== > ======================== > /** > * ibv_xrc_rcv_qp_alloc - creates an XRC QP for serving as a > receive-side only QP, > * and moves the created qp through the RESET->INIT and > INIT->RTR transitions. > * (The RTR->RTS transition is not needed, since this QP > does no sending). > * The sending XRC QP uses this QP as destination, while > specifying an XRC SRQ > * for actually receiving the transmissions and > generating all completions on the > * receiving side. > * > * This QP is created in kernel space, and persists > until the last process registered > * for the QP calls ibv_xrc_rcv_qp_unregister() (at > which time the QP is destroyed). > * > * @pd: protection domain to use. At lower layer, this > provides access to userspace obj > * @xrc_domain: xrc domain to use for the QP. > * @attr: modify-qp attributes needed to bring the QP to RTR. > * @attr_mask: bitmap indicating which attributes are > provided in the attr struct. > * used for validity checking. > * @xrc_rcv_qpn: qp_num of created QP (if success). To be > passed to the remote node (sender). > * The remote node will use xrc_rcv_qpn in > ibv_post_send when sending to > * XRC SRQ's on this host in the same xrc domain. > * > * RETURNS: success (0), or a (negative) error value. > * > * NOTE: this verb also registers the calling user-process > with the QP at its creation time > * (implicit call to ibv_xrc_rcv_qp_register), to avoid > race conditions. > * The creating process will need to call > ibv_xrc_qp_unregister() for the QP to release it from > * this process. > */ > > int ibv_xrc_rcv_qp_alloc(struct ibv_pd *pd, > struct ibv_xrc_domain *xrc_domain, > struct ibv_qp_attr *attr, > enum ibv_qp_attr_mask attr_mask, > uint32_t *xrc_rcv_qpn); > > ===================================================================== > > /** > * ibv_xrc_rcv_qp_register: registers a user process with an > XRC QP which serves as > * a receive-side only QP. > * > * @xrc_domain: xrc domain the QP belongs to (for verification). > * @xrc_qp_num: The (24 bit) number of the XRC QP. > * > * RETURNS: success (0), > * or error (-EINVAL), if: > * 1. There is no such QP_num allocated. > * 2. The QP is allocated, but is not an receive XRC QP > * 3. The XRC QP does not belong to the given domain. > */ > int ibv_xrc_rcv_qp_register(struct ibv_xrc_domain > *xrc_domain, uint32_t xrc_qp_num); > > ===================================================================== > /** > * ibv_xrc_rcv_qp_unregister: detaches a user process from an > XRC QP serving as > * a receive-side only QP. If as a result, there are > no remaining userspace processes > * registered for this XRC QP, it is destroyed. > * > * @xrc_domain: xrc domain the QP belongs to (for verification). > * @xrc_qp_num: The (24 bit) number of the XRC QP. > * > * RETURNS: success (0), > * or error (-EINVAL), if: > * 1. There is no such QP_num allocated. > * 2. The QP is allocated, but is not an XRC QP > * 3. The XRC QP does not belong to the given domain. > * NOTE: I don't see any reason to return a special code if > the QP is destroyed -- the unregister simply > * succeeds. > */ > int ibv_xrc_rcv_qp_unregister(struct ibv_xrc_domain > *xrc_domain, uint32_t xrc_qp_num); > ============================================================== > =============================== > > Usage: > > 1. Sender creates an XRC QP (sending QP) 2. Sender sends some > receiving process on a remote node (say R1) a request to > provide an XRC QP and XRC SRQ for > receiving messages (the request includes the sending QP number). > 3. R1 calls ibv_xrc_rcv_qp_alloc() to create a receiving XRC > QP in kernel space, and move > that QP up to RTR state. This function also registers > process R1 with the XRC QP. > 4. R1 calls ibv_create_xrc_srq() to create an SRQ for receive > messages via the just created XRC QP. > 5. R1 responds to request, providing the XRC qp number, and > XRC SRQ number to be used in communication. > 6. Sender then may wish to communicate with another receiving > process on the remote host (say R2). > it sends a request to R2 containing the remote XRC QP > number (obtained from R1) > which it will use to send messages. > 7. R2 creates an XRC SRQ (if one does not already exist for > the domain), and also > calls ibv_xrc_rcv_qp_register() to register the process R2 > with the XRC QP created by R1. > 8. If R1 no longer needs to communicate with the sender, it > calls ibv_xrc_rcv_qp_unregister() for the QP. > The QP will not yet be destroyed, since R2 is still > registered with it. > 9. If R2 no longer needs to communicate with the sender, it > calls ibv_xrc_rcv_qp_unregister() for the QP. > At this point, the QP is destroyed, since no processes > remain registered with it. > > NOTES: > 1. The problem of the QP being destroyed and quickly > re-allocated does not exist -- the upper bits of the > QP number are incremented at each allocation (except for > the MSB which is always 1 for XRC QPs). Thus, > even if the same QP is re-allocated, its QP number (stored > in the QP object) will be different than > expected (unless it is re-destroyed/re-allocated several > hundred times). > > 2. With this model, we do not need a heartbeat: if a > receiving process dies, all XRC QPs it has registered for will > be unregistered as part of process cleanup in kernel space. > > - Jack > > From a.capriotti at cineca.it Wed Jan 2 08:01:26 2008 From: a.capriotti at cineca.it (Andrea Capriotti) Date: Wed, 2 Jan 2008 17:01:26 +0100 (MET) Subject: [ofa-general] Issues with compilation of OFED 1.2.5.4 and RHEL 4 U6 kernel 2.6.9-67.0.1.ELsmp Message-ID: <1199289816.10472.61.camel@debcap.cineca.it> Hi all, when compiling the latest OFED version (1.2.5.4) on a RHEL 4 U6 (kernel 2.6.9-67.0.1.ELsmp) I get the following error: make[1]: Entering directory `/usr/src/kernels/2.6.9-67.0.1.EL-smp-x86_64' mkdir -p /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/.tmp_versions make -f scripts/Makefile.build obj=/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4 make -f scripts/Makefile.build obj=/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband make -f scripts/Makefile.build obj=/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/.addr.o.d -nostdinc -iwithprefix include -D__KERNEL__ -I/var/tmp/OFEDRPM/BUIL D/ofa_kernel-1.2.5.4/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include -Iinclude -include include/linux/autoconf.h -includ e /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include/linux/autoconf.h -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-f rame-pointer -g -Wdeclaration-after-statement -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -I/var/tmp/OFE DRPM/BUILD/ofa_kernel-1.2.5.4/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/dri vers/infiniband/ulp/ipoib -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/debug -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniba nd/hw/cxgb3/core -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/net/cxgb3 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds -I/var/tmp/OFEDRPM/BUIL D/ofa_kernel-1.2.5.4/drivers/net/mlx4 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/hw/mlx4 -DMODULE -DKBUILD_BASENAME=addr -DKBUILD_MODN AME=ib_addr -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/.tmp_addr.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/ core/addr.c In file included from /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c:32: include/linux/inetdevice.h:50: error: field `mr_gq_timer' has incomplete type include/linux/inetdevice.h:51: error: field `mr_ifc_timer' has incomplete type include/linux/inetdevice.h:95: error: `IFNAMSIZ' undeclared here (not in a function) include/linux/inetdevice.h: In function `__in_dev_get_rcu': include/linux/inetdevice.h:142: error: dereferencing pointer to incomplete type include/linux/inetdevice.h: In function `in_dev_get': include/linux/inetdevice.h:154: error: dereferencing pointer to incomplete type include/linux/inetdevice.h: In function `__in_dev_get': include/linux/inetdevice.h:164: error: dereferencing pointer to incomplete type /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c: At top level: /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c:62: warning: type defaults to `int' in declaration of `DECLARE_DELAYED_WORK' /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c:62: warning: parameter names (without types) in function declaration /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c: In function `set_timeout': /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c:127: error: `work' undeclared (first use in this function) /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c:127: error: (Each undeclared identifier is reported only once /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c:127: error: for each function it appears in.) /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c: At top level: /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c:218: warning: 'process_req' defined but not used /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c:62: warning: 'DECLARE_DELAYED_WORK' declared `static' but never defined make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.o] Error 1 make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core] Error 2 make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband] Error 2 make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4] Error 2 make[1]: Leaving directory `/usr/src/kernels/2.6.9-67.0.1.EL-smp-x86_64' make: *** [kernel] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.92720 (%install) No problem with RHEL 4 U5 (kernel 2.6.9-55.0.12.ELsmp). Any idea? Best Regards -- Andrea Capriotti System Management Group - Cineca - www.cineca.it a.capriotti at cineca.it - Tel +39 051 6171890 From apploid at yahoo.com Wed Jan 2 08:12:03 2008 From: apploid at yahoo.com (Old Apple) Date: Wed, 2 Jan 2008 08:12:03 -0800 (PST) Subject: [ofa-general] ***SPAM*** Thanks: IBGD/OFED configuration/performance question Message-ID: <882017.41594.qm@web45101.mail.sp1.yahoo.com> --- Dave Dillow wrote: > > As a quick hack/workaround, add max_cmds_per_lun=5 > to the string you are echoing to add-target under > OFED. That should get you better performance, as > the SRP implementation will send requests to the > array when it has no credits available to do so. Hello Dave, many thanks - this really helped! The actual parameter name was max_cmd_per_lun; setting it to a small number allowed us to use all the available bandwidth of the IB port on DDN. Andy. ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From rdreier at cisco.com Wed Jan 2 08:43:39 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Jan 2008 08:43:39 -0800 Subject: [ofa-general] [PATCH] IB/iser: update url of iSER docs References: <4767D0B6.1030708@voltaire.com> <4778F413.6080400@Voltaire.COM> Message-ID: sorry, I missed the original posting. anyway, applied... From dwmoonsdesignsm at moonsdesigns.com Tue Jan 1 09:49:29 2008 From: dwmoonsdesignsm at moonsdesigns.com (Seth Simon) Date: Wed, 1 Jan 2008 19:49:29 +0200 Subject: [ofa-general] Alles auf der Welt kostet Geld, diese Software kostet wenig Message-ID: <01c84caf$6bb3ea80$b6e9505c@dwmoonsdesignsm> Wie kann man die Software momentan und fur wenig Geld bekommen? Einfach bezahlen und auslasten. Gleich haben Sie die auf allen europaischen Sprachen uberlassenen Programmen, die fur Windows und Macintosh vorherbestimmt sind. Die Produkte der Software sind original und vollig.Wie das Programm aufzustellen? Dabei hilft die professionelle Konsultation des Anwenderdienstes. Garantierte schnelle Antwort, die Ruckzahlung ist moglich. Also Sie kaufen nur die vollkommen funktionierende Software -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Jan 2 09:51:38 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Jan 2008 09:51:38 -0800 Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5 In-Reply-To: <1198689251.25003.2.camel@lap75545.ornl.gov> (David Dillow's message of "Wed, 26 Dec 2007 12:14:11 -0500") References: <1198273973.9979.34.camel@lap75545.ornl.gov> <1198275532.9979.43.camel@lap75545.ornl.gov> <20071223014407L.tomof@acm.org> <1198689251.25003.2.camel@lap75545.ornl.gov> Message-ID: > > Can you try this? > > That patched oopsed in scsi_remove_host(), but reversing the order has > survived over 500 insert/probe/remove cycles. > > Tested-by: David Dillow > --- > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c > index 950228f..77e8b90 100644 > --- a/drivers/infiniband/ulp/srp/ib_srp.c > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device) > list_for_each_entry_safe(target, tmp_target, > &host->target_list, list) { > scsi_remove_host(target->scsi_host); > + srp_remove_host(target->scsi_host); > srp_disconnect_target(target); Where do we stand on this? What is the right place to put the srp_remove_host? Is there a bug somewhere else? I'd like to get this fixed before 2.6.24 final comes out... - R. From YJia at tmriusa.com Wed Jan 2 09:59:30 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Wed, 2 Jan 2008 11:59:30 -0600 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: <200801010903.38891.jackm@dev.mellanox.co.il> Message-ID: Hi Jack, Thanks for your reply. The HCA I'm using is memory free, the chip is MT25204 and the HCA type is arbel, so it doesn't go through the "if (ah->type == MTHCA_AH_ON_HCA)" part of code. By checking the debug output, I got more details about this problem: The SW2HW_MPT command is issued while UDAV table is been creating. During the time that the driver is waiting for the completion of the command, it does many other things: creating send mad package, posting send mad request to the SQ and posting another receive mad request to the RQ. There's no error report for all of these actions. However after it, the HCA report command parameter error for the SW2HW_MPT. I've copied a snippet context of the debug trace output when this error happens, hopefully it will help spot the reason. 139903841835 HCR CMD: op_code: LE: d 139903861104 TRACE: mad.c:639/ib_mad_recv_done_handler 139903890876 HCR CMD: in_param_h: LE: 0 139903942869 TRACE: mad.c:644/ib_mad_recv_done_handler 139903993296 HCR CMD: in_param_l: LE: cf616000 139904038413 TRACE: verbs.c:182/ib_create_ah_from_wc 139904094753 HCR CMD: input_modifier: LE: 1e 139904139150 TRACE: mthca_provider.c:447/mthca_ah_create MTHCA DBG: Created UDAV at 8075220/00000000: 139904197065 HCR CMD: out_pram_h: LE: 0 139904333343 [ 0] 01000005 139904384499 HCR CMD: out_pram_l: LE: 0 139904428086 [ 4] 0000ffff 139904478675 HCR CMD: token: LE: ffff0000 139904520156 [ 8] 00003000 139904572059 HCR CMD: op_code_modifier: LE: 0 139904612802 [ c] 00000000 139904667693 HCR CMD: event: LE: 0 139904708526 [10] 00000000 139904758422 HCR CMD 0x18h: LE=80000d, BE=d008000 139904799210 [14] 00000000 139904904204 [18] 00000000 139904946792MTHCA DBG: HCR_STATUS 40100698= d008000 ? 8000 [1c] 00000002 139905076860 TRACE: mthca_av.c:235/mthca_create_ah 139905112329 TRACE: mthca_av.c:243/mthca_create_ah 139905147672 TRACE: mthca_provider.c:460/mthca_ah_create 636959 DEBUG: Start mthca_arbel_post_send. qp 0 wr 8d984b8 139905324432 TRACE: mthca_qp.c:1911/mthca_arbel_post_send 139905359505 TRACE: mthca_qp.c:1939/mthca_arbel_post_send 139905418932 TRACE: mthca_qp.c:1949/mthca_arbel_post_send 636959 DEBUG: qp is not direct access and wqe: 0x8d84400 139905541467 TRACE: mthca_qp.c:1954/mthca_arbel_post_send 139905577647 TRACE: mthca_qp.c:1964/mthca_arbel_post_send 139905614565 TRACE: mthca_qp.c:2057/mthca_arbel_post_send 139905669411 TRACE: mthca_qp.c:2076/mthca_arbel_post_send 139905705726 TRACE: mthca_qp.c:2078/mthca_arbel_post_send 636959 DEBUG: wr sg length 0x18, lkey 0x80001900, local addr 0xce2393b8 139905831060 TRACE: mthca_qp.c:2078/mthca_arbel_post_send 636959 DEBUG: wr sg length 0xe8, lkey 0x80001900, local addr 0xce2393d0 139905956322 TRACE: mthca_qp.c:2092/mthca_arbel_post_send 636959 DEBUG: wr id 148473016 139906069875 TRACE: mthca_qp.c:2120/mthca_arbel_post_send 139906106379 TRACE: mthca_qp.c:2128/mthca_arbel_post_send 139906142892 TRACE: mthca_qp.c:2131/mthca_arbel_post_send 139906178640 TRACE: mthca_qp.c:2135/mthca_arbel_post_send 139906214703 TRACE: mthca_qp.c:2158/mthca_arbel_post_send 139906250568 TRACE: mthca_qp.c:2160/mthca_arbel_post_send 636959 DEBUG: End mthca_arbel_post_send. err 0 139906369953 TRACE: mad.c:650/ib_mad_recv_done_handler 139906406295 TRACE: mad.c:669/ib_mad_recv_done_handler 139906441539 TRACE: mad.c:672/ib_mad_recv_done_handler 636959 QNX DBG: mad_priv->header.mad_list.mad_queue->list.prev 88b0a2c 139906578384 TRACE: mthca_qp.c:2177/mthca_arbel_post_receive 139906614168 TRACE: mthca_qp.c:2194/mthca_arbel_post_receive 139906649295 TRACE: mthca_qp.c:2196/mthca_arbel_post_receive 139906689129 TRACE: mad.c:674/ib_mad_recv_done_handler 139906723068 TRACE: mad.c:676/ib_mad_recv_done_handler 636959 QNX DBG: kmem_cache 5 free object=88b0724 139906793007 HCR CMD: Status Return: : 3 Again, thanks for your help! Best, Yicheng Jack Morgenstein 01/01/2008 01:03 AM To general at lists.openfabrics.org cc Yicheng Jia , Roland Dreier Subject Re: [ofa-general] synchronize commands issued to MTHCA On Tuesday 01 January 2008 03:02, Yicheng Jia wrote: Does your HCA use on-board memory? (Run: "lspci" and look at "Mellanox" lines. You have on-board memory if you see either: PCI bridge: Mellanox Technologies MT23108 InfiniHost HCA bridge (rev a1) InfiniBand: Mellanox Technologies MT23108 InfiniHost HCA (rev a1) OR: InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) ) In that case, when you create an AH in kernel space (file mthca_av.c, procedure mthca_create_ah() ), you will enter the following flow: if (ah->type == MTHCA_AH_ON_HCA) { memcpy_toio(dev->av_table.av_map + index * MTHCA_AV_SIZE, av, MTHCA_AV_SIZE); kfree(av); } Roland, do you think that the memcpy_toio() call might mess things up? Maybe we need "wmb()" or "mmiowb()" here as well? - Jack > Hi Roland, > > Thanks for your reply! > > Actually I'm working on porting IB driver to QNX platform. I resume the > work started by my former colleague, and I just found that the sync codes > (dev->cmd.poll_sem and dev->cmd.hcr_mutex) were deleted for unknown > reason. After adding back these sync codes, the driver runs much > smoothlier. > > However I still get a command exec error which I believe is relevant to > command synchronization. The problem is when "Created UDAV" is called > during SW2HW_MPT command is being executed, the SW2HW_MPT command would > return with bad parameter error. Here are my debug trace output: > > 139903841835 HCR CMD: op_code: LE: d > 139903861104 TRACE: mad.c:639/ib_mad_recv_done_handler > 139903890876 HCR CMD: in_param_h: LE: 0 > 139903942869 TRACE: mad.c:644/ib_mad_recv_done_handler > 139903993296 HCR CMD: in_param_l: LE: cf616000 > 139904038413 TRACE: verbs.c:182/ib_create_ah_from_wc > 139904094753 HCR CMD: input_modifier: LE: 1e > 139904139150 TRACE: mthca_provider.c:447/mthca_ah_create > MTHCA DBG: Created UDAV at 8075220/00000000: > 139904197065 HCR CMD: out_pram_h: LE: 0 > 139904333343 [ 0] 01000005 > 139904384499 HCR CMD: out_pram_l: LE: 0 > 139904428086 [ 4] 0000ffff > 139904478675 HCR CMD: token: LE: ffff0000 > 139904520156 [ 8] 00003000 > 139904572059 HCR CMD: op_code_modifier: LE: 0 > 139904612802 [ c] 00000000 > 139904667693 HCR CMD: event: LE: 0 > 139904708526 [10] 00000000 > 139904758422 HCR CMD 0x18h: LE=80000d, BE=d008000 > 139904799210 [14] 00000000 > 139904904204 [18] 00000000 > 139904946792MTHCA DBG: HCR_STATUS 40100698= d008000 ? > 8000 > [1c] 00000002 > 139905076860 TRACE: mthca_av.c:235/mthca_create_ah > 139905112329 TRACE: mthca_av.c:243/mthca_create_ah > 139905147672 TRACE: mthca_provider.c:460/mthca_ah_create > .... > 139906793007 HCR CMD: Status Return: : 3 > > Do you have any idea? > > Thanks and have a good new year! > Yicheng > > > > > Roland Dreier > 12/28/2007 11:39 PM > > To > Yicheng Jia > cc > general at lists.openfabrics.org > Subject > Re: [ofa-general] synchronize commands issued to MTHCA > > > > > > > > I'm using OFED-1.0 and the problem I believe is related to command > > synchronization of HCA. The host issues a MAD_INF command at first and > > then a SW2HW_MTP command without waiting for the completion of the > first > > command. Both of commands return with bad parameters error. > > I guess you mean the MAD_IFC and SW2HW_MPT commands? I've never heard > of a problem like that -- more details about your hardware/software > config and the exact symptoms you see would be helpful in debugging. > > Anyway OFED 1.0 is ancient by now -- you are much better off just > using drivers from the standard kernel. If you must use OFED, then > OFED 1.2 or even a 1.3 prerelease would be better. > > > My question is why there's no synchronization mechanism for the command > > > execution on HCA, can I use "spin_lock" or "sem_wait" to synchronize > > between every command? > > The HCA firmware allows multiple commands to be queued. The > dev->cmd.event_sem semaphore is used to limit the number of > outstanding commands to the HCA's capabilities, and the > dev->cmd.hcr_mutex mutex is used to serialize the actual writing of > commands to the HCA. > > There was a mmiowb() added to mthca_cmd_post() fairly recently that > might fix your problems if you are running on a large SGI Altix system. > > - R. > > _____________________________________________________________________________ > Scanned by IBM Email Security Management Services powered by MessageLabs. > For more information please visit http://www.ers.ibm.com > _____________________________________________________________________________ > > _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Jan 2 10:12:03 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Jan 2008 10:12:03 -0800 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: (Yicheng Jia's message of "Mon, 31 Dec 2007 19:02:15 -0600") References: Message-ID: > Actually I'm working on porting IB driver to QNX platform. I see. My opinion is that in the long term, you're better off writing a "native" QNX driver rather than trying to port a driver from another OS, although I understand that sometimes short-term issues make doing the right thing impossible. > However I still get a command exec error which I believe is relevant to > command synchronization. The problem is when "Created UDAV" is called > during SW2HW_MPT command is being executed, the SW2HW_MPT command would > return with bad parameter error. Here are my debug trace output: No idea really. Does the Linux mthca work on the same hardware? If so I guess you would have to figure out how the behavior of your driver is different. If you don't have Linux running on your platform then you just need to debug the driver/hardware ... perhaps hardware bus analysis would be helpful to understand what's happening. - R. From rdreier at cisco.com Wed Jan 2 10:14:54 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Jan 2008 10:14:54 -0800 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: <200801010903.38891.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Tue, 1 Jan 2008 09:03:38 +0200") References: <200801010903.38891.jackm@dev.mellanox.co.il> Message-ID: > Roland, do you think that the memcpy_toio() call might mess things up? I wouldn't think so, although I don't have full details of how your hardware behaves to know for sure. I assume your PCI bus/memory controller is already smart enough to deal with HCR writes being interleaved with writes to a doorbell page from userspace, so it seems that writes to locally attached memory should be OK too, as long as the HCR writes are word-sized in the right order etc. > Maybe we need "wmb()" or "mmiowb()" here as well? I don't see any reason, although I often miss things. It seems that the only thing that cares about the writes of the address info being done would be posting a send WQE that uses it, and that should already have sufficient ordering. What would we be ordering things against? - R. From rdreier at cisco.com Wed Jan 2 10:16:50 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Jan 2008 10:16:50 -0800 Subject: [ofa-general] Re: [PATCH] ib/ipoib: Reduce comparison size in data path In-Reply-To: <1199097884.21275.242.camel@mtls03> (Eli Cohen's message of "Mon, 31 Dec 2007 12:44:44 +0200") References: <1199097884.21275.242.camel@mtls03> Message-ID: > In the majority of cases, if the neighbour will change, it will > be reflected in the guid part of the GID (bytes 8-15). If the GID > prefix will change as well (bytes 0-7) it will be because the master > SM has changed, in which case we will get an SM change event resulting > in all paths flushed. Is it guaranteed that an active SM can't change a GID prefix? Especially if we're using a GID at an index != 0? In other words, is this change definitely 100 percent safe? Also I assume this change is coming from performance tuning. For patches like this it is always helpful to include hard data like "this gives a speedup of X on test Y on system Z." Thanks... From danderson at lnxi.com Wed Jan 2 10:37:15 2008 From: danderson at lnxi.com (David B. Anderson) Date: Wed, 02 Jan 2008 11:37:15 -0700 Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to selectsp4 patches for SLES9 kernel with minor versions equalor greater than 305 In-Reply-To: <47775FE3.400@dev.mellanox.co.il> References: <39C75744D164D948A170E9792AF8E7CA4D2CE1@exil.voltaire.com> <47670337.6080607@lnxi.com><476A15BD.1050505@dev.mellanox.co.il> <4773F272.8070309@lnxi.com> <47774AA3.90502@dev.mellanox.co.il> <39C75744D164D948A170E9792AF8E7CAC5AC9A@exil.voltaire.com> <47775FE3.400@dev.mellanox.co.il> Message-ID: <477BD9DB.7050201@lnxi.com> Hi Vladimir, I sent the patches to the list with git-send-email but I just checked the mail logs and I'm getting a timeout from lists.openfabrics.org for those messages. So here they are directly :). Sorry for the delay. David Vladimir Sokolovsky wrote: > git://git.openfabrics.org/ofed_1_2/linux-2.6.git > - its my git tree, > > David can't commit his patches to this tree (he does not have > permissions)... > So, probably he have a clone of my tree somewhere. > > Regards, > Vladimir > > Moshe Kazir wrote: >> He wrote -> >> >>> commit 3db835ee0edb792b120ba10c8066e3d4409de2d7 >>> >>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git >> >> Moshe >> ____________________________________________________________ >> Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) >> >> Voltaire - The Grid Backbone >> >> www.voltaire.com >> >> >> -----Original Message----- >> From: general-bounces at lists.openfabrics.org >> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vladimir >> Sokolovsky >> Sent: Sunday, December 30, 2007 9:37 AM >> To: David B. Anderson >> Cc: general at lists.openfabrics.org >> Subject: Re: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to >> selectsp4 patches for SLES9 kernel with minor versions equalor greater >> than 305 >> >> Hi David, >> Where can I get your patches? >> >> Regards, >> Vladimir >> >> David B. Anderson wrote: >>> Hi Vladimir, >>> >>> The four patches named below are what I'm using to get the OFED >>> 1.2.5 kernel to build for SLES9 SP4. >>> >>> commit 3db835ee0edb792b120ba10c8066e3d4409de2d7 >>> >>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git >>> >>> >>> The patches are: >>> >>> [PATCH 1/4] LNXI changed ofed_scripts configure to select sp4 patches >>> >>> [PATCH 2/4] LNXI created backport patch addr_8802_to_2_6_5-7_308 >>> >>> [PATCH 3/4] LNXI fixed backport/2.6.5_sles9_sp4/rds_to_2_6_9.patch >>> >>> [PATCH 4/4] LNXI fixed backport/2.6.5_sles9_sp4/cxg3_to_2_6_20.patch >>> >>> >>> I've tested these on my cluster. >>> >>> Note: I changed your patch to the ofed_scripts/configure script, so >>> that even if the SLES9 >>> >>> kernel is greater than 309 it will not revert to using SP3 patches. >>> >>> >>> David >>> >>> >>> Vladimir Sokolovsky wrote: >>>> David B. Anderson wrote: >>>>> I've all of these patches plus the following patch >>>>> >>>>> kernel_patches/backport/2.6.5_sles9_sp4/cxgb3_remove_eeh.patch >>>>> >>>>> My current git repo is >>>>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git >>>>> commit: 6974c285e6fb06264f570f9cf919865bab66c9e6 >>>>> >>>>> My patch that I posted before fixes the kernel configure script so >>>>> that it applies 2.6.5_sles9_sp4 patches for the SP4 release kernel of >>>>> 2.6.5-7.308 and above. The configure patch from >>>>> FED_1.2.5_sles9_sp4_configure.diff has 2.6.5-7.305* as the only valid >>>>> SP4 kernel which is incorrect. I get the same compiler error as >> before. >>>>> >>>>> >>>>> Moshe Kazir wrote: >>>>>> See patches in the attached message. >>>>>> >>>>>> It was applied by Vlad. >>>>>> >>>>>> Moshe >>>>>> >>>>>> ____________________________________________________________ >>>>>> Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) >>>>>> >>>>>> Voltaire - The Grid Backbone >>>>>> >>>>>> www.voltaire.com >>>>>> >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: general-bounces at lists.openfabrics.org >>>>>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of David >> B. >>>>>> Anderson >>>>>> Sent: Saturday, December 15, 2007 3:31 AM >>>>>> To: general at lists.openfabrics.org; vlad at mellanox.co.il >>>>>> Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to >> >>>>>> select sp4 patches for SLES9 kernel with minor versions equal or >>>>>> greater than 305 >>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> I've created the following patch for OFED 1.2.5.4 to have the >>>>>> kernel for >>>>>> >>>>>> SLES9 SP4 recognized (2.6.5-7.308). >>>>>> >>>>>> Even with the patch I then had two back port patches not apply >>>>>> cleanly (cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand >>>>>> patched them but now I'm getting the following compiler errors: >>>>>> >>>>>> In file included from >>>>>> /usr/src/linux-2.6.5-7.308/include/linux/module.h:10, >>>>>> from >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons >>>>>> /back >>>>>> >>>>>> port/2.6.5_sles9_sp4/include/linux/module.h:4, >>>>>> from >>>>>> /usr/src/linux-2.6.5-7.308/include/linux/device.h:21, >>>>>> from >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons >>>>>> /back >>>>>> >>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:4, >>>>>> from >>>>>> /usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38, >>>>>> from >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons >>>>>> /back >>>>>> >>>>>> port/2.6.5_sles9_sp4/include/linux/netdevice.h:4, >>>>>> from >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin >>>>>> iband >>>>>> >>>>>> /core/addr.c:32: >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons >>>>>> /back >>>>>> >>>>>> port/2.6.5_sles9_sp4/include/linux/sched.h:8: warning: static >>>>>> declaration for `wait_for_completion_timeout' follows non-static >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin >>>>>> iband >>>>>> >>>>>> /core/addr.c:67: warning: initialization from incompatible >>>>>> pointer type >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin >>>>>> iband >>>>>> >>>>>> /core/addr.c: In function `addr_resolve_remote': >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin >>>>>> iband >>>>>> >>>>>> /core/addr.c:192: error: structure has no member named `idev' >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin >>>>>> iband >>>>>> >>>>>> /core/addr.c:193: error: structure has no member named `idev' >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin >>>>>> iband >>>>>> >>>>>> /core/addr.c:197: error: structure has no member named `idev' >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin >>>>>> iband >>>>>> >>>>>> /core/addr.c: At top level: >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons >>>>>> /back >>>>>> >>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:48: warning: >>>>>> `class_create' defined but not used >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons >>>>>> /back >>>>>> >>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:82: warning: >>>>>> `class_destroy' defined but not used >>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons >>>>>> /back >>>>>> >>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:108: warning: >>>>>> `class_device_create' defined but not used >>>>>> make[6]: *** >>>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi >>>>>> niban >>>>>> >>>>>> d/core/addr.o] Error 1 >>>>>> make[5]: *** >>>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi >>>>>> niban >>>>>> >>>>>> d/core] Error 2 >>>>>> make[4]: *** >>>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi >>>>>> niban >>>>>> >>>>>> d] Error 2 >>>>>> make[3]: *** >>>>>> [_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] >>>>>> Error 2 >>>>>> make[2]: *** [modules] Error 2 >>>>>> make[1]: *** [modules] Error 2 >>>>>> make[1]: Leaving directory >>>>>> `/usr/src/linux-2.6.5-7.308-obj/x86_64/default' >>>>>> make: *** [kernel] Error 2 >>>>>> >>>>>> Does anyone have OFED 1.2.5.4 building for SLES 9 SP4? >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------- >>>>>> ----- >>>>>> >>>>>> >>>>>> Subject: >>>>>> [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4 >>>>>> From: >>>>>> "Moshe Kazir" >>>>>> Date: >>>>>> Sun, 25 Nov 2007 09:59:26 +0200 >>>>>> To: >>>>>> "Vladimir Sokolovsky" , >>>>>> >>>>>> >>>>>> To: >>>>>> "Vladimir Sokolovsky" , >>>>>> >>>>>> >>>>>> >>>>>> The attached files do the work. >>>>>> >>>>>> OFED_1.2.5_sles9_sp4_configure.diff include the changes in the >>>>>> configure file. >>>>>> OFED_1.2.5_sles9_sp4_backport.diff include the canges requiered in >> >>>>>> the kernel_patche and kernel_addons directories. >>>>>> >>>>>> Moshe >>>>>> ____________________________________________________________ >>>>>> Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) >>>>>> >>>>>> Voltaire - The Grid Backbone >>>>>> >>>>>> www.voltaire.com >>>>>> >>>> Hi David, >>>> Please try the latest OFED-1.2.5.4-20071219-0824.tgz build on your >>>> SLES9SP4. >>>> >>>> http://www.openfabrics.org/builds/connectx/OFED-1.2.5.4-20071219-0824 >>>> .tgz >>>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>> >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > -- David B. Anderson Linux Networx Sr. Software Engineer Email: danderson at lnxi.com Phone: (801) 649-1311 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-LNXI-changed-ofed_scripts-configure-to-select-sp4-patc.patch Type: text/x-patch Size: 1139 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-LNXI-created-patch-kernel_patches-backport-2.6.5_sle.patch Type: text/x-patch Size: 3378 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-LNXI-fixed-backport-2.6.5_sles9_sp4-rds_to_2_6_9.patch Type: text/x-patch Size: 2208 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 0004-LNXI-fixed-backport-2.6.5_sles9_sp4-cxg3_to_2_6_20.patch Type: text/x-patch Size: 3185 bytes Desc: not available URL: From hrosenstock at xsigo.com Wed Jan 2 10:55:58 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Wed, 02 Jan 2008 10:55:58 -0800 Subject: [ofa-general] Re: [PATCH] ib/ipoib: Reduce comparison size in data path In-Reply-To: References: <1199097884.21275.242.camel@mtls03> Message-ID: <1199300158.23289.502.camel@hrosenstock-ws.xsigo.com> On Wed, 2008-01-02 at 10:16 -0800, Roland Dreier wrote: > > In the majority of cases, if the neighbour will change, it will > > be reflected in the guid part of the GID (bytes 8-15). If the GID > > prefix will change as well (bytes 0-7) it will be because the master > > SM has changed, in which case we will get an SM change event resulting > > in all paths flushed. > > Is it guaranteed that an active SM can't change a GID prefix? I think that is left as an exercise left to the IBA spec reader as to changing GID prefix (in PortInfo). This would be part of Change Management which was punted by the IBTA. > Especially if we're using a GID at an index != 0? Do you mean GUID ? -- Hal > In other words, is this change definitely 100 percent safe? > > Also I assume this change is coming from performance tuning. For > patches like this it is always helpful to include hard data like "this > gives a speedup of X on test Y on system Z." > > Thanks... > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sean.hefty at intel.com Wed Jan 2 12:00:24 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 2 Jan 2008 12:00:24 -0800 Subject: [ofa-general] [PATCH] [2.6.25] MAINTAINTERS: update email address Message-ID: <000c01c84d7a$1cc23e60$3c98070a@amr.corp.intel.com> Signed-off-by: Sean Hefty --- My Unix email account is being discontinued at end of Q1 '08. Please queue for 2.6.25 MAINTAINERS | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index cad0882..8185bdd 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1923,7 +1923,7 @@ INFINIBAND SUBSYSTEM P: Roland Dreier M: rolandd at cisco.com P: Sean Hefty -M: mshefty at ichips.intel.com +M: sean.hefty at intel.com P: Hal Rosenstock M: hal.rosenstock at gmail.com L: general at lists.openfabrics.org From YJia at tmriusa.com Wed Jan 2 12:07:30 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Wed, 2 Jan 2008 14:07:30 -0600 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: Message-ID: Hi Roland, Could you tell me what's the difference between "wmb()" and "mmiowb()". I notice that ofa-1.3 has added "mmiowb()" at the end of mthca_cmd_post, since "wmb()" is already called at the end of cmd_post, is "mmiowb()" really necessary? Thanks! Yicheng Roland Dreier 01/02/2008 12:13 PM To Jack Morgenstein cc general at lists.openfabrics.org, Yicheng Jia Subject Re: [ofa-general] synchronize commands issued to MTHCA > Roland, do you think that the memcpy_toio() call might mess things up? I wouldn't think so, although I don't have full details of how your hardware behaves to know for sure. I assume your PCI bus/memory controller is already smart enough to deal with HCR writes being interleaved with writes to a doorbell page from userspace, so it seems that writes to locally attached memory should be OK too, as long as the HCR writes are word-sized in the right order etc. > Maybe we need "wmb()" or "mmiowb()" here as well? I don't see any reason, although I often miss things. It seems that the only thing that cares about the writes of the address info being done would be posting a send WQE that uses it, and that should already have sufficient ordering. What would we be ordering things against? - R. _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Jan 2 12:54:09 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Jan 2008 12:54:09 -0800 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: (Yicheng Jia's message of "Wed, 2 Jan 2008 14:07:30 -0600") References: Message-ID: > Could you tell me what's the difference between "wmb()" and "mmiowb()". I > notice that ofa-1.3 has added "mmiowb()" at the end of mthca_cmd_post, > since "wmb()" is already called at the end of cmd_post, is "mmiowb()" > really necessary? wmb() orders writes from the same CPU -- it prevents highly out-of-order architectures from making writes visible in an order different from program order. mmiowb() orders MMIO writes between different CPUs, and prevents systems (such as SGI Altix) where the CPU fabric may reorder writes before they reach the IO bus. The mmiowb() is definitely necessary, because without it then commands were getting messed up on large Altix systems. - R. From rdreier at cisco.com Wed Jan 2 12:57:30 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Jan 2008 12:57:30 -0800 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: (Yicheng Jia's message of "Wed, 2 Jan 2008 11:59:30 -0600") References: Message-ID: > The SW2HW_MPT command is issued while UDAV table is been creating. During > the time that the driver is waiting for the completion of the command, it > does many other things: creating send mad package, posting send mad > request to the SQ and posting another receive mad request to the RQ. > There's no error report for all of these actions. However after it, the > HCA report command parameter error for the SW2HW_MPT. I doubt the problem is creating the UD address vector -- that is just shuffling some things around in the CPU's memory. It seems more likely that posting a send or receive request is messing things up somehow. What is the call chain that calls SW2HW_MPT in this case? Also are you going through the mthca_cmd_post_dbell() or mthca_cmd_post_hcr() code to write the command params to the HCA? I think the best way to debug this would be to work directly with Mellanox to get a debug build of the HCA firmware and get definite info on why the SW2HW_MPT command is failing. - R. From YJia at tmriusa.com Wed Jan 2 13:09:49 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Wed, 2 Jan 2008 15:09:49 -0600 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: Message-ID: > I wouldn't think so, although I don't have full details of how your > hardware behaves to know for sure. I assume your PCI bus/memory > controller is already smart enough to deal with HCR writes being > interleaved with writes to a doorbell page from userspace, so it seems > that writes to locally attached memory should be OK too, as long as > the HCR writes are word-sized in the right order etc. For the problem I've seen, most probably the HCR writes mess up with doorbell register rings. Is it possible? The FW version I'm using is 1.1.0 without debug trace function. This problem is really hard to debug since it's real time and does not occur very oftem, and it's hard to hook up a PCIe bus analysis either since by the time the error happens, the PCIe transaction has been already done. All I get from the HCA is reporting bad parameter error. Is there any way to get more info from the HCA? Thanks! Yicheng Roland Dreier 01/02/2008 12:13 PM To Jack Morgenstein cc general at lists.openfabrics.org, Yicheng Jia Subject Re: [ofa-general] synchronize commands issued to MTHCA > Roland, do you think that the memcpy_toio() call might mess things up? I wouldn't think so, although I don't have full details of how your hardware behaves to know for sure. I assume your PCI bus/memory controller is already smart enough to deal with HCR writes being interleaved with writes to a doorbell page from userspace, so it seems that writes to locally attached memory should be OK too, as long as the HCR writes are word-sized in the right order etc. > Maybe we need "wmb()" or "mmiowb()" here as well? I don't see any reason, although I often miss things. It seems that the only thing that cares about the writes of the address info being done would be posting a send WQE that uses it, and that should already have sufficient ordering. What would we be ordering things against? - R. _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From YJia at tmriusa.com Wed Jan 2 13:51:26 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Wed, 2 Jan 2008 15:51:26 -0600 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: Message-ID: > What is the call chain that calls SW2HW_MPT in this case? The SW2HW_MPT is called by mthca_mr_alloc function. In this function, It first call "mthca_alloc" to get a mr key, then "mthca_table_get" to get a mr ICM entry, then "mthca_alloc_mailbox" to alloc a block of mailbox for the command. During the procedure, the mad completion handler of " ib_mad_recv_done_handler" is also running, which processes the MAD_IFC command and sends response, they are all completed without error report. Also for your information, I'm using two Due Core Xeon CPU to run the driver. > Also are you going through the mthca_cmd_post_dbell() or mthca_cmd_post_hcr()code to write the command params to the HCA? Yes. I found there's a little difference between these two functions. There are two "wmb()" functions call in mthca_cmd_post_dbell()but only one "wmb()" in mthca_cmd_post_hcr(). Any perticular reason for it? > I think the best way to debug this would be to work directly with Mellanox to get a debug build of the HCA firmware and get definite info on why the SW2HW_MPT command is failing. Do you know who I am supposed to contact with? Thanks! Yicheng Roland Dreier 01/02/2008 02:55 PM To Yicheng Jia cc Jack Morgenstein , general at lists.openfabrics.org Subject Re: [ofa-general] synchronize commands issued to MTHCA > The SW2HW_MPT command is issued while UDAV table is been creating. During > the time that the driver is waiting for the completion of the command, it > does many other things: creating send mad package, posting send mad > request to the SQ and posting another receive mad request to the RQ. > There's no error report for all of these actions. However after it, the > HCA report command parameter error for the SW2HW_MPT. I doubt the problem is creating the UD address vector -- that is just shuffling some things around in the CPU's memory. It seems more likely that posting a send or receive request is messing things up somehow. What is the call chain that calls SW2HW_MPT in this case? Also are you going through the mthca_cmd_post_dbell() or mthca_cmd_post_hcr() code to write the command params to the HCA? I think the best way to debug this would be to work directly with Mellanox to get a debug build of the HCA firmware and get definite info on why the SW2HW_MPT command is failing. - R. _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralph.campbell at qlogic.com Wed Jan 2 14:18:53 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Wed, 02 Jan 2008 14:18:53 -0800 Subject: [ofa-general] [PATCH] IB/core - ib_wr_opcode change to add IB_WR_LSO breaks ib_ipath Message-ID: <1199312334.4280.33.camel@brick.pathscale.com> The ib_ipath driver depends on /usr/include/infiniband/verbs.h enum ibv_wr_opcode matching the kernel's ib_verbs.h enum ib_wr_opcode. The recent change to add IB_WR_LSO breaks this. Now, you may argue that the kernel should not depend on this equivalence but you would then need to define IBV_WR_RDMA_WRITE, etc. in some kernel header file and do a table look up to map from user to kernel opcode values. Since I don't see any other code which depends on the value of IB_WR_LSO, I think the following patch is the right fix. This should be applied to 2.6.24 and 2.6.25. Signed-off-by: Ralph Campbell diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 14a51b8..b42fafb 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -616,11 +616,11 @@ enum ib_wr_opcode { IB_WR_RDMA_WRITE, IB_WR_RDMA_WRITE_WITH_IMM, IB_WR_SEND, - IB_WR_LSO, IB_WR_SEND_WITH_IMM, IB_WR_RDMA_READ, IB_WR_ATOMIC_CMP_AND_SWP, - IB_WR_ATOMIC_FETCH_AND_ADD + IB_WR_ATOMIC_FETCH_AND_ADD, + IB_WR_LSO }; enum ib_send_flags { From rdreier at cisco.com Wed Jan 2 15:08:48 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 02 Jan 2008 15:08:48 -0800 Subject: [ofa-general] [PATCH] IB/core - ib_wr_opcode change to add IB_WR_LSO breaks ib_ipath In-Reply-To: <1199312334.4280.33.camel@brick.pathscale.com> (Ralph Campbell's message of "Wed, 02 Jan 2008 14:18:53 -0800") References: <1199312334.4280.33.camel@brick.pathscale.com> Message-ID: > This should be applied to 2.6.24 and 2.6.25. IB_WR_LSO is an OFED-only thing. But I agree that this patch is correct for OFED 1.3. - R. From pradeeps at linux.vnet.ibm.com Wed Jan 2 15:18:07 2008 From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana) Date: Wed, 02 Jan 2008 15:18:07 -0800 Subject: [ofa-general] Oops in mthca In-Reply-To: References: <476B04C6.1040803@linux.vnet.ibm.com> Message-ID: <477C1BAF.7050501@linux.vnet.ibm.com> I was out on vacation the last week of December and hence the delayed response. Roland Dreier wrote: > > I discovered the following Oops while developing a patch to enable SRQ on HCAs with fewer than > > 16 SG elements. > > So is this oops with some version of your patch for limited SRQ > scatter entries applied? It's hard to know exactly what is going > wrong but I suspect that if you get a device that allows more than 16 > SRQ scatter entries, your patch passes that value for num_sg without > changing the declaration of rx_sge[] to have enough entries, so when > posting the receive request, the low-level driver goes off the end of > the array. rx_sge[] could have been the culprit. I now limit the max_sge to IPOIB_CM_RX_SG, and so that is not an issue any more. The latest patch submitted : http://lists.openfabrics.org/pipermail/general/2007-December/044298.html has this fix. > > > The root of this issue appears to be that ib_query_device(priv->ca, &attr) > > reports an incorrect value for attr.max_srq_sge. The value that > > ib_query_device returns is 28 (instead of 16 that I expected). > > Why do you think the value 28 is incorrect? At one stage ib_query_device() was returning max_srq_sge of 28 and ib_query_srq() returned max_sge of 16. And because of the discrepancy I thought 28 was incorrect. However, now both return 28 and that is probably a moot point. Pradeep From sean.hefty at intel.com Wed Jan 2 16:10:12 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 2 Jan 2008 16:10:12 -0800 Subject: [ofa-general] peer to peer connections support In-Reply-To: <476ADF5A.6080200@ichips.intel.com> References: <4767A2CD.8030209@voltaire.com><4768289F.6040907@ichips.intel.com> <4769019C.10602@voltaire.com><47696002.4030903@ichips.intel.com> <476A857D.3090608@voltaire.com> <476AB554.9080200@ichips.intel.com><15ddcffd0712201314x2b064f65m3c4cbb6f0fe02a42@mail.gmail.com> <476ADF5A.6080200@ichips.intel.com> Message-ID: <000d01c84d9d$025a0f80$3c98070a@amr.corp.intel.com> >> I understand that under TCP there's also a notion of peer to peer and >> client/server connections, I'll give it a look next week to see what's >> the foundations over there. According to 'TCP/IP Illustrated, Volume 1 the Protocols' this is called "simultaneous open" and is described as "possible, although improbable" to achieve. Maybe some of the newer implementations allow this to work better, but I doubt many apps, including MPI, use it. - Sean From sean.hefty at intel.com Wed Jan 2 16:13:40 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 2 Jan 2008 16:13:40 -0800 Subject: [ofa-general] [ANNOUNCE] librdmacm release 1.0.5 Message-ID: <000e01c84d9d$7e41e0f0$3c98070a@amr.corp.intel.com> I've pushed out release 1.0.5 of the librdmacm. It adds some additional documentation to the man pages only. Please update OFED 1.3 to use this version. - Sean From kliteyn at mellanox.co.il Wed Jan 2 17:26:11 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 3 Jan 2008 03:26:11 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-03:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-02 OpenSM git rev = Sat_Dec_29_21:02:49_2007 [f7b47c635d291a4aef38c609028791b4ad1f1259] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=399 Fail=1 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 9 LidMgr IS3-128.topo Failures: 1 LidMgr IS3-128.topo From eddiem at sgi.com Wed Jan 2 17:45:06 2008 From: eddiem at sgi.com (Edward Mascarenhas) Date: Wed, 02 Jan 2008 17:45:06 -0800 Subject: [ofa-general] Re: [ewg] OFED 1.3 timeline In-Reply-To: <4693BF47.8070700@mellanox.co.il> References: <4693BF47.8070700@mellanox.co.il> Message-ID: <477C3E22.7090006@sgi.com> Hi, What is the expected timeline for the remaining RCs and GA of OFED 1.3? Thanks, Edward From jackm at dev.mellanox.co.il Wed Jan 2 22:23:48 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Thu, 3 Jan 2008 08:23:48 +0200 Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of =?iso-8859-1?q?any=09one_user?= process In-Reply-To: References: <200712311339.41166.jackm@dev.mellanox.co.il> Message-ID: <200801030823.49257.jackm@dev.mellanox.co.il> On Wednesday 02 January 2008 17:26, Tang, Changqing wrote: > If rank B does not register (receiving QP has been created by another rank A on the node), > and sender know B's SRQ number, if sender sends a message to B, can B still receive this > message ?   (I hope, no register, no receive) > Yes, he can still receive this message. The only criterion is that the receiving XRC QP and the receiving XRC SRQ belong to the same domain. The only issue solved by registration is premature qp destruction. - Jack From tziporet at dev.mellanox.co.il Wed Jan 2 23:13:38 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Thu, 03 Jan 2008 09:13:38 +0200 Subject: [ofa-general] [PATCH] IB/core - ib_wr_opcode change to add IB_WR_LSO breaks ib_ipath In-Reply-To: References: <1199312334.4280.33.camel@brick.pathscale.com> Message-ID: <477C8B22.1060104@mellanox.co.il> Roland Dreier wrote: > > This should be applied to 2.6.24 and 2.6.25. > > IB_WR_LSO is an OFED-only thing. But I agree that this patch is > correct for OFED 1.3. > > > Can we add this to IB stack for 1.2.5 too Tziporet From eli at mellanox.co.il Wed Jan 2 23:49:58 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 03 Jan 2008 09:49:58 +0200 Subject: [ofa-general] Re: [PATCH] ib/ipoib: Reduce comparison size in data path In-Reply-To: References: <1199097884.21275.242.camel@mtls03> Message-ID: <1199346598.21275.393.camel@mtls03> On Wed, 2008-01-02 at 10:16 -0800, Roland Dreier wrote: > > In the majority of cases, if the neighbour will change, it will > > be reflected in the guid part of the GID (bytes 8-15). If the GID > > prefix will change as well (bytes 0-7) it will be because the master > > SM has changed, in which case we will get an SM change event resulting > > in all paths flushed. > > Is it guaranteed that an active SM can't change a GID prefix? I know opensm has a fixed, hard coded subnet prefix. > Especially if we're using a GID at an index != 0? I think ipoib uses only the GID from index 0, isn't it? > In other words, is > this change definitely 100 percent safe? It looks safe to me but I wanted to hear other opinions. > > Also I assume this change is coming from performance tuning. For > patches like this it is always helpful to include hard data like "this > gives a speedup of X on test Y on system Z." > > Thanks... Indeed I am working on performance and on the branch I am working on, which is different than main branch, it does makes a slight difference. I am trying to improve the throughput of small (up to 128 bytes) UDP messages where I am CPU bound so everything counts. But I believe that if it is correct than we should use it even if the improvement is not outstanding. From dwnashenm at nashen.com Wed Jan 2 00:07:36 2008 From: dwnashenm at nashen.com (Chance Jamison) Date: Thu, 2 Jan 2008 15:07:36 +0700 Subject: [ofa-general] Die Software, legal und billig, ist möglich Message-ID: <629168074.98027536043687@nashen.com> Die Software in kurzer Zeit und fur wenig Geld bekommen, ist es moglich? Warum nicht. Hier sind die Programmen auf allen europaischen Sprachen uberlassen und fur Windows und Macintosh vorherbestimmt. Alle hier prasentierten Produkte der Software sind original und vollig.Jetzt wird jedes Programm leicht aufgestellt. Dabei hilft die professionelle Konsultation des Anwenderdienstes. Wir garantieren schnelle Antworte und die Moglichkeit der Ruckzahlung. So konnen Sie die vollkommen funktionierende Software leicht kaufen http://geocities.com/sandramontoya17/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Thu Jan 3 00:23:55 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 08:23:55 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/dump_lfts.sh: adopt DR Path parser Message-ID: <20080103082355.GC19494@sashak.voltaire.com> Adopt DR Path parser to the new ibnetdiscover format, where DR Path is shown as comma separated list. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/scripts/dump_lfts.sh | 5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/infiniband-diags/scripts/dump_lfts.sh b/infiniband-diags/scripts/dump_lfts.sh index 67a307c..81984af 100755 --- a/infiniband-diags/scripts/dump_lfts.sh +++ b/infiniband-diags/scripts/dump_lfts.sh @@ -23,9 +23,8 @@ done dump_by_dr_path () { for sw_dr in `ibnetdiscover $ca_info -v \ - | sed -ne '/^DR path .* switch /s/^DR path \[\(.*\)\].*$/\1/p' \ - | sed -e 's/\]\[/,/g' \ - | sort -u` ; do + | sed -ne '/^DR path .* switch /s/^DR path \([,|0-9]\+\) ->.*$/\1/p' \ + | sort -u` ; do ibroute $ca_info -D ${sw_dr} done } -- 1.5.3.4.206.g58ba4 From moshek at voltaire.com Thu Jan 3 00:19:07 2008 From: moshek at voltaire.com (Moshe Kazir) Date: Thu, 3 Jan 2008 10:19:07 +0200 Subject: [ofa-general] Issues with compilation of OFED 1.2.5.4 and RHEL 4 U6kernel 2.6.9-67.0.1.ELsmp In-Reply-To: <1199289816.10472.61.camel@debcap.cineca.it> References: <1199289816.10472.61.camel@debcap.cineca.it> Message-ID: <39C75744D164D948A170E9792AF8E7CAC5ACB2@exil.voltaire.com> That's what I send to Vlad, You have to use the patch files and the new install.pl Or try using OFED-1.2.5 last build from http://www.openfabrics.org/builds/connectx Moshe ____________________________________________________________ Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Andrea Capriotti Sent: Wednesday, January 02, 2008 6:01 PM To: general at lists.openfabrics.org Subject: [ofa-general] Issues with compilation of OFED 1.2.5.4 and RHEL 4 U6kernel 2.6.9-67.0.1.ELsmp Hi all, when compiling the latest OFED version (1.2.5.4) on a RHEL 4 U6 (kernel 2.6.9-67.0.1.ELsmp) I get the following error: make[1]: Entering directory `/usr/src/kernels/2.6.9-67.0.1.EL-smp-x86_64' mkdir -p /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/.tmp_versions make -f scripts/Makefile.build obj=/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4 make -f scripts/Makefile.build obj=/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband make -f scripts/Makefile.build obj=/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/cor e/.addr.o.d -nostdinc -iwithprefix include -D__KERNEL__ -I/var/tmp/OFEDRPM/BUIL D/ofa_kernel-1.2.5.4/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include -Iinclude -include include/linux/autoconf.h -includ e /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include/linux/autoconf.h -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -fomit-f rame-pointer -g -Wdeclaration-after-statement -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -I/var/tmp/OFE DRPM/BUILD/ofa_kernel-1.2.5.4/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/dri vers/infiniband/ulp/ipoib -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/debug -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniba nd/hw/cxgb3/core -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/net/cxgb3 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds -I/var/tmp/OFEDRPM/BUIL D/ofa_kernel-1.2.5.4/drivers/net/mlx4 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/hw/mlx4 -DMODULE -DKBUILD_BASENAME=addr -DKBUILD_MODN AME=ib_addr -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/.tmp_a ddr.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/ core/addr.c In file included from /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c :32: include/linux/inetdevice.h:50: error: field `mr_gq_timer' has incomplete type include/linux/inetdevice.h:51: error: field `mr_ifc_timer' has incomplete type include/linux/inetdevice.h:95: error: `IFNAMSIZ' undeclared here (not in a function) include/linux/inetdevice.h: In function `__in_dev_get_rcu': include/linux/inetdevice.h:142: error: dereferencing pointer to incomplete type include/linux/inetdevice.h: In function `in_dev_get': include/linux/inetdevice.h:154: error: dereferencing pointer to incomplete type include/linux/inetdevice.h: In function `__in_dev_get': include/linux/inetdevice.h:164: error: dereferencing pointer to incomplete type /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c : At top level: /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c :62: warning: type defaults to `int' in declaration of `DECLARE_DELAYED_WORK' /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c :62: warning: parameter names (without types) in function declaration /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c : In function `set_timeout': /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c :127: error: `work' undeclared (first use in this function) /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c :127: error: (Each undeclared identifier is reported only once /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c :127: error: for each function it appears in.) /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c : At top level: /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c :218: warning: 'process_req' defined but not used /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c :62: warning: 'DECLARE_DELAYED_WORK' declared `static' but never defined make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr. o] Error 1 make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core] Error 2 make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband] Error 2 make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4] Error 2 make[1]: Leaving directory `/usr/src/kernels/2.6.9-67.0.1.EL-smp-x86_64' make: *** [kernel] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.92720 (%install) No problem with RHEL 4 U5 (kernel 2.6.9-55.0.12.ELsmp). Any idea? Best Regards -- Andrea Capriotti System Management Group - Cineca - www.cineca.it a.capriotti at cineca.it - Tel +39 051 6171890 _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -------------- next part -------------- An embedded message was scrubbed... From: "Moshe Kazir" Subject: ofed-1.2.5 - RH 4 U 6 bacport files Date: Mon, 17 Dec 2007 11:11:23 +0200 Size: 403949 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: install.pl Type: application/octet-stream Size: 147811 bytes Desc: install.pl URL: From sashak at voltaire.com Thu Jan 3 00:30:39 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 08:30:39 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/dump_lfts.sh: fix switch DR Path parser In-Reply-To: <20080103082355.GC19494@sashak.voltaire.com> References: <20080103082355.GC19494@sashak.voltaire.com> Message-ID: <20080103083039.GD19494@sashak.voltaire.com> It is highly possible that ibnetdiscover finds the same switches at different paths. In this case the current version of the script will dump LFTs multiple times for the same switches. This patch fixes this bug - all discovered paths are sorted and unified by switch GUIDs. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/scripts/dump_lfts.sh | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/infiniband-diags/scripts/dump_lfts.sh b/infiniband-diags/scripts/dump_lfts.sh index 81984af..ebca705 100755 --- a/infiniband-diags/scripts/dump_lfts.sh +++ b/infiniband-diags/scripts/dump_lfts.sh @@ -23,8 +23,9 @@ done dump_by_dr_path () { for sw_dr in `ibnetdiscover $ca_info -v \ - | sed -ne '/^DR path .* switch /s/^DR path \([,|0-9]\+\) ->.*$/\1/p' \ - | sort -u` ; do + | sed -ne '/^DR path .* switch /s/^DR path \([,|0-9]\+\) ->.*{\([0-9|a-f]\+\)}.*$/\2 \1/p' \ + | sort -u \ + | awk 'BEGIN {guid=0;} {if ($1 != guid) { guid=$1; print $2; }}'` ; do ibroute $ca_info -D ${sw_dr} done } -- 1.5.3.4.206.g58ba4 From tomof at acm.org Thu Jan 3 00:30:20 2008 From: tomof at acm.org (FUJITA Tomonori) Date: Thu, 3 Jan 2008 17:30:20 +0900 Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5 In-Reply-To: References: <20071223014407L.tomof@acm.org> <1198689251.25003.2.camel@lap75545.ornl.gov> Message-ID: <20080103173330T.tomof@acm.org> On Wed, 02 Jan 2008 09:51:38 -0800 Roland Dreier wrote: > > > Can you try this? > > > > That patched oopsed in scsi_remove_host(), but reversing the order has > > survived over 500 insert/probe/remove cycles. > > > > Tested-by: David Dillow > > --- > > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c > > index 950228f..77e8b90 100644 > > --- a/drivers/infiniband/ulp/srp/ib_srp.c > > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > > @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device) > > list_for_each_entry_safe(target, tmp_target, > > &host->target_list, list) { > > scsi_remove_host(target->scsi_host); > > + srp_remove_host(target->scsi_host); > > srp_disconnect_target(target); > > Where do we stand on this? What is the right place to put the > srp_remove_host? Is there a bug somewhere else? {sas|fc}_remove_host is called before scsi_remove_host. And in srp_remove_work(), we call srp_remove_host and then scsi_remove_host. ibmvscsi also calls them in that order. I thought that I messed up something in srp_transport_class. But I can't figure out what's wrong. The above patch works and is unlikely to lead to critical problems so I'm fine with it for now. > I'd like to get this fixed before 2.6.24 final comes out... Yeah, it should be fixed. From ogerlitz at voltaire.com Thu Jan 3 00:45:36 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 03 Jan 2008 10:45:36 +0200 Subject: [ofa-general] [PATCH] IB/core - ib_wr_opcode change to add IB_WR_LSO breaks ib_ipath In-Reply-To: <1199312334.4280.33.camel@brick.pathscale.com> References: <1199312334.4280.33.camel@brick.pathscale.com> Message-ID: <477CA0B0.5070106@voltaire.com> Ralph Campbell wrote: > The ib_ipath driver depends on /usr/include/infiniband/verbs.h > enum ibv_wr_opcode matching the kernel's ib_verbs.h > enum ib_wr_opcode. The recent change to add IB_WR_LSO breaks this. > > Now, you may argue that the kernel should not depend on this equivalence > but you would then need to define IBV_WR_RDMA_WRITE, etc. in some > kernel header file and do a table look up to map from user to > kernel opcode values. Since I don't see any other code which depends > on the value of IB_WR_LSO, I think the following patch is the right > fix. > > This should be applied to 2.6.24 and 2.6.25. SO in a way, putting IB_WR_LSO where it was broke the ABI for libibpath. Eli, Can you please apply this change to the patch set which you consider as candidate for upstream merging? Or. From sashak at voltaire.com Thu Jan 3 01:00:59 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 09:00:59 +0000 Subject: [ofa-general] [ANNOUNCE] management tarballs release In-Reply-To: <20071119214435.GR5986@sashak.voltaire.com> References: <20071119214435.GR5986@sashak.voltaire.com> Message-ID: <20080103090059.GF19494@sashak.voltaire.com> Hi, There is a new release of the management (OpenSM and infiniband diagnostics) tarballs available in: http://www.openfabrics.org/downloads/management/ md5sum: df15dde16ee5b28c25affa2dedaa3d0f libibcommon-1.0.7.tar.gz 124f9f5c3e0c28afb86bb8bbe4a60ecd libibumad-1.1.6.tar.gz 047a6d9f834d7012185e5a28f0f6141f libibmad-1.1.5.tar.gz 53675686246f0cf92853358483f2d0f3 opensm-3.1.8.tar.gz 892dab82b783a4ff2063a438aa93418f infiniband-diags-1.3.5.tar.gz Complete list of changes since last release is below: Sasha Al Chu (6): osm cache file extra info patch OpenSM: Fix error return corner case OpenSM: Fix comment typo. OpenSM: osm routing engine type OpenSM: Fix incorrect reporting of routing engine used OpenSM: Fix incorrect identification of routing engine used Albert L. Chu (2): support minhop as a routing engine input change 'min-hop' to 'minhop' for consistency with routing engine input Erez Strauss (1): ibnetdiscover - ports report Hal Rosenstock (3): OpenSM/libvendor/osm_vendor_ibumad.c: Make error code in osm_log message unique libibmad/dump.c: Use bit mask approach to decoding LinkWidth/Speed Enabled/Supported opensm/libvendor/osm_vendor_ibumad_sa.c: In __osmv_sa_mad_rcv_cb, handle attribute offset of 0 Ira Weiny (5): opensm/include/opensm/osm_event_plugin.h: fix comment opensm/opensm/osm_event_plugin.c: clean up version check error message a bit opensm/opensm/osm_event_plugin.c: remove duplicate header include opensm/opensm/osm_perfmgr_db.c: fix clearing previous count when "out of band" opensm: Add "perfmgr print_counters node" to the console to print individual values Rolf Manderscheid (1): opensm: allow multiple scopes in a partition Sasha Khapyorsky (50): opensm: move vendor specific header files to include/vendor opensm: remove unused flag opensm: move IBA constants from osm_sa_mcmember_record.h to ib_types.h opensm: move OpenSM constants from osm_sa_mcmember-record.h to osm_base.h opensm: cosmetic fixes opensm: make osm_pkey_get_tables static opensm: remove testability_mode option libibumad: fix memory leak infiniband-diags/saquery: add get_any_records() function infinibad-diags/saquery: move lid resolving functions infiniband-diags/saquery: LinkRecord query support infiniband-diags/saquery: allow empty src and/or dst with --src-to-dst option infiniband-diags/man: add -x option to saquery man page opensm: fix lmc_mask bit order in osm_sa_link_record.c opensm: don't break name_map using when routing_engine was not found. opensm: remove unused osm_port_lid_category_t enum opensm: minor cleaning opensm/lash: fix wrong allocation size opensm/lash: cosmetic opensm/osm_ucast_updn.c: indentation fixes opensm/libvendor: indentation fixes infiniband-diags/ibcheckerrors: for CAs query only single ports opensm/config/osmvsel.m4: update LDADD variable, not LDFLAGS infiniband-diags/ibcheckerrors: fix port errors count infiniband-diags/scripts: fix perfquery usage opensm: don't zero base LID when invalid value is received opensm: remove old style code formatters opensm/Makefile: remove opensm_CXXFLAGS opensm: recover only for base LID values >= 0xc000 opensm/osm_port_info_rcv: node instead of port as parameter for osm_pi_rcv_process_set() opensm/osm_node_new: move p_node->print_desc setup libibmad: initialize sm portid in ib_resolve_smlid() libibcommon: fix overflow in debug/log prints opensm: rename __osm_epi_plugin_t to osm_event_plugin_t opensm/osm_event_plugin.h: add names to structures opensm: remove useless osm_node_get_remote_type() opensm: indentation fixes manangement: kill __WORDSIZE macro checks complib: make __cl_thread_wrapper() static opensm: make some functions static opensm/osm_helper: make some functions static opensm: make some functions statics opensm: some micro-optimizations opensm/updn: report fallback properly opensm/updn: rename __osm_subn_calc_up_down_min_hop_table() opensm: Revert "opensm/osm_pkey_mgr.c: setting only outbound partition enforcement on switch" opensm: mcast mgr improvements infiniband-diags/dump_lfts.sh: adopt DR Path parser infiniband-diags/dump_lfts.sh: fix switch DR Path parser management: bump versions Yevgeny Kliteynik (10): opensm: Remove unnecessary ntoh and hton conversions in LinkRecord processing opensm: adding missing comparison by to_lid/from_lid in LinkRecord processing opensm: Fixing broken logic in 'process world' part of LinkRecord processing opensm: printing to stderr note about error in QoS policy file opensm: trivial change of log message opensm: fixing coredump in QoS policy pkey validation opensm: QoS policy - fixing pkey range implementation opensm/osm_pkey_mgr.c: trivial fix in log message opensm/osm_pkey_mgr.c: setting only outbound partition enforcement on switch opensm: osm_state_mgr.c - stop idle queue processing if heavy sweep requested From dwnibmm at nibm.nl Wed Jan 2 00:55:34 2008 From: dwnibmm at nibm.nl (Dusty Medeiros) Date: Thu, 2 Jan 2008 10:55:34 +0200 Subject: [ofa-general] Die Software ohne Probleme mit Aufstellung und hohen Preisen Message-ID: <01c84d2d$ffc49700$6693fd58@dwnibmm> Um die echte und vollige Software in kurzer Zeit zu bekommen, braucht man nur zu bezahlen und auszulasten. Sie haben dann die Programmen auf allen europaischen Sprachen uberlassen, die fur Windows und Macintosh vorherbestimmt sind. Wie das Programm aufzustellen? Dabei hilft die professionelle Konsultation des Anwenderdienstes. Garantierte schnelle Antwort, die Ruckzahlung ist moglich. Sie kaufen, die Software funktionieren, ausgezeichnet http://geocities.com/bennett.phelps/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ishai at mellanox.co.il Thu Jan 3 00:59:11 2008 From: ishai at mellanox.co.il (Ishai Rabinovitz) Date: Thu, 3 Jan 2008 10:59:11 +0200 Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of any one user process In-Reply-To: Message-ID: <6C2C79E72C305246B504CBA17B5500C903039083@mtlexch01.mtl.com> Please see my comments (prefix [Ishai]) -----Original Message----- From: Tang, Changqing [mailto:changquing.tang at hp.com] Sent: ד 02 ינואר 2008 17:27 To: Jack Morgenstein; Pavel Shamis Cc: Ishai Rabinovitz; Gleb Natapov; Roland Dreier; general at lists.openfabrics.org Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of any one user process This interface is OK for me. Now, every rank on a node who wants to receive message from the same remote rank must know the same receiving QP number, and register for receiving using this QP number. If rank B does not register (receiving QP has been created by another rank A on the node), and sender know B's SRQ number, if sender sends a message to B, can B still receive this message ? (I hope, no register, no receive) [Ishai] I guess that from the MPI layer prospective, the sender can not know B's SRQ number until it ask B to give it to him. So B can register to this QP before sending the SRQ number. I hope to know the opinion from other MPI team, or other XRC user. [Ishai] We already discussed this issues with Open MPI IB group, and it looks fine to them. I'm sending this mail to Prof. Panda, so he can comment on it as well. --CQ > -----Original Message----- > From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il] > Sent: Monday, December 31, 2007 5:40 AM > To: pasha at mellanox.co.il > Cc: ishai at mellanox.co.il; Gleb Natapov; Roland Dreier; Tang, > Changqing; general at lists.openfabrics.org > Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP > independent of any one user process > > > Tang, Changqing wrote: > > > If I have a MPI server processes on a node, many > other MPI > > > client processes will dynamically connect/disconnect with the > > > server. The server use same XRC domain. > > > > > > Will this cause accumulating the "kernel" QP for such > > > application ? we want the server to run 365 days a year. > > > > > > I have some question about the scenario above. Did you > call for the > > > mpi disconnect on the both ends (server/client) before the client > > > exit (did we must to do it?) > > > > Yes, both ends will call disconnect. But for us, > MPI_Comm_disconnect() > > call is not a collective call, it is just a local operation. > > > > --CQ > > > Possible solution (internal review as yet): > > Each user process registers with the XRC QP: > a. each process registers ONCE. If it registers multiple times, > there is no reference increment -- > rather the registration succeeds, but only one PID entry is > kept per QP. > b. Can have cleanup in the event of a process dying suddenly. > c. QP cannot be destroyed while there are any user processes still > registered with it. > > libibverbs API is as follows: > > ============================================================== > ======================== > /** > * ibv_xrc_rcv_qp_alloc - creates an XRC QP for serving as a > receive-side only QP, > * and moves the created qp through the RESET->INIT and > INIT->RTR transitions. > * (The RTR->RTS transition is not needed, since this QP > does no sending). > * The sending XRC QP uses this QP as destination, while > specifying an XRC SRQ > * for actually receiving the transmissions and > generating all completions on the > * receiving side. > * > * This QP is created in kernel space, and persists > until the last process registered > * for the QP calls ibv_xrc_rcv_qp_unregister() (at > which time the QP is destroyed). > * > * @pd: protection domain to use. At lower layer, this provides > access to userspace obj > * @xrc_domain: xrc domain to use for the QP. > * @attr: modify-qp attributes needed to bring the QP to RTR. > * @attr_mask: bitmap indicating which attributes are provided in the > attr struct. > * used for validity checking. > * @xrc_rcv_qpn: qp_num of created QP (if success). To be passed to > the remote node (sender). > * The remote node will use xrc_rcv_qpn in > ibv_post_send when sending to > * XRC SRQ's on this host in the same xrc domain. > * > * RETURNS: success (0), or a (negative) error value. > * > * NOTE: this verb also registers the calling user-process with the QP > at its creation time > * (implicit call to ibv_xrc_rcv_qp_register), to avoid > race conditions. > * The creating process will need to call > ibv_xrc_qp_unregister() for the QP to release it from > * this process. > */ > > int ibv_xrc_rcv_qp_alloc(struct ibv_pd *pd, > struct ibv_xrc_domain *xrc_domain, > struct ibv_qp_attr *attr, > enum ibv_qp_attr_mask attr_mask, > uint32_t *xrc_rcv_qpn); > > ===================================================================== > > /** > * ibv_xrc_rcv_qp_register: registers a user process with an XRC QP > which serves as > * a receive-side only QP. > * > * @xrc_domain: xrc domain the QP belongs to (for verification). > * @xrc_qp_num: The (24 bit) number of the XRC QP. > * > * RETURNS: success (0), > * or error (-EINVAL), if: > * 1. There is no such QP_num allocated. > * 2. The QP is allocated, but is not an receive XRC QP > * 3. The XRC QP does not belong to the given domain. > */ > int ibv_xrc_rcv_qp_register(struct ibv_xrc_domain *xrc_domain, > uint32_t xrc_qp_num); > > ===================================================================== > /** > * ibv_xrc_rcv_qp_unregister: detaches a user process from an XRC QP > serving as > * a receive-side only QP. If as a result, there are > no remaining userspace processes > * registered for this XRC QP, it is destroyed. > * > * @xrc_domain: xrc domain the QP belongs to (for verification). > * @xrc_qp_num: The (24 bit) number of the XRC QP. > * > * RETURNS: success (0), > * or error (-EINVAL), if: > * 1. There is no such QP_num allocated. > * 2. The QP is allocated, but is not an XRC QP > * 3. The XRC QP does not belong to the given domain. > * NOTE: I don't see any reason to return a special code if the QP is > destroyed -- the unregister simply > * succeeds. > */ > int ibv_xrc_rcv_qp_unregister(struct ibv_xrc_domain *xrc_domain, > uint32_t xrc_qp_num); > ============================================================== > =============================== > > Usage: > > 1. Sender creates an XRC QP (sending QP) 2. Sender sends some > receiving process on a remote node (say R1) a request to provide an > XRC QP and XRC SRQ for > receiving messages (the request includes the sending QP number). > 3. R1 calls ibv_xrc_rcv_qp_alloc() to create a receiving XRC QP in > kernel space, and move > that QP up to RTR state. This function also registers process R1 > with the XRC QP. > 4. R1 calls ibv_create_xrc_srq() to create an SRQ for receive messages > via the just created XRC QP. > 5. R1 responds to request, providing the XRC qp number, and XRC SRQ > number to be used in communication. > 6. Sender then may wish to communicate with another receiving process > on the remote host (say R2). > it sends a request to R2 containing the remote XRC QP number > (obtained from R1) > which it will use to send messages. > 7. R2 creates an XRC SRQ (if one does not already exist for the > domain), and also > calls ibv_xrc_rcv_qp_register() to register the process R2 with the > XRC QP created by R1. > 8. If R1 no longer needs to communicate with the sender, it calls > ibv_xrc_rcv_qp_unregister() for the QP. > The QP will not yet be destroyed, since R2 is still registered with > it. > 9. If R2 no longer needs to communicate with the sender, it calls > ibv_xrc_rcv_qp_unregister() for the QP. > At this point, the QP is destroyed, since no processes remain > registered with it. > > NOTES: > 1. The problem of the QP being destroyed and quickly re-allocated does > not exist -- the upper bits of the > QP number are incremented at each allocation (except for the MSB > which is always 1 for XRC QPs). Thus, > even if the same QP is re-allocated, its QP number (stored in the > QP object) will be different than > expected (unless it is re-destroyed/re-allocated several hundred > times). > > 2. With this model, we do not need a heartbeat: if a receiving process > dies, all XRC QPs it has registered for will > be unregistered as part of process cleanup in kernel space. > > - Jack > > From sashak at voltaire.com Thu Jan 3 01:41:52 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 09:41:52 +0000 Subject: [ofa-general] [PATCH 0/3] opensm: use malloc instead of cl_qlock_pool Message-ID: <11993533151272-git-send-email-sashak@voltaire.com> As was shown in the thread http://lists.openfabrics.org/pipermail/general/2007-December/043806.html standard malloc/free primitives have much better performance and simpler usage than cl_qlock_pool allocator used in OpenSM. Those patches are only for master and don't target current OFED-1.3 release. Sasha From sashak at voltaire.com Thu Jan 3 01:41:54 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 09:41:54 +0000 Subject: [ofa-general] [PATCH 2/3] opensm: use malloc instead of cl_qlock_pool in osm_mad_pool.c In-Reply-To: <11993533151272-git-send-email-sashak@voltaire.com> References: <11993533151272-git-send-email-sashak@voltaire.com> Message-ID: <1199353316726-git-send-email-sashak@voltaire.com> Use regular malloc/free instead of cl_qlock_pool allocator in osm_mad_pool.c. malloc() is more than twice faster than cl_qlock_pool analogs and using this doesn't require any locking. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_mad_pool.h | 7 +--- opensm/include/opensm/osm_madw.h | 80 +++------------------------------- opensm/opensm/osm_mad_pool.c | 56 ++++-------------------- opensm/opensm/osm_vl15intf.c | 6 +- 4 files changed, 20 insertions(+), 129 deletions(-) diff --git a/opensm/include/opensm/osm_mad_pool.h b/opensm/include/opensm/osm_mad_pool.h index 9ec0a7a..b8421b9 100644 --- a/opensm/include/opensm/osm_mad_pool.h +++ b/opensm/include/opensm/osm_mad_pool.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,7 +50,6 @@ #include #include -#include #include #include #include @@ -97,7 +96,6 @@ BEGIN_C_DECLS */ typedef struct _osm_mad_pool { osm_log_t *p_log; - cl_qlock_pool_t madw_pool; atomic32_t mads_out; } osm_mad_pool_t; /* @@ -105,9 +103,6 @@ typedef struct _osm_mad_pool { * p_log * Pointer to the log object. * -* lock -* Spinlock guarding the pool. -* * mads_out * Running total of the number of MADs outstanding. * diff --git a/opensm/include/opensm/osm_madw.h b/opensm/include/opensm/osm_madw.h index d4bcbc1..31707ad 100644 --- a/opensm/include/opensm/osm_madw.h +++ b/opensm/include/opensm/osm_madw.h @@ -409,7 +409,7 @@ typedef struct _osm_mad_addr { * SYNOPSIS */ typedef struct _osm_madw { - cl_pool_item_t pool_item; + cl_list_item_t list_item; osm_bind_handle_t h_bind; osm_vend_wrap_t vend_wrap; osm_mad_addr_t mad_addr; @@ -423,8 +423,8 @@ typedef struct _osm_madw { } osm_madw_t; /* * FIELDS -* pool_item -* List linkage for pools and lists. MUST BE FIRST MEMBER! +* list_item +* List linkage for lists. MUST BE FIRST MEMBER! * * h_bind * Bind handle for the port on which this MAD will be sent @@ -467,72 +467,6 @@ typedef struct _osm_madw { * SEE ALSO *********/ -/****f* OpenSM: MAD Wrapper/osm_madw_construct -* NAME -* osm_madw_construct -* -* DESCRIPTION -* This function constructs a MAD Wrapper object. -* -* SYNOPSIS -*/ -static inline void osm_madw_construct(IN osm_madw_t * const p_madw) -{ - /* - Don't touch the pool_item since that is an opaque object. - Clear all other objects in the mad wrapper. - */ - memset(((uint8_t *) p_madw) + sizeof(cl_pool_item_t), 0, - sizeof(*p_madw) - sizeof(cl_pool_item_t)); -} - -/* -* PARAMETERS -* p_madw -* [in] Pointer to a MAD Wrapper object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_madw_init, osm_madw_destroy -* -* Calling osm_madw_construct is a prerequisite to calling any other -* method except osm_madw_init. -* -* SEE ALSO -* MAD Wrapper object, osm_madw_init, osm_madw_destroy -*********/ - -/****f* OpenSM: MAD Wrapper/osm_madw_destroy -* NAME -* osm_madw_destroy -* -* DESCRIPTION -* The osm_madw_destroy function destroys a node, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_madw_destroy(IN osm_madw_t * const p_madw); -/* -* PARAMETERS -* p_madw -* [in] Pointer to a MAD Wrapper object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified MAD Wrapper object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to osm_madw_construct or -* osm_madw_init. -* -* SEE ALSO -* MAD Wrapper object, osm_madw_construct, osm_madw_init -*********/ - /****f* OpenSM: MAD Wrapper/osm_madw_init * NAME * osm_madw_init @@ -548,7 +482,7 @@ osm_madw_init(IN osm_madw_t * const p_madw, IN const uint32_t mad_size, IN const osm_mad_addr_t * const p_mad_addr) { - osm_madw_construct(p_madw); + memset(p_madw, 0, sizeof(*p_madw)); p_madw->h_bind = h_bind; p_madw->fail_msg = CL_DISP_MSGID_NONE; p_madw->mad_size = mad_size; @@ -602,7 +536,7 @@ static inline ib_smp_t *osm_madw_get_smp_ptr(IN const osm_madw_t * const p_madw) * NOTES * * SEE ALSO -* MAD Wrapper object, osm_madw_construct, osm_madw_destroy +* MAD Wrapper object *********/ /****f* OpenSM: MAD Wrapper/osm_madw_get_sa_mad_ptr @@ -631,7 +565,7 @@ static inline ib_sa_mad_t *osm_madw_get_sa_mad_ptr(IN const osm_madw_t * * NOTES * * SEE ALSO -* MAD Wrapper object, osm_madw_construct, osm_madw_destroy +* MAD Wrapper object *********/ /****f* OpenSM: MAD Wrapper/osm_madw_get_perfmgt_mad_ptr @@ -657,7 +591,7 @@ static inline ib_perfmgt_mad_t *osm_madw_get_perfmgt_mad_ptr(IN const osm_madw_t * NOTES * * SEE ALSO -* MAD Wrapper object, osm_madw_construct, osm_madw_destroy +* MAD Wrapper object *********/ /****f* OpenSM: MAD Wrapper/osm_madw_get_ni_context_ptr diff --git a/opensm/opensm/osm_mad_pool.c b/opensm/opensm/osm_mad_pool.c index c3f3f2a..f9ef54c 100644 --- a/opensm/opensm/osm_mad_pool.c +++ b/opensm/opensm/osm_mad_pool.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -56,24 +56,6 @@ #include #include -#define OSM_MAD_POOL_MIN_SIZE 256 -#define OSM_MAD_POOL_GROW_SIZE 256 - -/********************************************************************** - **********************************************************************/ -cl_status_t -__osm_mad_pool_ctor(IN void *const p_object, - IN void *context, OUT cl_pool_item_t ** const pp_pool_item) -{ - osm_madw_t *p_madw = p_object; - - UNUSED_PARAM(context); - osm_madw_construct(p_madw); - /* CHECK THIS. DOCS DON'T DESCRIBE THIS OUT PARAM. */ - *pp_pool_item = &p_madw->pool_item; - return (CL_SUCCESS); -} - /********************************************************************** **********************************************************************/ void osm_mad_pool_construct(IN osm_mad_pool_t * const p_pool) @@ -81,7 +63,6 @@ void osm_mad_pool_construct(IN osm_mad_pool_t * const p_pool) CL_ASSERT(p_pool); memset(p_pool, 0, sizeof(*p_pool)); - cl_qlock_pool_construct(&p_pool->madw_pool); } /********************************************************************** @@ -89,9 +70,6 @@ void osm_mad_pool_construct(IN osm_mad_pool_t * const p_pool) void osm_mad_pool_destroy(IN osm_mad_pool_t * const p_pool) { CL_ASSERT(p_pool); - - /* HACK: we still rarely see some mads leaking - so ignore this */ - /* cl_qlock_pool_destroy( &p_pool->madw_pool ); */ } /********************************************************************** @@ -99,29 +77,12 @@ void osm_mad_pool_destroy(IN osm_mad_pool_t * const p_pool) ib_api_status_t osm_mad_pool_init(IN osm_mad_pool_t * const p_pool, IN osm_log_t * const p_log) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_mad_pool_init); p_pool->p_log = p_log; - status = cl_qlock_pool_init(&p_pool->madw_pool, - OSM_MAD_POOL_MIN_SIZE, - 0, - OSM_MAD_POOL_GROW_SIZE, - sizeof(osm_madw_t), - __osm_mad_pool_ctor, NULL, p_pool); - if (status != IB_SUCCESS) { - osm_log(p_log, OSM_LOG_ERROR, - "osm_mad_pool_init: ERR 0702: " - "Grow pool initialization failed (%s)\n", - ib_get_err_str(status)); - goto Exit; - } - - Exit: OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -142,7 +103,7 @@ osm_madw_t *osm_mad_pool_get(IN osm_mad_pool_t * const p_pool, /* First, acquire a mad wrapper from the mad wrapper pool. */ - p_madw = (osm_madw_t *) cl_qlock_pool_get(&p_pool->madw_pool); + p_madw = malloc(sizeof(*p_madw)); if (p_madw == NULL) { osm_log(p_pool->p_log, OSM_LOG_ERROR, "osm_mad_pool_get: ERR 0703: " @@ -162,8 +123,7 @@ osm_madw_t *osm_mad_pool_get(IN osm_mad_pool_t * const p_pool, "Unable to acquire wire MAD\n"); /* Don't leak wrappers! */ - cl_qlock_pool_put(&p_pool->madw_pool, - (cl_pool_item_t *) p_madw); + free(p_madw); p_madw = NULL; goto Exit; } @@ -202,7 +162,7 @@ osm_madw_t *osm_mad_pool_get_wrapper(IN osm_mad_pool_t * const p_pool, /* First, acquire a mad wrapper from the mad wrapper pool. */ - p_madw = (osm_madw_t *) cl_qlock_pool_get(&p_pool->madw_pool); + p_madw = malloc(sizeof(*p_madw)); if (p_madw == NULL) { osm_log(p_pool->p_log, OSM_LOG_ERROR, "osm_mad_pool_get_wrapper: ERR 0705: " @@ -234,7 +194,9 @@ osm_madw_t *osm_mad_pool_get_wrapper_raw(IN osm_mad_pool_t * const p_pool) OSM_LOG_ENTER(p_pool->p_log, osm_mad_pool_get_wrapper_raw); - p_madw = (osm_madw_t *) cl_qlock_pool_get(&p_pool->madw_pool); + p_madw = malloc(sizeof(*p_madw)); + if (!p_madw) + return NULL; osm_log(p_pool->p_log, OSM_LOG_DEBUG, "osm_mad_pool_get_wrapper_raw: " @@ -270,7 +232,7 @@ osm_mad_pool_put(IN osm_mad_pool_t * const p_pool, IN osm_madw_t * const p_madw) /* Return the mad wrapper to the wrapper pool */ - cl_qlock_pool_put(&p_pool->madw_pool, (cl_pool_item_t *) p_madw); + free(p_madw); cl_atomic_dec(&p_pool->mads_out); OSM_LOG_EXIT(p_pool->p_log); diff --git a/opensm/opensm/osm_vl15intf.c b/opensm/opensm/osm_vl15intf.c index 74e749f..5d10ed6 100644 --- a/opensm/opensm/osm_vl15intf.c +++ b/opensm/opensm/osm_vl15intf.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -340,10 +340,10 @@ void osm_vl15_post(IN osm_vl15_t * const p_vl, IN osm_madw_t * const p_madw) */ cl_spinlock_acquire(&p_vl->lock); if (p_madw->resp_expected == TRUE) { - cl_qlist_insert_tail(&p_vl->rfifo, (cl_list_item_t *) p_madw); + cl_qlist_insert_tail(&p_vl->rfifo, &p_madw->list_item); cl_atomic_inc(&p_vl->p_stats->qp0_mads_outstanding); } else - cl_qlist_insert_tail(&p_vl->ufifo, (cl_list_item_t *) p_madw); + cl_qlist_insert_tail(&p_vl->ufifo, &p_madw->list_item); cl_spinlock_release(&p_vl->lock); if (osm_log_is_active(p_vl->p_log, OSM_LOG_DEBUG)) -- 1.5.3.4.206.g58ba4 From sashak at voltaire.com Thu Jan 3 01:41:55 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 09:41:55 +0000 Subject: [ofa-general] [PATCH 3/3] complib: clean unused cl_qlockpool primitives In-Reply-To: <11993533151272-git-send-email-sashak@voltaire.com> References: <11993533151272-git-send-email-sashak@voltaire.com> Message-ID: <11993533161042-git-send-email-sashak@voltaire.com> Signed-off-by: Sasha Khapyorsky --- opensm/complib/Makefile.am | 1 - opensm/complib/libosmcomp.map | 5 - opensm/include/Makefile.am | 1 - opensm/include/complib/cl_qlockpool.h | 355 --------------------------------- 4 files changed, 0 insertions(+), 362 deletions(-) delete mode 100644 opensm/include/complib/cl_qlockpool.h diff --git a/opensm/complib/Makefile.am b/opensm/complib/Makefile.am index 3ef5357..063bb8b 100644 --- a/opensm/complib/Makefile.am +++ b/opensm/complib/Makefile.am @@ -58,7 +58,6 @@ libosmcompinclude_HEADERS = $(srcdir)/../include/complib/cl_atomic.h \ $(srcdir)/../include/complib/cl_ptr_vector.h \ $(srcdir)/../include/complib/cl_qcomppool.h \ $(srcdir)/../include/complib/cl_qlist.h \ - $(srcdir)/../include/complib/cl_qlockpool.h \ $(srcdir)/../include/complib/cl_qmap.h \ $(srcdir)/../include/complib/cl_qpool.h \ $(srcdir)/../include/complib/cl_spinlock.h \ diff --git a/opensm/complib/libosmcomp.map b/opensm/complib/libosmcomp.map index 435c2fe..d0e107e 100644 --- a/opensm/complib/libosmcomp.map +++ b/opensm/complib/libosmcomp.map @@ -98,11 +98,6 @@ OSMCOMP_2.3 { cl_ptr_vector_apply_func; cl_ptr_vector_find_from_start; cl_ptr_vector_find_from_end; - cl_qlock_pool_construct; - cl_qlock_pool_destroy; - cl_qlock_pool_init; - cl_qlock_pool_get; - cl_qlock_pool_put; cl_spinlock_construct; cl_spinlock_init; cl_spinlock_destroy; diff --git a/opensm/include/Makefile.am b/opensm/include/Makefile.am index d9ed2c3..a46669d 100644 --- a/opensm/include/Makefile.am +++ b/opensm/include/Makefile.am @@ -104,7 +104,6 @@ EXTRA_DIST = \ $(srcdir)/complib/cl_qlist.h \ $(srcdir)/complib/cl_vector.h \ $(srcdir)/complib/cl_byteswap_osd.h \ - $(srcdir)/complib/cl_qlockpool.h \ $(srcdir)/complib/cl_event_wheel.h \ $(srcdir)/complib/cl_thread.h \ $(srcdir)/complib/cl_packoff.h \ diff --git a/opensm/include/complib/cl_qlockpool.h b/opensm/include/complib/cl_qlockpool.h deleted file mode 100644 index a4d71a5..0000000 --- a/opensm/include/complib/cl_qlockpool.h +++ /dev/null @@ -1,355 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of cl_qlock_pool_t. - * This object represents a threadsafe quick-pool of objects. - * - * Environment: - * All - * - * $Revision: 1.3 $ - */ - -#ifndef _CL_QLOCKPOOL_H_ -#define _CL_QLOCKPOOL_H_ - -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* Component Library/Quick Locking Pool -* NAME -* Quick Locking Pool -* -* DESCRIPTION -* The Quick Locking Pool represents a thread-safe quick pool. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* SEE ALSO -* Structures: -* cl_qlock_pool_t -* -* Initialization: -* cl_qlock_pool_construct, cl_qlock_pool_init, cl_qlock_pool_destroy -* -* Manipulation -* cl_qlock_pool_get, cl_qlock_pool_put -*********/ -/****s* Component Library: Quick Locking Pool/cl_qlock_pool_t -* NAME -* cl_qlock_pool_t -* -* DESCRIPTION -* Quick Locking Pool structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _cl_qlock_pool { - cl_spinlock_t lock; - cl_qpool_t pool; -} cl_qlock_pool_t; -/* -* FIELDS -* lock -* Spinlock guarding the pool. -* -* pool -* quick_pool of user objects. -* -* SEE ALSO -* Quick Locking Pool -*********/ - -/****f* Component Library: Quick Locking Pool/cl_qlock_pool_construct -* NAME -* cl_qlock_pool_construct -* -* DESCRIPTION -* This function constructs a Quick Locking Pool. -* -* SYNOPSIS -*/ -static inline void cl_qlock_pool_construct(IN cl_qlock_pool_t * const p_pool) -{ - cl_qpool_construct(&p_pool->pool); - cl_spinlock_construct(&p_pool->lock); -} - -/* -* PARAMETERS -* p_pool -* [in] Pointer to a Quick Locking Pool to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling cl_qlock_pool_init, cl_qlock_pool_destroy -* -* Calling cl_qlock_pool_construct is a prerequisite to calling any other -* method except cl_qlock_pool_init. -* -* SEE ALSO -* Quick Locking Pool, cl_qlock_pool_init, cl_qlock_pool_destroy -*********/ - -/****f* Component Library: Quick Locking Pool/cl_qlock_pool_destroy -* NAME -* cl_qlock_pool_destroy -* -* DESCRIPTION -* The cl_qlock_pool_destroy function destroys a node, releasing -* all resources. -* -* SYNOPSIS -*/ -static inline void cl_qlock_pool_destroy(IN cl_qlock_pool_t * const p_pool) -{ - /* - If the pool has already been put into use, grab the lock - to sync with other threads before we blow everything away. - */ - if (cl_is_qpool_inited(&p_pool->pool)) { - cl_spinlock_acquire(&p_pool->lock); - cl_qpool_destroy(&p_pool->pool); - cl_spinlock_release(&p_pool->lock); - } else - cl_qpool_destroy(&p_pool->pool); - - cl_spinlock_destroy(&p_pool->lock); -} - -/* -* PARAMETERS -* p_pool -* [in] Pointer to a Quick Locking Pool to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified Quick Locking Pool. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* cl_qlock_pool_construct or cl_qlock_pool_init. -* -* SEE ALSO -* Quick Locking Pool, cl_qlock_pool_construct, cl_qlock_pool_init -*********/ - -/****f* Component Library: Quick Locking Pool/cl_qlock_pool_init -* NAME -* cl_qlock_pool_init -* -* DESCRIPTION -* The cl_qlock_pool_init function initializes a Quick Locking Pool for use. -* -* SYNOPSIS -*/ -static inline cl_status_t -cl_qlock_pool_init(IN cl_qlock_pool_t * const p_pool, - IN const size_t min_size, - IN const size_t max_size, - IN const size_t grow_size, - IN const size_t object_size, - IN cl_pfn_qpool_init_t pfn_initializer OPTIONAL, - IN cl_pfn_qpool_dtor_t pfn_destructor OPTIONAL, - IN const void *const context) -{ - cl_status_t status; - - cl_qlock_pool_construct(p_pool); - - status = cl_spinlock_init(&p_pool->lock); - if (status) - return (status); - - status = cl_qpool_init(&p_pool->pool, min_size, max_size, grow_size, - object_size, pfn_initializer, pfn_destructor, - context); - - return (status); -} - -/* -* PARAMETERS -* p_pool -* [in] Pointer to an cl_qlock_pool_t object to initialize. -* -* min_size -* [in] Minimum number of objects that the pool should support. All -* necessary allocations to allow storing the minimum number of items -* are performed at initialization time, and all necessary callbacks -* successfully invoked. -* -* max_size -* [in] Maximum number of objects to which the pool is allowed to grow. -* A value of zero specifies no maximum. -* -* grow_size -* [in] Number of objects to allocate when incrementally growing the pool. -* A value of zero disables automatic growth. -* -* object_size -* [in] Size, in bytes, of each object. -* -* pfn_initializer -* [in] Initialization callback to invoke for every new object when -* growing the pool. This parameter is optional and may be NULL. If NULL, -* the pool assumes the cl_pool_item_t structure describing objects is -* located at the head of each object. See the cl_pfn_qpool_init_t -* function type declaration for details about the callback function. -* -* pfn_destructor -* [in] Destructor callback to invoke for every object before memory for -* that object is freed. This parameter is optional and may be NULL. -* See the cl_pfn_qpool_dtor_t function type declaration for details -* about the callback function. -* -* context -* [in] Value to pass to the callback functions to provide context. -* -* RETURN VALUES -* CL_SUCCESS if the quick pool was initialized successfully. -* -* CL_INSUFFICIENT_MEMORY if there was not enough memory to initialize the -* quick pool. -* -* CL_INVALID_SETTING if a the maximum size is non-zero and less than the -* minimum size. -* -* Other cl_status_t value returned by optional initialization callback function -* specified by the pfn_initializer parameter. -* -* NOTES -* Allows calling other Quick Locking Pool methods. -* -* SEE ALSO -* Quick Locking Pool, cl_qlock_pool_construct, cl_qlock_pool_destroy -*********/ - -/****f* Component Library: Quick Locking Pool/cl_qlock_pool_get -* NAME -* cl_qlock_pool_get -* -* DESCRIPTION -* Gets an object wrapper and wire MAD from the pool. -* -* SYNOPSIS -*/ -static inline cl_pool_item_t *cl_qlock_pool_get(IN cl_qlock_pool_t * - const p_pool) -{ - cl_pool_item_t *p_item; - cl_spinlock_acquire(&p_pool->lock); - p_item = cl_qpool_get(&p_pool->pool); - cl_spinlock_release(&p_pool->lock); - return (p_item); -} - -/* -* PARAMETERS -* p_pool -* [in] Pointer to an cl_qlock_pool_t object. -* -* RETURN VALUES -* Returns a pointer to a cl_pool_item_t contained in the user object. -* -* NOTES -* The object must eventually be returned to the pool with a call to -* cl_qlock_pool_put. -* -* The cl_qlock_pool_construct or cl_qlock_pool_init must be called before -* using this function. -* -* SEE ALSO -* Quick Locking Pool, cl_qlock_pool_put -*********/ - -/****f* Component Library: Quick Locking Pool/cl_qlock_pool_put -* NAME -* cl_qlock_pool_put -* -* DESCRIPTION -* Returns an object to the pool. -* -* SYNOPSIS -*/ -static inline void -cl_qlock_pool_put(IN cl_qlock_pool_t * const p_pool, - IN cl_pool_item_t * const p_item) -{ - cl_spinlock_acquire(&p_pool->lock); - cl_qpool_put(&p_pool->pool, p_item); - cl_spinlock_release(&p_pool->lock); -} - -/* -* PARAMETERS -* p_pool -* [in] Pointer to an cl_qlock_pool_t object. -* -* p_item -* [in] Pointer to the cl_pool_item_t in an object that was previously -* retrieved from the pool. -* -* RETURN VALUES -* This function does not return a value. -* -* NOTES -* The cl_qlock_pool_construct or cl_qlock_pool_init must be called before -* using this function. -* -* SEE ALSO -* Quick Locking Pool, cl_qlock_pool_get -*********/ - -END_C_DECLS -#endif /* _CL_QLOCKPOOL_H_ */ -- 1.5.3.4.206.g58ba4 From sashak at voltaire.com Thu Jan 3 01:41:53 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 09:41:53 +0000 Subject: [ofa-general] [PATCH 1/3] opensm: use malloc instead of cl_qlock_pool in SA processors In-Reply-To: <11993533151272-git-send-email-sashak@voltaire.com> References: <11993533151272-git-send-email-sashak@voltaire.com> Message-ID: <11993533153074-git-send-email-sashak@voltaire.com> Use regular malloc/free instead of cl_qlock_pool for records allocation in SA processors. Simple benchmark shows that regular malloc/free is more than twice faster than cl_qlock_pool allocator and this doesn't require additional locking (actually it still be faster than non-locking cl_qpool allocator too). Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_sa_class_port_info.h | 4 +- opensm/include/opensm/osm_sa_guidinfo_record.h | 8 +--- opensm/include/opensm/osm_sa_informinfo.h | 10 +--- opensm/include/opensm/osm_sa_lft_record.h | 8 +--- opensm/include/opensm/osm_sa_link_record.h | 7 +-- opensm/include/opensm/osm_sa_mcmember_record.h | 4 +- opensm/include/opensm/osm_sa_mft_record.h | 8 +--- opensm/include/opensm/osm_sa_multipath_record.h | 8 +--- opensm/include/opensm/osm_sa_node_record.h | 8 +--- opensm/include/opensm/osm_sa_path_record.h | 8 +--- opensm/include/opensm/osm_sa_pkey_record.h | 8 +--- opensm/include/opensm/osm_sa_portinfo_record.h | 8 +--- opensm/include/opensm/osm_sa_service_record.h | 8 +--- opensm/include/opensm/osm_sa_slvl_record.h | 8 +--- opensm/include/opensm/osm_sa_sminfo_record.h | 4 +- opensm/include/opensm/osm_sa_sw_info_record.h | 3 +- opensm/include/opensm/osm_sa_vlarb_record.h | 8 +--- opensm/opensm/osm_sa_guidinfo_record.c | 38 ++++---------- opensm/opensm/osm_sa_informinfo.c | 33 +++--------- opensm/opensm/osm_sa_lft_record.c | 39 ++++---------- opensm/opensm/osm_sa_link_record.c | 33 +++--------- opensm/opensm/osm_sa_mcmember_record.c | 38 +++----------- opensm/opensm/osm_sa_mft_record.c | 36 ++++---------- opensm/opensm/osm_sa_multipath_record.c | 62 +++++++---------------- opensm/opensm/osm_sa_node_record.c | 40 ++++---------- opensm/opensm/osm_sa_path_record.c | 49 +++++------------- opensm/opensm/osm_sa_pkey_record.c | 37 ++++---------- opensm/opensm/osm_sa_portinfo_record.c | 36 ++++---------- opensm/opensm/osm_sa_service_record.c | 49 ++++-------------- opensm/opensm/osm_sa_slvl_record.c | 34 +++--------- opensm/opensm/osm_sa_sminfo_record.c | 39 ++++---------- opensm/opensm/osm_sa_sw_info_record.c | 35 ++++--------- opensm/opensm/osm_sa_vlarb_record.c | 41 ++++----------- 33 files changed, 193 insertions(+), 566 deletions(-) diff --git a/opensm/include/opensm/osm_sa_class_port_info.h b/opensm/include/opensm/osm_sa_class_port_info.h index b477cd5..6e4c069 100644 --- a/opensm/include/opensm/osm_sa_class_port_info.h +++ b/opensm/include/opensm/osm_sa_class_port_info.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -49,8 +49,6 @@ #define _OSM_CPI_H_ #include -#include -#include #include #include #include diff --git a/opensm/include/opensm/osm_sa_guidinfo_record.h b/opensm/include/opensm/osm_sa_guidinfo_record.h index b3035c7..c074b7b 100644 --- a/opensm/include/opensm/osm_sa_guidinfo_record.h +++ b/opensm/include/opensm/osm_sa_guidinfo_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -48,7 +48,6 @@ #define _OSM_GIR_RCV_H_ #include -#include #include #include #include @@ -100,7 +99,6 @@ typedef struct _osm_gir_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_gir_rcv_t; /* * FIELDS @@ -119,10 +117,6 @@ typedef struct _osm_gir_rcv { * p_lock * Pointer to the serializing lock. * -* pool -* Pool of linkable GUIDInfo Record objects used to generate -* the query response. -* * SEE ALSO * *********/ diff --git a/opensm/include/opensm/osm_sa_informinfo.h b/opensm/include/opensm/osm_sa_informinfo.h index 5d00dd6..2a4b4ba 100644 --- a/opensm/include/opensm/osm_sa_informinfo.h +++ b/opensm/include/opensm/osm_sa_informinfo.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,9 +50,6 @@ #define _OSM_SA_INFR_H_ #include -#include -#include -#include #include #include #include @@ -104,7 +101,6 @@ typedef struct _osm_infr_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_infr_rcv_t; /* * FIELDS @@ -120,10 +116,6 @@ typedef struct _osm_infr_rcv { * p_lock * Pointer to the serializing lock. * -* pool -* Pool of linkable InformInfo Record objects used to -* generate the query response. -* * SEE ALSO * InformInfo Receiver object *********/ diff --git a/opensm/include/opensm/osm_sa_lft_record.h b/opensm/include/opensm/osm_sa_lft_record.h index 18a43f4..8470490 100644 --- a/opensm/include/opensm/osm_sa_lft_record.h +++ b/opensm/include/opensm/osm_sa_lft_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,7 +50,6 @@ #define _OSM_LFTR_H_ #include -#include #include #include #include @@ -103,7 +102,6 @@ typedef struct _osm_lft { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_lftr_rcv_t; /* * FIELDS @@ -125,10 +123,6 @@ typedef struct _osm_lft { * p_lock * Pointer to the serializing lock. * -* pool -* Pool of linkable Linear Forwarding Table Record objects used to -* generate the query response. -* * SEE ALSO * Linear Forwarding Table Receiver object *********/ diff --git a/opensm/include/opensm/osm_sa_link_record.h b/opensm/include/opensm/osm_sa_link_record.h index 3104704..d09eb69 100644 --- a/opensm/include/opensm/osm_sa_link_record.h +++ b/opensm/include/opensm/osm_sa_link_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -49,7 +49,6 @@ #ifndef _OSM_LR_RCV_H_ #define _OSM_LR_RCV_H_ -#include #include #include #include @@ -102,7 +101,6 @@ typedef struct _osm_lr_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t lr_pool; } osm_lr_rcv_t; /* * FIELDS @@ -121,9 +119,6 @@ typedef struct _osm_lr_rcv { * p_lock * Pointer to the serializing lock. * -* lr_pool -* Pool of link record objects used to generate the query response. -* * SEE ALSO *********/ diff --git a/opensm/include/opensm/osm_sa_mcmember_record.h b/opensm/include/opensm/osm_sa_mcmember_record.h index f13bc98..8540a89 100644 --- a/opensm/include/opensm/osm_sa_mcmember_record.h +++ b/opensm/include/opensm/osm_sa_mcmember_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,7 +50,6 @@ #define _OSM_MCMR_H_ #include -#include #include #include #include @@ -105,7 +104,6 @@ typedef struct _osm_mcmr { osm_log_t *p_log; cl_plock_t *p_lock; uint16_t mlid_ho; - cl_qlock_pool_t pool; } osm_mcmr_recv_t; /* diff --git a/opensm/include/opensm/osm_sa_mft_record.h b/opensm/include/opensm/osm_sa_mft_record.h index dd14257..09b922d 100644 --- a/opensm/include/opensm/osm_sa_mft_record.h +++ b/opensm/include/opensm/osm_sa_mft_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -49,7 +49,6 @@ #define _OSM_MFTR_H_ #include -#include #include #include #include @@ -102,7 +101,6 @@ typedef struct _osm_mft { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_mftr_rcv_t; /* * FIELDS @@ -124,10 +122,6 @@ typedef struct _osm_mft { * p_lock * Pointer to the serializing lock. * -* pool -* Pool of linkable Multicast Forwarding Table Record objects used to -* generate the query response. -* * SEE ALSO * Multicast Forwarding Table Receiver object *********/ diff --git a/opensm/include/opensm/osm_sa_multipath_record.h b/opensm/include/opensm/osm_sa_multipath_record.h index 8fa1046..afd407d 100644 --- a/opensm/include/opensm/osm_sa_multipath_record.h +++ b/opensm/include/opensm/osm_sa_multipath_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -49,8 +49,6 @@ #define _OSM_MPR_RCV_H_ #include -#include -#include #include #include #include @@ -103,7 +101,6 @@ typedef struct _osm_mpr_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pr_pool; } osm_mpr_rcv_t; /* * FIELDS @@ -119,9 +116,6 @@ typedef struct _osm_mpr_rcv { * p_lock * Pointer to the serializing lock. * -* pr_pool -* Pool of multipath record objects used to generate query responses. -* * SEE ALSO * MultiPath Record Receiver object *********/ diff --git a/opensm/include/opensm/osm_sa_node_record.h b/opensm/include/opensm/osm_sa_node_record.h index 36eea27..8f385f8 100644 --- a/opensm/include/opensm/osm_sa_node_record.h +++ b/opensm/include/opensm/osm_sa_node_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -49,7 +49,6 @@ #ifndef _OSM_NR_H_ #define _OSM_NR_H_ -#include #include #include #include @@ -101,7 +100,6 @@ typedef struct _osm_nr_recv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_nr_rcv_t; /* * FIELDS @@ -120,10 +118,6 @@ typedef struct _osm_nr_recv { * p_lock * Pointer to the serializing lock. * -* pool -* Pool of linkable node record objects used to generate -* the query response. -* * SEE ALSO * *********/ diff --git a/opensm/include/opensm/osm_sa_path_record.h b/opensm/include/opensm/osm_sa_path_record.h index 88eb6c3..76d24fc 100644 --- a/opensm/include/opensm/osm_sa_path_record.h +++ b/opensm/include/opensm/osm_sa_path_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,8 +50,6 @@ #define _OSM_PR_H_ #include -#include -#include #include #include #include @@ -104,7 +102,6 @@ typedef struct _osm_pr_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pr_pool; } osm_pr_rcv_t; /* * FIELDS @@ -120,9 +117,6 @@ typedef struct _osm_pr_rcv { * p_lock * Pointer to the serializing lock. * -* pr_pool -* Pool of path record objects used to generate query responses. -* * SEE ALSO * Path Record Receiver object *********/ diff --git a/opensm/include/opensm/osm_sa_pkey_record.h b/opensm/include/opensm/osm_sa_pkey_record.h index 4242a2f..b2f43f0 100644 --- a/opensm/include/opensm/osm_sa_pkey_record.h +++ b/opensm/include/opensm/osm_sa_pkey_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -37,7 +37,6 @@ #define _OSM_PKEY_REC_RCV_H_ #include -#include #include #include #include @@ -89,7 +88,6 @@ typedef struct _osm_pkey_rec_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_pkey_rec_rcv_t; /* * FIELDS @@ -108,10 +106,6 @@ typedef struct _osm_pkey_rec_rcv { * p_lock * Pointer to the serializing lock. * -* pool -* Pool of linkable P_Key Record objects used to generate -* the query response. -* * SEE ALSO * *********/ diff --git a/opensm/include/opensm/osm_sa_portinfo_record.h b/opensm/include/opensm/osm_sa_portinfo_record.h index 38eabdb..a818f25 100644 --- a/opensm/include/opensm/osm_sa_portinfo_record.h +++ b/opensm/include/opensm/osm_sa_portinfo_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,7 +50,6 @@ #define _OSM_PIR_RCV_H_ #include -#include #include #include #include @@ -102,7 +101,6 @@ typedef struct _osm_pir_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_pir_rcv_t; /* * FIELDS @@ -121,10 +119,6 @@ typedef struct _osm_pir_rcv { * p_lock * Pointer to the serializing lock. * -* pool -* Pool of linkable PortInfo Record objects used to generate -* the query response. -* * SEE ALSO * *********/ diff --git a/opensm/include/opensm/osm_sa_service_record.h b/opensm/include/opensm/osm_sa_service_record.h index 8884944..43859e0 100644 --- a/opensm/include/opensm/osm_sa_service_record.h +++ b/opensm/include/opensm/osm_sa_service_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,9 +50,7 @@ #define _OSM_SR_H_ #include -#include #include -#include #include #include #include @@ -104,7 +102,6 @@ typedef struct _osm_sr_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t sr_pool; cl_timer_t sr_timer; } osm_sr_rcv_t; /* @@ -121,9 +118,6 @@ typedef struct _osm_sr_rcv { * p_lock * Pointer to the serializing lock. * -* sr_pool -* Pool of Service Record objects used to generate query responses. -* * SEE ALSO * Service Record Receiver object *********/ diff --git a/opensm/include/opensm/osm_sa_slvl_record.h b/opensm/include/opensm/osm_sa_slvl_record.h index c72d5d4..518a0f1 100644 --- a/opensm/include/opensm/osm_sa_slvl_record.h +++ b/opensm/include/opensm/osm_sa_slvl_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,7 +50,6 @@ #define _OSM_SLVL_REC_RCV_H_ #include -#include #include #include #include @@ -102,7 +101,6 @@ typedef struct _osm_slvl_rec_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_slvl_rec_rcv_t; /* * FIELDS @@ -121,10 +119,6 @@ typedef struct _osm_slvl_rec_rcv { * p_lock * Pointer to the serializing lock. * -* pool -* Pool of linkable SLtoVL Mapping Record objects used to generate -* the query response. -* * SEE ALSO * *********/ diff --git a/opensm/include/opensm/osm_sa_sminfo_record.h b/opensm/include/opensm/osm_sa_sminfo_record.h index ce57925..f4fd1ff 100644 --- a/opensm/include/opensm/osm_sa_sminfo_record.h +++ b/opensm/include/opensm/osm_sa_sminfo_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,7 +50,6 @@ #define _OSM_SMIR_H_ #include -#include #include #include #include @@ -103,7 +102,6 @@ typedef struct _osm_smir { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_smir_rcv_t; /* * FIELDS diff --git a/opensm/include/opensm/osm_sa_sw_info_record.h b/opensm/include/opensm/osm_sa_sw_info_record.h index ad1f773..df6f842 100644 --- a/opensm/include/opensm/osm_sa_sw_info_record.h +++ b/opensm/include/opensm/osm_sa_sw_info_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -103,7 +103,6 @@ typedef struct _osm_sir_rcv { osm_req_t *p_req; osm_state_mgr_t *p_state_mgr; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_sir_rcv_t; /* * FIELDS diff --git a/opensm/include/opensm/osm_sa_vlarb_record.h b/opensm/include/opensm/osm_sa_vlarb_record.h index e823880..1ed8554 100644 --- a/opensm/include/opensm/osm_sa_vlarb_record.h +++ b/opensm/include/opensm/osm_sa_vlarb_record.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -50,7 +50,6 @@ #define _OSM_VLARB_REC_RCV_H_ #include -#include #include #include #include @@ -102,7 +101,6 @@ typedef struct _osm_vlarb_rec_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_qlock_pool_t pool; } osm_vlarb_rec_rcv_t; /* * FIELDS @@ -121,10 +119,6 @@ typedef struct _osm_vlarb_rec_rcv { * p_lock * Pointer to the serializing lock. * -* pool -* Pool of linkable VLArbitration Record objects used to generate -* the query response. -* * SEE ALSO * *********/ diff --git a/opensm/opensm/osm_sa_guidinfo_record.c b/opensm/opensm/osm_sa_guidinfo_record.c index d955e93..a758888 100644 --- a/opensm/opensm/osm_sa_guidinfo_record.c +++ b/opensm/opensm/osm_sa_guidinfo_record.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -62,11 +62,8 @@ #include #include -#define OSM_GIR_RCV_POOL_MIN_SIZE 32 -#define OSM_GIR_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_gir_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_guidinfo_record_t rec; } osm_gir_item_t; @@ -83,7 +80,6 @@ typedef struct _osm_gir_search_ctxt { void osm_gir_rcv_construct(IN osm_gir_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -91,7 +87,6 @@ void osm_gir_rcv_construct(IN osm_gir_rcv_t * const p_rcv) void osm_gir_rcv_destroy(IN osm_gir_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_gir_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -104,8 +99,6 @@ osm_gir_rcv_init(IN osm_gir_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_gir_rcv_init); osm_gir_rcv_construct(p_rcv); @@ -116,14 +109,8 @@ osm_gir_rcv_init(IN osm_gir_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_GIR_RCV_POOL_MIN_SIZE, - 0, - OSM_GIR_RCV_POOL_GROW_SIZE, - sizeof(osm_gir_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -142,11 +129,11 @@ __osm_gir_rcv_new_gir(IN osm_gir_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_gir_rcv_new_gir); - p_rec_item = (osm_gir_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_gir_rcv_new_gir: ERR 5102: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } @@ -158,7 +145,7 @@ __osm_gir_rcv_new_gir(IN osm_gir_rcv_t * const p_rcv, cl_ntoh16(match_lid), block_num); } - memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = match_lid; p_rec_item->rec.block_num = block_num; @@ -166,8 +153,7 @@ __osm_gir_rcv_new_gir(IN osm_gir_rcv_t * const p_rcv, p_rec_item->rec.guid_info.guid[0] = osm_physp_get_port_guid(p_req_physp); - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -465,8 +451,7 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) (osm_gir_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_gir_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, - &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_gir_item_t *) cl_qlist_remove_head(&rec_list); } @@ -514,7 +499,7 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_gir_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -559,10 +544,9 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < pre_trim_num_rec; i++) { p_rec_item = (osm_gir_item_t *) cl_qlist_remove_head(&rec_list); /* copy only if not trimmed */ - if (i < num_rec) { + if (i < num_rec) *p_resp_rec = p_rec_item->rec; - } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_informinfo.c b/opensm/opensm/osm_sa_informinfo.c index 71332dc..db58bc0 100644 --- a/opensm/opensm/osm_sa_informinfo.c +++ b/opensm/opensm/osm_sa_informinfo.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -66,11 +66,8 @@ #include #include -#define OSM_IIR_RCV_POOL_MIN_SIZE 32 -#define OSM_IIR_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_iir_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_inform_info_record_t rec; } osm_iir_item_t; @@ -89,7 +86,6 @@ typedef struct _osm_iir_search_ctxt { void osm_infr_rcv_construct(IN osm_infr_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -99,7 +95,6 @@ void osm_infr_rcv_destroy(IN osm_infr_rcv_t * const p_rcv) CL_ASSERT(p_rcv); OSM_LOG_ENTER(p_rcv->p_log, osm_infr_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -112,8 +107,6 @@ osm_infr_rcv_init(IN osm_infr_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status = IB_ERROR; - OSM_LOG_ENTER(p_log, osm_infr_rcv_init); osm_infr_rcv_construct(p_rcv); @@ -124,14 +117,8 @@ osm_infr_rcv_init(IN osm_infr_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_IIR_RCV_POOL_MIN_SIZE, - 0, - OSM_IIR_RCV_POOL_GROW_SIZE, - sizeof(osm_iir_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_rcv->p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -392,18 +379,17 @@ __osm_sa_inform_info_rec_by_comp_mask(IN osm_infr_rcv_t * const p_rcv, goto Exit; } - p_rec_item = (osm_iir_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_sa_inform_info_rec_by_comp_mask: ERR 430E: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); goto Exit; } memcpy((void *)&p_rec_item->rec, (void *)&p_infr->inform_record, sizeof(ib_inform_info_record_t)); - cl_qlist_insert_tail(p_ctxt->p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -519,8 +505,7 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, (osm_iir_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_iir_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, - &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_iir_item_t *) cl_qlist_remove_head(&rec_list); } @@ -565,7 +550,7 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, for (i = 0; i < num_rec; i++) { p_rec_item = (osm_iir_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -619,7 +604,7 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, for (j = 0; j < 4; j++) p_resp_rec->pad[j] = 0; } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_lft_record.c b/opensm/opensm/osm_sa_lft_record.c index e645569..5f3f208 100644 --- a/opensm/opensm/osm_sa_lft_record.c +++ b/opensm/opensm/osm_sa_lft_record.c @@ -60,11 +60,8 @@ #include #include -#define OSM_LFTR_RCV_POOL_MIN_SIZE 32 -#define OSM_LFTR_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_lftr_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_lft_record_t rec; } osm_lftr_item_t; @@ -81,7 +78,6 @@ typedef struct _osm_lftr_search_ctxt { void osm_lftr_rcv_construct(IN osm_lftr_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -89,7 +85,6 @@ void osm_lftr_rcv_construct(IN osm_lftr_rcv_t * const p_rcv) void osm_lftr_rcv_destroy(IN osm_lftr_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_lftr_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -102,8 +97,6 @@ osm_lftr_rcv_init(IN osm_lftr_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_lftr_rcv_init); osm_lftr_rcv_construct(p_rcv); @@ -114,14 +107,8 @@ osm_lftr_rcv_init(IN osm_lftr_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_LFTR_RCV_POOL_MIN_SIZE, - 0, - OSM_LFTR_RCV_POOL_GROW_SIZE, - sizeof(osm_lftr_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -137,16 +124,16 @@ __osm_lftr_rcv_new_lftr(IN osm_lftr_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_lftr_rcv_new_lftr); - p_rec_item = (osm_lftr_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_lftr_rcv_new_lftr: ERR 4402: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { + if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) osm_log(p_rcv->p_log, OSM_LOG_DEBUG, "__osm_lftr_rcv_new_lftr: " "New LinearForwardingTable: sw 0x%016" PRIx64 @@ -154,9 +141,8 @@ __osm_lftr_rcv_new_lftr(IN osm_lftr_rcv_t * const p_rcv, cl_ntoh64(osm_node_get_node_guid(p_sw->p_node)), cl_ntoh16(block), cl_ntoh16(lid) ); - } - memset(&p_rec_item->rec, 0, sizeof(ib_lft_record_t)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = lid; p_rec_item->rec.block_num = block; @@ -165,8 +151,7 @@ __osm_lftr_rcv_new_lftr(IN osm_lftr_rcv_t * const p_rcv, osm_switch_get_fwd_tbl_block(p_sw, cl_ntoh16(block), p_rec_item->rec.lft); - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -369,8 +354,7 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) (osm_lftr_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_lftr_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, - &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_lftr_item_t *) cl_qlist_remove_head(&rec_list); } @@ -418,7 +402,7 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_lftr_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -465,10 +449,9 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) p_rec_item = (osm_lftr_item_t *) cl_qlist_remove_head(&rec_list); /* copy only if not trimmed */ - if (i < num_rec) { + if (i < num_rec) *p_resp_rec = p_rec_item->rec; - } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c index a6bdc8f..ba239be 100644 --- a/opensm/opensm/osm_sa_link_record.c +++ b/opensm/opensm/osm_sa_link_record.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -61,11 +61,8 @@ #include #include -#define OSM_LR_RCV_POOL_MIN_SIZE 64 -#define OSM_LR_RCV_POOL_GROW_SIZE 64 - typedef struct _osm_lr_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_link_record_t link_rec; } osm_lr_item_t; @@ -74,7 +71,6 @@ typedef struct _osm_lr_item { void osm_lr_rcv_construct(IN osm_lr_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->lr_pool); } /********************************************************************** @@ -82,7 +78,6 @@ void osm_lr_rcv_construct(IN osm_lr_rcv_t * const p_rcv) void osm_lr_rcv_destroy(IN osm_lr_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_lr_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->lr_pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -95,8 +90,6 @@ osm_lr_rcv_init(IN osm_lr_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_lr_rcv_init); osm_lr_rcv_construct(p_rcv); @@ -107,14 +100,8 @@ osm_lr_rcv_init(IN osm_lr_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->lr_pool, - OSM_LR_RCV_POOL_MIN_SIZE, - 0, - OSM_LR_RCV_POOL_GROW_SIZE, - sizeof(osm_lr_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_rcv->p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -128,7 +115,7 @@ __osm_lr_rcv_build_physp_link(IN osm_lr_rcv_t * const p_rcv, { osm_lr_item_t *p_lr_item; - p_lr_item = (osm_lr_item_t *) cl_qlock_pool_get(&p_rcv->lr_pool); + p_lr_item = malloc(sizeof(*p_lr_item)); if (p_lr_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_lr_rcv_build_physp_link: ERR 1801: " @@ -141,13 +128,14 @@ __osm_lr_rcv_build_physp_link(IN osm_lr_rcv_t * const p_rcv, cl_ntoh16(from_lid), cl_ntoh16(to_lid)); return; } + memset(p_lr_item, 0, sizeof(*p_lr_item)); p_lr_item->link_rec.from_port_num = from_port; p_lr_item->link_rec.to_port_num = to_port; p_lr_item->link_rec.to_lid = to_lid; p_lr_item->link_rec.from_lid = from_lid; - cl_qlist_insert_tail(p_list, (cl_list_item_t *) & p_lr_item->pool_item); + cl_qlist_insert_tail(p_list, &p_lr_item->list_item); } /********************************************************************** @@ -560,8 +548,7 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, /* need to set the mem free ... */ p_lr_item = (osm_lr_item_t *) cl_qlist_remove_head(p_list); while (p_lr_item != (osm_lr_item_t *) cl_qlist_end(p_list)) { - cl_qlock_pool_put(&p_rcv->lr_pool, - &p_lr_item->pool_item); + free(p_lr_item); p_lr_item = (osm_lr_item_t *) cl_qlist_remove_head(p_list); } @@ -600,8 +587,7 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, /* Release the quick pool items */ p_lr_item = (osm_lr_item_t *) cl_qlist_remove_head(p_list); while (p_lr_item != (osm_lr_item_t *) cl_qlist_end(p_list)) { - cl_qlock_pool_put(&p_rcv->lr_pool, - &p_lr_item->pool_item); + free(p_lr_item); p_lr_item = (osm_lr_item_t *) cl_qlist_remove_head(p_list); } @@ -654,8 +640,7 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, *p_resp_lr = p_lr_item->link_rec; num_copied++; } - cl_qlock_pool_put(&p_rcv->lr_pool, - &p_lr_item->pool_item); + free(p_lr_item); p_resp_lr++; p_lr_item = (osm_lr_item_t *) cl_qlist_remove_head(p_list); diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 5d5fb8d..ddb1ca5 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -70,11 +70,8 @@ #include #include -#define OSM_MCMR_RCV_POOL_MIN_SIZE 32 -#define OSM_MCMR_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_mcmr_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_member_rec_t rec; } osm_mcmr_item_t; @@ -93,7 +90,6 @@ typedef struct osm_sa_mcmr_search_ctxt { void osm_mcmr_rcv_construct(IN osm_mcmr_recv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -103,9 +99,6 @@ void osm_mcmr_rcv_destroy(IN osm_mcmr_recv_t * const p_rcv) CL_ASSERT(p_rcv); OSM_LOG_ENTER(p_rcv->p_log, osm_mcmr_rcv_destroy); - - cl_qlock_pool_destroy(&p_rcv->pool); - OSM_LOG_EXIT(p_rcv->p_log); } @@ -119,8 +112,6 @@ osm_mcmr_rcv_init(IN osm_sm_t * const p_sm, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_mcmr_rcv_init); osm_mcmr_rcv_construct(p_rcv); @@ -133,18 +124,8 @@ osm_mcmr_rcv_init(IN osm_sm_t * const p_sm, p_rcv->p_mad_pool = p_mad_pool; p_rcv->mlid_ho = 0xC000; - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_MCMR_RCV_POOL_MIN_SIZE, - 0, - OSM_MCMR_RCV_POOL_GROW_SIZE, - sizeof(osm_mcmr_item_t), NULL, NULL, NULL); - if (status != CL_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, - "osm_mcmr_rcv_init: ERR 1B02: " - "qlock pool init failed (%d)\n", status); - } OSM_LOG_EXIT(p_rcv->p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -1727,22 +1708,21 @@ __osm_mcmr_rcv_new_mcmr(IN osm_mcmr_recv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_mcmr_rcv_new_mcmr); - p_rec_item = (osm_mcmr_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_new_mcmr: ERR 1B15: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); /* HACK: Untrusted requesters should result with 0 Join State, Port Guid, and Proxy */ p_rec_item->rec = *p_rcvd_rec; - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -2023,7 +2003,7 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, (osm_mcmr_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_mcmr_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_mcmr_item_t *) cl_qlist_remove_head(&rec_list); } @@ -2070,7 +2050,7 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, for (i = 0; i < num_rec; i++) { p_rec_item = (osm_mcmr_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -2134,7 +2114,7 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, p_resp_rec->proxy_join = 0; } } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_mft_record.c b/opensm/opensm/osm_sa_mft_record.c index 3968304..f9ac527 100644 --- a/opensm/opensm/osm_sa_mft_record.c +++ b/opensm/opensm/osm_sa_mft_record.c @@ -59,11 +59,8 @@ #include #include -#define OSM_MFTR_RCV_POOL_MIN_SIZE 32 -#define OSM_MFTR_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_mftr_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_mft_record_t rec; } osm_mftr_item_t; @@ -80,7 +77,6 @@ typedef struct _osm_mftr_search_ctxt { void osm_mftr_rcv_construct(IN osm_mftr_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -88,7 +84,6 @@ void osm_mftr_rcv_construct(IN osm_mftr_rcv_t * const p_rcv) void osm_mftr_rcv_destroy(IN osm_mftr_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_mftr_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -101,8 +96,6 @@ osm_mftr_rcv_init(IN osm_mftr_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_mftr_rcv_init); osm_mftr_rcv_construct(p_rcv); @@ -113,14 +106,8 @@ osm_mftr_rcv_init(IN osm_mftr_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_MFTR_RCV_POOL_MIN_SIZE, - 0, - OSM_MFTR_RCV_POOL_GROW_SIZE, - sizeof(osm_mftr_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -138,11 +125,11 @@ __osm_mftr_rcv_new_mftr(IN osm_mftr_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_mftr_rcv_new_mftr); - p_rec_item = (osm_mftr_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_mftr_rcv_new_mftr: ERR 4A02: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } @@ -160,7 +147,7 @@ __osm_mftr_rcv_new_mftr(IN osm_mftr_rcv_t * const p_rcv, position_block_num = ((uint16_t) position << 12) | (block & IB_MCAST_BLOCK_ID_MASK_HO); - memset(&p_rec_item->rec, 0, sizeof(ib_mft_record_t)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = lid; p_rec_item->rec.position_block_num = cl_hton16(position_block_num); @@ -168,8 +155,7 @@ __osm_mftr_rcv_new_mftr(IN osm_mftr_rcv_t * const p_rcv, /* copy the mft block */ osm_switch_get_mft_block(p_sw, block, position, p_rec_item->rec.mft); - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -399,8 +385,7 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) (osm_mftr_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_mftr_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, - &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_mftr_item_t *) cl_qlist_remove_head(&rec_list); } @@ -448,7 +433,7 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_mftr_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -495,10 +480,9 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) p_rec_item = (osm_mftr_item_t *) cl_qlist_remove_head(&rec_list); /* copy only if not trimmed */ - if (i < num_rec) { + if (i < num_rec) *p_resp_rec = p_rec_item->rec; - } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_multipath_record.c b/opensm/opensm/osm_sa_multipath_record.c index a37b726..6851cce 100644 --- a/opensm/opensm/osm_sa_multipath_record.c +++ b/opensm/opensm/osm_sa_multipath_record.c @@ -67,13 +67,10 @@ #include #include -#define OSM_MPR_RCV_POOL_MIN_SIZE 64 -#define OSM_MPR_RCV_POOL_GROW_SIZE 64 - #define OSM_SA_MPR_MAX_NUM_PATH 127 typedef struct _osm_mpr_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; const osm_port_t *p_src_port; const osm_port_t *p_dest_port; int hops; @@ -95,7 +92,6 @@ typedef struct _osm_path_parms { void osm_mpr_rcv_construct(IN osm_mpr_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pr_pool); } /********************************************************************** @@ -103,7 +99,6 @@ void osm_mpr_rcv_construct(IN osm_mpr_rcv_t * const p_rcv) void osm_mpr_rcv_destroy(IN osm_mpr_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_mpr_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pr_pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -116,8 +111,6 @@ osm_mpr_rcv_init(IN osm_mpr_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_mpr_rcv_init); osm_mpr_rcv_construct(p_rcv); @@ -128,14 +121,8 @@ osm_mpr_rcv_init(IN osm_mpr_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pr_pool, - OSM_MPR_RCV_POOL_MIN_SIZE, - 0, - OSM_MPR_RCV_POOL_GROW_SIZE, - sizeof(osm_mpr_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_rcv->p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -905,20 +892,21 @@ __osm_mpr_rcv_get_lid_pair_path(IN osm_mpr_rcv_t * const p_rcv, "Src LID 0x%X, Dest LID 0x%X\n", src_lid_ho, dest_lid_ho); - p_pr_item = (osm_mpr_item_t *) cl_qlock_pool_get(&p_rcv->pr_pool); + p_pr_item = malloc(sizeof(*p_pr_item)); if (p_pr_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_lid_pair_path: ERR 4501: " "Unable to allocate path record\n"); goto Exit; } + memset(p_pr_item, 0, sizeof(*p_pr_item)); status = __osm_mpr_rcv_get_path_parms(p_rcv, p_mpr, p_src_port, p_dest_port, dest_lid_ho, comp_mask, &path_parms); if (status != IB_SUCCESS) { - cl_qlock_pool_put(&p_rcv->pr_pool, &p_pr_item->pool_item); + free(p_pr_item); p_pr_item = NULL; goto Exit; } @@ -942,8 +930,7 @@ __osm_mpr_rcv_get_lid_pair_path(IN osm_mpr_rcv_t * const p_rcv, "__osm_mpr_rcv_get_lid_pair_path: " "Requested reversible path but failed to get one\n"); - cl_qlock_pool_put(&p_rcv->pr_pool, - &p_pr_item->pool_item); + free(p_pr_item); p_pr_item = NULL; goto Exit; } @@ -1084,9 +1071,7 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, preference); if (p_pr_item) { - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_pr_item-> - pool_item); + cl_qlist_insert_tail(p_list, &p_pr_item->list_item); ++path_num; } @@ -1152,9 +1137,7 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, preference); if (p_pr_item) { - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_pr_item-> - pool_item); + cl_qlist_insert_tail(p_list, &p_pr_item->list_item); ++path_num; } } @@ -1471,14 +1454,10 @@ __osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv, matrix[0][0]->path_rec.dlid, matrix[0][0]->hops, matrix[1][1]->path_rec.slid, matrix[1][1]->path_rec.dlid, matrix[1][1]->hops); - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & matrix[0][0]-> - pool_item); - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & matrix[1][1]-> - pool_item); - cl_qlock_pool_put(&p_rcv->pr_pool, &matrix[0][1]->pool_item); - cl_qlock_pool_put(&p_rcv->pr_pool, &matrix[1][0]->pool_item); + cl_qlist_insert_tail(p_list, &matrix[0][0]->list_item); + cl_qlist_insert_tail(p_list, &matrix[1][1]->list_item); + free(matrix[0][1]); + free(matrix[1][0]); } else { /* Diag B */ osm_log(p_rcv->p_log, OSM_LOG_DEBUG, @@ -1489,14 +1468,10 @@ __osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv, matrix[0][1]->path_rec.dlid, matrix[0][1]->hops, matrix[1][0]->path_rec.slid, matrix[1][0]->path_rec.dlid, matrix[1][0]->hops); - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & matrix[0][1]-> - pool_item); - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & matrix[1][0]-> - pool_item); - cl_qlock_pool_put(&p_rcv->pr_pool, &matrix[0][0]->pool_item); - cl_qlock_pool_put(&p_rcv->pr_pool, &matrix[1][1]->pool_item); + cl_qlist_insert_tail(p_list, &matrix[0][1]->list_item); + cl_qlist_insert_tail(p_list, &matrix[1][0]->list_item); + free(matrix[0][0]); + free(matrix[1][1]); } OSM_LOG_EXIT(p_rcv->p_log); @@ -1598,8 +1573,7 @@ __osm_mpr_rcv_respond(IN osm_mpr_rcv_t * const p_rcv, for (i = 0; i < num_rec; i++) { p_mpr_item = (osm_mpr_item_t *) cl_qlist_remove_head(p_list); - cl_qlock_pool_put(&p_rcv->pr_pool, - &p_mpr_item->pool_item); + free(p_mpr_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -1634,7 +1608,7 @@ __osm_mpr_rcv_respond(IN osm_mpr_rcv_t * const p_rcv, /* Copy the Path Records from the list into the MAD */ *p_resp_pr = p_mpr_item->path_rec; - cl_qlock_pool_put(&p_rcv->pr_pool, &p_mpr_item->pool_item); + free(p_mpr_item); p_resp_pr++; } diff --git a/opensm/opensm/osm_sa_node_record.c b/opensm/opensm/osm_sa_node_record.c index b94d005..e78e827 100644 --- a/opensm/opensm/osm_sa_node_record.c +++ b/opensm/opensm/osm_sa_node_record.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -60,11 +60,8 @@ #include #include -#define OSM_NR_RCV_POOL_MIN_SIZE 32 -#define OSM_NR_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_nr_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_node_record_t rec; } osm_nr_item_t; @@ -81,7 +78,6 @@ typedef struct _osm_nr_search_ctxt { void osm_nr_rcv_construct(IN osm_nr_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -89,7 +85,6 @@ void osm_nr_rcv_construct(IN osm_nr_rcv_t * const p_rcv) void osm_nr_rcv_destroy(IN osm_nr_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_nr_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -102,8 +97,6 @@ osm_nr_rcv_init(IN osm_nr_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_nr_rcv_init); osm_nr_rcv_construct(p_rcv); @@ -114,14 +107,8 @@ osm_nr_rcv_init(IN osm_nr_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_NR_RCV_POOL_MIN_SIZE, - 0, - OSM_NR_RCV_POOL_GROW_SIZE, - sizeof(osm_nr_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -137,16 +124,16 @@ __osm_nr_rcv_new_nr(IN osm_nr_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_nr_rcv_new_nr); - p_rec_item = (osm_nr_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_nr_rcv_new_nr: ERR 1D02: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { + if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) osm_log(p_rcv->p_log, OSM_LOG_DEBUG, "__osm_nr_rcv_new_nr: " "New NodeRecord: node 0x%016" PRIx64 @@ -154,9 +141,8 @@ __osm_nr_rcv_new_nr(IN osm_nr_rcv_t * const p_rcv, cl_ntoh64(osm_node_get_node_guid(p_node)), cl_ntoh64(port_guid), cl_ntoh16(lid) ); - } - memset(&p_rec_item->rec, 0, sizeof(ib_node_record_t)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = lid; @@ -164,8 +150,7 @@ __osm_nr_rcv_new_nr(IN osm_nr_rcv_t * const p_rcv, p_rec_item->rec.node_info.port_guid = port_guid; memcpy(&(p_rec_item->rec.node_desc), &(p_node->node_desc), IB_NODE_DESCRIPTION_SIZE); - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -459,7 +444,7 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) /* need to set the mem free ... */ p_rec_item = (osm_nr_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_nr_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_nr_item_t *) cl_qlist_remove_head(&rec_list); } @@ -506,7 +491,7 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_nr_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -551,10 +536,9 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < pre_trim_num_rec; i++) { p_rec_item = (osm_nr_item_t *) cl_qlist_remove_head(&rec_list); /* copy only if not trimmed */ - if (i < num_rec) { + if (i < num_rec) *p_resp_rec = p_rec_item->rec; - } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 8d26da4..2ea6211 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -73,15 +73,12 @@ #include #include -#define OSM_PR_RCV_POOL_MIN_SIZE 64 -#define OSM_PR_RCV_POOL_GROW_SIZE 64 - extern uint8_t osm_get_lash_sl(osm_opensm_t * p_osm, const osm_port_t * p_src_port, const osm_port_t * p_dst_port); typedef struct _osm_pr_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_path_rec_t path_rec; } osm_pr_item_t; @@ -111,7 +108,6 @@ static const ib_gid_t zero_gid = { {0x00, 0x00, 0x00, 0x00, void osm_pr_rcv_construct(IN osm_pr_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pr_pool); } /********************************************************************** @@ -119,7 +115,6 @@ void osm_pr_rcv_construct(IN osm_pr_rcv_t * const p_rcv) void osm_pr_rcv_destroy(IN osm_pr_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_pr_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pr_pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -132,8 +127,6 @@ osm_pr_rcv_init(IN osm_pr_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_pr_rcv_init); osm_pr_rcv_construct(p_rcv); @@ -144,14 +137,8 @@ osm_pr_rcv_init(IN osm_pr_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pr_pool, - OSM_PR_RCV_POOL_MIN_SIZE, - 0, - OSM_PR_RCV_POOL_GROW_SIZE, - sizeof(osm_pr_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_rcv->p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -939,20 +926,21 @@ __osm_pr_rcv_get_lid_pair_path(IN osm_pr_rcv_t * const p_rcv, "Src LID 0x%X, Dest LID 0x%X\n", src_lid_ho, dest_lid_ho); - p_pr_item = (osm_pr_item_t *) cl_qlock_pool_get(&p_rcv->pr_pool); + p_pr_item = malloc(sizeof(*p_pr_item)); if (p_pr_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_lid_pair_path: ERR 1F01: " "Unable to allocate path record\n"); goto Exit; } + memset(p_pr_item, 0, sizeof(*p_pr_item)); status = __osm_pr_rcv_get_path_parms(p_rcv, p_pr, p_src_port, p_dest_port, dest_lid_ho, comp_mask, &path_parms); if (status != IB_SUCCESS) { - cl_qlock_pool_put(&p_rcv->pr_pool, &p_pr_item->pool_item); + free(p_pr_item); p_pr_item = NULL; goto Exit; } @@ -976,8 +964,7 @@ __osm_pr_rcv_get_lid_pair_path(IN osm_pr_rcv_t * const p_rcv, "__osm_pr_rcv_get_lid_pair_path: " "Requested reversible path but failed to get one\n"); - cl_qlock_pool_put(&p_rcv->pr_pool, - &p_pr_item->pool_item); + free(p_pr_item); p_pr_item = NULL; goto Exit; } @@ -1158,9 +1145,7 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv, preference); if (p_pr_item) { - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_pr_item-> - pool_item); + cl_qlist_insert_tail(p_list, &p_pr_item->list_item); ++path_num; } @@ -1226,9 +1211,7 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv, preference); if (p_pr_item) { - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_pr_item-> - pool_item); + cl_qlist_insert_tail(p_list, &p_pr_item->list_item); ++path_num; } } @@ -1861,8 +1844,7 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, (osm_pr_item_t *) cl_qlist_remove_head(p_list); while (p_pr_item != (osm_pr_item_t *) cl_qlist_end(p_list)) { - cl_qlock_pool_put(&p_rcv->pr_pool, - &p_pr_item->pool_item); + free(p_pr_item); p_pr_item = (osm_pr_item_t *) cl_qlist_remove_head(p_list); } @@ -1907,8 +1889,7 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, for (i = 0; i < num_rec; i++) { p_pr_item = (osm_pr_item_t *) cl_qlist_remove_head(p_list); - cl_qlock_pool_put(&p_rcv->pr_pool, - &p_pr_item->pool_item); + free(p_pr_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -1949,7 +1930,7 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, if (i < num_rec) *p_resp_pr = p_pr_item->path_rec; - cl_qlock_pool_put(&p_rcv->pr_pool, &p_pr_item->pool_item); + free(p_pr_item); p_resp_pr++; } @@ -2113,14 +2094,14 @@ void osm_pr_rcv_process(IN void *context, IN void *data) goto Unlock; } - p_pr_item = - (osm_pr_item_t *) cl_qlock_pool_get(&p_rcv->pr_pool); + p_pr_item = malloc(sizeof(*p_pr_item)); if (p_pr_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "osm_pr_rcv_process: ERR 1F18: " "Unable to allocate path record for MC group\n"); goto Unlock; } + memset(p_pr_item, 0, sizeof(*p_pr_item)); /* Copy PathRecord request into response */ p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); @@ -2157,9 +2138,7 @@ void osm_pr_rcv_process(IN void *context, IN void *data) p_pr_item->path_rec.hop_flow_raw = cl_hton32(hop_limit) | (flow_label << 8); - cl_qlist_insert_tail(&pr_list, (cl_list_item_t *) - & p_pr_item->pool_item); - + cl_qlist_insert_tail(&pr_list, &p_pr_item->list_item); } Unlock: diff --git a/opensm/opensm/osm_sa_pkey_record.c b/opensm/opensm/osm_sa_pkey_record.c index 4402b94..1e9f50f 100644 --- a/opensm/opensm/osm_sa_pkey_record.c +++ b/opensm/opensm/osm_sa_pkey_record.c @@ -51,11 +51,8 @@ #include #include -#define OSM_PKEY_REC_RCV_POOL_MIN_SIZE 32 -#define OSM_PKEY_REC_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_pkey_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_pkey_table_record_t rec; } osm_pkey_item_t; @@ -73,7 +70,6 @@ typedef struct _osm_pkey_search_ctxt { void osm_pkey_rec_rcv_construct(IN osm_pkey_rec_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -81,7 +77,6 @@ void osm_pkey_rec_rcv_construct(IN osm_pkey_rec_rcv_t * const p_rcv) void osm_pkey_rec_rcv_destroy(IN osm_pkey_rec_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_pkey_rec_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -94,8 +89,6 @@ osm_pkey_rec_rcv_init(IN osm_pkey_rec_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_pkey_rec_rcv_init); osm_pkey_rec_rcv_construct(p_rcv); @@ -106,15 +99,8 @@ osm_pkey_rec_rcv_init(IN osm_pkey_rec_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - /* used for matching records collection */ - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_PKEY_REC_RCV_POOL_MIN_SIZE, - 0, - OSM_PKEY_REC_RCV_POOL_GROW_SIZE, - sizeof(osm_pkey_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -131,11 +117,11 @@ __osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_pkey_create); - p_rec_item = (osm_pkey_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_sa_pkey_create: ERR 4602: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } @@ -154,7 +140,7 @@ __osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv, cl_ntoh16(lid), osm_physp_get_port_num(p_physp), block); } - memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = lid; p_rec_item->rec.block_num = block; @@ -162,8 +148,7 @@ __osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv, p_rec_item->rec.pkey_tbl = *(osm_pkey_tbl_block_get(osm_physp_get_pkey_tbl(p_physp), block)); - cl_qlist_insert_tail(p_ctxt->p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -444,8 +429,7 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) (osm_pkey_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_pkey_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, - &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_pkey_item_t *) cl_qlist_remove_head(&rec_list); } @@ -494,7 +478,7 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_pkey_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -541,10 +525,9 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) p_rec_item = (osm_pkey_item_t *) cl_qlist_remove_head(&rec_list); /* copy only if not trimmed */ - if (i < num_rec) { + if (i < num_rec) *p_resp_rec = p_rec_item->rec; - } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_portinfo_record.c b/opensm/opensm/osm_sa_portinfo_record.c index 22869e6..ed3684c 100644 --- a/opensm/opensm/osm_sa_portinfo_record.c +++ b/opensm/opensm/osm_sa_portinfo_record.c @@ -64,11 +64,8 @@ #include #include -#define OSM_PIR_RCV_POOL_MIN_SIZE 32 -#define OSM_PIR_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_pir_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_portinfo_record_t rec; } osm_pir_item_t; @@ -86,7 +83,6 @@ typedef struct _osm_pir_search_ctxt { void osm_pir_rcv_construct(IN osm_pir_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -94,7 +90,6 @@ void osm_pir_rcv_construct(IN osm_pir_rcv_t * const p_rcv) void osm_pir_rcv_destroy(IN osm_pir_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_pir_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -107,8 +102,6 @@ osm_pir_rcv_init(IN osm_pir_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_pir_rcv_init); osm_pir_rcv_construct(p_rcv); @@ -119,14 +112,8 @@ osm_pir_rcv_init(IN osm_pir_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_PIR_RCV_POOL_MIN_SIZE, - 0, - OSM_PIR_RCV_POOL_GROW_SIZE, - sizeof(osm_pir_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -141,32 +128,30 @@ __osm_pir_rcv_new_pir(IN osm_pir_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_pir_rcv_new_pir); - p_rec_item = (osm_pir_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_pir_rcv_new_pir: ERR 2102: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { + if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) osm_log(p_rcv->p_log, OSM_LOG_DEBUG, "__osm_pir_rcv_new_pir: " "New PortInfoRecord: port 0x%016" PRIx64 ", lid 0x%X, port 0x%X\n", cl_ntoh64(osm_physp_get_port_guid(p_physp)), cl_ntoh16(lid), osm_physp_get_port_num(p_physp)); - } - memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = lid; p_rec_item->rec.port_info = p_physp->port_info; p_rec_item->rec.port_num = osm_physp_get_port_num(p_physp); - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -676,8 +661,7 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) (osm_pir_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_pir_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, - &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_pir_item_t *) cl_qlist_remove_head(&rec_list); } @@ -725,7 +709,7 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_pir_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -786,7 +770,7 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) if (trusted_req == FALSE) p_resp_rec->port_info.m_key = 0; } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_service_record.c b/opensm/opensm/osm_sa_service_record.c index abad29f..fb0193e 100644 --- a/opensm/opensm/osm_sa_service_record.c +++ b/opensm/opensm/osm_sa_service_record.c @@ -66,11 +66,8 @@ #include #include -#define OSM_SR_RCV_POOL_MIN_SIZE 64 -#define OSM_SR_RCV_POOL_GROW_SIZE 64 - typedef struct _osm_sr_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_service_record_t service_rec; } osm_sr_item_t; @@ -79,7 +76,6 @@ typedef struct osm_sr_match_item { ib_service_record_t *p_service_rec; ib_net64_t comp_mask; osm_sr_rcv_t *p_rcv; - } osm_sr_match_item_t; typedef struct _osm_sr_search_ctxt { @@ -92,7 +88,6 @@ typedef struct _osm_sr_search_ctxt { void osm_sr_rcv_construct(IN osm_sr_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->sr_pool); cl_timer_construct(&p_rcv->sr_timer); } @@ -101,7 +96,6 @@ void osm_sr_rcv_construct(IN osm_sr_rcv_t * const p_rcv) void osm_sr_rcv_destroy(IN osm_sr_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_sr_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->sr_pool); cl_timer_trim(&p_rcv->sr_timer, 1); cl_timer_destroy(&p_rcv->sr_timer); OSM_LOG_EXIT(p_rcv->p_log); @@ -116,8 +110,7 @@ osm_sr_rcv_init(IN osm_sr_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status = IB_ERROR; - cl_status_t cl_status; + ib_api_status_t status; OSM_LOG_ENTER(p_log, osm_sr_rcv_init); @@ -129,20 +122,8 @@ osm_sr_rcv_init(IN osm_sr_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - cl_status = cl_qlock_pool_init(&p_rcv->sr_pool, - OSM_SR_RCV_POOL_MIN_SIZE, - 0, - OSM_SR_RCV_POOL_GROW_SIZE, - sizeof(osm_sr_item_t), NULL, NULL, NULL); - if (cl_status != CL_SUCCESS) - goto Exit; - status = cl_timer_init(&p_rcv->sr_timer, osm_sr_rcv_lease_cb, p_rcv); - if (cl_status != CL_SUCCESS) - goto Exit; - status = IB_SUCCESS; - Exit: OSM_LOG_EXIT(p_rcv->p_log); return (status); } @@ -315,8 +296,7 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, /* need to set the mem free ... */ p_sr_item = (osm_sr_item_t *) cl_qlist_remove_head(p_list); while (p_sr_item != (osm_sr_item_t *) cl_qlist_end(p_list)) { - cl_qlock_pool_put(&p_rcv->sr_pool, - &p_sr_item->pool_item); + free(p_sr_item); p_sr_item = (osm_sr_item_t *) cl_qlist_remove_head(p_list); } @@ -355,8 +335,7 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, /* Release the quick pool items */ p_sr_item = (osm_sr_item_t *) cl_qlist_remove_head(p_list); while (p_sr_item != (osm_sr_item_t *) cl_qlist_end(p_list)) { - cl_qlock_pool_put(&p_rcv->sr_pool, - &p_sr_item->pool_item); + free(p_sr_item); p_sr_item = (osm_sr_item_t *) cl_qlist_remove_head(p_list); } @@ -430,8 +409,7 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, num_copied++; } - cl_qlock_pool_put(&p_rcv->sr_pool, - &p_sr_item->pool_item); + free(p_sr_item); p_resp_sr++; p_sr_item = (osm_sr_item_t *) cl_qlist_remove_head(p_list); @@ -668,9 +646,7 @@ __get_matching_sr(IN cl_list_item_t * const p_list_item, IN void *context) } } - p_sr_pool_item = - (osm_sr_item_t *) cl_qlock_pool_get(&p_sr_item->p_rcv->sr_pool); - + p_sr_pool_item = malloc(sizeof(*p_sr_pool_item)); if (p_sr_pool_item == NULL) { osm_log(p_sr_item->p_rcv->p_log, OSM_LOG_ERROR, "__get_matching_sr: ERR 2408: " @@ -680,8 +656,7 @@ __get_matching_sr(IN cl_list_item_t * const p_list_item, IN void *context) p_sr_pool_item->service_rec = p_svcr->service_record; - cl_qlist_insert_tail(&p_sr_item->sr_list, - (cl_list_item_t *) & p_sr_pool_item->pool_item); + cl_qlist_insert_tail(&p_sr_item->sr_list, &p_sr_pool_item->list_item); Exit: return; @@ -848,7 +823,7 @@ osm_sr_rcv_process_set_method(IN osm_sr_rcv_t * const p_rcv, p_svcr->modified_time = cl_get_time_stamp_sec(); } - p_sr_item = (osm_sr_item_t *) cl_qlock_pool_get(&p_rcv->sr_pool); + p_sr_item = malloc(sizeof(*p_sr_item)); if (p_sr_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "osm_sr_rcv_process_set_method: ERR 2412: " @@ -866,8 +841,7 @@ osm_sr_rcv_process_set_method(IN osm_sr_rcv_t * const p_rcv, p_sr_item->service_rec = *p_recvd_service_rec; cl_qlist_init(&sr_list); - cl_qlist_insert_tail(&sr_list, - (cl_list_item_t *) & p_sr_item->pool_item); + cl_qlist_insert_tail(&sr_list, &p_sr_item->list_item); __osm_sr_rcv_respond(p_rcv, p_madw, &sr_list); @@ -925,7 +899,7 @@ osm_sr_rcv_process_delete_method(IN osm_sr_rcv_t * const p_rcv, cl_plock_release(p_rcv->p_lock); - p_sr_item = (osm_sr_item_t *) cl_qlock_pool_get(&p_rcv->sr_pool); + p_sr_item = malloc(sizeof(*p_sr_item)); if (p_sr_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "osm_sr_rcv_process_delete_method: ERR 2413: " @@ -939,8 +913,7 @@ osm_sr_rcv_process_delete_method(IN osm_sr_rcv_t * const p_rcv, p_sr_item->service_rec = p_svcr->service_record; cl_qlist_init(&sr_list); - cl_qlist_insert_tail(&sr_list, - (cl_list_item_t *) & p_sr_item->pool_item); + cl_qlist_insert_tail(&sr_list, &p_sr_item->list_item); if (p_svcr) osm_svcr_delete(p_svcr); diff --git a/opensm/opensm/osm_sa_slvl_record.c b/opensm/opensm/osm_sa_slvl_record.c index 8d8e4dc..fd48296 100644 --- a/opensm/opensm/osm_sa_slvl_record.c +++ b/opensm/opensm/osm_sa_slvl_record.c @@ -63,11 +63,8 @@ #include #include -#define OSM_SLVL_REC_RCV_POOL_MIN_SIZE 32 -#define OSM_SLVL_REC_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_slvl_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_slvl_table_record_t rec; } osm_slvl_item_t; @@ -85,7 +82,6 @@ typedef struct _osm_slvl_search_ctxt { void osm_slvl_rec_rcv_construct(IN osm_slvl_rec_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -93,7 +89,6 @@ void osm_slvl_rec_rcv_construct(IN osm_slvl_rec_rcv_t * const p_rcv) void osm_slvl_rec_rcv_destroy(IN osm_slvl_rec_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_slvl_rec_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -106,8 +101,6 @@ osm_slvl_rec_rcv_init(IN osm_slvl_rec_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_slvl_rec_rcv_init); osm_slvl_rec_rcv_construct(p_rcv); @@ -118,15 +111,8 @@ osm_slvl_rec_rcv_init(IN osm_slvl_rec_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - /* used for matching records collection */ - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_SLVL_REC_RCV_POOL_MIN_SIZE, - 0, - OSM_SLVL_REC_RCV_POOL_GROW_SIZE, - sizeof(osm_slvl_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -143,11 +129,11 @@ __osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_slvl_create); - p_rec_item = (osm_slvl_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_sa_slvl_create: ERR 2602: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } @@ -166,7 +152,7 @@ __osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv, cl_ntoh16(lid), osm_physp_get_port_num(p_physp), in_port_idx); - memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = lid; p_rec_item->rec.out_port_num = osm_physp_get_port_num(p_physp); @@ -174,8 +160,7 @@ __osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv, p_rec_item->rec.slvl_tbl = *(osm_physp_get_slvl_tbl(p_physp, in_port_idx)); - cl_qlist_insert_tail(p_ctxt->p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -419,8 +404,7 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) (osm_slvl_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_slvl_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, - &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_slvl_item_t *) cl_qlist_remove_head(&rec_list); } @@ -469,7 +453,7 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_slvl_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -518,7 +502,7 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) /* copy only if not trimmed */ if (i < num_rec) *p_resp_rec = p_rec_item->rec; - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_sminfo_record.c b/opensm/opensm/osm_sa_sminfo_record.c index 2aa136e..6f84ac7 100644 --- a/opensm/opensm/osm_sa_sminfo_record.c +++ b/opensm/opensm/osm_sa_sminfo_record.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -70,11 +70,8 @@ #include #include -#define OSM_SMIR_RCV_POOL_MIN_SIZE 32 -#define OSM_SMIR_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_smir_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_sminfo_record_t rec; } osm_smir_item_t; @@ -91,7 +88,6 @@ typedef struct _osm_smir_search_ctxt { void osm_smir_rcv_construct(IN osm_smir_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -99,9 +95,7 @@ void osm_smir_rcv_construct(IN osm_smir_rcv_t * const p_rcv) void osm_smir_rcv_destroy(IN osm_smir_rcv_t * const p_rcv) { CL_ASSERT(p_rcv); - OSM_LOG_ENTER(p_rcv->p_log, osm_smir_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -115,8 +109,6 @@ osm_smir_rcv_init(IN osm_smir_rcv_t * const p_rcv, IN osm_stats_t * const p_stats, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_smir_rcv_init); osm_smir_rcv_construct(p_rcv); @@ -128,14 +120,8 @@ osm_smir_rcv_init(IN osm_smir_rcv_t * const p_rcv, p_rcv->p_stats = p_stats; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_SMIR_RCV_POOL_MIN_SIZE, - 0, - OSM_SMIR_RCV_POOL_GROW_SIZE, - sizeof(osm_smir_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_rcv->p_log); - return (status); + return IB_SUCCESS; } static ib_api_status_t @@ -152,31 +138,29 @@ __osm_smir_rcv_new_smir(IN osm_smir_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_smir_rcv_new_smir); - p_rec_item = (osm_smir_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_smir_rcv_new_smir: ERR 2801: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { + if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) osm_log(p_rcv->p_log, OSM_LOG_DEBUG, "__osm_smir_rcv_new_smir: " "New SMInfo: GUID 0x%016" PRIx64 "\n", cl_ntoh64(guid) ); - } - memset(&p_rec_item->rec, 0, sizeof(ib_sminfo_record_t)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = osm_port_get_base_lid(p_port); p_rec_item->rec.sm_info.guid = guid; p_rec_item->rec.sm_info.act_count = act_count; p_rec_item->rec.sm_info.pri_state = pri_state; - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -445,8 +429,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) (osm_smir_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_smir_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, - &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_smir_item_t *) cl_qlist_remove_head(&rec_list); } @@ -493,7 +476,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_smir_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -544,7 +527,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) *p_resp_rec = p_rec_item->rec; p_resp_rec->sm_info.sm_key = 0; } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_sw_info_record.c b/opensm/opensm/osm_sa_sw_info_record.c index edfe106..a9947e1 100644 --- a/opensm/opensm/osm_sa_sw_info_record.c +++ b/opensm/opensm/osm_sa_sw_info_record.c @@ -63,7 +63,7 @@ #define OSM_SIR_RCV_POOL_GROW_SIZE 32 typedef struct _osm_sir_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_switch_info_record_t rec; } osm_sir_item_t; @@ -80,7 +80,6 @@ typedef struct _osm_sir_search_ctxt { void osm_sir_rcv_construct(IN osm_sir_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -88,7 +87,6 @@ void osm_sir_rcv_construct(IN osm_sir_rcv_t * const p_rcv) void osm_sir_rcv_destroy(IN osm_sir_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_sir_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -101,8 +99,6 @@ osm_sir_rcv_init(IN osm_sir_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_sir_rcv_init); osm_sir_rcv_construct(p_rcv); @@ -113,14 +109,8 @@ osm_sir_rcv_init(IN osm_sir_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_SIR_RCV_POOL_MIN_SIZE, - 0, - OSM_SIR_RCV_POOL_GROW_SIZE, - sizeof(osm_sir_item_t), NULL, NULL, NULL); - OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -135,29 +125,27 @@ __osm_sir_rcv_new_sir(IN osm_sir_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_sir_rcv_new_sir); - p_rec_item = (osm_sir_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_sir_rcv_new_sir: ERR 5308: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { + if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) osm_log(p_rcv->p_log, OSM_LOG_DEBUG, "__osm_sir_rcv_new_sir: " "New SwitchInfoRecord: lid 0x%X\n", cl_ntoh16(lid) ); - } - memset(&p_rec_item->rec, 0, sizeof(ib_switch_info_record_t)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = lid; p_rec_item->rec.switch_info = p_sw->switch_info; - cl_qlist_insert_tail(p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -393,7 +381,7 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) /* need to set the mem free ... */ p_rec_item = (osm_sir_item_t *) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_sir_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_sir_item_t *) cl_qlist_remove_head(&rec_list); } @@ -442,7 +430,7 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_sir_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -487,10 +475,9 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < pre_trim_num_rec; i++) { p_rec_item = (osm_sir_item_t *) cl_qlist_remove_head(&rec_list); /* copy only if not trimmed */ - if (i < num_rec) { + if (i < num_rec) *p_resp_rec = p_rec_item->rec; - } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } diff --git a/opensm/opensm/osm_sa_vlarb_record.c b/opensm/opensm/osm_sa_vlarb_record.c index 49d688a..a538a0b 100644 --- a/opensm/opensm/osm_sa_vlarb_record.c +++ b/opensm/opensm/osm_sa_vlarb_record.c @@ -63,11 +63,8 @@ #include #include -#define OSM_VLARB_REC_RCV_POOL_MIN_SIZE 32 -#define OSM_VLARB_REC_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_vl_arb_item { - cl_pool_item_t pool_item; + cl_list_item_t list_item; ib_vl_arb_table_record_t rec; } osm_vl_arb_item_t; @@ -85,7 +82,6 @@ typedef struct _osm_vl_arb_search_ctxt { void osm_vlarb_rec_rcv_construct(IN osm_vlarb_rec_rcv_t * const p_rcv) { memset(p_rcv, 0, sizeof(*p_rcv)); - cl_qlock_pool_construct(&p_rcv->pool); } /********************************************************************** @@ -93,7 +89,6 @@ void osm_vlarb_rec_rcv_construct(IN osm_vlarb_rec_rcv_t * const p_rcv) void osm_vlarb_rec_rcv_destroy(IN osm_vlarb_rec_rcv_t * const p_rcv) { OSM_LOG_ENTER(p_rcv->p_log, osm_vlarb_rec_rcv_destroy); - cl_qlock_pool_destroy(&p_rcv->pool); OSM_LOG_EXIT(p_rcv->p_log); } @@ -106,8 +101,6 @@ osm_vlarb_rec_rcv_init(IN osm_vlarb_rec_rcv_t * const p_rcv, IN osm_subn_t * const p_subn, IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) { - ib_api_status_t status; - OSM_LOG_ENTER(p_log, osm_vlarb_rec_rcv_init); osm_vlarb_rec_rcv_construct(p_rcv); @@ -118,16 +111,8 @@ osm_vlarb_rec_rcv_init(IN osm_vlarb_rec_rcv_t * const p_rcv, p_rcv->p_resp = p_resp; p_rcv->p_mad_pool = p_mad_pool; - /* used for matching records collection */ - status = cl_qlock_pool_init(&p_rcv->pool, - OSM_VLARB_REC_RCV_POOL_MIN_SIZE, - 0, - OSM_VLARB_REC_RCV_POOL_GROW_SIZE, - sizeof(osm_vl_arb_item_t), - NULL, NULL, NULL); - OSM_LOG_EXIT(p_log); - return (status); + return IB_SUCCESS; } /********************************************************************** @@ -144,11 +129,11 @@ __osm_sa_vl_arb_create(IN osm_vlarb_rec_rcv_t * const p_rcv, OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_vl_arb_create); - p_rec_item = (osm_vl_arb_item_t *) cl_qlock_pool_get(&p_rcv->pool); + p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { osm_log(p_rcv->p_log, OSM_LOG_ERROR, "__osm_sa_vl_arb_create: ERR 2A02: " - "cl_qlock_pool_get failed\n"); + "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } @@ -158,24 +143,22 @@ __osm_sa_vl_arb_create(IN osm_vlarb_rec_rcv_t * const p_rcv, else lid = osm_node_get_base_lid(p_physp->p_node, 0); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { + if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) osm_log(p_rcv->p_log, OSM_LOG_DEBUG, "__osm_sa_vl_arb_create: " "New VLArbitration for: port 0x%016" PRIx64 ", lid 0x%X, port 0x%X Block:%u\n", cl_ntoh64(osm_physp_get_port_guid(p_physp)), cl_ntoh16(lid), osm_physp_get_port_num(p_physp), block); - } - memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec)); + memset(p_rec_item, 0, sizeof(*p_rec_item)); p_rec_item->rec.lid = lid; p_rec_item->rec.port_num = osm_physp_get_port_num(p_physp); p_rec_item->rec.block_num = block; p_rec_item->rec.vl_arb_tbl = *(osm_physp_get_vla_tbl(p_physp, block)); - cl_qlist_insert_tail(p_ctxt->p_list, - (cl_list_item_t *) & p_rec_item->pool_item); + cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item); Exit: OSM_LOG_EXIT(p_rcv->p_log); @@ -436,8 +419,7 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) cl_qlist_remove_head(&rec_list); while (p_rec_item != (osm_vl_arb_item_t *) cl_qlist_end(&rec_list)) { - cl_qlock_pool_put(&p_rcv->pool, - &p_rec_item->pool_item); + free(p_rec_item); p_rec_item = (osm_vl_arb_item_t *) cl_qlist_remove_head(&rec_list); } @@ -487,7 +469,7 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) for (i = 0; i < num_rec; i++) { p_rec_item = (osm_vl_arb_item_t *) cl_qlist_remove_head(&rec_list); - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); } osm_sa_send_error(p_rcv->p_resp, p_madw, @@ -534,10 +516,9 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) p_rec_item = (osm_vl_arb_item_t *) cl_qlist_remove_head(&rec_list); /* copy only if not trimmed */ - if (i < num_rec) { + if (i < num_rec) *p_resp_rec = p_rec_item->rec; - } - cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item); + free(p_rec_item); p_resp_rec++; } -- 1.5.3.4.206.g58ba4 From sashak at voltaire.com Thu Jan 3 02:01:11 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 10:01:11 +0000 Subject: [ofa-general] [PATCH 0/4] opensm: cleanup dummy _rcv_t objects Message-ID: <11993544753970-git-send-email-sashak@voltaire.com> Hi, The issue of data duplications over various dummy sub-objects in OpenSM was raised many times. This patch series starts some cleanups in this area and removes dummy *_rcv_t sub-objects in SM and SA mad receiver processors. This patch series is for the master branch only. Sasha From sashak at voltaire.com Thu Jan 3 02:01:12 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 10:01:12 +0000 Subject: [ofa-general] [PATCH 1/4] opensm: cleanup SM related _rcv_t objects In-Reply-To: <11993544753970-git-send-email-sashak@voltaire.com> References: <11993544753970-git-send-email-sashak@voltaire.com> Message-ID: <11993544752530-git-send-email-sashak@voltaire.com> This removes SM related dummy *_rcv_t objects and eliminates some data duplications. Instead osm_sm_t is used. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_inform.h | 1 - opensm/include/opensm/osm_sm.h | 43 +----- opensm/opensm/osm_inform.c | 2 - opensm/opensm/osm_lin_fwd_rcv.c | 59 ++------- opensm/opensm/osm_mcast_fwd_rcv.c | 63 ++------- opensm/opensm/osm_node_desc_rcv.c | 70 ++------- opensm/opensm/osm_node_info_rcv.c | 276 +++++++++++++++--------------------- opensm/opensm/osm_pkey_rcv.c | 75 ++-------- opensm/opensm/osm_port_info_rcv.c | 201 ++++++++++---------------- opensm/opensm/osm_slvl_map_rcv.c | 75 ++-------- opensm/opensm/osm_sm.c | 120 ++++------------ opensm/opensm/osm_sminfo_rcv.c | 255 +++++++++++++-------------------- opensm/opensm/osm_sw_info_rcv.c | 185 ++++++++++--------------- opensm/opensm/osm_trap_rcv.c | 184 +++++++++---------------- opensm/opensm/osm_vl_arb_rcv.c | 76 ++-------- 15 files changed, 546 insertions(+), 1139 deletions(-) diff --git a/opensm/include/opensm/osm_inform.h b/opensm/include/opensm/osm_inform.h index 91c0c64..0ec6a1b 100644 --- a/opensm/include/opensm/osm_inform.h +++ b/opensm/include/opensm/osm_inform.h @@ -58,7 +58,6 @@ #include #include #include -#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h index a676cd6..f68d59e 100644 --- a/opensm/include/opensm/osm_sm.h +++ b/opensm/include/opensm/osm_sm.h @@ -53,6 +53,7 @@ #include #include #include +#include #include #include #include @@ -61,24 +62,13 @@ #include #include #include -#include -#include -#include -#include #include #include #include #include #include -#include -#include #include -#include -#include #include -#include -#include -#include #include #include #include @@ -130,6 +120,7 @@ typedef struct osm_sm { cl_event_t signal_event; cl_event_t subnet_up_event; cl_timer_t sweep_timer; + cl_event_wheel_t trap_aging_tracker; cl_thread_t sweeper; osm_subn_t *p_subn; osm_db_t *p_db; @@ -144,26 +135,15 @@ typedef struct osm_sm { cl_qlist_t mgrp_list; osm_req_t req; osm_resp_t resp; - osm_ni_rcv_t ni_rcv; - osm_pi_rcv_t pi_rcv; - osm_nd_rcv_t nd_rcv; osm_sm_mad_ctrl_t mad_ctrl; - osm_si_rcv_t si_rcv; osm_lid_mgr_t lid_mgr; osm_ucast_mgr_t ucast_mgr; osm_link_mgr_t link_mgr; osm_state_mgr_t state_mgr; osm_drop_mgr_t drop_mgr; - osm_lft_rcv_t lft_rcv; - osm_mft_rcv_t mft_rcv; osm_sweep_fail_ctrl_t sweep_fail_ctrl; - osm_sminfo_rcv_t sm_info_rcv; - osm_trap_rcv_t trap_rcv; osm_sm_state_mgr_t sm_state_mgr; osm_mcast_mgr_t mcast_mgr; - osm_slvl_rcv_t slvl_rcv; - osm_vla_rcv_t vla_rcv; - osm_pkey_rcv_t pkey_rcv; cl_disp_reg_handle_t ni_disp_h; cl_disp_reg_handle_t pi_disp_h; cl_disp_reg_handle_t nd_disp_h; @@ -181,8 +161,8 @@ typedef struct osm_sm { * p_subn * Pointer to the Subnet object for this subnet. * -* p_db -* Pointer to the database (persistency) object +* p_db +* Pointer to the database (persistency) object * * p_vendor * Pointer to the vendor specific interfaces object. @@ -202,21 +182,6 @@ typedef struct osm_sm { * resp * MAD attribute responder. * -* nd_rcv_ctrl -* Node Description Receive Controller. -* -* ni_rcv_ctrl -* Node Info Receive Controller. -* -* pi_rcv_ctrl -* Port Info Receive Controller. -* -* si_rcv_ctrl -* Switch Info Receive Controller. -* -* nd_rcv_ctrl -* Node Description Receive Controller. -* * mad_ctrl * MAD Controller. * diff --git a/opensm/opensm/osm_inform.c b/opensm/opensm/osm_inform.c index 69eaaef..e488e3b 100644 --- a/opensm/opensm/osm_inform.c +++ b/opensm/opensm/osm_inform.c @@ -50,11 +50,9 @@ #include #include #include -#include #include #include #include -#include #include #include #include diff --git a/opensm/opensm/osm_lin_fwd_rcv.c b/opensm/opensm/osm_lin_fwd_rcv.c index efc0b1b..7d9d1af 100644 --- a/opensm/opensm/osm_lin_fwd_rcv.c +++ b/opensm/opensm/osm_lin_fwd_rcv.c @@ -51,53 +51,14 @@ #include #include -#include #include - -/********************************************************************** - **********************************************************************/ -void osm_lft_rcv_construct(IN osm_lft_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_lft_rcv_destroy(IN osm_lft_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_lft_rcv_destroy); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_lft_rcv_init(IN osm_lft_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_lft_rcv_init); - - osm_lft_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - - OSM_LOG_EXIT(p_rcv->p_log); - return (status); -} +#include /********************************************************************** **********************************************************************/ void osm_lft_rcv_process(IN void *context, IN void *data) { - osm_lft_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_smp_t *p_smp; uint32_t block_num; @@ -107,9 +68,9 @@ void osm_lft_rcv_process(IN void *context, IN void *data) ib_net64_t node_guid; ib_api_status_t status; - CL_ASSERT(p_rcv); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_rcv->p_log, osm_lft_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_lft_rcv_process); CL_ASSERT(p_madw); @@ -123,18 +84,18 @@ void osm_lft_rcv_process(IN void *context, IN void *data) p_lft_context = osm_madw_get_lft_context_ptr(p_madw); node_guid = p_lft_context->node_guid; - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); - p_sw = osm_get_switch_by_guid(p_rcv->p_subn, node_guid); + CL_PLOCK_EXCL_ACQUIRE(sm->p_lock); + p_sw = osm_get_switch_by_guid(sm->p_subn, node_guid); if (!p_sw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_lft_rcv_process: ERR 0401: " "LFT received for nonexistent node " "0x%" PRIx64 "\n", cl_ntoh64(node_guid)); } else { status = osm_switch_set_ft_block(p_sw, p_block, block_num); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_lft_rcv_process: ERR 0402: " "Setting forwarding table block failed (%s)" "\n\t\t\t\tSwitch 0x%" PRIx64 "\n", @@ -142,6 +103,6 @@ void osm_lft_rcv_process(IN void *context, IN void *data) } } - CL_PLOCK_RELEASE(p_rcv->p_lock); - OSM_LOG_EXIT(p_rcv->p_log); + CL_PLOCK_RELEASE(sm->p_lock); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_mcast_fwd_rcv.c b/opensm/opensm/osm_mcast_fwd_rcv.c index aa497b5..3233def 100644 --- a/opensm/opensm/osm_mcast_fwd_rcv.c +++ b/opensm/opensm/osm_mcast_fwd_rcv.c @@ -54,56 +54,17 @@ #include #include #include -#include #include #include #include #include - -/********************************************************************** - **********************************************************************/ -void osm_mft_rcv_construct(IN osm_mft_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_mft_rcv_destroy(IN osm_mft_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_mft_rcv_destroy); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_mft_rcv_init(IN osm_mft_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_mft_rcv_init); - - osm_mft_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - - OSM_LOG_EXIT(p_rcv->p_log); - return (status); -} +#include /********************************************************************** **********************************************************************/ void osm_mft_rcv_process(IN void *context, IN void *data) { - osm_mft_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_smp_t *p_smp; uint32_t block_num; @@ -114,9 +75,9 @@ void osm_mft_rcv_process(IN void *context, IN void *data) ib_net64_t node_guid; ib_api_status_t status; - CL_ASSERT(p_rcv); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_rcv->p_log, osm_mft_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_mft_rcv_process); CL_ASSERT(p_madw); @@ -133,8 +94,8 @@ void osm_mft_rcv_process(IN void *context, IN void *data) p_mft_context = osm_madw_get_mft_context_ptr(p_madw); node_guid = p_mft_context->node_guid; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "osm_mft_rcv_process: " "Setting MFT block %u, position %u, " "Switch 0x%016" PRIx64 ", TID 0x%" PRIx64 "\n", @@ -142,11 +103,11 @@ void osm_mft_rcv_process(IN void *context, IN void *data) cl_ntoh64(p_smp->trans_id)); } - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); - p_sw = osm_get_switch_by_guid(p_rcv->p_subn, node_guid); + CL_PLOCK_EXCL_ACQUIRE(sm->p_lock); + p_sw = osm_get_switch_by_guid(sm->p_subn, node_guid); if (!p_sw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_mft_rcv_process: ERR 0801: " "MFT received for nonexistent node " "0x%016" PRIx64 "\n", cl_ntoh64(node_guid)); @@ -155,7 +116,7 @@ void osm_mft_rcv_process(IN void *context, IN void *data) (uint16_t) block_num, position); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_mft_rcv_process: ERR 0802: " "Setting MFT block failed (%s)" "\n\t\t\t\tSwitch 0x%016" PRIx64 @@ -165,6 +126,6 @@ void osm_mft_rcv_process(IN void *context, IN void *data) } } - CL_PLOCK_RELEASE(p_rcv->p_lock); - OSM_LOG_EXIT(p_rcv->p_log); + CL_PLOCK_RELEASE(sm->p_lock); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_node_desc_rcv.c b/opensm/opensm/osm_node_desc_rcv.c index 788905b..6c9c8ea 100644 --- a/opensm/opensm/osm_node_desc_rcv.c +++ b/opensm/opensm/osm_node_desc_rcv.c @@ -54,7 +54,6 @@ #include #include #include -#include #include #include #include @@ -64,21 +63,21 @@ /********************************************************************** **********************************************************************/ static void -__osm_nd_rcv_process_nd(IN const osm_nd_rcv_t * const p_rcv, +__osm_nd_rcv_process_nd(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const ib_node_desc_t * const p_nd) { char *tmp_desc; char print_desc[IB_NODE_DESCRIPTION_SIZE + 1]; - OSM_LOG_ENTER(p_rcv->p_log, __osm_nd_rcv_process_nd); + OSM_LOG_ENTER(sm->p_log, __osm_nd_rcv_process_nd); memcpy(&p_node->node_desc.description, p_nd, sizeof(*p_nd)); /* also set up a printable version */ memcpy(print_desc, p_nd, sizeof(*p_nd)); print_desc[IB_NODE_DESCRIPTION_SIZE] = '\0'; - tmp_desc = remap_node_name(p_rcv->p_subn->p_osm->node_name_map, + tmp_desc = remap_node_name(sm->p_subn->p_osm->node_name_map, cl_ntoh64(osm_node_get_node_guid(p_node)), print_desc); @@ -87,70 +86,31 @@ __osm_nd_rcv_process_nd(IN const osm_nd_rcv_t * const p_rcv, free(p_node->print_desc); p_node->print_desc = tmp_desc; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_VERBOSE)) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) { + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_nd_rcv_process_nd: " "Node 0x%" PRIx64 "\n\t\t\t\tDescription = %s\n", cl_ntoh64(osm_node_get_node_guid(p_node)), p_node->print_desc); } - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -void osm_nd_rcv_construct(IN osm_nd_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_nd_rcv_destroy(IN osm_nd_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_nd_rcv_destroy); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_nd_rcv_init(IN osm_nd_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_nd_rcv_init); - - osm_nd_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - - OSM_LOG_EXIT(p_rcv->p_log); - return (status); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ void osm_nd_rcv_process(IN void *context, IN void *data) { - osm_nd_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_node_desc_t *p_nd; ib_smp_t *p_smp; osm_node_t *p_node; ib_net64_t node_guid; - CL_ASSERT(p_rcv); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_rcv->p_log, osm_nd_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_nd_rcv_process); CL_ASSERT(p_madw); @@ -162,17 +122,17 @@ void osm_nd_rcv_process(IN void *context, IN void *data) */ node_guid = osm_madw_get_nd_context_ptr(p_madw)->node_guid; - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); - p_node = osm_get_node_by_guid(p_rcv->p_subn, node_guid); + CL_PLOCK_EXCL_ACQUIRE(sm->p_lock); + p_node = osm_get_node_by_guid(sm->p_subn, node_guid); if (!p_node) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_nd_rcv_process: ERR 0B01: " "NodeDescription received for nonexistent node " "0x%" PRIx64 "\n", cl_ntoh64(node_guid)); } else { - __osm_nd_rcv_process_nd(p_rcv, p_node, p_nd); + __osm_nd_rcv_process_nd(sm, p_node, p_nd); } - CL_PLOCK_RELEASE(p_rcv->p_lock); - OSM_LOG_EXIT(p_rcv->p_log); + CL_PLOCK_RELEASE(sm->p_lock); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_node_info_rcv.c b/opensm/opensm/osm_node_info_rcv.c index 4571a0f..b84788a 100644 --- a/opensm/opensm/osm_node_info_rcv.c +++ b/opensm/opensm/osm_node_info_rcv.c @@ -55,7 +55,6 @@ #include #include #include -#include #include #include #include @@ -68,7 +67,7 @@ #include static void -report_duplicated_guid(IN const osm_ni_rcv_t * const p_rcv, +report_duplicated_guid(IN osm_sm_t * sm, osm_physp_t * p_physp, osm_node_t * p_neighbor_node, const uint8_t port_num) { @@ -78,7 +77,7 @@ report_duplicated_guid(IN const osm_ni_rcv_t * const p_rcv, p_old = p_physp->p_remote_physp; p_new = osm_node_get_physp_ptr(p_neighbor_node, port_num); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "report_duplicated_guid: ERR 0D01: " "Found duplicated node.\n" "Node 0x%" PRIx64 " port %u is reachable from remote node " @@ -89,18 +88,18 @@ report_duplicated_guid(IN const osm_ni_rcv_t * const p_rcv, cl_ntoh64(p_old->p_node->node_info.node_guid), p_old->port_num, cl_ntoh64(p_new->p_node->node_info.node_guid), p_new->port_num); - osm_dump_dr_path(p_rcv->p_log, osm_physp_get_dr_path_ptr(p_physp), + osm_dump_dr_path(sm->p_log, osm_physp_get_dr_path_ptr(p_physp), OSM_LOG_ERROR); path = *osm_physp_get_dr_path_ptr(p_new); osm_dr_path_extend(&path, port_num); - osm_dump_dr_path(p_rcv->p_log, &path, OSM_LOG_ERROR); + osm_dump_dr_path(sm->p_log, &path, OSM_LOG_ERROR); - osm_log(p_rcv->p_log, OSM_LOG_SYS, + osm_log(sm->p_log, OSM_LOG_SYS, "FATAL: duplicated guids or 12x lane reversal\n"); } -static void requery_dup_node_info(IN const osm_ni_rcv_t * const p_rcv, +static void requery_dup_node_info(IN osm_sm_t * sm, osm_physp_t * p_physp, unsigned count) { osm_madw_context_t context; @@ -117,13 +116,13 @@ static void requery_dup_node_info(IN const osm_ni_rcv_t * const p_rcv, context.ni_context.dup_port_num = p_physp->port_num; context.ni_context.dup_count = count; - status = osm_req_get(p_rcv->p_gen_req, + status = osm_req_get(&sm->req, &path, IB_MAD_ATTR_NODE_INFO, 0, CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "requery_dup_node_info: ERR 0D02: " "Failure initiating NodeInfo request (%s)\n", ib_get_err_str(status)); @@ -133,7 +132,7 @@ static void requery_dup_node_info(IN const osm_ni_rcv_t * const p_rcv, The plock must be held before calling this function. **********************************************************************/ static void -__osm_ni_rcv_set_links(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_set_links(IN osm_sm_t * sm, osm_node_t * p_node, const uint8_t port_num, const osm_ni_context_t * const p_ni_context) @@ -141,7 +140,7 @@ __osm_ni_rcv_set_links(IN const osm_ni_rcv_t * const p_rcv, osm_node_t *p_neighbor_node; osm_physp_t *p_physp; - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_set_links); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_set_links); /* A special case exists in which the node we're trying to @@ -149,17 +148,17 @@ __osm_ni_rcv_set_links(IN const osm_ni_rcv_t * const p_rcv, the ni_context will be zero. */ if (p_ni_context->node_guid == 0) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_ni_rcv_set_links: " "Nothing to link for our own node 0x%" PRIx64 "\n", cl_ntoh64(osm_node_get_node_guid(p_node))); goto _exit; } - p_neighbor_node = osm_get_node_by_guid(p_rcv->p_subn, + p_neighbor_node = osm_get_node_by_guid(sm->p_subn, p_ni_context->node_guid); if (!p_neighbor_node) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_set_links: ERR 0D10: " "Unexpected removal of neighbor node " "0x%" PRIx64 "\n", cl_ntoh64(p_ni_context->node_guid)); @@ -181,13 +180,13 @@ __osm_ni_rcv_set_links(IN const osm_ni_rcv_t * const p_rcv, if (osm_node_link_exists(p_node, port_num, p_neighbor_node, p_ni_context->port_num)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_ni_rcv_set_links: " "Link already exists\n"); goto _exit; } if (osm_node_has_any_link(p_node, port_num) && - p_rcv->p_subn->force_immediate_heavy_sweep == FALSE && + sm->p_subn->force_immediate_heavy_sweep == FALSE && (!p_ni_context->dup_count || (p_ni_context->dup_node_guid == osm_node_get_node_guid(p_node) && p_ni_context->dup_port_num == port_num))) { @@ -208,15 +207,15 @@ __osm_ni_rcv_set_links(IN const osm_ni_rcv_t * const p_rcv, */ p_physp = osm_node_get_physp_ptr(p_node, port_num); if (p_ni_context->dup_count > 5) { - report_duplicated_guid(p_rcv, p_physp, + report_duplicated_guid(sm, p_physp, p_neighbor_node, p_ni_context->port_num); - p_rcv->p_subn->force_immediate_heavy_sweep = TRUE; + sm->p_subn->force_immediate_heavy_sweep = TRUE; } else if (p_node->sw) - requery_dup_node_info(p_rcv, p_physp->p_remote_physp, + requery_dup_node_info(sm, p_physp->p_remote_physp, p_ni_context->dup_count + 1); else - requery_dup_node_info(p_rcv, p_physp, + requery_dup_node_info(sm, p_physp, p_ni_context->dup_count + 1); } @@ -228,19 +227,19 @@ __osm_ni_rcv_set_links(IN const osm_ni_rcv_t * const p_rcv, */ if ((osm_node_get_node_guid(p_node) == p_ni_context->node_guid) && (port_num == p_ni_context->port_num) && - port_num != 0 && cl_qmap_count(&p_rcv->p_subn->sw_guid_tbl) == 0) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + port_num != 0 && cl_qmap_count(&sm->p_subn->sw_guid_tbl) == 0) { + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_ni_rcv_set_links: " "Duplicate GUID found by link from a port to itself:" "node 0x%" PRIx64 ", port number 0x%X\n", cl_ntoh64(osm_node_get_node_guid(p_node)), port_num); p_physp = osm_node_get_physp_ptr(p_node, port_num); - osm_dump_dr_path(p_rcv->p_log, + osm_dump_dr_path(sm->p_log, osm_physp_get_dr_path_ptr(p_physp), OSM_LOG_VERBOSE); - if (p_rcv->p_subn->opt.exit_on_fatal == TRUE) { - osm_log(p_rcv->p_log, OSM_LOG_SYS, + if (sm->p_subn->opt.exit_on_fatal == TRUE) { + osm_log(sm->p_log, OSM_LOG_SYS, "Errors on subnet. Duplicate GUID found " "by link from a port to itself. " "See verbose opensm.log for more details\n"); @@ -248,8 +247,8 @@ __osm_ni_rcv_set_links(IN const osm_ni_rcv_t * const p_rcv, } } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_ni_rcv_set_links: " "Creating new link between: " "\n\t\t\t\tnode 0x%" PRIx64 ", " @@ -265,14 +264,14 @@ __osm_ni_rcv_set_links(IN const osm_ni_rcv_t * const p_rcv, p_ni_context->port_num); _exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must be held before calling this function. **********************************************************************/ static void -__osm_ni_rcv_process_new_node(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_process_new_node(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { @@ -283,7 +282,7 @@ __osm_ni_rcv_process_new_node(IN const osm_ni_rcv_t * const p_rcv, ib_smp_t *p_smp; uint8_t port_num; - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_process_new_node); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_new_node); CL_ASSERT(p_node); CL_ASSERT(p_madw); @@ -314,24 +313,24 @@ __osm_ni_rcv_process_new_node(IN const osm_ni_rcv_t * const p_rcv, context.pi_context.light_sweep = FALSE; context.pi_context.active_transition = FALSE; - status = osm_req_get(p_rcv->p_gen_req, + status = osm_req_get(&sm->req, osm_physp_get_dr_path_ptr(p_physp), IB_MAD_ATTR_PORT_INFO, cl_hton32(port_num), CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_new_node: ERR 0D02: " "Failure initiating PortInfo request (%s)\n", ib_get_err_str(status)); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must be held before calling this function. **********************************************************************/ static void -__osm_ni_rcv_get_node_desc(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_get_node_desc(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { @@ -342,7 +341,7 @@ __osm_ni_rcv_get_node_desc(IN const osm_ni_rcv_t * const p_rcv, ib_smp_t *p_smp; uint8_t port_num; - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_get_node_desc); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_get_node_desc); CL_ASSERT(p_node); CL_ASSERT(p_madw); @@ -368,30 +367,30 @@ __osm_ni_rcv_get_node_desc(IN const osm_ni_rcv_t * const p_rcv, context.nd_context.node_guid = osm_node_get_node_guid(p_node); - status = osm_req_get(p_rcv->p_gen_req, + status = osm_req_get(&sm->req, osm_physp_get_dr_path_ptr(p_physp), IB_MAD_ATTR_NODE_DESC, 0, CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_get_node_desc: ERR 0D03: " "Failure initiating NodeDescription request (%s)\n", ib_get_err_str(status)); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must be held before calling this function. **********************************************************************/ static void -__osm_ni_rcv_process_new_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_process_new_ca_or_router(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_process_new_ca_or_router); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_new_ca_or_router); - __osm_ni_rcv_process_new_node(p_rcv, p_node, p_madw); + __osm_ni_rcv_process_new_node(sm, p_node, p_madw); /* A node guid of 0 is the corner case that indicates @@ -399,16 +398,16 @@ __osm_ni_rcv_process_new_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, object with the SM's own port guid. */ if (osm_madw_get_ni_context_ptr(p_madw)->node_guid == 0) - p_rcv->p_subn->sm_port_guid = p_node->node_info.port_guid; + sm->p_subn->sm_port_guid = p_node->node_info.port_guid; - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must be held before calling this function. **********************************************************************/ static void -__osm_ni_rcv_process_existing_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_process_existing_ca_or_router(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { @@ -423,7 +422,7 @@ __osm_ni_rcv_process_existing_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, osm_dr_path_t *p_dr_path; osm_bind_handle_t h_bind; - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_process_existing_ca_or_router); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_existing_ca_or_router); p_smp = osm_madw_get_smp_ptr(p_madw); p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp); @@ -435,9 +434,9 @@ __osm_ni_rcv_process_existing_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, previously undiscovered port. If so, build the new port object. */ - p_port = osm_get_port_by_guid(p_rcv->p_subn, p_ni->port_guid); + p_port = osm_get_port_by_guid(sm->p_subn, p_ni->port_guid); if (!p_port) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_ni_rcv_process_existing_ca_or_router: " "Creating new port object with GUID 0x%" PRIx64 "\n", cl_ntoh64(p_ni->port_guid)); @@ -446,7 +445,7 @@ __osm_ni_rcv_process_existing_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, p_port = osm_port_new(p_ni, p_node); if (p_port == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_existing_ca_or_router: ERR 0D04: " "Unable to create new port object\n"); goto Exit; @@ -456,7 +455,7 @@ __osm_ni_rcv_process_existing_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, Add the new port object to the database. */ p_port_check = - (osm_port_t *) cl_qmap_insert(&p_rcv->p_subn->port_guid_tbl, + (osm_port_t *) cl_qmap_insert(&sm->p_subn->port_guid_tbl, p_ni->port_guid, &p_port->map_item); if (p_port_check != p_port) { @@ -464,7 +463,7 @@ __osm_ni_rcv_process_existing_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, We should never be here! Somehow, this port GUID already exists in the table. */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_existing_ca_or_router: ERR 0D12: " "Port 0x%" PRIx64 " already in the database!\n", cl_ntoh64(p_ni->port_guid)); @@ -480,7 +479,7 @@ __osm_ni_rcv_process_existing_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, then these ports may be new to us, but are not new on the subnet. If we are master, then the subnet as we know it is the updated one, and any new ports we encounter should cause trap 64. C14-72.1.1 */ - if (p_rcv->p_subn->sm_state == IB_SMINFO_STATE_MASTER) + if (sm->p_subn->sm_state == IB_SMINFO_STATE_MASTER) p_port->is_new = 1; p_physp = osm_node_get_physp_ptr(p_node, port_num); @@ -488,7 +487,7 @@ __osm_ni_rcv_process_existing_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, p_physp = osm_node_get_physp_ptr(p_node, port_num); if (!osm_physp_is_valid(p_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_existing_ca_or_router: ERR 0D19: " "Invalid physical port. Aborting discovery\n"); goto Exit; @@ -510,25 +509,25 @@ __osm_ni_rcv_process_existing_ca_or_router(IN const osm_ni_rcv_t * const p_rcv, context.pi_context.update_master_sm_base_lid = FALSE; context.pi_context.light_sweep = FALSE; - status = osm_req_get(p_rcv->p_gen_req, + status = osm_req_get(&sm->req, osm_physp_get_dr_path_ptr(p_physp), IB_MAD_ATTR_PORT_INFO, cl_hton32(port_num), CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_existing_ca_or_router: ERR 0D13: " "Failure initiating PortInfo request (%s)\n", ib_get_err_str(status)); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_ni_rcv_process_switch(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_process_switch(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { @@ -537,7 +536,7 @@ __osm_ni_rcv_process_switch(IN const osm_ni_rcv_t * const p_rcv, osm_dr_path_t dr_path; ib_smp_t *p_smp; - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_process_switch); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_switch); CL_ASSERT(p_node); CL_ASSERT(p_madw); @@ -553,29 +552,29 @@ __osm_ni_rcv_process_switch(IN const osm_ni_rcv_t * const p_rcv, context.si_context.light_sweep = FALSE; /* Request a SwitchInfo attribute */ - status = osm_req_get(p_rcv->p_gen_req, + status = osm_req_get(&sm->req, &dr_path, IB_MAD_ATTR_SWITCH_INFO, 0, CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) /* continue despite error */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_switch: ERR 0D06: " "Failure initiating SwitchInfo request (%s)\n", ib_get_err_str(status)); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must be held before calling this function. **********************************************************************/ static void -__osm_ni_rcv_process_existing_switch(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_process_existing_switch(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_process_existing_switch); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_existing_switch); /* If this switch has already been probed during this sweep, @@ -586,30 +585,30 @@ __osm_ni_rcv_process_existing_switch(IN const osm_ni_rcv_t * const p_rcv, to retry to probe the switch. */ if (p_node->discovery_count == 1) - __osm_ni_rcv_process_switch(p_rcv, p_node, p_madw); + __osm_ni_rcv_process_switch(sm, p_node, p_madw); else if (!p_node->sw || p_node->sw->discovery_count == 0) { /* we don't have the SwitchInfo - retry to get it */ - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_ni_rcv_process_existing_switch: " "Retry to get SwitchInfo on node GUID:0x%" PRIx64 "\n", cl_ntoh64(osm_node_get_node_guid(p_node))); - __osm_ni_rcv_process_switch(p_rcv, p_node, p_madw); + __osm_ni_rcv_process_switch(sm, p_node, p_madw); } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must be held before calling this function. **********************************************************************/ static void -__osm_ni_rcv_process_new_switch(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_process_new_switch(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_process_new_switch); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_new_switch); - __osm_ni_rcv_process_switch(p_rcv, p_node, p_madw); + __osm_ni_rcv_process_switch(sm, p_node, p_madw); /* A node guid of 0 is the corner case that indicates @@ -617,16 +616,16 @@ __osm_ni_rcv_process_new_switch(IN const osm_ni_rcv_t * const p_rcv, object with the SM's own port guid. */ if (osm_madw_get_ni_context_ptr(p_madw)->node_guid == 0) - p_rcv->p_subn->sm_port_guid = p_node->node_info.port_guid; + sm->p_subn->sm_port_guid = p_node->node_info.port_guid; - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must NOT be held before calling this function. **********************************************************************/ static void -__osm_ni_rcv_process_new(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_process_new(IN osm_sm_t * sm, IN const osm_madw_t * const p_madw) { osm_node_t *p_node; @@ -641,16 +640,16 @@ __osm_ni_rcv_process_new(IN const osm_ni_rcv_t * const p_rcv, osm_ni_context_t *p_ni_context; uint8_t port_num; - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_process_new); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_new); p_smp = osm_madw_get_smp_ptr(p_madw); p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp); p_ni_context = osm_madw_get_ni_context_ptr(p_madw); port_num = ib_node_info_get_local_port_num(p_ni); - osm_dump_smp_dr_path(p_rcv->p_log, p_smp, OSM_LOG_VERBOSE); + osm_dump_smp_dr_path(sm->p_log, p_smp, OSM_LOG_VERBOSE); - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_ni_rcv_process_new: " "Discovered new %s node," "\n\t\t\t\tGUID 0x%" PRIx64 ", TID 0x%" PRIx64 "\n", @@ -659,7 +658,7 @@ __osm_ni_rcv_process_new(IN const osm_ni_rcv_t * const p_rcv, p_node = osm_node_new(p_madw); if (p_node == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_new: ERR 0D07: " "Unable to create new node object\n"); goto Exit; @@ -671,7 +670,7 @@ __osm_ni_rcv_process_new(IN const osm_ni_rcv_t * const p_rcv, */ p_port = osm_port_new(p_ni, p_node); if (p_port == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_new: ERR 0D14: " "Unable to create new port object\n"); osm_node_delete(&p_node); @@ -682,22 +681,22 @@ __osm_ni_rcv_process_new(IN const osm_ni_rcv_t * const p_rcv, Add the new port object to the database. */ p_port_check = - (osm_port_t *) cl_qmap_insert(&p_rcv->p_subn->port_guid_tbl, + (osm_port_t *) cl_qmap_insert(&sm->p_subn->port_guid_tbl, p_ni->port_guid, &p_port->map_item); if (p_port_check != p_port) { /* We should never be here! Somehow, this port GUID already exists in the table. */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_new: ERR 0D15: " "Duplicate Port GUID 0x%" PRIx64 "! Found by the two directed routes:\n", cl_ntoh64(p_ni->port_guid)); - osm_dump_dr_path(p_rcv->p_log, + osm_dump_dr_path(sm->p_log, osm_physp_get_dr_path_ptr(p_port->p_physp), OSM_LOG_ERROR); - osm_dump_dr_path(p_rcv->p_log, + osm_dump_dr_path(sm->p_log, osm_physp_get_dr_path_ptr(p_port_check-> p_physp), OSM_LOG_ERROR); @@ -713,24 +712,24 @@ __osm_ni_rcv_process_new(IN const osm_ni_rcv_t * const p_rcv, then these ports may be new to us, but are not new on the subnet. If we are master, then the subnet as we know it is the updated one, and any new ports we encounter should cause trap 64. C14-72.1.1 */ - if (p_rcv->p_subn->sm_state == IB_SMINFO_STATE_MASTER) + if (sm->p_subn->sm_state == IB_SMINFO_STATE_MASTER) p_port->is_new = 1; /* If there were RouterInfo or other router attribute, this would be elsewhere */ if (p_ni->node_type == IB_NODE_TYPE_ROUTER) { if ((p_rtr = osm_router_new(p_port)) == NULL) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_new: ERR 0D1A: " "Unable to create new router object\n"); else { - p_rtr_guid_tbl = &p_rcv->p_subn->rtr_guid_tbl; + p_rtr_guid_tbl = &sm->p_subn->rtr_guid_tbl; p_rtr_check = (osm_router_t *) cl_qmap_insert(p_rtr_guid_tbl, p_ni->port_guid, &p_rtr->map_item); if (p_rtr_check != p_rtr) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_new: ERR 0D1B: " "Unable to add port GUID:0x%016" PRIx64 " to router table\n", @@ -739,7 +738,7 @@ __osm_ni_rcv_process_new(IN const osm_ni_rcv_t * const p_rcv, } p_node_check = - (osm_node_t *) cl_qmap_insert(&p_rcv->p_subn->node_guid_tbl, + (osm_node_t *) cl_qmap_insert(&sm->p_subn->node_guid_tbl, p_ni->node_guid, &p_node->map_item); if (p_node_check != p_node) { /* @@ -748,30 +747,30 @@ __osm_ni_rcv_process_new(IN const osm_ni_rcv_t * const p_rcv, We can simply clean-up, since the other thread will see this processing through to completion. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_ni_rcv_process_new: " "Discovery race detected at node 0x%" PRIx64 "\n", cl_ntoh64(p_ni->node_guid)); osm_node_delete(&p_node); p_node = p_node_check; - __osm_ni_rcv_set_links(p_rcv, p_node, port_num, p_ni_context); + __osm_ni_rcv_set_links(sm, p_node, port_num, p_ni_context); goto Exit; } else - __osm_ni_rcv_set_links(p_rcv, p_node, port_num, p_ni_context); + __osm_ni_rcv_set_links(sm, p_node, port_num, p_ni_context); p_node->discovery_count++; - __osm_ni_rcv_get_node_desc(p_rcv, p_node, p_madw); + __osm_ni_rcv_get_node_desc(sm, p_node, p_madw); switch (p_ni->node_type) { case IB_NODE_TYPE_CA: case IB_NODE_TYPE_ROUTER: - __osm_ni_rcv_process_new_ca_or_router(p_rcv, p_node, p_madw); + __osm_ni_rcv_process_new_ca_or_router(sm, p_node, p_madw); break; case IB_NODE_TYPE_SWITCH: - __osm_ni_rcv_process_new_switch(p_rcv, p_node, p_madw); + __osm_ni_rcv_process_new_switch(sm, p_node, p_madw); break; default: - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_new: ERR 0D16: " "Unknown node type %u with GUID 0x%" PRIx64 "\n", p_ni->node_type, cl_ntoh64(p_ni->node_guid)); @@ -779,14 +778,14 @@ __osm_ni_rcv_process_new(IN const osm_ni_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must be held before calling this function. **********************************************************************/ static void -__osm_ni_rcv_process_existing(IN const osm_ni_rcv_t * const p_rcv, +__osm_ni_rcv_process_existing(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { @@ -795,15 +794,15 @@ __osm_ni_rcv_process_existing(IN const osm_ni_rcv_t * const p_rcv, osm_ni_context_t *p_ni_context; uint8_t port_num; - OSM_LOG_ENTER(p_rcv->p_log, __osm_ni_rcv_process_existing); + OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_existing); p_smp = osm_madw_get_smp_ptr(p_madw); p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp); p_ni_context = osm_madw_get_ni_context_ptr(p_madw); port_num = ib_node_info_get_local_port_num(p_ni); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_VERBOSE)) - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_ni_rcv_process_existing: " "Rediscovered %s node 0x%" PRIx64 " TID 0x%" PRIx64 ", discovered %u times already\n", @@ -820,82 +819,41 @@ __osm_ni_rcv_process_existing(IN const osm_ni_rcv_t * const p_rcv, switch (p_ni->node_type) { case IB_NODE_TYPE_CA: case IB_NODE_TYPE_ROUTER: - __osm_ni_rcv_process_existing_ca_or_router(p_rcv, p_node, + __osm_ni_rcv_process_existing_ca_or_router(sm, p_node, p_madw); break; case IB_NODE_TYPE_SWITCH: - __osm_ni_rcv_process_existing_switch(p_rcv, p_node, p_madw); + __osm_ni_rcv_process_existing_switch(sm, p_node, p_madw); break; default: - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_ni_rcv_process_existing: ERR 0D09: " "Unknown node type %u with GUID 0x%" PRIx64 "\n", p_ni->node_type, cl_ntoh64(p_ni->node_guid)); break; } - __osm_ni_rcv_set_links(p_rcv, p_node, port_num, p_ni_context); + __osm_ni_rcv_set_links(sm, p_node, port_num, p_ni_context); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -void osm_ni_rcv_construct(IN osm_ni_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_ni_rcv_destroy(IN osm_ni_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_ni_rcv_destroy); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_ni_rcv_init(IN osm_ni_rcv_t * const p_rcv, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_ni_rcv_init); - - osm_ni_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_gen_req = p_req; - - OSM_LOG_EXIT(p_rcv->p_log); - return (status); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ void osm_ni_rcv_process(IN void *context, IN void *data) { - osm_ni_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_node_info_t *p_ni; ib_smp_t *p_smp; osm_node_t *p_node; boolean_t process_new_flag = FALSE; - CL_ASSERT(p_rcv); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_rcv->p_log, osm_ni_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_ni_rcv_process); CL_ASSERT(p_madw); @@ -905,18 +863,18 @@ void osm_ni_rcv_process(IN void *context, IN void *data) CL_ASSERT(p_smp->attr_id == IB_MAD_ATTR_NODE_INFO); if (p_ni->node_guid == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_ni_rcv_process: ERR 0D16: " "Got Zero Node GUID! Found on the directed route:\n"); - osm_dump_smp_dr_path(p_rcv->p_log, p_smp, OSM_LOG_ERROR); + osm_dump_smp_dr_path(sm->p_log, p_smp, OSM_LOG_ERROR); goto Exit; } if (p_ni->port_guid == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_ni_rcv_process: ERR 0D17: " "Got Zero Port GUID! Found on the directed route:\n"); - osm_dump_smp_dr_path(p_rcv->p_log, p_smp, OSM_LOG_ERROR); + osm_dump_smp_dr_path(sm->p_log, p_smp, OSM_LOG_ERROR); goto Exit; } @@ -926,27 +884,27 @@ void osm_ni_rcv_process(IN void *context, IN void *data) During processing of this node, hold the shared lock. */ - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); - p_node = osm_get_node_by_guid(p_rcv->p_subn, p_ni->node_guid); + CL_PLOCK_EXCL_ACQUIRE(sm->p_lock); + p_node = osm_get_node_by_guid(sm->p_subn, p_ni->node_guid); - osm_dump_node_info(p_rcv->p_log, p_ni, OSM_LOG_DEBUG); + osm_dump_node_info(sm->p_log, p_ni, OSM_LOG_DEBUG); if (!p_node) { - __osm_ni_rcv_process_new(p_rcv, p_madw); + __osm_ni_rcv_process_new(sm, p_madw); process_new_flag = TRUE; } else - __osm_ni_rcv_process_existing(p_rcv, p_node, p_madw); + __osm_ni_rcv_process_existing(sm, p_node, p_madw); - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sm->p_lock); /* * If we processed a new node - need to signal to the SM that * change detected. */ if (process_new_flag) - osm_sm_signal(&p_rcv->p_subn->p_osm->sm, + osm_sm_signal(&sm->p_subn->p_osm->sm, OSM_SIGNAL_CHANGE_DETECTED); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_pkey_rcv.c b/opensm/opensm/osm_pkey_rcv.c index d045fa2..c510ab5 100644 --- a/opensm/opensm/osm_pkey_rcv.c +++ b/opensm/opensm/osm_pkey_rcv.c @@ -39,61 +39,14 @@ #include #include -#include #include #include -#include -#include -#include #include #include #include #include -#include -#include #include -#include - -/********************************************************************** - **********************************************************************/ -void osm_pkey_rcv_construct(IN osm_pkey_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_pkey_rcv_destroy(IN osm_pkey_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_pkey_rcv_destroy); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_pkey_rcv_init(IN osm_pkey_rcv_t * const p_rcv, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_pkey_rcv_init); - - osm_pkey_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_req = p_req; - - OSM_LOG_EXIT(p_log); - return (status); -} +#include /********************************************************************** **********************************************************************/ @@ -102,7 +55,7 @@ osm_pkey_rcv_init(IN osm_pkey_rcv_t * const p_rcv, */ void osm_pkey_rcv_process(IN void *context, IN void *data) { - osm_pkey_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_pkey_table_t *p_pkey_tbl; ib_smp_t *p_smp; @@ -115,9 +68,9 @@ void osm_pkey_rcv_process(IN void *context, IN void *data) uint8_t port_num; uint16_t block_num; - CL_ASSERT(p_rcv); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_rcv->p_log, osm_pkey_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_pkey_rcv_process); CL_ASSERT(p_madw); @@ -131,10 +84,10 @@ void osm_pkey_rcv_process(IN void *context, IN void *data) CL_ASSERT(p_smp->attr_id == IB_MAD_ATTR_P_KEY_TABLE); - cl_plock_excl_acquire(p_rcv->p_lock); - p_port = osm_get_port_by_guid(p_rcv->p_subn, port_guid); + cl_plock_excl_acquire(sm->p_lock); + p_port = osm_get_port_by_guid(sm->p_subn, port_guid); if (!p_port) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_pkey_rcv_process: ERR 4806: " "No port object for port with GUID 0x%" PRIx64 "\n\t\t\t\tfor parent node GUID 0x%" PRIx64 @@ -163,8 +116,8 @@ void osm_pkey_rcv_process(IN void *context, IN void *data) We do not mind if this is a result of a set or get - all we want is to update the subnet. */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_VERBOSE)) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) { + osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_pkey_rcv_process: " "Got GetResp(PKey) block:%u port_num %u with GUID 0x%" PRIx64 " for parent node GUID 0x%" PRIx64 ", TID 0x%" @@ -177,21 +130,21 @@ void osm_pkey_rcv_process(IN void *context, IN void *data) If so, ignore it. */ if (!osm_physp_is_valid(p_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_pkey_rcv_process: ERR 4807: " "Got invalid port number 0x%X\n", port_num); goto Exit; } - osm_dump_pkey_block(p_rcv->p_log, + osm_dump_pkey_block(sm->p_log, port_guid, block_num, port_num, p_pkey_tbl, OSM_LOG_DEBUG); - osm_physp_set_pkey_tbl(p_rcv->p_log, p_rcv->p_subn, + osm_physp_set_pkey_tbl(sm->p_log, sm->p_subn, p_physp, p_pkey_tbl, block_num); Exit: - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sm->p_lock); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c index 1987e2c..3775665 100644 --- a/opensm/opensm/osm_port_info_rcv.c +++ b/opensm/opensm/osm_port_info_rcv.c @@ -54,8 +54,7 @@ #include #include #include -#include -#include +#include #include #include #include @@ -64,7 +63,6 @@ #include #include #include -#include #include #include #include @@ -72,16 +70,16 @@ /********************************************************************** **********************************************************************/ static void -__osm_pi_rcv_set_sm(IN const osm_pi_rcv_t * const p_rcv, +__osm_pi_rcv_set_sm(IN osm_sm_t * sm, IN osm_physp_t * const p_physp) { osm_bind_handle_t h_bind; osm_dr_path_t *p_dr_path; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_set_sm); + OSM_LOG_ENTER(sm->p_log, __osm_pi_rcv_set_sm); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_pi_rcv_set_sm: " "Setting IS_SM bit in port attributes\n"); @@ -93,7 +91,7 @@ __osm_pi_rcv_set_sm(IN const osm_pi_rcv_t * const p_rcv, */ osm_vendor_set_sm(h_bind, TRUE); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** @@ -114,7 +112,7 @@ static void pi_rcv_check_and_fix_lid(osm_log_t *log, ib_port_info_t * const pi, /********************************************************************** **********************************************************************/ static void -__osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv, +__osm_pi_rcv_process_endport(IN osm_sm_t * sm, IN osm_physp_t * const p_physp, IN const ib_port_info_t * const p_pi) { @@ -125,7 +123,7 @@ __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv, cl_qmap_t *p_sm_tbl; osm_remote_sm_t *p_sm; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_process_endport); + OSM_LOG_ENTER(sm->p_log, __osm_pi_rcv_process_endport); port_guid = osm_physp_get_port_guid(p_physp); @@ -133,25 +131,25 @@ __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv, if (osm_physp_get_port_num(p_physp) != 0) { /* track the minimal endport MTU and rate */ mtu = ib_port_info_get_mtu_cap(p_pi); - if (mtu < p_rcv->p_subn->min_ca_mtu) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (mtu < sm->p_subn->min_ca_mtu) { + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_pi_rcv_process_endport: " "Setting endport minimal MTU to:%u defined by port:0x%" PRIx64 "\n", mtu, cl_ntoh64(port_guid)); - p_rcv->p_subn->min_ca_mtu = mtu; + sm->p_subn->min_ca_mtu = mtu; } rate = ib_port_info_compute_rate(p_pi); - if (rate < p_rcv->p_subn->min_ca_rate) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (rate < sm->p_subn->min_ca_rate) { + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_pi_rcv_process_endport: " "Setting endport minimal rate to:%u defined by port:0x%" PRIx64 "\n", rate, cl_ntoh64(port_guid)); - p_rcv->p_subn->min_ca_rate = rate; + sm->p_subn->min_ca_rate = rate; } } - if (port_guid == p_rcv->p_subn->sm_port_guid) { + if (port_guid == sm->p_subn->sm_port_guid) { /* We received the PortInfo for our own port. */ @@ -159,28 +157,28 @@ __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv, /* Set the IS_SM bit to indicate our port hosts an SM. */ - __osm_pi_rcv_set_sm(p_rcv, p_physp); + __osm_pi_rcv_set_sm(sm, p_physp); } else { /* Before querying the SM - we want to make sure we clean its state, so if the querying fails we recognize that this SM is not active. */ - p_sm_tbl = &p_rcv->p_subn->sm_guid_tbl; + p_sm_tbl = &sm->p_subn->sm_guid_tbl; p_sm = (osm_remote_sm_t *) cl_qmap_get(p_sm_tbl, port_guid); if (p_sm != (osm_remote_sm_t *) cl_qmap_end(p_sm_tbl)) /* clean it up */ p_sm->smi.pri_state = 0xF0 & p_sm->smi.pri_state; if (p_pi->capability_mask & IB_PORT_CAP_IS_SM) { - if (p_rcv->p_subn->opt.ignore_other_sm) - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (sm->p_subn->opt.ignore_other_sm) + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_pi_rcv_process_endport: " "Ignoring SM on port 0x%" PRIx64 "\n", cl_ntoh64(port_guid)); else { if (osm_log_is_active - (p_rcv->p_log, OSM_LOG_VERBOSE)) - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + (sm->p_log, OSM_LOG_VERBOSE)) + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_pi_rcv_process_endport: " "Detected another SM. Requesting SMInfo" "\n\t\t\t\tPort 0x%" PRIx64 @@ -193,7 +191,7 @@ __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv, memset(&context, 0, sizeof(context)); context.smi_context.set_method = FALSE; context.smi_context.port_guid = port_guid; - status = osm_req_get(p_rcv->p_req, + status = osm_req_get(&sm->req, osm_physp_get_dr_path_ptr (p_physp), IB_MAD_ATTR_SM_INFO, 0, @@ -201,7 +199,7 @@ __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv, &context); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_pi_rcv_process_endport: ERR 0F05: " "Failure requesting SMInfo (%s)\n", ib_get_err_str(status)); @@ -209,14 +207,14 @@ __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv, } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must be held before calling this function. **********************************************************************/ static void -__osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv, +__osm_pi_rcv_process_switch_port(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN osm_physp_t * const p_physp, IN ib_port_info_t * const p_pi) @@ -229,7 +227,7 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv, uint8_t remote_port_num; osm_dr_path_t path; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_process_switch_port); + OSM_LOG_ENTER(sm->p_log, __osm_pi_rcv_process_switch_port); /* Check the state of the physical port. @@ -240,7 +238,7 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv, /* if in_sweep_hop_0 is TRUE, then this means the SM is on the switch, and we got switchInfo of our local switch. Do not continue probing through the switch. */ - if (port_num != 0 && p_rcv->p_subn->in_sweep_hop_0 == FALSE) { + if (port_num != 0 && sm->p_subn->in_sweep_hop_0 == FALSE) { switch (ib_port_info_get_port_state(p_pi)) { case IB_LINK_DOWN: p_remote_physp = osm_physp_get_remote(p_physp); @@ -251,7 +249,7 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv, remote_port_num = osm_physp_get_port_num(p_remote_physp); - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_pi_rcv_process_switch_port: " "Unlinking local node 0x%" PRIx64 ", port 0x%X" @@ -297,7 +295,7 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv, context.ni_context.port_num = osm_physp_get_port_num(p_physp); - status = osm_req_get(p_rcv->p_req, + status = osm_req_get(&sm->req, &path, IB_MAD_ATTR_NODE_INFO, 0, @@ -305,20 +303,20 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv, &context); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_pi_rcv_process_switch_port: ERR 0F02: " "Failure initiating NodeInfo request (%s)\n", ib_get_err_str(status)); } else - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_pi_rcv_process_switch_port: " "Skipping SMP responder port 0x%X\n", p_pi->local_port_num); break; default: - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_pi_rcv_process_switch_port: ERR 0F03: " "Unknown link state = %u, port = 0x%X\n", ib_port_info_get_port_state(p_pi), @@ -331,7 +329,7 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv, p_node->sw->need_update = 0; if (port_num == 0) - pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp); + pi_rcv_check_and_fix_lid(sm->p_log, p_pi, p_physp); /* Update the PortInfo attribute. @@ -344,31 +342,31 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv, !ib_switch_info_is_enhanced_port0(&p_node->sw->switch_info)) /* PortState is not used on BSP0 but just in case it is DOWN */ p_physp->port_info = *p_pi; - __osm_pi_rcv_process_endport(p_rcv, p_physp, p_pi); + __osm_pi_rcv_process_endport(sm, p_physp, p_pi); } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_pi_rcv_process_ca_or_router_port(IN const osm_pi_rcv_t * const p_rcv, +__osm_pi_rcv_process_ca_or_router_port(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN osm_physp_t * const p_physp, IN ib_port_info_t * const p_pi) { - OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_process_ca_or_router_port); + OSM_LOG_ENTER(sm->p_log, __osm_pi_rcv_process_ca_or_router_port); UNUSED_PARAM(p_node); - pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp); + pi_rcv_check_and_fix_lid(sm->p_log, p_pi, p_physp); osm_physp_set_port_info(p_physp, p_pi); - __osm_pi_rcv_process_endport(p_rcv, p_physp, p_pi); + __osm_pi_rcv_process_endport(sm, p_physp, p_pi); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } #define IBM_VENDOR_ID (0x5076) @@ -450,62 +448,21 @@ static void get_pkey_table(IN osm_log_t * p_log, /********************************************************************** **********************************************************************/ static void -__osm_pi_rcv_get_pkey_slvl_vla_tables(IN const osm_pi_rcv_t * const p_rcv, +__osm_pi_rcv_get_pkey_slvl_vla_tables(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN osm_physp_t * const p_physp) { - OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_get_pkey_slvl_vla_tables); - - get_pkey_table(p_rcv->p_log, p_rcv->p_req, p_rcv->p_subn, - p_node, p_physp); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -void osm_pi_rcv_construct(IN osm_pi_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_pi_rcv_destroy(IN osm_pi_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_pi_rcv_destroy); - - CL_ASSERT(p_rcv); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_pi_rcv_init(IN osm_pi_rcv_t * const p_rcv, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_pi_rcv_init); + OSM_LOG_ENTER(sm->p_log, __osm_pi_rcv_get_pkey_slvl_vla_tables); - osm_pi_rcv_construct(p_rcv); + get_pkey_table(sm->p_log, &sm->req, sm->p_subn, p_node, p_physp); - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_req = p_req; - - OSM_LOG_EXIT(p_log); - return (status); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ -void -osm_pi_rcv_process_set(IN const osm_pi_rcv_t * const p_rcv, - IN osm_node_t * const p_node, +static void +osm_pi_rcv_process_set(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const uint8_t port_num, IN osm_madw_t * const p_madw) { osm_physp_t *p_physp; @@ -515,7 +472,7 @@ osm_pi_rcv_process_set(IN const osm_pi_rcv_t * const p_rcv, osm_pi_context_t *p_context; osm_log_level_t level; - OSM_LOG_ENTER(p_rcv->p_log, osm_pi_rcv_process_set); + OSM_LOG_ENTER(sm->p_log, osm_pi_rcv_process_set); p_context = osm_madw_get_pi_context_ptr(p_madw); @@ -535,24 +492,24 @@ osm_pi_rcv_process_set(IN const osm_pi_rcv_t * const p_rcv, if (p_context->active_transition && (cl_ntoh16(p_smp->status) & 0x7fff) == 0x1c) { level = OSM_LOG_INFO; - osm_log(p_rcv->p_log, OSM_LOG_INFO, + osm_log(sm->p_log, OSM_LOG_INFO, "osm_pi_rcv_process_set: " "Received error status 0x%x for SetResp() during ACTIVE transition\n", cl_ntoh16(p_smp->status) & 0x7fff); /* Should there be a subsequent Get to validate that port is ACTIVE ? */ } else { level = OSM_LOG_ERROR; - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_pi_rcv_process_set: ERR 0F10: " "Received error status for SetResp()\n"); } - osm_dump_port_info(p_rcv->p_log, + osm_dump_port_info(sm->p_log, osm_node_get_node_guid(p_node), port_guid, port_num, p_pi, level); } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) + osm_log(sm->p_log, OSM_LOG_DEBUG, "osm_pi_rcv_process_set: " "Received logical SetResp() for GUID 0x%" PRIx64 ", port num 0x%X" @@ -565,14 +522,14 @@ osm_pi_rcv_process_set(IN const osm_pi_rcv_t * const p_rcv, osm_physp_set_port_info(p_physp, p_pi); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ void osm_pi_rcv_process(IN void *context, IN void *data) { - osm_pi_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_port_info_t *p_pi; ib_smp_t *p_smp; @@ -585,9 +542,9 @@ void osm_pi_rcv_process(IN void *context, IN void *data) ib_net64_t node_guid; uint8_t port_num; - OSM_LOG_ENTER(p_rcv->p_log, osm_pi_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_pi_rcv_process); - CL_ASSERT(p_rcv); + CL_ASSERT(sm); CL_ASSERT(p_madw); p_smp = osm_madw_get_smp_ptr(p_madw); @@ -601,13 +558,13 @@ void osm_pi_rcv_process(IN void *context, IN void *data) port_guid = p_context->port_guid; node_guid = p_context->node_guid; - osm_dump_port_info(p_rcv->p_log, + osm_dump_port_info(sm->p_log, node_guid, port_guid, port_num, p_pi, OSM_LOG_DEBUG); /* On receipt of client reregister, clear the reregister bit so reregistering won't be sent again and again */ if (ib_port_info_get_client_rereg(p_pi)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "osm_pi_rcv_process: " "Client reregister received on response\n"); ib_port_info_set_client_rereg(p_pi, 0); @@ -620,22 +577,22 @@ void osm_pi_rcv_process(IN void *context, IN void *data) do anything with the response - just flag that we need a heavy sweep */ if (p_context->light_sweep == TRUE) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_pi_rcv_process: " "Got light sweep response from remote port of parent node " "GUID 0x%" PRIx64 " port 0x%016" PRIx64 ", Commencing heavy sweep\n", cl_ntoh64(node_guid), cl_ntoh64(port_guid)); - osm_sm_signal(&p_rcv->p_subn->p_osm->sm, + osm_sm_signal(&sm->p_subn->p_osm->sm, OSM_SIGNAL_CHANGE_DETECTED); goto Exit; } - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); - p_port = osm_get_port_by_guid(p_rcv->p_subn, port_guid); + CL_PLOCK_EXCL_ACQUIRE(sm->p_lock); + p_port = osm_get_port_by_guid(sm->p_subn, port_guid); if (!p_port) { - CL_PLOCK_RELEASE(p_rcv->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + CL_PLOCK_RELEASE(sm->p_lock); + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_pi_rcv_process: ERR 0F06: " "No port object for port with GUID 0x%" PRIx64 "\n\t\t\t\tfor parent node GUID 0x%" PRIx64 @@ -660,7 +617,7 @@ void osm_pi_rcv_process(IN void *context, IN void *data) boolean around to determine if we were doing Get() or Set(). */ if (p_context->set_method) - osm_pi_rcv_process_set(p_rcv, p_node, port_num, p_madw); + osm_pi_rcv_process_set(sm, p_node, port_num, p_madw); else { p_port->discovery_count++; @@ -668,8 +625,8 @@ void osm_pi_rcv_process(IN void *context, IN void *data) This PortInfo arrived because we did a Get() method, most likely due to a subnet sweep in progress. */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_VERBOSE)) - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) + osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_pi_rcv_process: " "Discovered port num 0x%X with GUID 0x%" PRIx64 " for parent node GUID 0x%" PRIx64 @@ -687,8 +644,8 @@ void osm_pi_rcv_process(IN void *context, IN void *data) continue processing as normal. */ if (!osm_physp_is_valid(p_physp)) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_VERBOSE)) - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) + osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_pi_rcv_process: " "Initializing port number 0x%X\n", port_num); @@ -718,13 +675,13 @@ void osm_pi_rcv_process(IN void *context, IN void *data) in the subnet. */ if (p_context->update_master_sm_base_lid == TRUE) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_pi_rcv_process: " "update_master_sm is TRUE. " "Updating master_sm_base_lid to:%u\n", p_pi->master_sm_base_lid); - p_rcv->p_subn->master_sm_base_lid = + sm->p_subn->master_sm_base_lid = p_pi->master_sm_base_lid; } @@ -737,16 +694,16 @@ void osm_pi_rcv_process(IN void *context, IN void *data) switch (osm_node_get_type(p_node)) { case IB_NODE_TYPE_CA: case IB_NODE_TYPE_ROUTER: - __osm_pi_rcv_process_ca_or_router_port(p_rcv, + __osm_pi_rcv_process_ca_or_router_port(sm, p_node, p_physp, p_pi); break; case IB_NODE_TYPE_SWITCH: - __osm_pi_rcv_process_switch_port(p_rcv, + __osm_pi_rcv_process_switch_port(sm, p_node, p_physp, p_pi); break; default: - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_pi_rcv_process: ERR 0F07: " "Unknown node type %u with GUID 0x%" PRIx64 "\n", osm_node_get_type(p_node), @@ -757,17 +714,17 @@ void osm_pi_rcv_process(IN void *context, IN void *data) /* Get the tables on the physp. */ - if (p_physp->need_update || p_rcv->p_subn->need_update) - __osm_pi_rcv_get_pkey_slvl_vla_tables(p_rcv, p_node, + if (p_physp->need_update || sm->p_subn->need_update) + __osm_pi_rcv_get_pkey_slvl_vla_tables(sm, p_node, p_physp); } - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sm->p_lock); Exit: /* Release the lock before jumping here!! */ - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_slvl_map_rcv.c b/opensm/opensm/osm_slvl_map_rcv.c index e3942fb..3f9c88a 100644 --- a/opensm/opensm/osm_slvl_map_rcv.c +++ b/opensm/opensm/osm_slvl_map_rcv.c @@ -51,61 +51,14 @@ #include #include -#include #include #include -#include -#include -#include #include #include #include #include -#include -#include #include -#include - -/********************************************************************** - **********************************************************************/ -void osm_slvl_rcv_construct(IN osm_slvl_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_slvl_rcv_destroy(IN osm_slvl_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_slvl_rcv_destroy); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_slvl_rcv_init(IN osm_slvl_rcv_t * const p_rcv, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_slvl_rcv_init); - - osm_slvl_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_req = p_req; - - OSM_LOG_EXIT(p_log); - return (status); -} +#include /********************************************************************** **********************************************************************/ @@ -114,7 +67,7 @@ osm_slvl_rcv_init(IN osm_slvl_rcv_t * const p_rcv, */ void osm_slvl_rcv_process(IN void *context, IN void *p_data) { - osm_slvl_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = p_data; ib_slvl_table_t *p_slvl_tbl; ib_smp_t *p_smp; @@ -126,9 +79,9 @@ void osm_slvl_rcv_process(IN void *context, IN void *p_data) ib_net64_t node_guid; uint8_t out_port_num, in_port_num; - CL_ASSERT(p_rcv); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_rcv->p_log, osm_slvl_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_slvl_rcv_process); CL_ASSERT(p_madw); @@ -141,12 +94,12 @@ void osm_slvl_rcv_process(IN void *context, IN void *p_data) CL_ASSERT(p_smp->attr_id == IB_MAD_ATTR_SLVL_TABLE); - cl_plock_excl_acquire(p_rcv->p_lock); - p_port = osm_get_port_by_guid(p_rcv->p_subn, port_guid); + cl_plock_excl_acquire(sm->p_lock); + p_port = osm_get_port_by_guid(sm->p_subn, port_guid); if (!p_port) { - cl_plock_release(p_rcv->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + cl_plock_release(sm->p_lock); + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_slvl_rcv_process: ERR 2C06: " "No port object for port with GUID 0x%" PRIx64 "\n\t\t\t\tfor parent node GUID 0x%" PRIx64 @@ -178,8 +131,8 @@ void osm_slvl_rcv_process(IN void *context, IN void *p_data) We do not mind if this is a result of a set or get - all we want is to update the subnet. */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_VERBOSE)) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) { + osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_slvl_rcv_process: " "Got SLtoVL get response in_port_num %u out_port_num %u with GUID 0x%" PRIx64 " for parent node GUID 0x%" PRIx64 ", TID 0x%" @@ -193,20 +146,20 @@ void osm_slvl_rcv_process(IN void *context, IN void *p_data) If so, Ignore it. */ if (!osm_physp_is_valid(p_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_slvl_rcv_process: " "Got invalid port number 0x%X\n", out_port_num); goto Exit; } - osm_dump_slvl_map_table(p_rcv->p_log, + osm_dump_slvl_map_table(sm->p_log, port_guid, in_port_num, out_port_num, p_slvl_tbl, OSM_LOG_DEBUG); osm_physp_set_slvl_tbl(p_physp, p_slvl_tbl, in_port_num); Exit: - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sm->p_lock); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index b295a77..b60a615 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -68,6 +68,18 @@ #define OSM_SM_INITIAL_TID_VALUE 0x1233 +extern void osm_lft_rcv_process(IN void *context, IN void *data); +extern void osm_mft_rcv_process(IN void *context, IN void *data); +extern void osm_nd_rcv_process(IN void *context, IN void *data); +extern void osm_ni_rcv_process(IN void *context, IN void *data); +extern void osm_pkey_rcv_process(IN void *context, IN void *data); +extern void osm_pi_rcv_process(IN void *context, IN void *data); +extern void osm_slvl_rcv_process(IN void *context, IN void *p_data); +extern void osm_sminfo_rcv_process(IN void *context, IN void *data); +extern void osm_si_rcv_process(IN void *context, IN void *data); +extern void osm_trap_rcv_process(IN void *context, IN void *data); +extern void osm_vla_rcv_process(IN void *context, IN void *data); + /********************************************************************** **********************************************************************/ static void osm_sm_process(osm_sm_t * sm, osm_signal_t signal) @@ -143,29 +155,19 @@ void osm_sm_construct(IN osm_sm_t * const p_sm) cl_spinlock_construct(&p_sm->signal_lock); cl_event_construct(&p_sm->signal_event); cl_event_construct(&p_sm->subnet_up_event); + cl_event_wheel_construct(&p_sm->trap_aging_tracker); cl_thread_construct(&p_sm->sweeper); cl_spinlock_construct(&p_sm->mgrp_lock); osm_req_construct(&p_sm->req); osm_resp_construct(&p_sm->resp); - osm_ni_rcv_construct(&p_sm->ni_rcv); - osm_pi_rcv_construct(&p_sm->pi_rcv); - osm_nd_rcv_construct(&p_sm->nd_rcv); osm_sm_mad_ctrl_construct(&p_sm->mad_ctrl); - osm_si_rcv_construct(&p_sm->si_rcv); osm_lid_mgr_construct(&p_sm->lid_mgr); osm_ucast_mgr_construct(&p_sm->ucast_mgr); osm_link_mgr_construct(&p_sm->link_mgr); osm_state_mgr_construct(&p_sm->state_mgr); osm_drop_mgr_construct(&p_sm->drop_mgr); - osm_lft_rcv_construct(&p_sm->lft_rcv); - osm_mft_rcv_construct(&p_sm->mft_rcv); osm_sweep_fail_ctrl_construct(&p_sm->sweep_fail_ctrl); - osm_sminfo_rcv_construct(&p_sm->sm_info_rcv); - osm_trap_rcv_construct(&p_sm->trap_rcv); osm_sm_state_mgr_construct(&p_sm->sm_state_mgr); - osm_slvl_rcv_construct(&p_sm->slvl_rcv); - osm_vla_rcv_construct(&p_sm->vla_rcv); - osm_pkey_rcv_construct(&p_sm->pkey_rcv); osm_mcast_mgr_construct(&p_sm->mcast_mgr); } @@ -222,26 +224,16 @@ void osm_sm_shutdown(IN osm_sm_t * const p_sm) void osm_sm_destroy(IN osm_sm_t * const p_sm) { OSM_LOG_ENTER(p_sm->p_log, osm_sm_destroy); - osm_trap_rcv_destroy(&p_sm->trap_rcv); - osm_sminfo_rcv_destroy(&p_sm->sm_info_rcv); osm_req_destroy(&p_sm->req); osm_resp_destroy(&p_sm->resp); - osm_ni_rcv_destroy(&p_sm->ni_rcv); - osm_pi_rcv_destroy(&p_sm->pi_rcv); - osm_si_rcv_destroy(&p_sm->si_rcv); - osm_nd_rcv_destroy(&p_sm->nd_rcv); osm_lid_mgr_destroy(&p_sm->lid_mgr); osm_ucast_mgr_destroy(&p_sm->ucast_mgr); osm_link_mgr_destroy(&p_sm->link_mgr); osm_drop_mgr_destroy(&p_sm->drop_mgr); - osm_lft_rcv_destroy(&p_sm->lft_rcv); - osm_mft_rcv_destroy(&p_sm->mft_rcv); - osm_slvl_rcv_destroy(&p_sm->slvl_rcv); - osm_vla_rcv_destroy(&p_sm->vla_rcv); - osm_pkey_rcv_destroy(&p_sm->pkey_rcv); osm_state_mgr_destroy(&p_sm->state_mgr); osm_sm_state_mgr_destroy(&p_sm->sm_state_mgr); osm_mcast_mgr_destroy(&p_sm->mcast_mgr); + cl_event_wheel_destroy(&p_sm->trap_aging_tracker); cl_timer_destroy(&p_sm->sweep_timer); cl_event_destroy(&p_sm->signal_event); cl_event_destroy(&p_sm->subnet_up_event); @@ -319,24 +311,7 @@ osm_sm_init(IN osm_sm_t * const p_sm, if (status != IB_SUCCESS) goto Exit; - status = osm_ni_rcv_init(&p_sm->ni_rcv, - &p_sm->req, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_pi_rcv_init(&p_sm->pi_rcv, - &p_sm->req, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_si_rcv_init(&p_sm->si_rcv, - p_sm->p_subn, - p_sm->p_log, &p_sm->req, p_sm->p_lock); - - if (status != IB_SUCCESS) - goto Exit; - - status = osm_nd_rcv_init(&p_sm->nd_rcv, p_subn, p_log, p_lock); + status = cl_event_wheel_init(&p_sm->trap_aging_tracker); if (status != IB_SUCCESS) goto Exit; @@ -381,32 +356,11 @@ osm_sm_init(IN osm_sm_t * const p_sm, if (status != IB_SUCCESS) goto Exit; - status = osm_lft_rcv_init(&p_sm->lft_rcv, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_mft_rcv_init(&p_sm->mft_rcv, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - status = osm_sweep_fail_ctrl_init(&p_sm->sweep_fail_ctrl, p_log, p_sm, p_disp); if (status != IB_SUCCESS) goto Exit; - status = osm_sminfo_rcv_init(&p_sm->sm_info_rcv, - p_subn, - p_stats, - &p_sm->resp, - p_log, &p_sm->sm_state_mgr, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_trap_rcv_init(&p_sm->trap_rcv, - p_subn, p_stats, &p_sm->resp, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - status = osm_sm_state_mgr_init(&p_sm->sm_state_mgr, p_sm->p_subn, &p_sm->req, p_sm->p_log); if (status != IB_SUCCESS) @@ -417,80 +371,58 @@ osm_sm_init(IN osm_sm_t * const p_sm, if (status != IB_SUCCESS) goto Exit; - status = osm_slvl_rcv_init(&p_sm->slvl_rcv, - &p_sm->req, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_vla_rcv_init(&p_sm->vla_rcv, - &p_sm->req, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_pkey_rcv_init(&p_sm->pkey_rcv, - &p_sm->req, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - p_sm->ni_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_NODE_INFO, - osm_ni_rcv_process, &p_sm->ni_rcv); + osm_ni_rcv_process, p_sm); if (p_sm->ni_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->pi_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_PORT_INFO, - osm_pi_rcv_process, &p_sm->pi_rcv); + osm_pi_rcv_process, p_sm); if (p_sm->pi_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->si_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_SWITCH_INFO, - osm_si_rcv_process, &p_sm->si_rcv); + osm_si_rcv_process, p_sm); if (p_sm->si_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->nd_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_NODE_DESC, - osm_nd_rcv_process, &p_sm->nd_rcv); + osm_nd_rcv_process, p_sm); if (p_sm->nd_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->lft_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_LFT, - osm_lft_rcv_process, - &p_sm->lft_rcv); + osm_lft_rcv_process, p_sm); if (p_sm->lft_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->mft_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_MFT, - osm_mft_rcv_process, - &p_sm->mft_rcv); + osm_mft_rcv_process, p_sm); if (p_sm->mft_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->sm_info_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_SM_INFO, - osm_sminfo_rcv_process, - &p_sm->sm_info_rcv); + osm_sminfo_rcv_process, p_sm); if (p_sm->sm_info_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->trap_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_NOTICE, - osm_trap_rcv_process, - &p_sm->trap_rcv); + osm_trap_rcv_process, p_sm); if (p_sm->trap_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->slvl_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_SLVL, - osm_slvl_rcv_process, - &p_sm->slvl_rcv); + osm_slvl_rcv_process, p_sm); if (p_sm->slvl_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->vla_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_VL_ARB, - osm_vla_rcv_process, - &p_sm->vla_rcv); + osm_vla_rcv_process, p_sm); if (p_sm->vla_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sm->pkey_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_PKEY, - osm_pkey_rcv_process, - &p_sm->pkey_rcv); + osm_pkey_rcv_process, p_sm); if (p_sm->pkey_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; diff --git a/opensm/opensm/osm_sminfo_rcv.c b/opensm/opensm/osm_sminfo_rcv.c index 2367941..b150edd 100644 --- a/opensm/opensm/osm_sminfo_rcv.c +++ b/opensm/opensm/osm_sminfo_rcv.c @@ -55,7 +55,6 @@ #include #include #include -#include #include #include #include @@ -65,73 +64,27 @@ #include /********************************************************************** - **********************************************************************/ -void osm_sminfo_rcv_construct(IN osm_sminfo_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_sminfo_rcv_destroy(IN osm_sminfo_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_sminfo_rcv_destroy); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_sminfo_rcv_init(IN osm_sminfo_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_stats_t * const p_stats, - IN osm_resp_t * const p_resp, - IN osm_log_t * const p_log, - IN osm_sm_state_mgr_t * const p_sm_state_mgr, - IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_sminfo_rcv_init); - - osm_sminfo_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_stats = p_stats; - p_rcv->p_resp = p_resp; - p_rcv->p_sm_state_mgr = p_sm_state_mgr; - - OSM_LOG_EXIT(p_rcv->p_log); - return (status); -} - -/********************************************************************** Return TRUE if the remote sm given (by ib_sm_info_t) is higher, return FALSE otherwise. By higher - we mean: SM with higher priority or with same priority and lower GUID. **********************************************************************/ static inline boolean_t -__osm_sminfo_rcv_remote_sm_is_higher(IN const osm_sminfo_rcv_t * p_rcv, +__osm_sminfo_rcv_remote_sm_is_higher(IN osm_sm_t * sm, IN const ib_sm_info_t * p_remote_sm) { return (osm_sm_is_greater_than(ib_sminfo_get_priority(p_remote_sm), p_remote_sm->guid, - p_rcv->p_subn->opt.sm_priority, - p_rcv->p_subn->sm_port_guid)); + sm->p_subn->opt.sm_priority, + sm->p_subn->sm_port_guid)); } /********************************************************************** **********************************************************************/ static void -__osm_sminfo_rcv_process_get_request(IN const osm_sminfo_rcv_t * const p_rcv, +__osm_sminfo_rcv_process_get_request(IN osm_sm_t * sm, IN const osm_madw_t * const p_madw) { uint8_t payload[IB_SMP_DATA_SIZE]; @@ -140,7 +93,7 @@ __osm_sminfo_rcv_process_get_request(IN const osm_sminfo_rcv_t * const p_rcv, ib_api_status_t status; ib_sm_info_t *p_remote_smi; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sminfo_rcv_process_get_request); + OSM_LOG_ENTER(sm->p_log, __osm_sminfo_rcv_process_get_request); CL_ASSERT(p_madw); @@ -153,31 +106,31 @@ __osm_sminfo_rcv_process_get_request(IN const osm_sminfo_rcv_t * const p_rcv, CL_ASSERT(p_smp->method == IB_MAD_METHOD_GET); - p_smi->guid = p_rcv->p_subn->sm_port_guid; - p_smi->act_count = cl_hton32(p_rcv->p_stats->qp0_mads_sent); - p_smi->pri_state = (uint8_t) (p_rcv->p_subn->sm_state | - p_rcv->p_subn->opt.sm_priority << 4); + p_smi->guid = sm->p_subn->sm_port_guid; + p_smi->act_count = cl_hton32(sm->p_subn->p_osm->stats.qp0_mads_sent); + p_smi->pri_state = (uint8_t) (sm->p_subn->sm_state | + sm->p_subn->opt.sm_priority << 4); /* p.840 line 20 - Return 0 for the SM key unless we authenticate the requester as the master SM. */ p_remote_smi = ib_smp_get_payload_ptr(osm_madw_get_smp_ptr(p_madw)); if (ib_sminfo_get_state(p_remote_smi) == IB_SMINFO_STATE_MASTER) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_sminfo_rcv_process_get_request: " "Responding to master SM with real sm_key\n"); - p_smi->sm_key = p_rcv->p_subn->opt.sm_key; + p_smi->sm_key = sm->p_subn->opt.sm_key; } else { /* The requester is not authenticated as master - set sm_key to zero. */ - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_sminfo_rcv_process_get_request: " "Responding to SM not master with zero sm_key\n"); p_smi->sm_key = 0; } - status = osm_resp_send(p_rcv->p_resp, p_madw, 0, payload); + status = osm_resp_send(&sm->resp, p_madw, 0, payload); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_get_request: ERR 2F02: " "Error sending response (%s)\n", ib_get_err_str(status)); @@ -185,7 +138,7 @@ __osm_sminfo_rcv_process_get_request(IN const osm_sminfo_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** @@ -222,18 +175,18 @@ __osm_sminfo_rcv_check_set_req_legality(IN const ib_smp_t * const p_smp) /********************************************************************** **********************************************************************/ static void -__osm_sminfo_rcv_process_set_request(IN const osm_sminfo_rcv_t * const p_rcv, +__osm_sminfo_rcv_process_set_request(IN osm_sm_t * sm, IN const osm_madw_t * const p_madw) { uint8_t payload[IB_SMP_DATA_SIZE]; ib_smp_t *p_smp; ib_sm_info_t *p_smi = (ib_sm_info_t *) payload; - ib_sm_info_t *p_rcv_smi; + ib_sm_info_t *sm_smi; ib_api_status_t status; osm_sm_signal_t sm_signal; ib_sm_info_t *p_remote_smi; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sminfo_rcv_process_set_request); + OSM_LOG_ENTER(sm->p_log, __osm_sminfo_rcv_process_set_request); CL_ASSERT(p_madw); @@ -243,36 +196,36 @@ __osm_sminfo_rcv_process_set_request(IN const osm_sminfo_rcv_t * const p_rcv, memset(payload, 0, sizeof(payload)); /* get the lock */ - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); + CL_PLOCK_EXCL_ACQUIRE(sm->p_lock); p_smp = osm_madw_get_smp_ptr(p_madw); - p_rcv_smi = ib_smp_get_payload_ptr(p_smp); + sm_smi = ib_smp_get_payload_ptr(p_smp); if (p_smp->method != IB_MAD_METHOD_SET) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F03: " "Unsupported method 0x%X\n", p_smp->method); - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sm->p_lock); goto Exit; } - p_smi->guid = p_rcv->p_subn->sm_port_guid; - p_smi->act_count = cl_hton32(p_rcv->p_stats->qp0_mads_sent); - p_smi->pri_state = (uint8_t) (p_rcv->p_subn->sm_state | - p_rcv->p_subn->opt.sm_priority << 4); + p_smi->guid = sm->p_subn->sm_port_guid; + p_smi->act_count = cl_hton32(sm->p_subn->p_osm->stats.qp0_mads_sent); + p_smi->pri_state = (uint8_t) (sm->p_subn->sm_state | + sm->p_subn->opt.sm_priority << 4); /* p.840 line 20 - Return 0 for the SM key unless we authenticate the requester as the master SM. */ p_remote_smi = ib_smp_get_payload_ptr(osm_madw_get_smp_ptr(p_madw)); if (ib_sminfo_get_state(p_remote_smi) == IB_SMINFO_STATE_MASTER) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_sminfo_rcv_process_set_request: " "Responding to master SM with real sm_key\n"); - p_smi->sm_key = p_rcv->p_subn->opt.sm_key; + p_smi->sm_key = sm->p_subn->opt.sm_key; } else { /* The requester is not authenticated as master - set sm_key to zero. */ - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_sminfo_rcv_process_set_request: " "Responding to SM not master with zero sm_key\n"); p_smi->sm_key = 0; @@ -281,20 +234,20 @@ __osm_sminfo_rcv_process_set_request(IN const osm_sminfo_rcv_t * const p_rcv, /* Check the legality of the packet */ status = __osm_sminfo_rcv_check_set_req_legality(p_smp); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F04: " "Check legality failed. AttributeModifier:0x%X RemoteState:%s\n", p_smp->attr_mod, osm_get_sm_mgr_state_str(ib_sminfo_get_state - (p_rcv_smi))); + (sm_smi))); /* send a response with error code */ - status = osm_resp_send(p_rcv->p_resp, p_madw, 7, payload); + status = osm_resp_send(&sm->resp, p_madw, 7, payload); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F05: " "Error sending response (%s)\n", ib_get_err_str(status)); - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sm->p_lock); goto Exit; } @@ -320,38 +273,38 @@ __osm_sminfo_rcv_process_set_request(IN const osm_sminfo_rcv_t * const p_rcv, This code shouldn't be reached - checked in the check legality */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F06: " "THIS CODE SHOULD NOT BE REACHED!!\n"); - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sm->p_lock); goto Exit; } /* check legality of the needed transition in the SM state machine */ - status = osm_sm_state_mgr_check_legality(p_rcv->p_sm_state_mgr, + status = osm_sm_state_mgr_check_legality(&sm->sm_state_mgr, sm_signal); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F07: " "Failed check of legality of needed SM transition. AttributeModifier:0x%X RemoteState:%s\n", p_smp->attr_mod, osm_get_sm_mgr_state_str(ib_sminfo_get_state - (p_rcv_smi))); + (sm_smi))); /* send a response with error code */ - status = osm_resp_send(p_rcv->p_resp, p_madw, 7, payload); + status = osm_resp_send(&sm->resp, p_madw, 7, payload); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F08: " "Error sending response (%s)\n", ib_get_err_str(status)); - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sm->p_lock); goto Exit; } /* the SubnSet(SMInfo) command is ok. Send a response. */ - status = osm_resp_send(p_rcv->p_resp, p_madw, 0, payload); + status = osm_resp_send(&sm->resp, p_madw, 0, payload); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F09: " "Error sending response (%s)\n", ib_get_err_str(status)); @@ -362,26 +315,26 @@ __osm_sminfo_rcv_process_set_request(IN const osm_sminfo_rcv_t * const p_rcv, /* p_sm_state_mgr in the master_guid variable - the guid of the */ /* current master. */ if (p_smp->attr_mod == IB_SMINFO_ATTR_MOD_STANDBY) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_sminfo_rcv_process_set_request: " "Received a STANDBY signal. Updating " "sm_state_mgr master_guid: 0x%016" PRIx64 "\n", - cl_ntoh64(p_rcv_smi->guid)); - p_rcv->p_sm_state_mgr->master_guid = p_rcv_smi->guid; + cl_ntoh64(sm_smi->guid)); + sm->sm_state_mgr.master_guid = sm_smi->guid; } /* call osm_sm_state_mgr_process with the received signal. */ - CL_PLOCK_RELEASE(p_rcv->p_lock); - status = osm_sm_state_mgr_process(p_rcv->p_sm_state_mgr, sm_signal); + CL_PLOCK_RELEASE(sm->p_lock); + status = osm_sm_state_mgr_process(&sm->sm_state_mgr, sm_signal); if (status != IB_SUCCESS) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F10: " "Error in SM state transition (%s)\n", ib_get_err_str(status)); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** @@ -389,18 +342,18 @@ __osm_sminfo_rcv_process_set_request(IN const osm_sminfo_rcv_t * const p_rcv, * If return OSM_SIGNAL_NONE - do not call osm_sm_signal. **********************************************************************/ static osm_signal_t -__osm_sminfo_rcv_process_get_sm(IN const osm_sminfo_rcv_t * const p_rcv, +__osm_sminfo_rcv_process_get_sm(IN osm_sm_t * sm, IN const osm_remote_sm_t * const p_sm) { const ib_sm_info_t *p_smi; osm_signal_t ret_val = OSM_SIGNAL_NONE; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sminfo_rcv_process_get_sm); + OSM_LOG_ENTER(sm->p_log, __osm_sminfo_rcv_process_get_sm); p_smi = &p_sm->smi; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_VERBOSE)) - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_sminfo_rcv_process_get_sm: " "Detected SM 0x%016" PRIx64 " in state %u\n", cl_ntoh64(p_smi->guid), ib_sminfo_get_state(p_smi)); @@ -408,7 +361,7 @@ __osm_sminfo_rcv_process_get_sm(IN const osm_sminfo_rcv_t * const p_rcv, /* Check the state of this SM vs. our own. */ - switch (p_rcv->p_subn->sm_state) { + switch (sm->p_subn->sm_state) { case IB_SMINFO_STATE_NOTACTIVE: break; @@ -419,27 +372,27 @@ __osm_sminfo_rcv_process_get_sm(IN const osm_sminfo_rcv_t * const p_rcv, case IB_SMINFO_STATE_MASTER: ret_val = OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED; /* save on the p_sm_state_mgr the guid of the current master. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_sminfo_rcv_process_get_sm: " "Found master SM. Updating sm_state_mgr master_guid: 0x%016" PRIx64 "\n", cl_ntoh64(p_sm->p_port->guid)); - p_rcv->p_sm_state_mgr->master_guid = p_sm->p_port->guid; + sm->sm_state_mgr.master_guid = p_sm->p_port->guid; break; case IB_SMINFO_STATE_DISCOVERING: case IB_SMINFO_STATE_STANDBY: - if (__osm_sminfo_rcv_remote_sm_is_higher(p_rcv, p_smi) + if (__osm_sminfo_rcv_remote_sm_is_higher(sm, p_smi) == TRUE) { /* the remote is a higher sm - need to stop sweeping */ ret_val = OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED; - /* save on the p_sm_state_mgr the guid of the higher SM we found - */ + /* save on the sm_state_mgr the guid of the higher SM we found - */ /* we will poll it - as long as it lives - we should be in Standby. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_sminfo_rcv_process_get_sm: " "Found higher SM. Updating sm_state_mgr master_guid:" " 0x%016" PRIx64 "\n", cl_ntoh64(p_sm->p_port->guid)); - p_rcv->p_sm_state_mgr->master_guid = + sm->sm_state_mgr.master_guid = p_sm->p_port->guid; } break; @@ -455,22 +408,20 @@ __osm_sminfo_rcv_process_get_sm(IN const osm_sminfo_rcv_t * const p_rcv, case IB_SMINFO_STATE_MASTER: /* This means the master is alive */ /* Signal that to the SM state mgr */ - osm_sm_state_mgr_signal_master_is_alive(p_rcv-> - p_sm_state_mgr); + osm_sm_state_mgr_signal_master_is_alive(&sm->sm_state_mgr); break; case IB_SMINFO_STATE_STANDBY: /* This should be the response from the sm we are polling. */ /* If it is - then signal master is alive */ - if (p_rcv->p_sm_state_mgr->master_guid == - p_sm->p_port->guid) { + if (sm->sm_state_mgr.master_guid == p_sm->p_port->guid) { /* Make sure that it is an SM with higher priority than us. If we started polling it when it was master, and it moved to standby - then it might be with a lower priority than us - and then we don't want to continue polling it. */ if (__osm_sminfo_rcv_remote_sm_is_higher - (p_rcv, p_smi) == TRUE) + (sm, p_smi) == TRUE) osm_sm_state_mgr_signal_master_is_alive - (p_rcv->p_sm_state_mgr); + (&sm->sm_state_mgr); } break; default: @@ -485,9 +436,9 @@ __osm_sminfo_rcv_process_get_sm(IN const osm_sminfo_rcv_t * const p_rcv, /* If this is a response due to our polling, this means that we are waiting for a handover from this SM, and it is still alive - signal that. */ - if (p_rcv->p_sm_state_mgr->p_polling_sm != NULL) { - osm_sm_state_mgr_signal_master_is_alive(p_rcv-> - p_sm_state_mgr); + if (sm->sm_state_mgr.p_polling_sm != NULL) { + osm_sm_state_mgr_signal_master_is_alive(&sm-> + sm_state_mgr); } else { /* This is a response we got while sweeping the subnet. We will handle a case of handover needed later on, when the sweep @@ -504,14 +455,14 @@ __osm_sminfo_rcv_process_get_sm(IN const osm_sminfo_rcv_t * const p_rcv, break; } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); return ret_val; } /********************************************************************** **********************************************************************/ static void -__osm_sminfo_rcv_process_get_response(IN const osm_sminfo_rcv_t * const p_rcv, +__osm_sminfo_rcv_process_get_response(IN osm_sm_t * sm, IN const osm_madw_t * const p_madw) { const ib_smp_t *p_smp; @@ -522,34 +473,34 @@ __osm_sminfo_rcv_process_get_response(IN const osm_sminfo_rcv_t * const p_rcv, osm_remote_sm_t *p_sm; osm_signal_t process_get_sm_ret_val = OSM_SIGNAL_NONE; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sminfo_rcv_process_get_response); + OSM_LOG_ENTER(sm->p_log, __osm_sminfo_rcv_process_get_response); CL_ASSERT(p_madw); p_smp = osm_madw_get_smp_ptr(p_madw); if (p_smp->method != IB_MAD_METHOD_GET_RESP) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_get_response: ERR 2F11: " "Unsupported method 0x%X\n", p_smp->method); goto Exit; } p_smi = ib_smp_get_payload_ptr(p_smp); - p_sm_tbl = &p_rcv->p_subn->sm_guid_tbl; + p_sm_tbl = &sm->p_subn->sm_guid_tbl; port_guid = p_smi->guid; - osm_dump_sm_info(p_rcv->p_log, p_smi, OSM_LOG_DEBUG); + osm_dump_sm_info(sm->p_log, p_smi, OSM_LOG_DEBUG); /* Check that the sm_key of the found SM is the same as ours, or is zero. If not - OpenSM cannot continue with configuration!. */ - if (p_smi->sm_key != 0 && p_smi->sm_key != p_rcv->p_subn->opt.sm_key) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + if (p_smi->sm_key != 0 && p_smi->sm_key != sm->p_subn->opt.sm_key) { + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_get_response: ERR 2F18: " "Got SM with sm_key that doesn't match our " "local key. Exiting\n"); - osm_log(p_rcv->p_log, OSM_LOG_SYS, + osm_log(sm->p_log, OSM_LOG_SYS, "Found remote SM with non-matching sm_key. Exiting\n"); osm_exit_flag = TRUE; goto Exit; @@ -558,18 +509,18 @@ __osm_sminfo_rcv_process_get_response(IN const osm_sminfo_rcv_t * const p_rcv, /* Determine if we already have another SM object for this SM. */ - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); + CL_PLOCK_EXCL_ACQUIRE(sm->p_lock); - p_port = osm_get_port_by_guid(p_rcv->p_subn, port_guid); + p_port = osm_get_port_by_guid(sm->p_subn, port_guid); if (!p_port) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_get_response: ERR 2F12: " "No port object for this SM\n"); goto _unlock_and_exit; } if (osm_port_get_guid(p_port) != p_smi->guid) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_get_response: ERR 2F13: " "Bogus SM port GUID" "\n\t\t\t\tExpected 0x%016" PRIx64 @@ -579,8 +530,8 @@ __osm_sminfo_rcv_process_get_response(IN const osm_sminfo_rcv_t * const p_rcv, goto _unlock_and_exit; } - if (port_guid == p_rcv->p_subn->sm_port_guid) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (port_guid == sm->p_subn->sm_port_guid) { + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_sminfo_rcv_process_get_response: " "Self query response received - SM port 0x%016" PRIx64 "\n", cl_ntoh64(port_guid)); @@ -591,7 +542,7 @@ __osm_sminfo_rcv_process_get_response(IN const osm_sminfo_rcv_t * const p_rcv, if (p_sm == (osm_remote_sm_t *) cl_qmap_end(p_sm_tbl)) { p_sm = malloc(sizeof(*p_sm)); if (p_sm == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_get_response: ERR 2F14: " "Unable to allocate SM object\n"); goto _unlock_and_exit; @@ -607,49 +558,49 @@ __osm_sminfo_rcv_process_get_response(IN const osm_sminfo_rcv_t * const p_rcv, */ p_sm->smi = *p_smi; - process_get_sm_ret_val = __osm_sminfo_rcv_process_get_sm(p_rcv, p_sm); + process_get_sm_ret_val = __osm_sminfo_rcv_process_get_sm(sm, p_sm); _unlock_and_exit: - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sm->p_lock); /* If process_get_sm_ret_val != OSM_SIGNAL_NONE then we have to signal * to the SM with that signal. */ if (process_get_sm_ret_val != OSM_SIGNAL_NONE) - osm_sm_signal(&p_rcv->p_subn->p_osm->sm, + osm_sm_signal(&sm->p_subn->p_osm->sm, process_get_sm_ret_val); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_sminfo_rcv_process_set_response(IN const osm_sminfo_rcv_t * const p_rcv, +__osm_sminfo_rcv_process_set_response(IN osm_sm_t * sm, IN const osm_madw_t * const p_madw) { const ib_smp_t *p_smp; const ib_sm_info_t *p_smi; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sminfo_rcv_process_set_response); + OSM_LOG_ENTER(sm->p_log, __osm_sminfo_rcv_process_set_response); CL_ASSERT(p_madw); p_smp = osm_madw_get_smp_ptr(p_madw); if (p_smp->method != IB_MAD_METHOD_GET_RESP) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_response: ERR 2F16: " "Unsupported method 0x%X\n", p_smp->method); goto Exit; } p_smi = ib_smp_get_payload_ptr(p_smp); - osm_dump_sm_info(p_rcv->p_log, p_smi, OSM_LOG_DEBUG); + osm_dump_sm_info(sm->p_log, p_smi, OSM_LOG_DEBUG); /* Check the AttributeModifier */ if (p_smp->attr_mod != IB_SMINFO_ATTR_MOD_HANDOVER) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_response: ERR 2F17: " "Unsupported attribute modifier 0x%X\n", p_smp->attr_mod); @@ -662,19 +613,19 @@ __osm_sminfo_rcv_process_set_response(IN const osm_sminfo_rcv_t * const p_rcv, */ Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ void osm_sminfo_rcv_process(IN void *context, IN void *data) { - osm_sminfo_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_smp_t *p_smp; osm_smi_context_t *p_smi_context; - OSM_LOG_ENTER(p_rcv->p_log, osm_sminfo_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_sminfo_rcv_process); CL_ASSERT(p_madw); @@ -696,7 +647,7 @@ void osm_sminfo_rcv_process(IN void *context, IN void *data) moving issue. */ if (p_smi_context->port_guid != p_smi->guid) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_sminfo_rcv_process: ERR 2F19: " "Unexpected SM port GUID in response" "\n\t\t\t\tExpected 0x%016" PRIx64 @@ -708,18 +659,18 @@ void osm_sminfo_rcv_process(IN void *context, IN void *data) if (p_smi_context->set_method == FALSE) /* this is a response to a Get method */ - __osm_sminfo_rcv_process_get_response(p_rcv, p_madw); + __osm_sminfo_rcv_process_get_response(sm, p_madw); else /* this is a response to a Set method */ - __osm_sminfo_rcv_process_set_response(p_rcv, p_madw); + __osm_sminfo_rcv_process_set_response(sm, p_madw); } else if (p_smp->method == IB_MAD_METHOD_GET) /* This is a request */ /* This is a SubnGet request */ - __osm_sminfo_rcv_process_get_request(p_rcv, p_madw); + __osm_sminfo_rcv_process_get_request(sm, p_madw); else /* This should be a SubnSet request */ - __osm_sminfo_rcv_process_set_request(p_rcv, p_madw); + __osm_sminfo_rcv_process_set_request(sm, p_madw); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_sw_info_rcv.c b/opensm/opensm/osm_sw_info_rcv.c index 55c43e6..d9bd21b 100644 --- a/opensm/opensm/osm_sw_info_rcv.c +++ b/opensm/opensm/osm_sw_info_rcv.c @@ -54,7 +54,6 @@ #include #include #include -#include #include #include #include @@ -65,7 +64,7 @@ The plock must be held before calling this function. **********************************************************************/ static void -__osm_si_rcv_get_port_info(IN const osm_si_rcv_t * const p_rcv, +__osm_si_rcv_get_port_info(IN osm_sm_t * sm, IN osm_switch_t * const p_sw, IN const osm_madw_t * const p_madw) { @@ -78,7 +77,7 @@ __osm_si_rcv_get_port_info(IN const osm_si_rcv_t * const p_rcv, const ib_smp_t *p_smp; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_si_rcv_get_port_info); + OSM_LOG_ENTER(sm->p_log, __osm_si_rcv_get_port_info); CL_ASSERT(p_sw); @@ -112,21 +111,21 @@ __osm_si_rcv_get_port_info(IN const osm_si_rcv_t * const p_rcv, p_smp->hop_count, p_smp->initial_path); for (port_num = 0; port_num < num_ports; port_num++) { - status = osm_req_get(p_rcv->p_req, + status = osm_req_get(&sm->req, &dr_path, IB_MAD_ATTR_PORT_INFO, cl_hton32(port_num), CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) { /* continue the loop despite the error */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_si_rcv_get_port_info: ERR 3602: " "Failure initiating PortInfo request (%s)\n", ib_get_err_str(status)); } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } #if 0 @@ -134,7 +133,7 @@ __osm_si_rcv_get_port_info(IN const osm_si_rcv_t * const p_rcv, The plock must be held before calling this function. **********************************************************************/ static void -__osm_si_rcv_get_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, +__osm_si_rcv_get_fwd_tbl(IN osm_sm_t * sm, IN osm_switch_t * const p_sw) { osm_madw_context_t context; @@ -145,7 +144,7 @@ __osm_si_rcv_get_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, uint32_t max_block_id_ho; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_si_rcv_get_fwd_tbl); + OSM_LOG_ENTER(sm->p_log, __osm_si_rcv_get_fwd_tbl); CL_ASSERT(p_sw); @@ -165,34 +164,34 @@ __osm_si_rcv_get_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, p_dr_path = osm_physp_get_dr_path_ptr(p_physp); for (block_id_ho = 0; block_id_ho <= max_block_id_ho; block_id_ho++) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_si_rcv_get_fwd_tbl: " "Retrieving FT block %u\n", block_id_ho); } - status = osm_req_get(p_rcv->p_req, + status = osm_req_get(&sm->req, p_dr_path, IB_MAD_ATTR_LIN_FWD_TBL, cl_hton32(block_id_ho), CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) { /* continue the loop despite the error */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_si_rcv_get_fwd_tbl: ERR 3603: " "Failure initiating PortInfo request (%s)\n", ib_get_err_str(status)); } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** The plock must be held before calling this function. **********************************************************************/ static void -__osm_si_rcv_get_mcast_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, +__osm_si_rcv_get_mcast_fwd_tbl(IN osm_sm_t * sm, IN osm_switch_t * const p_sw) { osm_madw_context_t context; @@ -207,7 +206,7 @@ __osm_si_rcv_get_mcast_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, uint32_t attr_mod_ho; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_si_rcv_get_mcast_fwd_tbl); + OSM_LOG_ENTER(sm->p_log, __osm_si_rcv_get_mcast_fwd_tbl); CL_ASSERT(p_sw); @@ -216,7 +215,7 @@ __osm_si_rcv_get_mcast_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, CL_ASSERT(osm_node_get_type(p_node) == IB_NODE_TYPE_SWITCH); if (osm_switch_get_mcast_fwd_tbl_size(p_sw) == 0) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_si_rcv_get_mcast_fwd_tbl: " "Multicast not supported by switch 0x%016" PRIx64 "\n", cl_ntoh64(osm_node_get_node_guid(p_node))); @@ -234,7 +233,7 @@ __osm_si_rcv_get_mcast_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, max_block_id_ho = osm_mcast_tbl_get_max_block(p_tbl); if (max_block_id_ho > IB_MCAST_MAX_BLOCK_ID) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_si_rcv_get_mcast_fwd_tbl: ERR 3609: " "Out-of-range mcast block size = %u on switch 0x%016" PRIx64 "\n", max_block_id_ho, @@ -246,7 +245,7 @@ __osm_si_rcv_get_mcast_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, CL_ASSERT(max_position <= IB_MCAST_POSITION_MAX); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_si_rcv_get_mcast_fwd_tbl: " "Max MFT block = %u, Max position = %u\n", max_block_id_ho, max_position); @@ -254,15 +253,15 @@ __osm_si_rcv_get_mcast_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, p_dr_path = osm_physp_get_dr_path_ptr(p_physp); for (block_id_ho = 0; block_id_ho <= max_block_id_ho; block_id_ho++) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_si_rcv_get_mcast_fwd_tbl: " "Retrieving MFT block %u\n", block_id_ho); } for (position = 0; position <= max_position; position++) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_si_rcv_get_mcast_fwd_tbl: " "Retrieving MFT position %u\n", position); @@ -271,13 +270,13 @@ __osm_si_rcv_get_mcast_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, attr_mod_ho = block_id_ho | position << IB_MCAST_POSITION_SHIFT; status = - osm_req_get(p_rcv->p_req, p_dr_path, + osm_req_get(&sm->req, p_dr_path, IB_MAD_ATTR_MCAST_FWD_TBL, cl_hton32(attr_mod_ho), CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) { /* continue the loop despite the error */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_si_rcv_get_mcast_fwd_tbl: ERR 3607: " "Failure initiating PortInfo request (%s)\n", ib_get_err_str(status)); @@ -286,7 +285,7 @@ __osm_si_rcv_get_mcast_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } #endif @@ -294,7 +293,7 @@ __osm_si_rcv_get_mcast_fwd_tbl(IN const osm_si_rcv_t * const p_rcv, Lock must be held on entry to this function. **********************************************************************/ static void -__osm_si_rcv_process_new(IN const osm_si_rcv_t * const p_rcv, +__osm_si_rcv_process_new(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { @@ -304,18 +303,18 @@ __osm_si_rcv_process_new(IN const osm_si_rcv_t * const p_rcv, ib_smp_t *p_smp; cl_qmap_t *p_sw_guid_tbl; - CL_ASSERT(p_rcv); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_rcv->p_log, __osm_si_rcv_process_new); + OSM_LOG_ENTER(sm->p_log, __osm_si_rcv_process_new); CL_ASSERT(p_madw); - p_sw_guid_tbl = &p_rcv->p_subn->sw_guid_tbl; + p_sw_guid_tbl = &sm->p_subn->sw_guid_tbl; p_smp = osm_madw_get_smp_ptr(p_madw); p_si = (ib_switch_info_t *) ib_smp_get_payload_ptr(p_smp); - osm_dump_switch_info(p_rcv->p_log, p_si, OSM_LOG_DEBUG); + osm_dump_switch_info(sm->p_log, p_si, OSM_LOG_DEBUG); /* Allocate a new switch object for this switch, @@ -323,30 +322,30 @@ __osm_si_rcv_process_new(IN const osm_si_rcv_t * const p_rcv, */ p_sw = osm_switch_new(p_node, p_madw); if (p_sw == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_si_rcv_process_new: ERR 3608: " "Unable to allocate new switch object\n"); goto Exit; } /* set subnet max mlid to the minimum MulticastFDBCap of all switches */ - if (p_sw->mcast_tbl.max_mlid_ho < p_rcv->p_subn->max_multicast_lid_ho) { - p_rcv->p_subn->max_multicast_lid_ho = + if (p_sw->mcast_tbl.max_mlid_ho < sm->p_subn->max_multicast_lid_ho) { + sm->p_subn->max_multicast_lid_ho = p_sw->mcast_tbl.max_mlid_ho; - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_si_rcv_process_new: " "Subnet max multicast lid is 0x%X\n", - p_rcv->p_subn->max_multicast_lid_ho); + sm->p_subn->max_multicast_lid_ho); } /* set subnet max unicast lid to the minimum LinearFDBCap of all switches */ - if (p_sw->fwd_tbl.p_lin_tbl->size < p_rcv->p_subn->max_unicast_lid_ho) { - p_rcv->p_subn->max_unicast_lid_ho = + if (p_sw->fwd_tbl.p_lin_tbl->size < sm->p_subn->max_unicast_lid_ho) { + sm->p_subn->max_unicast_lid_ho = p_sw->fwd_tbl.p_lin_tbl->size; - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_si_rcv_process_new: " "Subnet max unicast lid is 0x%X\n", - p_rcv->p_subn->max_unicast_lid_ho); + sm->p_subn->max_unicast_lid_ho); } p_check = (osm_switch_t *) cl_qmap_insert(p_sw_guid_tbl, @@ -357,7 +356,7 @@ __osm_si_rcv_process_new(IN const osm_si_rcv_t * const p_rcv, /* This shouldn't happen since we hold the lock! */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_si_rcv_process_new: ERR 3605: " "Unable to add new switch object to database\n"); osm_switch_delete(&p_sw); @@ -376,7 +375,7 @@ __osm_si_rcv_process_new(IN const osm_si_rcv_t * const p_rcv, /* Get the PortInfo attribute for every port. */ - __osm_si_rcv_get_port_info(p_rcv, p_sw, p_madw); + __osm_si_rcv_get_port_info(sm, p_sw, p_madw); /* Don't bother retrieving the current unicast and multicast tables @@ -390,13 +389,13 @@ __osm_si_rcv_process_new(IN const osm_si_rcv_t * const p_rcv, The code to retrieve the tables was fully debugged. */ #if 0 - __osm_si_rcv_get_fwd_tbl(p_rcv, p_sw); - if (!p_rcv->p_subn->opt.disable_multicast) - __osm_si_rcv_get_mcast_fwd_tbl(p_rcv, p_sw); + __osm_si_rcv_get_fwd_tbl(sm, p_sw); + if (!sm->p_subn->opt.disable_multicast) + __osm_si_rcv_get_mcast_fwd_tbl(sm, p_sw); #endif Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** @@ -405,7 +404,7 @@ __osm_si_rcv_process_new(IN const osm_si_rcv_t * const p_rcv, this can not be done internally as the event needs the lock... **********************************************************************/ static boolean_t -__osm_si_rcv_process_existing(IN const osm_si_rcv_t * const p_rcv, +__osm_si_rcv_process_existing(IN osm_sm_t * sm, IN osm_node_t * const p_node, IN const osm_madw_t * const p_madw) { @@ -415,7 +414,7 @@ __osm_si_rcv_process_existing(IN const osm_si_rcv_t * const p_rcv, ib_smp_t *p_smp; boolean_t is_change_detected = FALSE; - OSM_LOG_ENTER(p_rcv->p_log, __osm_si_rcv_process_existing); + OSM_LOG_ENTER(sm->p_log, __osm_si_rcv_process_existing); CL_ASSERT(p_madw); @@ -424,16 +423,16 @@ __osm_si_rcv_process_existing(IN const osm_si_rcv_t * const p_rcv, p_si_context = osm_madw_get_si_context_ptr(p_madw); if (p_si_context->set_method) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_si_rcv_process_existing: " "Received logical SetResp()\n"); } osm_switch_set_switch_info(p_sw, p_si); } else { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_si_rcv_process_existing: " "Received logical GetResp()\n"); } @@ -449,7 +448,7 @@ __osm_si_rcv_process_existing(IN const osm_si_rcv_t * const p_rcv, /* If the mad was returned with an error - signal a change to the state manager. */ if (ib_smp_get_status(p_smp) != 0) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_si_rcv_process_existing: " "GetResp() received with error in light sweep. " "Commencing heavy sweep\n"); @@ -461,7 +460,7 @@ __osm_si_rcv_process_existing(IN const osm_si_rcv_t * const p_rcv, a light sweep. */ if (ib_switch_info_get_state_change(p_si)) { - osm_dump_switch_info(p_rcv->p_log, p_si, + osm_dump_switch_info(sm->p_log, p_si, OSM_LOG_DEBUG); is_change_detected = TRUE; } @@ -472,16 +471,16 @@ __osm_si_rcv_process_existing(IN const osm_si_rcv_t * const p_rcv, of the state change bit. */ p_sw->discovery_count++; - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_si_rcv_process_existing: " "discovery_count is:%u\n", p_sw->discovery_count); /* If this is the first discovery - then get the port_info */ if (p_sw->discovery_count == 1) - __osm_si_rcv_get_port_info(p_rcv, p_sw, p_madw); + __osm_si_rcv_get_port_info(sm, p_sw, p_madw); else { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_si_rcv_process_existing: " "Not discovering again through switch:0x%" PRIx64 "\n", @@ -490,55 +489,15 @@ __osm_si_rcv_process_existing(IN const osm_si_rcv_t * const p_rcv, } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); return is_change_detected; } /********************************************************************** **********************************************************************/ -void osm_si_rcv_construct(IN osm_si_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_si_rcv_destroy(IN osm_si_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_si_rcv_destroy); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_si_rcv_init(IN osm_si_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN osm_req_t * const p_req, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_si_rcv_init); - - osm_si_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_req = p_req; - - OSM_LOG_EXIT(p_rcv->p_log); - return (status); -} - -/********************************************************************** - **********************************************************************/ void osm_si_rcv_process(IN void *context, IN void *data) { - osm_si_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_switch_info_t *p_si; ib_smp_t *p_smp; @@ -546,9 +505,9 @@ void osm_si_rcv_process(IN void *context, IN void *data) ib_net64_t node_guid; osm_si_context_t *p_context; - CL_ASSERT(p_rcv); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_rcv->p_log, osm_si_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_si_rcv_process); CL_ASSERT(p_madw); @@ -563,19 +522,19 @@ void osm_si_rcv_process(IN void *context, IN void *data) node_guid = p_context->node_guid; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "osm_si_rcv_process: " "Switch GUID 0x%016" PRIx64 ", TID 0x%" PRIx64 "\n", cl_ntoh64(node_guid), cl_ntoh64(p_smp->trans_id)); } - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); + CL_PLOCK_EXCL_ACQUIRE(sm->p_lock); - p_node = osm_get_node_by_guid(p_rcv->p_subn, node_guid); + p_node = osm_get_node_by_guid(sm->p_subn, node_guid); if (!p_node) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_si_rcv_process: ERR 3606: " "SwitchInfo received for nonexistent node " "with GUID 0x%" PRIx64 "\n", cl_ntoh64(node_guid)); @@ -585,7 +544,7 @@ void osm_si_rcv_process(IN void *context, IN void *data) Hack for bad value in Mellanox switch */ if (cl_ntoh16(p_si->lin_top) > IB_LID_UCAST_END_HO) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_si_rcv_process: ERR 3610: " "\n\t\t\t\tBad LinearFDBTop value = 0x%X " "on switch 0x%" PRIx64 @@ -600,25 +559,25 @@ void osm_si_rcv_process(IN void *context, IN void *data) Acquire the switch object for this switch. */ if (!p_node->sw) { - __osm_si_rcv_process_new(p_rcv, p_node, p_madw); + __osm_si_rcv_process_new(sm, p_node, p_madw); /* A new switch was found during the sweep so we need to ignore the current LFT settings. */ - p_rcv->p_subn->ignore_existing_lfts = TRUE; + sm->p_subn->ignore_existing_lfts = TRUE; } else { /* we might get back a request for signaling change was detected */ if (__osm_si_rcv_process_existing - (p_rcv, p_node, p_madw)) { - CL_PLOCK_RELEASE(p_rcv->p_lock); - osm_sm_signal(&p_rcv->p_subn->p_osm->sm, + (sm, p_node, p_madw)) { + CL_PLOCK_RELEASE(sm->p_lock); + osm_sm_signal(&sm->p_subn->p_osm->sm, OSM_SIGNAL_CHANGE_DETECTED); goto Exit; } } } - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sm->p_lock); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_trap_rcv.c b/opensm/opensm/osm_trap_rcv.c index ae11323..196bca2 100644 --- a/opensm/opensm/osm_trap_rcv.c +++ b/opensm/opensm/osm_trap_rcv.c @@ -53,7 +53,6 @@ #include #include #include -#include #include #include #include @@ -91,10 +90,10 @@ typedef struct _osm_trap_aging_tracker_context { /********************************************************************** **********************************************************************/ -static osm_physp_t *__get_physp_by_lid_and_num(IN osm_trap_rcv_t * const p_rcv, +static osm_physp_t *__get_physp_by_lid_and_num(IN osm_sm_t * sm, IN uint16_t lid, IN uint8_t num) { - cl_ptr_vector_t *p_vec = &(p_rcv->p_subn->port_lid_tbl); + cl_ptr_vector_t *p_vec = &(sm->p_subn->port_lid_tbl); osm_port_t *p_port; osm_physp_t *p_physp; @@ -119,12 +118,12 @@ uint64_t osm_trap_rcv_aging_tracker_callback(IN uint64_t key, IN uint32_t num_regs, IN void *context) { - osm_trap_rcv_t *p_rcv = (osm_trap_rcv_t *) context; + osm_sm_t *sm = context; uint16_t lid; uint8_t port_num; osm_physp_t *p_physp; - OSM_LOG_ENTER(p_rcv->p_log, osm_trap_rcv_aging_tracker_callback); + OSM_LOG_ENTER(sm->p_log, osm_trap_rcv_aging_tracker_callback); if (osm_exit_flag) /* We got an exit flag - do nothing */ @@ -133,16 +132,16 @@ osm_trap_rcv_aging_tracker_callback(IN uint64_t key, lid = cl_ntoh16((uint16_t) ((key & 0x0000FFFF00000000ULL) >> 32)); port_num = (uint8_t) ((key & 0x00FF000000000000ULL) >> 48); - p_physp = __get_physp_by_lid_and_num(p_rcv, lid, port_num); + p_physp = __get_physp_by_lid_and_num(sm, lid, port_num); if (!p_physp) - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_trap_rcv_aging_tracker_callback: " "Cannot find port num:0x%X with lid:%u\n", port_num, lid); /* make sure the physp is still valid */ /* If the health port was false - set it to true */ else if (osm_physp_is_valid(p_physp) && !osm_physp_is_healthy(p_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_trap_rcv_aging_tracker_callback: " "Clearing health bit of port num:%u with lid:%u\n", port_num, lid); @@ -151,7 +150,7 @@ osm_trap_rcv_aging_tracker_callback(IN uint64_t key, osm_physp_set_health(p_physp, TRUE); } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); /* We want to remove the event from the tracker - so need to return zero. */ @@ -159,59 +158,6 @@ osm_trap_rcv_aging_tracker_callback(IN uint64_t key, } /********************************************************************** - **********************************************************************/ -void osm_trap_rcv_construct(IN osm_trap_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); - cl_event_wheel_construct(&p_rcv->trap_aging_tracker); -} - -/********************************************************************** - **********************************************************************/ -void osm_trap_rcv_destroy(IN osm_trap_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_trap_rcv_destroy); - - cl_event_wheel_destroy(&p_rcv->trap_aging_tracker); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_trap_rcv_init(IN osm_trap_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_stats_t * const p_stats, - IN osm_resp_t * const p_resp, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_trap_rcv_init); - - osm_trap_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_stats = p_stats; - p_rcv->p_resp = p_resp; - - if (cl_event_wheel_init(&p_rcv->trap_aging_tracker)) { - osm_log(p_log, OSM_LOG_ERROR, - "osm_trap_rcv_init: ERR 3800: " - "Failed to initialize cl_event_wheel\n"); - status = IB_NOT_DONE; - } - - OSM_LOG_EXIT(p_rcv->p_log); - return (status); -} - -/********************************************************************** * CRC calculation for notice identification **********************************************************************/ @@ -297,7 +243,7 @@ static int __print_num_received(IN uint32_t num_received) /********************************************************************** **********************************************************************/ static void -__osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, +__osm_trap_rcv_process_request(IN osm_sm_t * sm, IN const osm_madw_t * const p_madw) { uint8_t payload[sizeof(ib_mad_notice_attr_t)]; @@ -317,7 +263,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, uint64_t event_wheel_timeout = OSM_DEFAULT_TRAP_SUPRESSION_TIMEOUT; boolean_t run_heavy_sweep = FALSE; - OSM_LOG_ENTER(p_rcv->p_log, __osm_trap_rcv_process_request); + OSM_LOG_ENTER(sm->p_log, __osm_trap_rcv_process_request); CL_ASSERT(p_madw); @@ -343,7 +289,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, p_smp = osm_madw_get_smp_ptr(p_madw); if (p_smp->method != IB_MAD_METHOD_TRAP) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3801: " "Unsupported method 0x%X\n", p_smp->method); goto Exit; @@ -369,19 +315,19 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, /* Check if the sm_base_lid is 0. If yes - this means that the local lid wasn't configured yet. Don't send a response to the trap. */ - if (p_rcv->p_subn->sm_base_lid == 0) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (sm->p_subn->sm_base_lid == 0) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_trap_rcv_process_request: " "Received SLID=0 Trap with local LID=0. Ignoring MAD\n"); goto Exit; } - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sm->p_log, OSM_LOG_DEBUG, "__osm_trap_rcv_process_request: " "Received SLID=0 Trap. Using local LID:0x%04X instead\n", - cl_ntoh16(p_rcv->p_subn->sm_base_lid) + cl_ntoh16(sm->p_subn->sm_base_lid) ); tmp_madw.mad_addr.addr_type.smi.source_lid = - p_rcv->p_subn->sm_base_lid; + sm->p_subn->sm_base_lid; } source_lid = tmp_madw.mad_addr.addr_type.smi.source_lid; @@ -393,7 +339,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, CL_HTON16(130)) || (p_ntci->g_or_v.generic.trap_num == CL_HTON16(131))) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: " "Received Generic Notice type:0x%02X num:%u Producer:%u (%s) " "from LID:0x%04X Port %d TID:0x%016" @@ -409,7 +355,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, port_num, cl_ntoh64(p_smp->trans_id) ); else - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: " "Received Generic Notice type:0x%02X num:%u Producer:%u (%s) " "from LID:0x%04X TID:0x%016" PRIx64 @@ -424,7 +370,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, cl_ntoh64(p_smp->trans_id) ); } else - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: " "Received Vendor Notice type:0x%02X vend:0x%06X dev:%u " "from LID:0x%04X TID:0x%016" PRIx64 "\n", @@ -436,20 +382,20 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, ); } - osm_dump_notice(p_rcv->p_log, p_ntci, OSM_LOG_VERBOSE); + osm_dump_notice(sm->p_log, p_ntci, OSM_LOG_VERBOSE); - p_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, &tmp_madw.mad_addr); + p_physp = osm_get_physp_by_mad_addr(sm->p_log, + sm->p_subn, &tmp_madw.mad_addr); if (p_physp) p_smp->m_key = p_physp->port_info.m_key; else - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3809: " "Failed to find source physical port for trap\n"); - status = osm_resp_send(p_rcv->p_resp, &tmp_madw, 0, payload); + status = osm_resp_send(&sm->resp, &tmp_madw, 0, payload); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3802: " "Error sending response (%s)\n", ib_get_err_str(status)); @@ -488,13 +434,13 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, /* try to find it in the aging tracker */ num_received = - cl_event_wheel_num_regs(&p_rcv->trap_aging_tracker, + cl_event_wheel_num_regs(&sm->trap_aging_tracker, trap_key); /* Now we know how many times it provided this trap */ if (num_received > 10) { if (__print_num_received(num_received)) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3804: " "Received trap %u times consecutively\n", num_received); @@ -504,7 +450,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, */ if (physp_change_trap == TRUE) { /* get the port */ - p_physp = __get_physp_by_lid_and_num(p_rcv, + p_physp = __get_physp_by_lid_and_num(sm, cl_ntoh16 (p_ntci-> data_details. @@ -513,7 +459,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, port_num); if (!p_physp) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3805: " "Failed to find physical port by lid:0x%02X num:%u\n", cl_ntoh16(p_ntci->data_details. @@ -523,7 +469,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, else { /* When babbling port policy option is enabled and Threshold for disabling a "babbling" port is exceeded */ - if (p_rcv->p_subn->opt. + if (sm->p_subn->opt. babbling_port_policy && num_received >= 250) { uint8_t @@ -536,7 +482,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, /* If trap 131, might want to disable peer port if available */ /* but peer port has been observed not to respond to SM requests */ - osm_log(p_rcv->p_log, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3810: " " Disabling physical port lid:0x%02X num:%u\n", @@ -577,7 +523,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, active_transition = FALSE; status = - osm_req_set(&p_rcv->p_subn-> + osm_req_set(&sm->p_subn-> p_osm->sm.req, osm_physp_get_dr_path_ptr (p_physp), @@ -593,13 +539,13 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, if (status == IB_SUCCESS) goto Exit; - osm_log(p_rcv->p_log, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3811: " "Request to set PortInfo failed\n"); } - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_trap_rcv_process_request: " "Marking unhealthy physical port by lid:0x%02X num:%u\n", cl_ntoh16(p_ntci->data_details. @@ -631,18 +577,18 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, /* If physp_change_trap is TRUE - then use a callback to unset the healthy bit. If not - no need to use a callback. */ if (physp_change_trap == TRUE) - cl_event_wheel_reg(&p_rcv->trap_aging_tracker, trap_key, cl_get_time_stamp() + event_wheel_timeout, osm_trap_rcv_aging_tracker_callback, /* no callback */ - p_rcv /* no context */ + cl_event_wheel_reg(&sm->trap_aging_tracker, trap_key, cl_get_time_stamp() + event_wheel_timeout, osm_trap_rcv_aging_tracker_callback, /* no callback */ + sm /* no context */ ); else - cl_event_wheel_reg(&p_rcv->trap_aging_tracker, trap_key, cl_get_time_stamp() + event_wheel_timeout, NULL, /* no callback */ + cl_event_wheel_reg(&sm->trap_aging_tracker, trap_key, cl_get_time_stamp() + event_wheel_timeout, NULL, /* no callback */ NULL /* no context */ ); /* If was already registered do nothing more */ if (num_received > 10 && run_heavy_sweep == FALSE) { if (__print_num_received(num_received)) - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_trap_rcv_process_request: " "Continuously received this trap %u times. Ignoring\n", num_received); @@ -651,7 +597,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, } /* do a sweep if we received a trap */ - if (p_rcv->p_subn->opt.sweep_on_trap) { + if (sm->p_subn->opt.sweep_on_trap) { /* if this is trap number 128 or run_heavy_sweep is TRUE - update the force_single_heavy_sweep flag of the subnet. Sweep also on traps 144/145 - these traps signal a change of a certain @@ -663,15 +609,15 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144) || (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 145) || run_heavy_sweep)) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_trap_rcv_process_request: " "Forcing immediate heavy sweep. " "Received trap:%u\n", cl_ntoh16(p_ntci->g_or_v.generic.trap_num)); - p_rcv->p_subn->force_immediate_heavy_sweep = TRUE; + sm->p_subn->force_immediate_heavy_sweep = TRUE; } - osm_sm_signal(&p_rcv->p_subn->p_osm->sm, OSM_SIGNAL_SWEEP); + osm_sm_signal(&sm->p_subn->p_osm->sm, OSM_SIGNAL_SWEEP); } /* If we reached here due to trap 129/130/131 - do not need to do @@ -685,7 +631,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, accordingly. See IBA 1.2 p.739 or IBA 1.1 p.653 for details. */ if (is_gsi) { if (!tmp_madw.mad_addr.addr_type.gsi.global_route) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3806: " "Received gsi trap with global_route FALSE. " "Cannot update issuer_gid!\n"); @@ -696,14 +642,14 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, sizeof(ib_gid_t)); } else { /* Need to use the IssuerLID */ - p_tbl = &p_rcv->p_subn->port_lid_tbl; + p_tbl = &sm->p_subn->port_lid_tbl; CL_ASSERT(cl_ptr_vector_get_size(p_tbl) < 0x10000); if ((uint16_t) cl_ptr_vector_get_size(p_tbl) <= cl_ntoh16(source_lid)) { /* the source lid is out of range */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_trap_rcv_process_request: " "source lid is out of range:0x%X\n", cl_ntoh16(source_lid)); @@ -713,7 +659,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, p_port = cl_ptr_vector_get(p_tbl, cl_ntoh16(source_lid)); if (p_port == 0) { /* We have the lid - but no corresponding port */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_trap_rcv_process_request: " "Cannot find port corresponding to lid:0x%X\n", cl_ntoh16(source_lid)); @@ -722,16 +668,16 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, } p_ntci->issuer_gid.unicast.prefix = - p_rcv->p_subn->opt.subnet_prefix; + sm->p_subn->opt.subnet_prefix; p_ntci->issuer_gid.unicast.interface_id = p_port->guid; } /* we need a lock here as the InformInfo DB must be stable */ - CL_PLOCK_ACQUIRE(p_rcv->p_lock); - status = osm_report_notice(p_rcv->p_log, p_rcv->p_subn, p_ntci); - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_ACQUIRE(sm->p_lock); + status = osm_report_notice(sm->p_log, sm->p_subn, p_ntci); + CL_PLOCK_RELEASE(sm->p_lock); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3803: " "Error sending trap reports (%s)\n", ib_get_err_str(status)); @@ -739,7 +685,7 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } #if 0 @@ -747,18 +693,18 @@ __osm_trap_rcv_process_request(IN osm_trap_rcv_t * const p_rcv, CURRENTLY WE ARE NOT CREATING TRAPS - SO THIS CALL IS AN ERROR **********************************************************************/ static void -__osm_trap_rcv_process_sm(IN const osm_trap_rcv_t * const p_rcv, +__osm_trap_rcv_process_sm(IN osm_sm_t * sm, IN const osm_remote_sm_t * const p_sm) { /* const ib_sm_info_t* p_smi; */ - OSM_LOG_ENTER(p_rcv->p_log, __osm_trap_rcv_process_sm); + OSM_LOG_ENTER(sm->p_log, __osm_trap_rcv_process_sm); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_sm: ERR 3807: " "This function is not supported yet\n"); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } #endif @@ -766,28 +712,28 @@ __osm_trap_rcv_process_sm(IN const osm_trap_rcv_t * const p_rcv, CURRENTLY WE ARE NOT CREATING TRAPS - SO THIS CALL IN AN ERROR **********************************************************************/ static void -__osm_trap_rcv_process_response(IN const osm_trap_rcv_t * const p_rcv, +__osm_trap_rcv_process_response(IN osm_sm_t * sm, IN const osm_madw_t * const p_madw) { - OSM_LOG_ENTER(p_rcv->p_log, __osm_trap_rcv_process_response); + OSM_LOG_ENTER(sm->p_log, __osm_trap_rcv_process_response); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_response: ERR 3808: " "This function is not supported yet\n"); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ void osm_trap_rcv_process(IN void *context, IN void *data) { - osm_trap_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_smp_t *p_smp; - OSM_LOG_ENTER(p_rcv->p_log, osm_trap_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_trap_rcv_process); CL_ASSERT(p_madw); @@ -799,9 +745,9 @@ void osm_trap_rcv_process(IN void *context, IN void *data) SM's Trap. */ if (ib_smp_is_response(p_smp)) - __osm_trap_rcv_process_response(p_rcv, p_madw); + __osm_trap_rcv_process_response(sm, p_madw); else - __osm_trap_rcv_process_request(p_rcv, p_madw); + __osm_trap_rcv_process_request(sm, p_madw); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_vl_arb_rcv.c b/opensm/opensm/osm_vl_arb_rcv.c index 78f15d6..23b081a 100644 --- a/opensm/opensm/osm_vl_arb_rcv.c +++ b/opensm/opensm/osm_vl_arb_rcv.c @@ -51,61 +51,15 @@ #include #include -#include #include #include -#include -#include #include #include #include #include #include -#include -#include #include -#include - -/********************************************************************** - **********************************************************************/ -void osm_vla_rcv_construct(IN osm_vla_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_vla_rcv_destroy(IN osm_vla_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_vla_rcv_destroy); - - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_vla_rcv_init(IN osm_vla_rcv_t * const p_rcv, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_vla_rcv_init); - - osm_vla_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_req = p_req; - - OSM_LOG_EXIT(p_log); - return (status); -} +#include /********************************************************************** **********************************************************************/ @@ -114,7 +68,7 @@ osm_vla_rcv_init(IN osm_vla_rcv_t * const p_rcv, */ void osm_vla_rcv_process(IN void *context, IN void *data) { - osm_vla_rcv_t *p_rcv = context; + osm_sm_t *sm = context; osm_madw_t *p_madw = data; ib_vl_arb_table_t *p_vla_tbl; ib_smp_t *p_smp; @@ -126,9 +80,9 @@ void osm_vla_rcv_process(IN void *context, IN void *data) ib_net64_t node_guid; uint8_t port_num, block_num; - CL_ASSERT(p_rcv); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_rcv->p_log, osm_vla_rcv_process); + OSM_LOG_ENTER(sm->p_log, osm_vla_rcv_process); CL_ASSERT(p_madw); @@ -142,11 +96,11 @@ void osm_vla_rcv_process(IN void *context, IN void *data) CL_ASSERT(p_smp->attr_id == IB_MAD_ATTR_VL_ARBITRATION); - cl_plock_excl_acquire(p_rcv->p_lock); - p_port = osm_get_port_by_guid(p_rcv->p_subn, port_guid); + cl_plock_excl_acquire(sm->p_lock); + p_port = osm_get_port_by_guid(sm->p_subn, port_guid); if (!p_port) { - cl_plock_release(p_rcv->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + cl_plock_release(sm->p_lock); + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_vla_rcv_process: ERR 3F06: " "No port object for port with GUID 0x%" PRIx64 "\n\t\t\t\tfor parent node GUID 0x%" PRIx64 @@ -175,8 +129,8 @@ void osm_vla_rcv_process(IN void *context, IN void *data) We do not mind if this is a result of a set or get - all we want is to update the subnet. */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_VERBOSE)) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) { + osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_vla_rcv_process: " "Got GetResp(VLArb) block:%u port_num %u with GUID 0x%" PRIx64 " for parent node GUID 0x%" PRIx64 ", TID 0x%" @@ -189,18 +143,18 @@ void osm_vla_rcv_process(IN void *context, IN void *data) If so, Ignore it. */ if (!osm_physp_is_valid(p_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_vla_rcv_process: " "Got invalid port number 0x%X\n", port_num); goto Exit; } - osm_dump_vl_arb_table(p_rcv->p_log, + osm_dump_vl_arb_table(sm->p_log, port_guid, block_num, port_num, p_vla_tbl, OSM_LOG_DEBUG); if ((block_num < 1) || (block_num > 4)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_vla_rcv_process: " "Got invalid block number 0x%X\n", block_num); goto Exit; @@ -208,7 +162,7 @@ void osm_vla_rcv_process(IN void *context, IN void *data) osm_physp_set_vla_tbl(p_physp, p_vla_tbl, block_num); Exit: - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sm->p_lock); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sm->p_log); } -- 1.5.3.4.206.g58ba4 From sashak at voltaire.com Thu Jan 3 02:01:13 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 10:01:13 +0000 Subject: [ofa-general] [PATCH 2/4] opensm: remove unneeded *_rcv.h header files In-Reply-To: <11993544753970-git-send-email-sashak@voltaire.com> References: <11993544753970-git-send-email-sashak@voltaire.com> Message-ID: <11993544751631-git-send-email-sashak@voltaire.com> Remove unneeded anymore *_rcv.h header files. Signed-off-by: Sasha Khapyorsky --- opensm/include/Makefile.am | 11 - opensm/include/opensm/osm_lin_fwd_rcv.h | 245 ----------------------- opensm/include/opensm/osm_mcast_fwd_rcv.h | 245 ----------------------- opensm/include/opensm/osm_node_desc_rcv.h | 246 ----------------------- opensm/include/opensm/osm_node_info_rcv.h | 255 ------------------------ opensm/include/opensm/osm_pkey_rcv.h | 243 ----------------------- opensm/include/opensm/osm_port_info_rcv.h | 256 ------------------------ opensm/include/opensm/osm_slvl_map_rcv.h | 255 ------------------------ opensm/include/opensm/osm_sminfo_rcv.h | 273 ------------------------- opensm/include/opensm/osm_sw_info_rcv.h | 254 ------------------------ opensm/include/opensm/osm_trap_rcv.h | 306 ----------------------------- opensm/include/opensm/osm_vl_arb_rcv.h | 255 ------------------------ 12 files changed, 0 insertions(+), 2844 deletions(-) delete mode 100644 opensm/include/opensm/osm_lin_fwd_rcv.h delete mode 100644 opensm/include/opensm/osm_mcast_fwd_rcv.h delete mode 100644 opensm/include/opensm/osm_node_desc_rcv.h delete mode 100644 opensm/include/opensm/osm_node_info_rcv.h delete mode 100644 opensm/include/opensm/osm_pkey_rcv.h delete mode 100644 opensm/include/opensm/osm_port_info_rcv.h delete mode 100644 opensm/include/opensm/osm_slvl_map_rcv.h delete mode 100644 opensm/include/opensm/osm_sminfo_rcv.h delete mode 100644 opensm/include/opensm/osm_sw_info_rcv.h delete mode 100644 opensm/include/opensm/osm_trap_rcv.h delete mode 100644 opensm/include/opensm/osm_vl_arb_rcv.h diff --git a/opensm/include/Makefile.am b/opensm/include/Makefile.am index a46669d..cdb83c9 100644 --- a/opensm/include/Makefile.am +++ b/opensm/include/Makefile.am @@ -6,8 +6,6 @@ nobase_pkginclude_HEADERS = iba/ib_types.h iba/ib_cm_types.h EXTRA_DIST = \ $(srcdir)/opensm/osm_sa_path_record.h \ $(srcdir)/opensm/osm_lid_mgr.h \ - $(srcdir)/opensm/osm_vl_arb_rcv.h \ - $(srcdir)/opensm/osm_pkey_rcv.h \ $(srcdir)/opensm/osm_port.h \ $(srcdir)/opensm/osm_sm_state_mgr.h \ $(srcdir)/opensm/osm_state_mgr.h \ @@ -33,20 +31,16 @@ EXTRA_DIST = \ $(srcdir)/opensm/osm_sa_pkey_record.h \ $(srcdir)/opensm/osm_inform.h \ $(srcdir)/opensm/osm_path.h \ - $(srcdir)/opensm/osm_lin_fwd_rcv.h \ $(srcdir)/opensm/osm_service.h \ $(srcdir)/opensm/osm_switch.h \ - $(srcdir)/opensm/osm_sw_info_rcv.h \ $(srcdir)/opensm/osm_router.h \ $(srcdir)/opensm/osm_prefix_route.h \ $(srcdir)/opensm/osm_sa_slvl_record.h \ $(srcdir)/opensm/osm_opensm.h \ $(srcdir)/opensm/osm_sa.h \ $(srcdir)/opensm/osm_port_profile.h \ - $(srcdir)/opensm/osm_sminfo_rcv.h \ $(srcdir)/opensm/osm_multicast.h \ $(srcdir)/opensm/osm_sa_class_port_info.h \ - $(srcdir)/opensm/osm_node_info_rcv.h \ $(srcdir)/opensm/osm_base.h \ $(srcdir)/opensm/osm_sa_sminfo_record.h \ $(srcdir)/opensm/osm_mcast_mgr.h \ @@ -54,7 +48,6 @@ EXTRA_DIST = \ $(srcdir)/opensm/osm_event_plugin.h \ $(srcdir)/opensm/osm_mtree.h \ $(srcdir)/opensm/osm_sm.h \ - $(srcdir)/opensm/osm_trap_rcv.h \ $(srcdir)/opensm/osm_lin_fwd_tbl.h \ $(srcdir)/opensm/osm_ucast_mgr.h \ $(srcdir)/opensm/osm_db.h \ @@ -72,19 +65,15 @@ EXTRA_DIST = \ $(srcdir)/opensm/osm_sa_link_record.h \ $(srcdir)/opensm/osm_mcm_port.h \ $(srcdir)/opensm/osm_log.h \ - $(srcdir)/opensm/osm_mcast_fwd_rcv.h \ $(srcdir)/opensm/osm_fwd_tbl.h \ $(srcdir)/opensm/osm_db_pack.h \ $(srcdir)/opensm/osm_sm_mad_ctrl.h \ - $(srcdir)/opensm/osm_slvl_map_rcv.h \ $(srcdir)/opensm/osm_attrib_req.h \ - $(srcdir)/opensm/osm_node_desc_rcv.h \ $(srcdir)/opensm/osm_stats.h \ $(srcdir)/opensm/osm_sa_mcmember_record.h \ $(srcdir)/opensm/osm_sa_sw_info_record.h \ $(srcdir)/opensm/osm_vl15intf.h \ $(srcdir)/opensm/osm_drop_mgr.h \ - $(srcdir)/opensm/osm_port_info_rcv.h \ $(srcdir)/opensm/osm_perfmgr.h \ $(srcdir)/opensm/osm_perfmgr_db.h \ $(srcdir)/opensm/osm_qos_policy.h \ diff --git a/opensm/include/opensm/osm_lin_fwd_rcv.h b/opensm/include/opensm/osm_lin_fwd_rcv.h deleted file mode 100644 index 28402cc..0000000 --- a/opensm/include/opensm/osm_lin_fwd_rcv.h +++ /dev/null @@ -1,245 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_lft_rcv_t. - * This object represents the LFT Receiver object. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_LFT_RCV_H_ -#define _OSM_LFT_RCV_H_ - -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/LFT Receiver -* NAME -* LFT Receiver -* -* DESCRIPTION -* The LFT Receiver object encapsulates the information -* needed to receive the LFT attribute from a node. -* -* The LFT Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: LFT Receiver/osm_lft_rcv_t -* NAME -* osm_lft_rcv_t -* -* DESCRIPTION -* LFT Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_lft_rcv { - osm_subn_t *p_subn; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_lft_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* LFT Receiver object -*********/ - -/****f* OpenSM: LFT Receiver/osm_lft_rcv_construct -* NAME -* osm_lft_rcv_construct -* -* DESCRIPTION -* This function constructs a LFT Receiver object. -* -* SYNOPSIS -*/ -void osm_lft_rcv_construct(IN osm_lft_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a LFT Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_lft_rcv_init, osm_lft_rcv_destroy -* -* Calling osm_lft_rcv_construct is a prerequisite to calling any other -* method except osm_lft_rcv_init. -* -* SEE ALSO -* LFT Receiver object, osm_lft_rcv_init, -* osm_lft_rcv_destroy -*********/ - -/****f* OpenSM: LFT Receiver/osm_lft_rcv_destroy -* NAME -* osm_lft_rcv_destroy -* -* DESCRIPTION -* The osm_lft_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_lft_rcv_destroy(IN osm_lft_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* LFT Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_lft_rcv_construct or osm_lft_rcv_init. -* -* SEE ALSO -* LFT Receiver object, osm_lft_rcv_construct, -* osm_lft_rcv_init -*********/ - -/****f* OpenSM: LFT Receiver/osm_lft_rcv_init -* NAME -* osm_lft_rcv_init -* -* DESCRIPTION -* The osm_lft_rcv_init function initializes a -* LFT Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_lft_rcv_init(IN osm_lft_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_lft_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the LFT Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other LFT Receiver methods. -* -* SEE ALSO -* LFT Receiver object, osm_lft_rcv_construct, -* osm_lft_rcv_destroy -*********/ - -/****f* OpenSM: LFT Receiver/osm_lft_rcv_process -* NAME -* osm_lft_rcv_process -* -* DESCRIPTION -* Process the LFT attribute. -* -* SYNOPSIS -*/ -void osm_lft_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_lft_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's LFT attribute. -* -* RETURN VALUES -* CL_SUCCESS if the LFT processing was successful. -* -* NOTES -* This function processes a LFT attribute. -* -* SEE ALSO -* LFT Receiver, Node Description Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_LFT_RCV_H_ */ diff --git a/opensm/include/opensm/osm_mcast_fwd_rcv.h b/opensm/include/opensm/osm_mcast_fwd_rcv.h deleted file mode 100644 index 9d81ee6..0000000 --- a/opensm/include/opensm/osm_mcast_fwd_rcv.h +++ /dev/null @@ -1,245 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_mft_rcv_t. - * This object represents the Multicast Forwarding Table Receiver object. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_MFT_RCV_H_ -#define _OSM_MFT_RCV_H_ - -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/MFT Receiver -* NAME -* MFT Receiver -* -* DESCRIPTION -* The MFT Receiver object encapsulates the information -* needed to receive the MFT attribute from a node. -* -* The MFT Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: MFT Receiver/osm_mft_rcv_t -* NAME -* osm_mft_rcv_t -* -* DESCRIPTION -* MFT Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_mft_rcv { - osm_subn_t *p_subn; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_mft_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* MFT Receiver object -*********/ - -/****f* OpenSM: MFT Receiver/osm_mft_rcv_construct -* NAME -* osm_mft_rcv_construct -* -* DESCRIPTION -* This function constructs a MFT Receiver object. -* -* SYNOPSIS -*/ -void osm_mft_rcv_construct(IN osm_mft_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a MFT Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_mft_rcv_init, osm_mft_rcv_destroy -* -* Calling osm_mft_rcv_construct is a prerequisite to calling any other -* method except osm_mft_rcv_init. -* -* SEE ALSO -* MFT Receiver object, osm_mft_rcv_init, -* osm_mft_rcv_destroy -*********/ - -/****f* OpenSM: MFT Receiver/osm_mft_rcv_destroy -* NAME -* osm_mft_rcv_destroy -* -* DESCRIPTION -* The osm_mft_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_mft_rcv_destroy(IN osm_mft_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* MFT Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_mft_rcv_construct or osm_mft_rcv_init. -* -* SEE ALSO -* MFT Receiver object, osm_mft_rcv_construct, -* osm_mft_rcv_init -*********/ - -/****f* OpenSM: MFT Receiver/osm_mft_rcv_init -* NAME -* osm_mft_rcv_init -* -* DESCRIPTION -* The osm_mft_rcv_init function initializes a -* MFT Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_mft_rcv_init(IN osm_mft_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_mft_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the MFT Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other MFT Receiver methods. -* -* SEE ALSO -* MFT Receiver object, osm_mft_rcv_construct, -* osm_mft_rcv_destroy -*********/ - -/****f* OpenSM: MFT Receiver/osm_mft_rcv_process -* NAME -* osm_mft_rcv_process -* -* DESCRIPTION -* Process the MFT attribute. -* -* SYNOPSIS -*/ -void osm_mft_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_mft_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's MFT attribute. -* -* RETURN VALUES -* CL_SUCCESS if the MFT processing was successful. -* -* NOTES -* This function processes a MFT attribute. -* -* SEE ALSO -* MFT Receiver, Node Description Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_MFT_RCV_H_ */ diff --git a/opensm/include/opensm/osm_node_desc_rcv.h b/opensm/include/opensm/osm_node_desc_rcv.h deleted file mode 100644 index fc336d7..0000000 --- a/opensm/include/opensm/osm_node_desc_rcv.h +++ /dev/null @@ -1,246 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_nd_rcv_t. - * This object represents the NodeInfo Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_ND_RCV_H_ -#define _OSM_ND_RCV_H_ - -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Node Description Receiver -* NAME -* Node Description Receiver -* -* DESCRIPTION -* The Node Description Receiver object encapsulates the information -* needed to receive the NodeInfo attribute from a node. -* -* The Node Description Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: Node Description Receiver/osm_nd_rcv_t -* NAME -* osm_nd_rcv_t -* -* DESCRIPTION -* Node Description Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_nd_rcv { - osm_subn_t *p_subn; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_nd_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Node Description Receiver object -*********/ - -/****f* OpenSM: Node Description Receiver/osm_nd_rcv_construct -* NAME -* osm_nd_rcv_construct -* -* DESCRIPTION -* This function constructs a Node Description Receiver object. -* -* SYNOPSIS -*/ -void osm_nd_rcv_construct(IN osm_nd_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a Node Description Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_nd_rcv_init, osm_nd_rcv_destroy -* -* Calling osm_nd_rcv_construct is a prerequisite to calling any other -* method except osm_nd_rcv_init. -* -* SEE ALSO -* Node Description Receiver object, osm_nd_rcv_init, -* osm_nd_rcv_destroy -*********/ - -/****f* OpenSM: Node Description Receiver/osm_nd_rcv_destroy -* NAME -* osm_nd_rcv_destroy -* -* DESCRIPTION -* The osm_nd_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_nd_rcv_destroy(IN osm_nd_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Node Description Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_nd_rcv_construct or osm_nd_rcv_init. -* -* SEE ALSO -* Node Description Receiver object, osm_nd_rcv_construct, -* osm_nd_rcv_init -*********/ - -/****f* OpenSM: Node Description Receiver/osm_nd_rcv_init -* NAME -* osm_nd_rcv_init -* -* DESCRIPTION -* The osm_nd_rcv_init function initializes a -* Node Description Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_nd_rcv_init(IN osm_nd_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_nd_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the Node Description Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Node Description Receiver methods. -* -* SEE ALSO -* Node Description Receiver object, osm_nd_rcv_construct, -* osm_nd_rcv_destroy -*********/ - -/****f* OpenSM: Node Description Receiver/osm_nd_rcv_process -* NAME -* osm_nd_rcv_process -* -* DESCRIPTION -* Process the NodeInfo attribute. -* -* SYNOPSIS -*/ -void osm_nd_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_nd_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's NodeInfo attribute. -* -* RETURN VALUES -* CL_SUCCESS if the NodeInfo processing was successful. -* -* NOTES -* This function processes a NodeInfo attribute. -* -* SEE ALSO -* Node Description Receiver, Node Description Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_ND_RCV_H_ */ diff --git a/opensm/include/opensm/osm_node_info_rcv.h b/opensm/include/opensm/osm_node_info_rcv.h deleted file mode 100644 index ca684b4..0000000 --- a/opensm/include/opensm/osm_node_info_rcv.h +++ /dev/null @@ -1,255 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_ni_rcv_t. - * This object represents the NodeInfo Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_NI_RCV_H_ -#define _OSM_NI_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Node Info Receiver -* NAME -* Node Info Receiver -* -* DESCRIPTION -* The Node Info Receiver object encapsulates the information -* needed to receive the NodeInfo attribute from a node. -* -* The Node Info Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: Node Info Receiver/osm_ni_rcv_t -* NAME -* osm_ni_rcv_t -* -* DESCRIPTION -* Node Info Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_ni_rcv { - osm_subn_t *p_subn; - osm_req_t *p_gen_req; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_ni_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_gen_req_ctrl -* Pointer to the generic request controller. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Node Info Receiver object -*********/ - -/****f* OpenSM: Node Info Receiver/osm_ni_rcv_construct -* NAME -* osm_ni_rcv_construct -* -* DESCRIPTION -* This function constructs a Node Info Receiver object. -* -* SYNOPSIS -*/ -void osm_ni_rcv_construct(IN osm_ni_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a Node Info Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_ni_rcv_destroy, -* -* Calling osm_ni_rcv_construct is a prerequisite to calling any other -* method except osm_ni_rcv_init. -* -* SEE ALSO -* Node Info Receiver object, osm_ni_rcv_init, osm_ni_rcv_destroy -*********/ - -/****f* OpenSM: Node Info Receiver/osm_ni_rcv_destroy -* NAME -* osm_ni_rcv_destroy -* -* DESCRIPTION -* The osm_ni_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_ni_rcv_destroy(IN osm_ni_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Node Info Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_ni_rcv_construct or osm_ni_rcv_init. -* -* SEE ALSO -* Node Info Receiver object, osm_ni_rcv_construct, -* osm_ni_rcv_init -*********/ - -/****f* OpenSM: Node Info Receiver/osm_ni_rcv_init -* NAME -* osm_ni_rcv_init -* -* DESCRIPTION -* The osm_ni_rcv_init function initializes a -* Node Info Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_ni_rcv_init(IN osm_ni_rcv_t * const p_ctrl, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to an osm_ni_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the Node Info Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Node Info Receiver methods. -* -* SEE ALSO -* Node Info Receiver object, osm_ni_rcv_construct, -* osm_ni_rcv_destroy -*********/ - -/****f* OpenSM: Node Info Receiver/osm_ni_rcv_process -* NAME -* osm_ni_rcv_process -* -* DESCRIPTION -* Process the NodeInfo attribute. -* -* SYNOPSIS -*/ -void osm_ni_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_ni_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's NodeInfo attribute. -* -* RETURN VALUES -* CL_SUCCESS if the NodeInfo processing was successful. -* -* NOTES -* This function processes a NodeInfo attribute. -* -* SEE ALSO -* Node Info Receiver, Node Info Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_NI_RCV_H_ */ diff --git a/opensm/include/opensm/osm_pkey_rcv.h b/opensm/include/opensm/osm_pkey_rcv.h deleted file mode 100644 index 14e6351..0000000 --- a/opensm/include/opensm/osm_pkey_rcv.h +++ /dev/null @@ -1,243 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -#ifndef _OSM_PKEY_RCV_H_ -#define _OSM_PKEY_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/P_Key Receiver -* NAME -* P_Key Receiver -* -* DESCRIPTION -* The P_Key Receiver object encapsulates the information -* needed to set or get the vl arbitration attribute from a port. -* -* The P_Key Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Yael Kalka, Mellanox -* -*********/ -/****s* OpenSM: P_Key Receiver/osm_pkey_rcv_t -* NAME -* osm_pkey_rcv_t -* -* DESCRIPTION -* P_Key Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_pkey_rcv { - osm_subn_t *p_subn; - osm_req_t *p_req; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_pkey_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_req -* Pointer to the generic attribute request object. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* P_Key Receiver object -*********/ - -/****f* OpenSM: P_Key Receiver/osm_pkey_rcv_construct -* NAME -* osm_pkey_rcv_construct -* -* DESCRIPTION -* This function constructs a P_Key Receiver object. -* -* SYNOPSIS -*/ -void osm_pkey_rcv_construct(IN osm_pkey_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a P_Key Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_pkey_rcv_destroy -* -* Calling osm_pkey_rcv_construct is a prerequisite to calling any other -* method except osm_pkey_rcv_init. -* -* SEE ALSO -* P_Key Receiver object, osm_pkey_rcv_init, -* osm_pkey_rcv_destroy -*********/ - -/****f* OpenSM: P_Key Receiver/osm_pkey_rcv_destroy -* NAME -* osm_pkey_rcv_destroy -* -* DESCRIPTION -* The osm_pkey_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_pkey_rcv_destroy(IN osm_pkey_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* P_Key Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_pkey_rcv_construct or osm_pkey_rcv_init. -* -* SEE ALSO -* P_Key Receiver object, osm_pkey_rcv_construct, -* osm_pkey_rcv_init -*********/ - -/****f* OpenSM: P_Key Receiver/osm_pkey_rcv_init -* NAME -* osm_pkey_rcv_init -* -* DESCRIPTION -* The osm_pkey_rcv_init function initializes a -* P_Key Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_pkey_rcv_init(IN osm_pkey_rcv_t * const p_ctrl, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to an osm_pkey_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the P_Key Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other P_Key Receiver methods. -* -* SEE ALSO -* P_Key Receiver object, osm_pkey_rcv_construct, -* osm_pkey_rcv_destroy -*********/ - -/****f* OpenSM: P_Key Receiver/osm_pkey_rcv_process -* NAME -* osm_pkey_rcv_process -* -* DESCRIPTION -* Process the vl arbitration attribute. -* -* SYNOPSIS -*/ -void osm_pkey_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_pkey_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's SLtoVL attribute. -* -* RETURN VALUES -* CL_SUCCESS if the SLtoVL processing was successful. -* -* NOTES -* This function processes a SLtoVL attribute. -* -* SEE ALSO -* P_Key Receiver, P_Key Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_PKEY_RCV_H_ */ diff --git a/opensm/include/opensm/osm_port_info_rcv.h b/opensm/include/opensm/osm_port_info_rcv.h deleted file mode 100644 index c4c7d96..0000000 --- a/opensm/include/opensm/osm_port_info_rcv.h +++ /dev/null @@ -1,256 +0,0 @@ -/* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_pi_rcv_t. - * This object represents the PortInfo Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_PI_RCV_H_ -#define _OSM_PI_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Port Info Receiver -* NAME -* Port Info Receiver -* -* DESCRIPTION -* The Port Info Receiver object encapsulates the information -* needed to receive the PortInfo attribute from a node. -* -* The Port Info Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: Port Info Receiver/osm_pi_rcv_t -* NAME -* osm_pi_rcv_t -* -* DESCRIPTION -* Port Info Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_pi_rcv { - osm_subn_t *p_subn; - osm_req_t *p_req; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_pi_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_req -* Pointer to the generic attribute request object. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Port Info Receiver object -*********/ - -/****f* OpenSM: Port Info Receiver/osm_pi_rcv_construct -* NAME -* osm_pi_rcv_construct -* -* DESCRIPTION -* This function constructs a Port Info Receiver object. -* -* SYNOPSIS -*/ -void osm_pi_rcv_construct(IN osm_pi_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a Port Info Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_pi_rcv_destroy -* -* Calling osm_pi_rcv_construct is a prerequisite to calling any other -* method except osm_pi_rcv_init. -* -* SEE ALSO -* Port Info Receiver object, osm_pi_rcv_init, -* osm_pi_rcv_destroy -*********/ - -/****f* OpenSM: Port Info Receiver/osm_pi_rcv_destroy -* NAME -* osm_pi_rcv_destroy -* -* DESCRIPTION -* The osm_pi_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_pi_rcv_destroy(IN osm_pi_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Port Info Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_pi_rcv_construct or osm_pi_rcv_init. -* -* SEE ALSO -* Port Info Receiver object, osm_pi_rcv_construct, -* osm_pi_rcv_init -*********/ - -/****f* OpenSM: Port Info Receiver/osm_pi_rcv_init -* NAME -* osm_pi_rcv_init -* -* DESCRIPTION -* The osm_pi_rcv_init function initializes a -* Port Info Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_pi_rcv_init(IN osm_pi_rcv_t * const p_ctrl, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to an osm_pi_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the Port Info Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Port Info Receiver methods. -* -* SEE ALSO -* Port Info Receiver object, osm_pi_rcv_construct, -* osm_pi_rcv_destroy -*********/ - -/****f* OpenSM: Port Info Receiver/osm_pi_rcv_process -* NAME -* osm_pi_rcv_process -* -* DESCRIPTION -* Process the PortInfo attribute. -* -* SYNOPSIS -*/ -void osm_pi_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_pi_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's PortInfo attribute. -* -* RETURN VALUES -* None. -* -* NOTES -* This function processes a PortInfo attribute. -* -* SEE ALSO -* Port Info Receiver, Port Info Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_PI_RCV_H_ */ diff --git a/opensm/include/opensm/osm_slvl_map_rcv.h b/opensm/include/opensm/osm_slvl_map_rcv.h deleted file mode 100644 index 2476588..0000000 --- a/opensm/include/opensm/osm_slvl_map_rcv.h +++ /dev/null @@ -1,255 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_slvl_rcv_t. - * This object represents the SLtoVL Map Receiver object. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.3 $ - */ - -#ifndef _OSM_SLVL_RCV_H_ -#define _OSM_SLVL_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Slvl Map Receiver -* NAME -* Slvl Map Receiver -* -* DESCRIPTION -* The Slvl Map Receiver object encapsulates the information -* needed to set or get the SLtoVL map attribute from a port. -* -* The Slvl Map Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Eitan Zahavi, Mellanox -* -*********/ -/****s* OpenSM: Slvl Map Receiver/osm_slvl_rcv_t -* NAME -* osm_slvl_rcv_t -* -* DESCRIPTION -* Slvl Map Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_slvl_rcv { - osm_subn_t *p_subn; - osm_req_t *p_req; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_slvl_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_req -* Pointer to the generic attribute request object. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Slvl Map Receiver object -*********/ - -/****f* OpenSM: Slvl Map Receiver/osm_slvl_rcv_construct -* NAME -* osm_slvl_rcv_construct -* -* DESCRIPTION -* This function constructs a Slvl Map Receiver object. -* -* SYNOPSIS -*/ -void osm_slvl_rcv_construct(IN osm_slvl_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a Slvl Map Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_slvl_rcv_destroy -* -* Calling osm_slvl_rcv_construct is a prerequisite to calling any other -* method except osm_slvl_rcv_init. -* -* SEE ALSO -* Slvl Map Receiver object, osm_slvl_rcv_init, -* osm_slvl_rcv_destroy -*********/ - -/****f* OpenSM: Slvl Map Receiver/osm_slvl_rcv_destroy -* NAME -* osm_slvl_rcv_destroy -* -* DESCRIPTION -* The osm_slvl_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_slvl_rcv_destroy(IN osm_slvl_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Slvl Map Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_slvl_rcv_construct or osm_slvl_rcv_init. -* -* SEE ALSO -* Slvl Map Receiver object, osm_slvl_rcv_construct, -* osm_slvl_rcv_init -*********/ - -/****f* OpenSM: Slvl Map Receiver/osm_slvl_rcv_init -* NAME -* osm_slvl_rcv_init -* -* DESCRIPTION -* The osm_slvl_rcv_init function initializes a -* Slvl Map Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_slvl_rcv_init(IN osm_slvl_rcv_t * const p_ctrl, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to an osm_slvl_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the Slvl Map Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Slvl Map Receiver methods. -* -* SEE ALSO -* Slvl Map Receiver object, osm_slvl_rcv_construct, -* osm_slvl_rcv_destroy -*********/ - -/****f* OpenSM: Slvl Map Receiver/osm_slvl_rcv_process -* NAME -* osm_slvl_rcv_process -* -* DESCRIPTION -* Process the SLtoVL map attribute. -* -* SYNOPSIS -*/ -void osm_slvl_rcv_process(IN void *context, IN void *p_data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_slvl_rcv_t object. -* -* p_data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's SLtoVL attribute. -* -* RETURN VALUES -* CL_SUCCESS if the SLtoVL processing was successful. -* -* NOTES -* This function processes a SLtoVL attribute. -* -* SEE ALSO -* Slvl Map Receiver, Slvl Map Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_SLVL_RCV_H_ */ diff --git a/opensm/include/opensm/osm_sminfo_rcv.h b/opensm/include/opensm/osm_sminfo_rcv.h deleted file mode 100644 index 9bbe7f8..0000000 --- a/opensm/include/opensm/osm_sminfo_rcv.h +++ /dev/null @@ -1,273 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_sminfo_rcv_t. - * This object represents the SMInfo Receiver object. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_SMINFO_RCV_H_ -#define _OSM_SMINFO_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/SMInfo Receiver -* NAME -* SMInfo Receiver -* -* DESCRIPTION -* The SMInfo Receiver object encapsulates the information -* needed to receive the SMInfo attribute from a node. -* -* The SMInfo Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: SMInfo Receiver/osm_sminfo_rcv_t -* NAME -* osm_sminfo_rcv_t -* -* DESCRIPTION -* SMInfo Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_sminfo_rcv { - osm_subn_t *p_subn; - osm_stats_t *p_stats; - osm_log_t *p_log; - osm_resp_t *p_resp; - struct _osm_sm_state_mgr *p_sm_state_mgr; - cl_plock_t *p_lock; -} osm_sminfo_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_stats -* Pointer to the OpenSM statistics block. -* -* p_log -* Pointer to the log object. -* -* p_resp -* Pointer to the generic MAD responder object. -* -* p_sm_state_mgr -* Pointer to the SM State Manager object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* SMInfo Receiver object -*********/ - -/****f* OpenSM: SMInfo Receiver/osm_sminfo_rcv_construct -* NAME -* osm_sminfo_rcv_construct -* -* DESCRIPTION -* This function constructs a SMInfo Receiver object. -* -* SYNOPSIS -*/ -void osm_sminfo_rcv_construct(IN osm_sminfo_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a SMInfo Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_sminfo_rcv_init, osm_sminfo_rcv_destroy -* -* Calling osm_sminfo_rcv_construct is a prerequisite to calling any other -* method except osm_sminfo_rcv_init. -* -* SEE ALSO -* SMInfo Receiver object, osm_sminfo_rcv_init, -* osm_sminfo_rcv_destroy -*********/ - -/****f* OpenSM: SMInfo Receiver/osm_sminfo_rcv_destroy -* NAME -* osm_sminfo_rcv_destroy -* -* DESCRIPTION -* The osm_sminfo_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_sminfo_rcv_destroy(IN osm_sminfo_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* SMInfo Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_sminfo_rcv_construct or osm_sminfo_rcv_init. -* -* SEE ALSO -* SMInfo Receiver object, osm_sminfo_rcv_construct, -* osm_sminfo_rcv_init -*********/ - -/****f* OpenSM: SMInfo Receiver/osm_sminfo_rcv_init -* NAME -* osm_sminfo_rcv_init -* -* DESCRIPTION -* The osm_sminfo_rcv_init function initializes a -* SMInfo Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_sminfo_rcv_init(IN osm_sminfo_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_stats_t * const p_stats, - IN osm_resp_t * const p_resp, - IN osm_log_t * const p_log, - IN struct _osm_sm_state_mgr *const - p_sm_state_mgr, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_sminfo_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_stats -* [in] Pointer to the OpenSM statistics block. -* -* p_resp -* [in] Pointer to the generic MAD Responder object. -* -* p_log -* [in] Pointer to the log object. -* -* p_sm_state_mgr -* [in] Pointer to the SM State Manager object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the SMInfo Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other SMInfo Receiver methods. -* -* SEE ALSO -* SMInfo Receiver object, osm_sminfo_rcv_construct, -* osm_sminfo_rcv_destroy -*********/ - -/****f* OpenSM: SMInfo Receiver/osm_sminfo_rcv_process -* NAME -* osm_sminfo_rcv_process -* -* DESCRIPTION -* Process the SMInfo attribute. -* -* SYNOPSIS -*/ -void osm_sminfo_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_sminfo_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's SMInfo attribute. -* -* RETURN VALUES -* IB_SUCCESS if the SMInfo processing was successful. -* -* NOTES -* This function processes a SMInfo attribute. -* -* SEE ALSO -* SMInfo Receiver, SMInfo Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_SMINFO_RCV_H_ */ diff --git a/opensm/include/opensm/osm_sw_info_rcv.h b/opensm/include/opensm/osm_sw_info_rcv.h deleted file mode 100644 index 8c4ce3e..0000000 --- a/opensm/include/opensm/osm_sw_info_rcv.h +++ /dev/null @@ -1,254 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_si_rcv_t. - * This object represents the SwitchInfo Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_SI_RCV_H_ -#define _OSM_SI_RCV_H_ - -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Switch Info Receiver -* NAME -* Switch Info Receiver -* -* DESCRIPTION -* The Switch Info Receiver object encapsulates the information -* needed to receive the SwitchInfo attribute from a node. -* -* The Switch Info Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: Switch Info Receiver/osm_si_rcv_t -* NAME -* osm_si_rcv_t -* -* DESCRIPTION -* Switch Info Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_si_rcv { - osm_subn_t *p_subn; - osm_log_t *p_log; - osm_req_t *p_req; - cl_plock_t *p_lock; -} osm_si_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_log -* Pointer to the log object. -* -* p_req -* Pointer to the Request object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Switch Info Receiver object -*********/ - -/****f* OpenSM: Switch Info Receiver/osm_si_rcv_construct -* NAME -* osm_si_rcv_construct -* -* DESCRIPTION -* This function constructs a Switch Info Receiver object. -* -* SYNOPSIS -*/ -void osm_si_rcv_construct(IN osm_si_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a Switch Info Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_si_rcv_destroy. -* -* Calling osm_si_rcv_construct is a prerequisite to calling any other -* method except osm_si_rcv_init. -* -* SEE ALSO -* Switch Info Receiver object, osm_si_rcv_init, osm_si_rcv_destroy -*********/ - -/****f* OpenSM: Switch Info Receiver/osm_si_rcv_destroy -* NAME -* osm_si_rcv_destroy -* -* DESCRIPTION -* The osm_si_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_si_rcv_destroy(IN osm_si_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Switch Info Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_si_rcv_construct or osm_si_rcv_init. -* -* SEE ALSO -* Switch Info Receiver object, osm_si_rcv_construct, -* osm_si_rcv_init -*********/ - -/****f* OpenSM: Switch Info Receiver/osm_si_rcv_init -* NAME -* osm_si_rcv_init -* -* DESCRIPTION -* The osm_si_rcv_init function initializes a -* Switch Info Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_si_rcv_init(IN osm_si_rcv_t * const p_ctrl, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN osm_req_t * const p_req, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to an osm_si_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the Switch Info Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Switch Info Receiver methods. -* -* SEE ALSO -* Switch Info Receiver object, osm_si_rcv_construct, -* osm_si_rcv_destroy -*********/ - -/****f* OpenSM: Switch Info Receiver/osm_si_rcv_process -* NAME -* osm_si_rcv_process -* -* DESCRIPTION -* Process the SwitchInfo attribute. -* -* SYNOPSIS -*/ -void osm_si_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_si_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's SwitchInfo attribute. -* -* RETURN VALUES -* CL_SUCCESS if the SwitchInfo processing was successful. -* -* NOTES -* This function processes a SwitchInfo attribute. -* -* SEE ALSO -* Switch Info Receiver, Switch Info Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_SI_RCV_H_ */ diff --git a/opensm/include/opensm/osm_trap_rcv.h b/opensm/include/opensm/osm_trap_rcv.h deleted file mode 100644 index e06f6b3..0000000 --- a/opensm/include/opensm/osm_trap_rcv.h +++ /dev/null @@ -1,306 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_trap_rcv_t. - * This object represents the Trap Receiver object. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.3 $ - */ - -#ifndef _OSM_TRAP_RCV_H_ -#define _OSM_TRAP_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Trap Receiver -* NAME -* Trap Receiver -* -* DESCRIPTION -* The Trap Receiver object encapsulates the information -* needed to receive the Trap attribute from a node. -* -* The Trap Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: Trap Receiver/osm_trap_rcv_t -* NAME -* osm_trap_rcv_t -* -* DESCRIPTION -* Trap Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_trap_rcv { - osm_subn_t *p_subn; - osm_stats_t *p_stats; - osm_log_t *p_log; - osm_resp_t *p_resp; - cl_plock_t *p_lock; - cl_event_wheel_t trap_aging_tracker; -} osm_trap_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_stats -* Pointer to the OpenSM statistics block. -* -* p_log -* Pointer to the log object. -* -* p_resp -* Pointer to the generic MAD responder object. -* -* p_lock -* Pointer to the serializing lock. -* -* trap_aging_tracker -* An event wheel tracking erceived traps and their aging. -* Basically we can start a timer every time we receive a specific -* trap and check to seee if not expired next time it is received. -* -* SEE ALSO -* Trap Receiver object -*********/ - -/****f* OpenSM: Trap Receiver/osm_trap_rcv_construct -* NAME -* osm_trap_rcv_construct -* -* DESCRIPTION -* This function constructs a Trap Receiver object. -* -* SYNOPSIS -*/ -void osm_trap_rcv_construct(IN osm_trap_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a Trap Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_trap_rcv_init, osm_trap_rcv_destroy -* -* Calling osm_trap_rcv_construct is a prerequisite to calling any other -* method except osm_trap_rcv_init. -* -* SEE ALSO -* Trap Receiver object, osm_trap_rcv_init, -* osm_trap_rcv_destroy -*********/ - -/****f* OpenSM: Trap Receiver/osm_trap_rcv_destroy -* NAME -* osm_trap_rcv_destroy -* -* DESCRIPTION -* The osm_trap_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_trap_rcv_destroy(IN osm_trap_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Trap Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_trap_rcv_construct or osm_trap_rcv_init. -* -* SEE ALSO -* Trap Receiver object, osm_trap_rcv_construct, -* osm_trap_rcv_init -*********/ - -/****f* OpenSM: Trap Receiver/osm_trap_rcv_init -* NAME -* osm_trap_rcv_init -* -* DESCRIPTION -* The osm_trap_rcv_init function initializes a -* Trap Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_trap_rcv_init(IN osm_trap_rcv_t * const p_rcv, - IN osm_subn_t * const p_subn, - IN osm_stats_t * const p_stats, - IN osm_resp_t * const p_resp, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_trap_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_stats -* [in] Pointer to the OpenSM statistics block. -* -* p_resp -* [in] Pointer to the generic MAD Responder object. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the Trap Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Trap Receiver methods. -* -* SEE ALSO -* Trap Receiver object, osm_trap_rcv_construct, -* osm_trap_rcv_destroy -*********/ - -/****f* OpenSM: Trap Receiver/osm_trap_rcv_process -* NAME -* osm_trap_rcv_process -* -* DESCRIPTION -* Process the Trap attribute. -* -* SYNOPSIS -*/ -void osm_trap_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_trap_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's Trap attribute. -* -* RETURN VALUES -* IB_SUCCESS if the Trap processing was successful. -* -* NOTES -* This function processes a Trap attribute. -* -* SEE ALSO -* Trap Receiver, Trap Response Controller -*********/ - -/****f* OpenSM: Trap Receiver/osm_trap_rcv_aging_tracker_callback -* NAME -* osm_trap_rcv_aging_tracker_callback -* -* DESCRIPTION -* Callback function called by the aging tracker mechanism. -* -* SYNOPSIS -*/ -uint64_t -osm_trap_rcv_aging_tracker_callback(IN uint64_t key, - IN uint32_t num_regs, IN void *context); - -/* -* PARAMETERS -* key -* [in] The key by which the event was inserted. -* -* num_regs -* [in] The number of times the same event (key) was registered. -* -* context -* [in] Pointer to the context given in the registering of the event. -* -* RETURN VALUES -* None. -* -* NOTES -* This function is called by the cl_event_wheel when the aging tracker -* event has ended. -* -* SEE ALSO -* Trap Receiver, Trap Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_TRAP_RCV_H_ */ diff --git a/opensm/include/opensm/osm_vl_arb_rcv.h b/opensm/include/opensm/osm_vl_arb_rcv.h deleted file mode 100644 index 2ca91f6..0000000 --- a/opensm/include/opensm/osm_vl_arb_rcv.h +++ /dev/null @@ -1,255 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_vla_rcv_t. - * This object represents the VL Arbitration Receiver object. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.3 $ - */ - -#ifndef _OSM_VLA_RCV_H_ -#define _OSM_VLA_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/VL Arbitration Receiver -* NAME -* VL Arbitration Receiver -* -* DESCRIPTION -* The VL Arbitration Receiver object encapsulates the information -* needed to set or get the vl arbitration attribute from a port. -* -* The VL Arbitration Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Eitan Zahavi, Mellanox -* -*********/ -/****s* OpenSM: VL Arbitration Receiver/osm_vla_rcv_t -* NAME -* osm_vla_rcv_t -* -* DESCRIPTION -* VL Arbitration Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_vla_rcv { - osm_subn_t *p_subn; - osm_req_t *p_req; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_vla_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_req -* Pointer to the generic attribute request object. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* VL Arbitration Receiver object -*********/ - -/****f* OpenSM: VL Arbitration Receiver/osm_vla_rcv_construct -* NAME -* osm_vla_rcv_construct -* -* DESCRIPTION -* This function constructs a VL Arbitration Receiver object. -* -* SYNOPSIS -*/ -void osm_vla_rcv_construct(IN osm_vla_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a VL Arbitration Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_vla_rcv_destroy -* -* Calling osm_vla_rcv_construct is a prerequisite to calling any other -* method except osm_vla_rcv_init. -* -* SEE ALSO -* VL Arbitration Receiver object, osm_vla_rcv_init, -* osm_vla_rcv_destroy -*********/ - -/****f* OpenSM: VL Arbitration Receiver/osm_vla_rcv_destroy -* NAME -* osm_vla_rcv_destroy -* -* DESCRIPTION -* The osm_vla_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_vla_rcv_destroy(IN osm_vla_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* VL Arbitration Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_vla_rcv_construct or osm_vla_rcv_init. -* -* SEE ALSO -* VL Arbitration Receiver object, osm_vla_rcv_construct, -* osm_vla_rcv_init -*********/ - -/****f* OpenSM: VL Arbitration Receiver/osm_vla_rcv_init -* NAME -* osm_vla_rcv_init -* -* DESCRIPTION -* The osm_vla_rcv_init function initializes a -* VL Arbitration Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_vla_rcv_init(IN osm_vla_rcv_t * const p_ctrl, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to an osm_vla_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the VL Arbitration Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other VL Arbitration Receiver methods. -* -* SEE ALSO -* VL Arbitration Receiver object, osm_vla_rcv_construct, -* osm_vla_rcv_destroy -*********/ - -/****f* OpenSM: VL Arbitration Receiver/osm_vla_rcv_process -* NAME -* osm_vla_rcv_process -* -* DESCRIPTION -* Process the vl arbitration attribute. -* -* SYNOPSIS -*/ -void osm_vla_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_vla_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's SLtoVL attribute. -* -* RETURN VALUES -* CL_SUCCESS if the SLtoVL processing was successful. -* -* NOTES -* This function processes a SLtoVL attribute. -* -* SEE ALSO -* VL Arbitration Receiver, VL Arbitration Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_VLA_RCV_H_ */ -- 1.5.3.4.206.g58ba4 From sashak at voltaire.com Thu Jan 3 02:01:14 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 10:01:14 +0000 Subject: [ofa-general] [PATCH 3/4] opensm: cleanup SA related osm_*_rcv_t objects In-Reply-To: <11993544753970-git-send-email-sashak@voltaire.com> References: <11993544753970-git-send-email-sashak@voltaire.com> Message-ID: <11993544762885-git-send-email-sashak@voltaire.com> This removes SA related dummy *_rcv_t objects, eliminates data duplications, simplifies flows, etc.. Instead of original objects a reference to osm_sa_t is used. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_inform.h | 8 +- opensm/include/opensm/osm_qos_policy.h | 4 +- opensm/include/opensm/osm_sa.h | 136 +++--- opensm/include/opensm/osm_sa_class_port_info.h | 1 - opensm/include/opensm/osm_sa_mad_ctrl.h | 17 +- opensm/include/opensm/osm_sa_mcmember_record.h | 50 --- opensm/include/opensm/osm_sa_response.h | 36 -- opensm/include/opensm/osm_sa_service_record.h | 2 - opensm/opensm/osm_inform.c | 13 +- opensm/opensm/osm_prtn.c | 8 +- opensm/opensm/osm_sa.c | 225 +++-------- opensm/opensm/osm_sa_class_port_info.c | 82 +--- opensm/opensm/osm_sa_guidinfo_record.c | 133 ++---- opensm/opensm/osm_sa_informinfo.c | 254 +++++------- opensm/opensm/osm_sa_lft_record.c | 127 ++---- opensm/opensm/osm_sa_link_record.c | 168 +++----- opensm/opensm/osm_sa_mad_ctrl.c | 8 +- opensm/opensm/osm_sa_mcmember_record.c | 539 +++++++++++------------- opensm/opensm/osm_sa_mft_record.c | 127 ++---- opensm/opensm/osm_sa_multipath_record.c | 306 ++++++-------- opensm/opensm/osm_sa_node_record.c | 137 +++---- opensm/opensm/osm_sa_path_record.c | 386 ++++++++---------- opensm/opensm/osm_sa_pkey_record.c | 149 +++---- opensm/opensm/osm_sa_portinfo_record.c | 157 +++----- opensm/opensm/osm_sa_response.c | 62 +--- opensm/opensm/osm_sa_service_record.c | 271 +++++------- opensm/opensm/osm_sa_slvl_record.c | 131 ++---- opensm/opensm/osm_sa_sminfo_record.c | 192 ++++------ opensm/opensm/osm_sa_sw_info_record.c | 174 +++----- opensm/opensm/osm_sa_vlarb_record.c | 167 +++----- 30 files changed, 1566 insertions(+), 2504 deletions(-) diff --git a/opensm/include/opensm/osm_inform.h b/opensm/include/opensm/osm_inform.h index 0ec6a1b..5da513e 100644 --- a/opensm/include/opensm/osm_inform.h +++ b/opensm/include/opensm/osm_inform.h @@ -57,7 +57,7 @@ #include #include #include -#include +#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { @@ -102,7 +102,7 @@ BEGIN_C_DECLS typedef struct _osm_infr_t { cl_list_item_t list_item; osm_bind_handle_t h_bind; - osm_infr_rcv_t *p_infr_rcv; + osm_sa_t *sa; osm_mad_addr_t report_addr; ib_inform_info_record_t inform_record; } osm_infr_t; @@ -114,8 +114,8 @@ typedef struct _osm_infr_t { * h_bind * A handle of lower level mad srvc * -* p_infr_rcv -* The receiver of inform_info's +* sa +* A pointer to osm_sa object * * report_addr * Report address diff --git a/opensm/include/opensm/osm_qos_policy.h b/opensm/include/opensm/osm_qos_policy.h index d61c269..82b6258 100644 --- a/opensm/include/opensm/osm_qos_policy.h +++ b/opensm/include/opensm/osm_qos_policy.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -52,8 +52,6 @@ #include #include #include -#include -#include #define YYSTYPE char * #define OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH 128 diff --git a/opensm/include/opensm/osm_sa.h b/opensm/include/opensm/osm_sa.h index a945833..82ca1dc 100644 --- a/opensm/include/opensm/osm_sa.h +++ b/opensm/include/opensm/osm_sa.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -58,26 +58,9 @@ #include #include #include -#include #include -#include -#include -#include -#include -#include -#include -#include #include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include +#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { @@ -136,6 +119,7 @@ typedef enum _osm_sa_state { */ typedef struct _osm_sa { osm_sa_state_t state; + osm_sm_t *sm; osm_subn_t *p_subn; osm_vendor_t *p_vendor; osm_log_t *p_log; @@ -144,41 +128,8 @@ typedef struct _osm_sa { cl_plock_t *p_lock; atomic32_t sa_trans_id; osm_sa_mad_ctrl_t mad_ctrl; - osm_sa_resp_t resp; - osm_cpi_rcv_t cpi_rcv; - osm_nr_rcv_t nr_rcv; - osm_pir_rcv_t pir_rcv; - osm_gir_rcv_t gir_rcv; - osm_lr_rcv_t lr_rcv; - osm_pr_rcv_t pr_rcv; - osm_smir_rcv_t smir_rcv; - osm_mcmr_recv_t mcmr_rcv; - osm_sr_rcv_t sr_rcv; -#if defined (VENDOR_RMPP_SUPPORT) && defined (DUAL_SIDED_RMPP) - osm_mpr_rcv_t mpr_rcv; -#endif - - /* InformInfo Receiver */ - osm_infr_rcv_t infr_rcv; - - /* VL Arbitrartion Query */ - osm_vlarb_rec_rcv_t vlarb_rec_rcv; - - /* SLtoVL Map Query */ - osm_slvl_rec_rcv_t slvl_rec_rcv; - - /* P_Key table Query */ - osm_pkey_rec_rcv_t pkey_rec_rcv; - - /* LinearForwardingTable Query */ - osm_lftr_rcv_t lftr_rcv; - - /* SwitchInfo Query */ - osm_sir_rcv_t sir_rcv; - - /* MulticastForwardingTable Query */ - osm_mftr_rcv_t mftr_rcv; + cl_timer_t sr_timer; cl_disp_reg_handle_t cpi_disp_h; cl_disp_reg_handle_t nr_disp_h; cl_disp_reg_handle_t pir_disp_h; @@ -205,6 +156,9 @@ typedef struct _osm_sa { * state * State of this SA object * +* sm +* Pointer to the Subnet Manager object. +* * p_subn * Pointer to the Subnet object for this subnet. * @@ -229,19 +183,6 @@ typedef struct _osm_sa { * mad_ctrl * Mad Controller * -* resp -* Response object -* -* nr -* -* pir_rcv -* -* lr -* -* pr -* -* smir -* * SEE ALSO * SM object *********/ @@ -424,6 +365,38 @@ osm_sa_vendor_send(IN osm_bind_handle_t h_bind, IN boolean_t const resp_expected, IN osm_subn_t * const p_subn); +/****f* IBA Base: Types/osm_sa_send_error +* NAME +* osm_sa_send_error +* +* DESCRIPTION +* Sends a generic SA response with the specified error status. +* The payload is simply replicated from the request MAD. +* +* SYNOPSIS +*/ +void +osm_sa_send_error(IN osm_sa_t * sa, + IN const osm_madw_t * const p_madw, + IN const ib_net16_t sa_status); +/* +* PARAMETERS +* sa +* [in] Pointer to an osm_sa_t object. +* +* p_madw +* [in] Original MAD to which the response must be sent. +* +* sa_status +* [in] Status to send in the response. +* +* RETURN VALUES +* None. +* +* SEE ALSO +* SA object +*********/ + struct _osm_opensm_t; /****f* OpenSM: SA/osm_sa_db_file_dump * NAME @@ -465,5 +438,36 @@ int osm_sa_db_file_load(struct _osm_opensm_t *p_osm); * *********/ +/****f* OpenSM: MC Member Record Receiver/osm_mcmr_rcv_find_or_create_new_mgrp +* NAME +* osm_mcmr_rcv_find_or_create_new_mgrp +* +* DESCRIPTION +* Create new Multicast group +* +* SYNOPSIS +*/ + +ib_api_status_t +osm_mcmr_rcv_find_or_create_new_mgrp(IN osm_sa_t * sa, + IN uint64_t comp_mask, + IN ib_member_rec_t * + const p_recvd_mcmember_rec, + OUT osm_mgrp_t ** pp_mgrp); +/* +* PARAMETERS +* p_sa +* [in] Pointer to an osm_sa_t object. +* p_recvd_mcmember_rec +* [in] Received Multicast member record +* +* pp_mgrp +* [out] pointer the osm_mgrp_t object +* +* RETURN VALUES +* IB_SUCCESS, IB_ERROR +* +*********/ + END_C_DECLS #endif /* _OSM_SA_H_ */ diff --git a/opensm/include/opensm/osm_sa_class_port_info.h b/opensm/include/opensm/osm_sa_class_port_info.h index 6e4c069..52b3c9e 100644 --- a/opensm/include/opensm/osm_sa_class_port_info.h +++ b/opensm/include/opensm/osm_sa_class_port_info.h @@ -51,7 +51,6 @@ #include #include #include -#include #include #include #include diff --git a/opensm/include/opensm/osm_sa_mad_ctrl.h b/opensm/include/opensm/osm_sa_mad_ctrl.h index bd7751e..a51c0b6 100644 --- a/opensm/include/opensm/osm_sa_mad_ctrl.h +++ b/opensm/include/opensm/osm_sa_mad_ctrl.h @@ -55,7 +55,6 @@ #include #include #include -#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { @@ -83,6 +82,8 @@ BEGIN_C_DECLS * Ranjit Pandit, Intel * *********/ + +struct _osm_sa; /****s* OpenSM: SA MAD Controller/osm_sa_mad_ctrl_t * NAME * osm_sa_mad_ctrl_t @@ -96,6 +97,7 @@ BEGIN_C_DECLS * SYNOPSIS */ typedef struct _osm_sa_mad_ctrl { + struct _osm_sa *sa; osm_log_t *p_log; osm_mad_pool_t *p_mad_pool; osm_vendor_t *p_vendor; @@ -104,10 +106,12 @@ typedef struct _osm_sa_mad_ctrl { cl_disp_reg_handle_t h_disp; osm_stats_t *p_stats; osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; } osm_sa_mad_ctrl_t; /* * FIELDS +* sa +* Pointer to the SA object. +* * p_log * Pointer to the log object. * @@ -129,9 +133,6 @@ typedef struct _osm_sa_mad_ctrl { * p_stats * Pointer to the OpenSM statistics block. * -* p_resp -* Pointer to the SA response manager -* * SEE ALSO * SA MAD Controller object * SA MADr object @@ -209,7 +210,7 @@ void osm_sa_mad_ctrl_destroy(IN osm_sa_mad_ctrl_t * const p_ctrl); * SYNOPSIS */ ib_api_status_t osm_sa_mad_ctrl_init(IN osm_sa_mad_ctrl_t * const p_ctrl, - IN osm_sa_resp_t * const p_resp, + IN struct _osm_sa * sa, IN osm_mad_pool_t * const p_mad_pool, IN osm_vendor_t * const p_vendor, IN osm_subn_t * const p_subn, @@ -221,8 +222,8 @@ ib_api_status_t osm_sa_mad_ctrl_init(IN osm_sa_mad_ctrl_t * const p_ctrl, * p_ctrl * [in] Pointer to an osm_sa_mad_ctrl_t object to initialize. * -* p_resp -* [in] Pointer to the response SA manager object +* sa +* [in] Pointer to the SA object. * * p_mad_pool * [in] Pointer to the MAD pool. diff --git a/opensm/include/opensm/osm_sa_mcmember_record.h b/opensm/include/opensm/osm_sa_mcmember_record.h index 8540a89..09db580 100644 --- a/opensm/include/opensm/osm_sa_mcmember_record.h +++ b/opensm/include/opensm/osm_sa_mcmember_record.h @@ -103,7 +103,6 @@ typedef struct _osm_mcmr { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - uint16_t mlid_ho; } osm_mcmr_recv_t; /* @@ -302,54 +301,5 @@ osm_mcmr_rcv_create_new_mgrp(IN osm_mcmr_recv_t * const p_mcmr, * *********/ -/****f* OpenSM: MC Member Record Receiver/osm_mcmr_rcv_find_or_create_new_mgrp -* NAME -* osm_mcmr_rcv_find_or_create_new_mgrp -* -* DESCRIPTION -* Create new Multicast group -* -* SYNOPSIS -*/ - -ib_api_status_t -osm_mcmr_rcv_find_or_create_new_mgrp(IN osm_mcmr_recv_t * const p_mcmr, - IN uint64_t comp_mask, - IN ib_member_rec_t * - const p_recvd_mcmember_rec, - OUT osm_mgrp_t ** pp_mgrp); -/* -* PARAMETERS -* p_mcmr -* [in] Pointer to an osm_mcmr_recv_t object. -* p_recvd_mcmember_rec -* [in] Received Multicast member record -* -* pp_mgrp -* [out] pointer the osm_mgrp_t object -* -* RETURN VALUES -* IB_SUCCESS, IB_ERROR -* -* NOTES -* -* -* SEE ALSO -* -*********/ - -#define JOIN_MC_COMP_MASK (IB_MCR_COMPMASK_MGID | \ - IB_MCR_COMPMASK_PORT_GID | \ - IB_MCR_COMPMASK_JOIN_STATE) - -#define REQUIRED_MC_CREATE_COMP_MASK (IB_MCR_COMPMASK_MGID | \ - IB_MCR_COMPMASK_PORT_GID | \ - IB_MCR_COMPMASK_JOIN_STATE | \ - IB_MCR_COMPMASK_QKEY | \ - IB_MCR_COMPMASK_TCLASS | \ - IB_MCR_COMPMASK_PKEY | \ - IB_MCR_COMPMASK_FLOW | \ - IB_MCR_COMPMASK_SL) - END_C_DECLS #endif /* _OSM_MCMR_H_ */ diff --git a/opensm/include/opensm/osm_sa_response.h b/opensm/include/opensm/osm_sa_response.h index 8e2c15e..53c4f95 100644 --- a/opensm/include/opensm/osm_sa_response.h +++ b/opensm/include/opensm/osm_sa_response.h @@ -208,41 +208,5 @@ osm_sa_resp_init(IN osm_sa_resp_t * const p_resp, * osm_sa_resp_destroy *********/ -/****f* IBA Base: Types/osm_sa_send_error -* NAME -* osm_sa_send_error -* -* DESCRIPTION -* Sends a generic SA response with the specified error status. -* The payload is simply replicated from the request MAD. -* -* SYNOPSIS -*/ -void -osm_sa_send_error(IN osm_sa_resp_t * const p_resp, - IN const osm_madw_t * const p_madw, - IN const ib_net16_t sa_status); -/* -* PARAMETERS -* p_resp -* [in] Pointer to an osm_sa_resp_t object. -* -* p_madw -* [in] Original MAD to which the response must be sent. -* -* sa_status -* [in] Status to send in the response. -* -* RETURN VALUES -* None. -* -* NOTES -* Allows calling other SA Response methods. -* -* SEE ALSO -* SA Response object, osm_sa_resp_construct, -* osm_sa_resp_destroy -*********/ - END_C_DECLS #endif /* _OSM_SA_RESP_H_ */ diff --git a/opensm/include/opensm/osm_sa_service_record.h b/opensm/include/opensm/osm_sa_service_record.h index 43859e0..63bcc46 100644 --- a/opensm/include/opensm/osm_sa_service_record.h +++ b/opensm/include/opensm/osm_sa_service_record.h @@ -50,7 +50,6 @@ #define _OSM_SR_H_ #include -#include #include #include #include @@ -102,7 +101,6 @@ typedef struct _osm_sr_rcv { osm_mad_pool_t *p_mad_pool; osm_log_t *p_log; cl_plock_t *p_lock; - cl_timer_t sr_timer; } osm_sr_rcv_t; /* * FIELDS diff --git a/opensm/opensm/osm_inform.c b/opensm/opensm/osm_inform.c index e488e3b..151b1dc 100644 --- a/opensm/opensm/osm_inform.c +++ b/opensm/opensm/osm_inform.c @@ -50,7 +50,6 @@ #include #include #include -#include #include #include #include @@ -115,7 +114,7 @@ __match_inf_rec(IN const cl_list_item_t * const p_list_item, IN void *context) { osm_infr_t *p_infr_rec = (osm_infr_t *) context; osm_infr_t *p_infr = (osm_infr_t *) p_list_item; - osm_log_t *p_log = p_infr_rec->p_infr_rcv->p_log; + osm_log_t *p_log = p_infr_rec->sa->p_log; cl_status_t status = CL_NOT_FOUND; ib_gid_t all_zero_gid; @@ -339,7 +338,7 @@ static ib_api_status_t __osm_send_report(IN osm_infr_t * p_infr_rec, /* the info ib_sa_mad_t *p_sa_mad; static atomic32_t trap_fwd_trans_id = 0x02DAB000; ib_api_status_t status; - osm_log_t *p_log = p_infr_rec->p_infr_rcv->p_log; + osm_log_t *p_log = p_infr_rec->sa->p_log; OSM_LOG_ENTER(p_log, __osm_send_report); @@ -354,7 +353,7 @@ static ib_api_status_t __osm_send_report(IN osm_infr_t * p_infr_rec, /* the info cl_ntoh16(p_infr_rec->report_addr.dest_lid), trap_fwd_trans_id); /* get the MAD to send */ - p_report_madw = osm_mad_pool_get(p_infr_rec->p_infr_rcv->p_mad_pool, + p_report_madw = osm_mad_pool_get(p_infr_rec->sa->p_mad_pool, p_infr_rec->h_bind, MAD_BLOCK_SIZE, &(p_infr_rec->report_addr)); @@ -387,7 +386,7 @@ static ib_api_status_t __osm_send_report(IN osm_infr_t * p_infr_rec, /* the info /* The TRUE is for: response is expected */ status = osm_sa_vendor_send(p_report_madw->h_bind, p_report_madw, TRUE, - p_infr_rec->p_infr_rcv->p_subn); + p_infr_rec->sa->p_subn); if (status != IB_SUCCESS) { osm_log(p_log, OSM_LOG_ERROR, "__osm_send_report: ERR 0204: " @@ -416,8 +415,8 @@ __match_notice_to_inf_rec(IN cl_list_item_t * const p_list_item, osm_infr_t *p_infr_rec = (osm_infr_t *) p_list_item; ib_inform_info_t *p_ii = &(p_infr_rec->inform_record.inform_info); cl_status_t status = CL_NOT_FOUND; - osm_log_t *p_log = p_infr_rec->p_infr_rcv->p_log; - osm_subn_t *p_subn = p_infr_rec->p_infr_rcv->p_subn; + osm_log_t *p_log = p_infr_rec->sa->p_log; + osm_subn_t *p_subn = p_infr_rec->sa->p_subn; ib_gid_t source_gid; osm_port_t *p_src_port; osm_port_t *p_dest_port; diff --git a/opensm/opensm/osm_prtn.c b/opensm/opensm/osm_prtn.c index f0168fc..15a9c2a 100644 --- a/opensm/opensm/osm_prtn.c +++ b/opensm/opensm/osm_prtn.c @@ -224,8 +224,7 @@ ib_api_status_t osm_prtn_add_mcgroup(osm_log_t * p_log, /* don't update rate, mtu */ comp_mask = IB_MCR_COMPMASK_MTU | IB_MCR_COMPMASK_MTU_SEL | IB_MCR_COMPMASK_RATE | IB_MCR_COMPMASK_RATE_SEL; - status = osm_mcmr_rcv_find_or_create_new_mgrp(&p_sa->mcmr_rcv, - comp_mask, &mc_rec, + status = osm_mcmr_rcv_find_or_create_new_mgrp(p_sa, comp_mask, &mc_rec, &p_mgrp); if (!p_mgrp || status != IB_SUCCESS) osm_log(p_log, OSM_LOG_ERROR, @@ -243,9 +242,8 @@ ib_api_status_t osm_prtn_add_mcgroup(osm_log_t * p_log, mc_rec.scope_state = ib_member_set_scope_state(scope, IB_MC_REC_STATE_FULL_MEMBER); ib_mgid_set_scope(&mc_rec.mgid, scope); - status = - osm_mcmr_rcv_find_or_create_new_mgrp(&p_sa->mcmr_rcv, comp_mask, - &mc_rec, &p_mgrp); + status = osm_mcmr_rcv_find_or_create_new_mgrp(p_sa, comp_mask, &mc_rec, + &p_mgrp); if (p_mgrp) p_mgrp->well_known = TRUE; diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c index 248f20d..740fef5 100644 --- a/opensm/opensm/osm_sa.c +++ b/opensm/opensm/osm_sa.c @@ -59,7 +59,6 @@ #include #include #include -#include #include #include #include @@ -73,6 +72,26 @@ #define OSM_SA_INITIAL_TID_VALUE 0xabc +extern void osm_cpi_rcv_process(IN void *context, IN void *data); +extern void osm_gir_rcv_process(IN void *context, IN void *data); +extern void osm_infr_rcv_process(IN void *context, IN void *data); +extern void osm_infir_rcv_process(IN void *context, IN void *data); +extern void osm_lftr_rcv_process(IN void *context, IN void *data); +extern void osm_lr_rcv_process(IN void *context, IN void *data); +extern void osm_mcmr_rcv_process(IN void *context, IN void *data); +extern void osm_mftr_rcv_process(IN void *context, IN void *data); +extern void osm_mpr_rcv_process(IN void *context, IN void *data); +extern void osm_nr_rcv_process(IN void *context, IN void *data); +extern void osm_pr_rcv_process(IN void *context, IN void *data); +extern void osm_pkey_rec_rcv_process(IN void *context, IN void *data); +extern void osm_pir_rcv_process(IN void *context, IN void *data); +extern void osm_sr_rcv_process(IN void *context, IN void *data); +extern void osm_slvl_rec_rcv_process(IN void *context, IN void *data); +extern void osm_smir_rcv_process(IN void *context, IN void *data); +extern void osm_sir_rcv_process(IN void *context, IN void *data); +extern void osm_vlarb_rec_rcv_process(IN void *context, IN void *data); +extern void osm_sr_rcv_lease_cb(IN void *context); + /********************************************************************** **********************************************************************/ void osm_sa_construct(IN osm_sa_t * const p_sa) @@ -81,25 +100,7 @@ void osm_sa_construct(IN osm_sa_t * const p_sa) p_sa->state = OSM_SA_STATE_INIT; p_sa->sa_trans_id = OSM_SA_INITIAL_TID_VALUE; - osm_sa_resp_construct(&p_sa->resp); - osm_nr_rcv_construct(&p_sa->nr_rcv); - osm_pir_rcv_construct(&p_sa->pir_rcv); - osm_gir_rcv_construct(&p_sa->gir_rcv); - osm_lr_rcv_construct(&p_sa->lr_rcv); - osm_pr_rcv_construct(&p_sa->pr_rcv); -#if defined (VENDOR_RMPP_SUPPORT) && defined (DUAL_SIDED_RMPP) - osm_mpr_rcv_construct(&p_sa->mpr_rcv); -#endif - osm_smir_rcv_construct(&p_sa->smir_rcv); - osm_mcmr_rcv_construct(&p_sa->mcmr_rcv); - osm_sr_rcv_construct(&p_sa->sr_rcv); - osm_infr_rcv_construct(&p_sa->infr_rcv); - osm_vlarb_rec_rcv_construct(&p_sa->vlarb_rec_rcv); - osm_slvl_rec_rcv_construct(&p_sa->slvl_rec_rcv); - osm_pkey_rec_rcv_construct(&p_sa->pkey_rec_rcv); - osm_lftr_rcv_construct(&p_sa->lftr_rcv); - osm_sir_rcv_construct(&p_sa->sir_rcv); - osm_mftr_rcv_construct(&p_sa->mftr_rcv); + cl_timer_construct(&p_sa->sr_timer); } /********************************************************************** @@ -109,6 +110,8 @@ void osm_sa_shutdown(IN osm_sa_t * const p_sa) ib_api_status_t status; OSM_LOG_ENTER(p_sa->p_log, osm_sa_shutdown); + cl_timer_stop(&p_sa->sr_timer); + /* unbind from the mad service */ status = osm_sa_mad_ctrl_unbind(&p_sa->mad_ctrl); @@ -145,25 +148,7 @@ void osm_sa_destroy(IN osm_sa_t * const p_sa) p_sa->state = OSM_SA_STATE_INIT; - osm_nr_rcv_destroy(&p_sa->nr_rcv); - osm_pir_rcv_destroy(&p_sa->pir_rcv); - osm_gir_rcv_destroy(&p_sa->gir_rcv); - osm_lr_rcv_destroy(&p_sa->lr_rcv); - osm_pr_rcv_destroy(&p_sa->pr_rcv); -#if defined (VENDOR_RMPP_SUPPORT) && defined (DUAL_SIDED_RMPP) - osm_mpr_rcv_destroy(&p_sa->mpr_rcv); -#endif - osm_smir_rcv_destroy(&p_sa->smir_rcv); - osm_mcmr_rcv_destroy(&p_sa->mcmr_rcv); - osm_sr_rcv_destroy(&p_sa->sr_rcv); - osm_infr_rcv_destroy(&p_sa->infr_rcv); - osm_vlarb_rec_rcv_destroy(&p_sa->vlarb_rec_rcv); - osm_slvl_rec_rcv_destroy(&p_sa->slvl_rec_rcv); - osm_pkey_rec_rcv_destroy(&p_sa->pkey_rec_rcv); - osm_lftr_rcv_destroy(&p_sa->lftr_rcv); - osm_sir_rcv_destroy(&p_sa->sir_rcv); - osm_mftr_rcv_destroy(&p_sa->mftr_rcv); - osm_sa_resp_destroy(&p_sa->resp); + cl_timer_destroy(&p_sa->sr_timer); OSM_LOG_EXIT(p_sa->p_log); } @@ -184,6 +169,7 @@ osm_sa_init(IN osm_sm_t * const p_sm, OSM_LOG_ENTER(p_log, osm_sa_init); + p_sa->sm = p_sm; p_sa->p_subn = p_subn; p_sa->p_vendor = p_vendor; p_sa->p_mad_pool = p_mad_pool; @@ -193,228 +179,113 @@ osm_sa_init(IN osm_sm_t * const p_sm, p_sa->state = OSM_SA_STATE_READY; - status = osm_sa_resp_init(&p_sa->resp, p_sa->p_mad_pool, p_subn, p_log); - if (status != IB_SUCCESS) - goto Exit; - status = osm_sa_mad_ctrl_init(&p_sa->mad_ctrl, - &p_sa->resp, + p_sa, p_sa->p_mad_pool, p_sa->p_vendor, p_subn, p_log, p_stats, p_disp); if (status != IB_SUCCESS) goto Exit; - status = osm_cpi_rcv_init(&p_sa->cpi_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_nr_rcv_init(&p_sa->nr_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_pir_rcv_init(&p_sa->pir_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_gir_rcv_init(&p_sa->gir_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_lr_rcv_init(&p_sa->lr_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_pr_rcv_init(&p_sa->pr_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - -#if defined (VENDOR_RMPP_SUPPORT) && defined (DUAL_SIDED_RMPP) - status = osm_mpr_rcv_init(&p_sa->mpr_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; -#endif - - status = osm_smir_rcv_init(&p_sa->smir_rcv, - &p_sa->resp, - p_sa->p_mad_pool, - p_subn, p_stats, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_mcmr_rcv_init(p_sm, - &p_sa->mcmr_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_sr_rcv_init(&p_sa->sr_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_infr_rcv_init(&p_sa->infr_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_vlarb_rec_rcv_init(&p_sa->vlarb_rec_rcv, - &p_sa->resp, - p_sa->p_mad_pool, - p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_slvl_rec_rcv_init(&p_sa->slvl_rec_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_pkey_rec_rcv_init(&p_sa->pkey_rec_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_lftr_rcv_init(&p_sa->lftr_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_sir_rcv_init(&p_sa->sir_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_mftr_rcv_init(&p_sa->mftr_rcv, - &p_sa->resp, - p_sa->p_mad_pool, p_subn, p_log, p_lock); + status = cl_timer_init(&p_sa->sr_timer, osm_sr_rcv_lease_cb, p_sa); if (status != IB_SUCCESS) goto Exit; p_sa->cpi_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_CLASS_PORT_INFO, - osm_cpi_rcv_process, - &p_sa->cpi_rcv); + osm_cpi_rcv_process, p_sa); if (p_sa->cpi_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->nr_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_NODE_RECORD, - osm_nr_rcv_process, &p_sa->nr_rcv); + osm_nr_rcv_process, p_sa); if (p_sa->nr_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->pir_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_PORTINFO_RECORD, - osm_pir_rcv_process, - &p_sa->pir_rcv); + osm_pir_rcv_process, p_sa); if (p_sa->pir_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->gir_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_GUIDINFO_RECORD, - osm_gir_rcv_process, - &p_sa->gir_rcv); + osm_gir_rcv_process, p_sa); if (p_sa->gir_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->lr_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_LINK_RECORD, - osm_lr_rcv_process, &p_sa->lr_rcv); + osm_lr_rcv_process, p_sa); if (p_sa->lr_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->pr_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_PATH_RECORD, - osm_pr_rcv_process, &p_sa->pr_rcv); + osm_pr_rcv_process, p_sa); if (p_sa->pr_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; #if defined (VENDOR_RMPP_SUPPORT) && defined (DUAL_SIDED_RMPP) p_sa->mpr_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_MULTIPATH_RECORD, - osm_mpr_rcv_process, &p_sa->mpr_rcv); + osm_mpr_rcv_process, p_sa); if (p_sa->mpr_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; #endif p_sa->smir_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_SMINFO_RECORD, - osm_smir_rcv_process, - &p_sa->smir_rcv); + osm_smir_rcv_process, p_sa); if (p_sa->smir_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->mcmr_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_MCMEMBER_RECORD, - osm_mcmr_rcv_process, &p_sa->mcmr_rcv); + osm_mcmr_rcv_process, p_sa); if (p_sa->mcmr_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->sr_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_SERVICE_RECORD, - osm_sr_rcv_process, &p_sa->sr_rcv); + osm_sr_rcv_process, p_sa); if (p_sa->sr_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->infr_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_INFORM_INFO, - osm_infr_rcv_process, - &p_sa->infr_rcv); + osm_infr_rcv_process, p_sa); if (p_sa->infr_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->infir_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_INFORM_INFO_RECORD, - osm_infir_rcv_process, &p_sa->infr_rcv); + osm_infir_rcv_process, p_sa); if (p_sa->infir_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->vlarb_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_VL_ARB_RECORD, - osm_vlarb_rec_rcv_process, - &p_sa->vlarb_rec_rcv); + osm_vlarb_rec_rcv_process, p_sa); if (p_sa->vlarb_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->slvl_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_SLVL_TBL_RECORD, - osm_slvl_rec_rcv_process, &p_sa->slvl_rec_rcv); + osm_slvl_rec_rcv_process, p_sa); if (p_sa->slvl_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->pkey_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_PKEY_TBL_RECORD, - osm_pkey_rec_rcv_process, &p_sa->pkey_rec_rcv); + osm_pkey_rec_rcv_process, p_sa); if (p_sa->pkey_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->lft_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_LFT_RECORD, - osm_lftr_rcv_process, - &p_sa->lftr_rcv); + osm_lftr_rcv_process, p_sa); if (p_sa->lft_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->sir_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_SWITCH_INFO_RECORD, - osm_sir_rcv_process, &p_sa->sir_rcv); + osm_sir_rcv_process, p_sa); if (p_sa->sir_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; p_sa->mft_disp_h = cl_disp_register(p_disp, OSM_MSG_MAD_MFT_RECORD, - osm_mftr_rcv_process, - &p_sa->mftr_rcv); + osm_mftr_rcv_process, p_sa); if (p_sa->mft_disp_h == CL_DISP_INVALID_HANDLE) goto Exit; @@ -711,7 +582,7 @@ static osm_mgrp_t *load_mcgroup(osm_opensm_t * p_osm, ib_net16_t mlid, comp_mask = IB_MCR_COMPMASK_MTU | IB_MCR_COMPMASK_MTU_SEL | IB_MCR_COMPMASK_RATE | IB_MCR_COMPMASK_RATE_SEL; - if (osm_mcmr_rcv_find_or_create_new_mgrp(&p_osm->sa.mcmr_rcv, + if (osm_mcmr_rcv_find_or_create_new_mgrp(&p_osm->sa, comp_mask, p_mcm_rec, &p_mgrp) != IB_SUCCESS || !p_mgrp || p_mgrp->mlid != mlid) { @@ -761,7 +632,7 @@ static int load_svcr(osm_opensm_t * p_osm, ib_service_record_t * sr, osm_svcr_insert_to_db(&p_osm->subn, &p_osm->log, p_svcr); if (lease_period != 0xffffffff) - cl_timer_trim(&p_osm->sa.sr_rcv.sr_timer, 1000); + cl_timer_trim(&p_osm->sa.sr_timer, 1000); _out: cl_plock_release(&p_osm->lock); @@ -776,7 +647,7 @@ static int load_infr(osm_opensm_t * p_osm, ib_inform_info_record_t * iir, int ret = 0; infr.h_bind = p_osm->sa.mad_ctrl.h_bind; - infr.p_infr_rcv = &p_osm->sa.infr_rcv; + infr.sa = &p_osm->sa; /* other possible way to restore mad_addr partially is to extract qpn from InformInfo and to find lid by gid */ infr.report_addr = *addr; diff --git a/opensm/opensm/osm_sa_class_port_info.c b/opensm/opensm/osm_sa_class_port_info.c index 8a49398..4f62761 100644 --- a/opensm/opensm/osm_sa_class_port_info.c +++ b/opensm/opensm/osm_sa_class_port_info.c @@ -55,8 +55,6 @@ #include #include #include -#include -#include #include #include #include @@ -74,48 +72,8 @@ static uint32_t __msecs_to_rtv_table[MAX_MSECS_TO_RTV] = { 1, 2, 4, 8, /********************************************************************** **********************************************************************/ -void osm_cpi_rcv_construct(IN osm_cpi_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_cpi_rcv_destroy(IN osm_cpi_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_cpi_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_cpi_rcv_init(IN osm_cpi_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_cpi_rcv_init); - - osm_cpi_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_rcv->p_log); - return (status); -} - -/********************************************************************** - **********************************************************************/ static void -__osm_cpi_rcv_respond(IN osm_cpi_rcv_t * const p_rcv, +__osm_cpi_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { osm_madw_t *p_resp_madw; @@ -126,18 +84,18 @@ __osm_cpi_rcv_respond(IN osm_cpi_rcv_t * const p_rcv, ib_gid_t zero_gid; uint8_t rtv; - OSM_LOG_ENTER(p_rcv->p_log, __osm_cpi_rcv_respond); + OSM_LOG_ENTER(sa->p_log, __osm_cpi_rcv_respond); memset(&zero_gid, 0, sizeof(ib_gid_t)); /* Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, MAD_BLOCK_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_cpi_rcv_respond: ERR 1408: " "Unable to allocate MAD\n"); goto Exit; @@ -159,12 +117,12 @@ __osm_cpi_rcv_respond(IN osm_cpi_rcv_t * const p_rcv, p_resp_cpi->class_ver = 2; /* Calculate encoded response time value */ /* transaction timeout is in msec */ - if (p_rcv->p_subn->opt.transaction_timeout > + if (sa->p_subn->opt.transaction_timeout > __msecs_to_rtv_table[MAX_MSECS_TO_RTV]) rtv = MAX_MSECS_TO_RTV - 1; else { for (rtv = 0; rtv < MAX_MSECS_TO_RTV; rtv++) { - if (p_rcv->p_subn->opt.transaction_timeout <= + if (sa->p_subn->opt.transaction_timeout <= __msecs_to_rtv_table[rtv]) break; } @@ -209,28 +167,28 @@ __osm_cpi_rcv_respond(IN osm_cpi_rcv_t * const p_rcv, p_resp_cpi->cap_mask = OSM_CAP_IS_SUBN_GET_SET_NOTICE_SUP | OSM_CAP_IS_PORT_INFO_CAPMASK_MATCH_SUPPORTED; #endif - if (p_rcv->p_subn->opt.qos) + if (sa->p_subn->opt.qos) ib_class_set_cap_mask2(p_resp_cpi, OSM_CAP2_IS_QOS_SUPPORTED); - if (p_rcv->p_subn->opt.no_multicast_option != TRUE) + if (sa->p_subn->opt.no_multicast_option != TRUE) p_resp_cpi->cap_mask |= OSM_CAP_IS_UD_MCAST_SUP; p_resp_cpi->cap_mask = cl_hton16(p_resp_cpi->cap_mask); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_FRAMES)) - osm_dump_sa_mad(p_rcv->p_log, p_resp_sa_mad, OSM_LOG_FRAMES); + if (osm_log_is_active(sa->p_log, OSM_LOG_FRAMES)) + osm_dump_sa_mad(sa->p_log, p_resp_sa_mad, OSM_LOG_FRAMES); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_cpi_rcv_respond: ERR 1409: " "Unable to send MAD (%s)\n", ib_get_err_str(status)); - /* osm_mad_pool_put( p_rcv->p_mad_pool, p_resp_madw ); */ + /* osm_mad_pool_put( sa->p_mad_pool, p_resp_madw ); */ goto Exit; } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -238,11 +196,11 @@ __osm_cpi_rcv_respond(IN osm_cpi_rcv_t * const p_rcv, **********************************************************************/ void osm_cpi_rcv_process(IN void *context, IN void *data) { - osm_cpi_rcv_t *p_rcv = context; + osm_sa_t *sa = context; osm_madw_t *p_madw = data; const ib_sa_mad_t *p_sa_mad; - OSM_LOG_ENTER(p_rcv->p_log, osm_cpi_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_cpi_rcv_process); CL_ASSERT(p_madw); @@ -250,11 +208,11 @@ void osm_cpi_rcv_process(IN void *context, IN void *data) /* we only support GET */ if (p_sa_mad->method != IB_MAD_METHOD_GET) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_cpi_rcv_process: ERR 1403: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_sa_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_REQ_INVALID); goto Exit; } @@ -265,8 +223,8 @@ void osm_cpi_rcv_process(IN void *context, IN void *data) CLASS PORT INFO does not really look on the SMDB - no lock required. */ - __osm_cpi_rcv_respond(p_rcv, p_madw); + __osm_cpi_rcv_respond(sa, p_madw); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_guidinfo_record.c b/opensm/opensm/osm_sa_guidinfo_record.c index a758888..a2c47bb 100644 --- a/opensm/opensm/osm_sa_guidinfo_record.c +++ b/opensm/opensm/osm_sa_guidinfo_record.c @@ -54,10 +54,9 @@ #include #include #include -#include +#include #include #include -#include #include #include #include @@ -71,52 +70,14 @@ typedef struct _osm_gir_search_ctxt { const ib_guidinfo_record_t *p_rcvd_rec; ib_net64_t comp_mask; cl_qlist_t *p_list; - osm_gir_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; } osm_gir_search_ctxt_t; /********************************************************************** **********************************************************************/ -void osm_gir_rcv_construct(IN osm_gir_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_gir_rcv_destroy(IN osm_gir_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_gir_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_gir_rcv_init(IN osm_gir_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_gir_rcv_init); - - osm_gir_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static ib_api_status_t -__osm_gir_rcv_new_gir(IN osm_gir_rcv_t * const p_rcv, +__osm_gir_rcv_new_gir(IN osm_sa_t * sa, IN const osm_node_t * const p_node, IN cl_qlist_t * const p_list, IN ib_net64_t const match_port_guid, @@ -127,19 +88,19 @@ __osm_gir_rcv_new_gir(IN osm_gir_rcv_t * const p_rcv, osm_gir_item_t *p_rec_item; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_gir_rcv_new_gir); + OSM_LOG_ENTER(sa->p_log, __osm_gir_rcv_new_gir); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_gir_rcv_new_gir: ERR 5102: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_gir_rcv_new_gir: " "New GUIDInfoRecord: lid 0x%X, block num %d\n", cl_ntoh16(match_lid), block_num); @@ -156,14 +117,14 @@ __osm_gir_rcv_new_gir(IN osm_gir_rcv_t * const p_rcv, cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ static void -__osm_sa_gir_create_gir(IN osm_gir_rcv_t * const p_rcv, +__osm_sa_gir_create_gir(IN osm_sa_t * sa, IN const osm_node_t * const p_node, IN cl_qlist_t * const p_list, IN ib_net64_t const match_port_guid, @@ -181,10 +142,10 @@ __osm_sa_gir_create_gir(IN osm_gir_rcv_t * const p_rcv, ib_net64_t port_guid; uint8_t block_num, start_block_num, end_block_num, num_blocks; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_gir_create_gir); + OSM_LOG_ENTER(sa->p_log, __osm_sa_gir_create_gir); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_gir_create_gir: " "Looking for GUIDRecord with LID: 0x%X GUID:0x%016" PRIx64 "\n", cl_ntoh16(match_lid), @@ -209,7 +170,7 @@ __osm_sa_gir_create_gir(IN osm_gir_rcv_t * const p_rcv, /* Check to see if the found p_physp and the requester physp share a pkey. If not, continue */ - if (!osm_physp_share_pkey(p_rcv->p_log, p_physp, p_req_physp)) + if (!osm_physp_share_pkey(sa->p_log, p_physp, p_req_physp)) continue; port_guid = osm_physp_get_port_guid(p_physp); @@ -248,8 +209,8 @@ __osm_sa_gir_create_gir(IN osm_gir_rcv_t * const p_rcv, /* We validate that the lid belongs to this node. */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_gir_create_gir: " "Comparing LID: 0x%X <= 0x%X <= 0x%X\n", base_lid_ho, match_lid_ho, max_lid_ho); @@ -262,13 +223,13 @@ __osm_sa_gir_create_gir(IN osm_gir_rcv_t * const p_rcv, for (block_num = start_block_num; block_num <= end_block_num; block_num++) - __osm_gir_rcv_new_gir(p_rcv, p_node, p_list, port_guid, + __osm_gir_rcv_new_gir(sa, p_node, p_list, port_guid, cl_ntoh16(base_lid_ho), p_physp, block_num); } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -282,14 +243,14 @@ __osm_sa_gir_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, const osm_node_t *const p_node = (osm_node_t *) p_map_item; const ib_guidinfo_record_t *const p_rcvd_rec = p_ctxt->p_rcvd_rec; const osm_physp_t *const p_req_physp = p_ctxt->p_req_physp; - osm_gir_rcv_t *const p_rcv = p_ctxt->p_rcv; + osm_sa_t *sa = p_ctxt->sa; const ib_guid_info_t *p_comp_gi; ib_net64_t const comp_mask = p_ctxt->comp_mask; ib_net64_t match_port_guid = 0; ib_net16_t match_lid = 0; uint8_t match_block_num = 255; - OSM_LOG_ENTER(p_ctxt->p_rcv->p_log, __osm_sa_gir_by_comp_mask_cb); + OSM_LOG_ENTER(p_ctxt->sa->p_log, __osm_sa_gir_by_comp_mask_cb); if (comp_mask & IB_GIR_COMPMASK_LID) match_lid = p_rcvd_rec->lid; @@ -341,19 +302,19 @@ __osm_sa_gir_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, goto Exit; } - __osm_sa_gir_create_gir(p_rcv, p_node, p_ctxt->p_list, + __osm_sa_gir_create_gir(sa, p_node, p_ctxt->p_list, match_port_guid, match_lid, p_req_physp, match_block_num); Exit: - OSM_LOG_EXIT(p_ctxt->p_rcv->p_log); + OSM_LOG_EXIT(p_ctxt->sa->p_log); } /********************************************************************** **********************************************************************/ void osm_gir_rcv_process(IN void *ctx, IN void *data) { - osm_gir_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; const ib_sa_mad_t *p_rcvd_mad; const ib_guidinfo_record_t *p_rcvd_rec; @@ -371,9 +332,9 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) ib_api_status_t status; osm_physp_t *p_req_physp; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_gir_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_gir_rcv_process); CL_ASSERT(p_madw); @@ -386,29 +347,29 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) /* we only support SubnAdmGet and SubnAdmGetTable methods */ if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_gir_rcv_process: ERR 5105: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_gir_rcv_process: ERR 5104: " "Cannot find requester physical port\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_guidinfo_record(p_rcv->p_log, p_rcvd_rec, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_guidinfo_record(sa->p_log, p_rcvd_rec, OSM_LOG_DEBUG); cl_qlist_init(&rec_list); @@ -416,15 +377,15 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.sa = sa; context.p_req_physp = p_req_physp; - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); - cl_qmap_apply_func(&p_rcv->p_subn->node_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->node_guid_tbl, __osm_sa_gir_by_comp_mask_cb, &context); - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -434,16 +395,16 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) */ if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_gir_rcv_process: ERR 5103: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -466,7 +427,7 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_guidinfo_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_gir_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -474,11 +435,11 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_gir_rcv_process: " "Returning %u records\n", num_rec); if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -486,13 +447,13 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_guidinfo_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_gir_rcv_process: ERR 5106: " "osm_mad_pool_get failed\n"); @@ -502,7 +463,7 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -553,9 +514,9 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_gir_rcv_process: ERR 5107: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -563,5 +524,5 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_informinfo.c b/opensm/opensm/osm_sa_informinfo.c index db58bc0..92a7fa1 100644 --- a/opensm/opensm/osm_sa_informinfo.c +++ b/opensm/opensm/osm_sa_informinfo.c @@ -55,12 +55,10 @@ #include #include #include -#include +#include #include #include #include -#include -#include #include #include #include @@ -77,51 +75,11 @@ typedef struct _osm_iir_search_ctxt { cl_qlist_t *p_list; ib_gid_t subscriber_gid; ib_net16_t subscriber_enum; - osm_infr_rcv_t *p_rcv; + osm_sa_t *sa; osm_physp_t *p_req_physp; } osm_iir_search_ctxt_t; /********************************************************************** - **********************************************************************/ -void osm_infr_rcv_construct(IN osm_infr_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_infr_rcv_destroy(IN osm_infr_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_infr_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_infr_rcv_init(IN osm_infr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_infr_rcv_init); - - osm_infr_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_rcv->p_log); - return IB_SUCCESS; -} - -/********************************************************************** o13-14.1.1: Except for Set(InformInfo) requests with Inform- Info:LIDRangeBegin=0xFFFF, managers that support event forwarding shall, upon receiving a Set(InformInfo), verify that the requester @@ -129,7 +87,7 @@ originating the Set(InformInfo) and a Trap() source identified by Inform- can access each other - can use path record to verify that. **********************************************************************/ static boolean_t -__validate_ports_access_rights(IN osm_infr_rcv_t * const p_rcv, +__validate_ports_access_rights(IN osm_sa_t * sa, IN osm_infr_t * p_infr_rec) { boolean_t valid = TRUE; @@ -143,11 +101,11 @@ __validate_ports_access_rights(IN osm_infr_rcv_t * const p_rcv, const cl_ptr_vector_t *p_tbl; ib_gid_t zero_gid; - OSM_LOG_ENTER(p_rcv->p_log, __validate_ports_access_rights); + OSM_LOG_ENTER(sa->p_log, __validate_ports_access_rights); /* get the requester physp from the request address */ - p_requester_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_requester_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, &p_infr_rec->report_addr); memset(&zero_gid, 0, sizeof(zero_gid)); @@ -158,10 +116,10 @@ __validate_ports_access_rights(IN osm_infr_rcv_t * const p_rcv, p_infr_rec->inform_record.inform_info.gid.unicast. interface_id; - p_port = osm_get_port_by_guid(p_rcv->p_subn, portguid); + p_port = osm_get_port_by_guid(sa->p_subn, portguid); if (p_port == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__validate_ports_access_rights: ERR 4301: " "Invalid port guid: 0x%016" PRIx64 "\n", cl_ntoh64(portguid)); @@ -175,8 +133,8 @@ __validate_ports_access_rights(IN osm_infr_rcv_t * const p_rcv, /* make sure that the requester and destination port can access each other according to the current partitioning. */ if (!osm_physp_share_pkey - (p_rcv->p_log, p_physp, p_requester_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + (sa->p_log, p_physp, p_requester_physp)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_ports_access_rights: " "port and requester don't share pkey\n"); valid = FALSE; @@ -203,12 +161,12 @@ __validate_ports_access_rights(IN osm_infr_rcv_t * const p_rcv, /* go over all defined lids within the range and make sure that the requester port can access them according to current partitioning. */ for (lid = lid_range_begin; lid <= lid_range_end; lid++) { - p_tbl = &p_rcv->p_subn->port_lid_tbl; + p_tbl = &sa->p_subn->port_lid_tbl; if (cl_ptr_vector_get_size(p_tbl) > lid) { p_port = cl_ptr_vector_get(p_tbl, lid); } else { /* lid requested is out of range */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__validate_ports_access_rights: ERR 4302: " "Given LID (0x%X) is out of range:0x%X\n", lid, cl_ptr_vector_get_size(p_tbl)); @@ -222,8 +180,8 @@ __validate_ports_access_rights(IN osm_infr_rcv_t * const p_rcv, /* make sure that the requester and destination port can access each other according to the current partitioning. */ if (!osm_physp_share_pkey - (p_rcv->p_log, p_physp, p_requester_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + (sa->p_log, p_physp, p_requester_physp)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_ports_access_rights: " "port and requester don't share pkey\n"); valid = FALSE; @@ -233,27 +191,27 @@ __validate_ports_access_rights(IN osm_infr_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return valid; } /********************************************************************** **********************************************************************/ static boolean_t -__validate_infr(IN osm_infr_rcv_t * const p_rcv, IN osm_infr_t * p_infr_rec) +__validate_infr(IN osm_sa_t * sa, IN osm_infr_t * p_infr_rec) { boolean_t valid = TRUE; - OSM_LOG_ENTER(p_rcv->p_log, __validate_infr); + OSM_LOG_ENTER(sa->p_log, __validate_infr); - valid = __validate_ports_access_rights(p_rcv, p_infr_rec); + valid = __validate_ports_access_rights(sa, p_infr_rec); if (!valid) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_infr: " "Invalid Access for InformInfo\n"); valid = FALSE; } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return valid; } @@ -263,7 +221,7 @@ with an InformInfo attribute that is a copy of the data in the Set(InformInfo) request. **********************************************************************/ static void -__osm_infr_rcv_respond(IN osm_infr_rcv_t * const p_rcv, +__osm_infr_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { osm_madw_t *p_resp_madw; @@ -272,10 +230,10 @@ __osm_infr_rcv_respond(IN osm_infr_rcv_t * const p_rcv, ib_inform_info_t *p_resp_infr; ib_api_status_t status; - OSM_LOG_ENTER(p_rcv->p_log, __osm_infr_rcv_respond); + OSM_LOG_ENTER(sa->p_log, __osm_infr_rcv_respond); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_infr_rcv_respond: " "Generating successful InformInfo response\n"); } @@ -283,11 +241,11 @@ __osm_infr_rcv_respond(IN osm_infr_rcv_t * const p_rcv, /* Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, MAD_BLOCK_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_infr_rcv_respond: ERR 4303: " "Unable to allocate MAD\n"); goto Exit; @@ -306,24 +264,24 @@ __osm_infr_rcv_respond(IN osm_infr_rcv_t * const p_rcv, (ib_inform_info_t *) ib_sa_mad_get_payload_ptr(p_resp_sa_mad); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_infr_rcv_respond: ERR 4304: " "Unable to send MAD (%s)\n", ib_get_err_str(status)); - /* osm_mad_pool_put( p_rcv->p_mad_pool, p_resp_madw ); */ + /* osm_mad_pool_put( sa->p_mad_pool, p_resp_madw ); */ goto Exit; } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_sa_inform_info_rec_by_comp_mask(IN osm_infr_rcv_t * const p_rcv, +__osm_sa_inform_info_rec_by_comp_mask(IN osm_sa_t * sa, IN const osm_infr_t * const p_infr, osm_iir_search_ctxt_t * const p_ctxt) { @@ -335,7 +293,7 @@ __osm_sa_inform_info_rec_by_comp_mask(IN osm_infr_rcv_t * const p_rcv, const osm_physp_t *p_req_physp; osm_iir_item_t *p_rec_item; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_inform_info_rec_by_comp_mask); + OSM_LOG_ENTER(sa->p_log, __osm_sa_inform_info_rec_by_comp_mask); p_rcvd_rec = p_ctxt->p_rcvd_rec; comp_mask = p_ctxt->comp_mask; @@ -358,9 +316,9 @@ __osm_sa_inform_info_rec_by_comp_mask(IN osm_infr_rcv_t * const p_rcv, /* Ensure pkey is shared before returning any records */ portguid = p_infr->inform_record.subscriber_gid.unicast.interface_id; - p_subscriber_port = osm_get_port_by_guid(p_rcv->p_subn, portguid); + p_subscriber_port = osm_get_port_by_guid(sa->p_subn, portguid); if (p_subscriber_port == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sa_inform_info_rec_by_comp_mask: ERR 430D: " "Invalid subscriber port guid: 0x%016" PRIx64 "\n", cl_ntoh64(portguid)); @@ -372,8 +330,8 @@ __osm_sa_inform_info_rec_by_comp_mask(IN osm_infr_rcv_t * const p_rcv, /* make sure that the requester and subscriber port can access each other according to the current partitioning. */ if (!osm_physp_share_pkey - (p_rcv->p_log, p_req_physp, p_subscriber_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + (sa->p_log, p_req_physp, p_subscriber_physp)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_inform_info_rec_by_comp_mask: " "requester and subscriber ports don't share pkey\n"); goto Exit; @@ -381,7 +339,7 @@ __osm_sa_inform_info_rec_by_comp_mask(IN osm_infr_rcv_t * const p_rcv, p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sa_inform_info_rec_by_comp_mask: ERR 430E: " "rec_item alloc failed\n"); goto Exit; @@ -392,7 +350,7 @@ __osm_sa_inform_info_rec_by_comp_mask(IN osm_infr_rcv_t * const p_rcv, cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -404,14 +362,14 @@ __osm_sa_inform_info_rec_by_comp_mask_cb(IN cl_list_item_t * const p_list_item, const osm_infr_t *const p_infr = (osm_infr_t *) p_list_item; osm_iir_search_ctxt_t *const p_ctxt = (osm_iir_search_ctxt_t *) context; - __osm_sa_inform_info_rec_by_comp_mask(p_ctxt->p_rcv, p_infr, p_ctxt); + __osm_sa_inform_info_rec_by_comp_mask(p_ctxt->sa, p_infr, p_ctxt); } /********************************************************************** Received a Get(InformInfoRecord) or GetTable(InformInfoRecord) MAD **********************************************************************/ static void -osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, +osm_infr_rcv_process_get_method(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { ib_sa_mad_t *p_rcvd_mad; @@ -430,7 +388,7 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, ib_api_status_t status = IB_SUCCESS; osm_physp_t *p_req_physp; - OSM_LOG_ENTER(p_rcv->p_log, osm_infr_rcv_process_get_method); + OSM_LOG_ENTER(sa->p_log, osm_infr_rcv_process_get_method); CL_ASSERT(p_madw); p_rcvd_mad = osm_madw_get_sa_mad_ptr(p_madw); @@ -438,19 +396,19 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, (ib_inform_info_record_t *) ib_sa_mad_get_payload_ptr(p_rcvd_mad); /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_infr_rcv_process_get_method: ERR 4309: " "Cannot find requester physical port\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_inform_info_record(p_rcv->p_log, p_rcvd_rec, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_inform_info_record(sa->p_log, p_rcvd_rec, OSM_LOG_DEBUG); cl_qlist_init(&rec_list); @@ -460,10 +418,10 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, context.comp_mask = p_rcvd_mad->comp_mask; context.subscriber_gid = p_rcvd_rec->subscriber_gid; context.subscriber_enum = p_rcvd_rec->subscriber_enum; - context.p_rcv = p_rcv; + context.sa = sa; context.p_req_physp = p_req_physp; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_infr_rcv_process_get_method: " "Query Subscriber GID:0x%016" PRIx64 " : 0x%016" PRIx64 "(%02X) Enum:0x%X(%02X)\n", @@ -473,12 +431,12 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, cl_ntoh16(p_rcvd_rec->subscriber_enum), (p_rcvd_mad->comp_mask & IB_IIR_COMPMASK_ENUM) != 0); - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); - cl_qlist_apply_func(&p_rcv->p_subn->sa_infr_list, + cl_qlist_apply_func(&sa->p_subn->sa_infr_list, __osm_sa_inform_info_rec_by_comp_mask_cb, &context); - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -488,16 +446,16 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, */ if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_infr_rcv_process_get_method: ERR 430A: " "More than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -521,7 +479,7 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_inform_info_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_infr_rcv_process_get_method: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -529,21 +487,21 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_infr_rcv_process_get_method: " "Returning %u records\n", num_rec); /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_inform_info_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_infr_rcv_process_get_method: ERR 430B: " "osm_mad_pool_get failed\n"); @@ -553,7 +511,7 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; @@ -611,9 +569,9 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_infr_rcv_process_get_method: ERR 430C: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -621,14 +579,14 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************* Received a Set(InformInfo) MAD **********************************************************************/ static void -osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, +osm_infr_rcv_process_set_method(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { ib_sa_mad_t *p_sa_mad; @@ -639,7 +597,7 @@ osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, uint8_t resp_time_val; ib_api_status_t res; - OSM_LOG_ENTER(p_rcv->p_log, osm_infr_rcv_process_set_method); + OSM_LOG_ENTER(sa->p_log, osm_infr_rcv_process_set_method); CL_ASSERT(p_madw); @@ -648,13 +606,13 @@ osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, (ib_inform_info_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); #if 0 - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_inform_info(p_rcv->p_log, p_recvd_inform_info, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_inform_info(sa->p_log, p_recvd_inform_info, OSM_LOG_DEBUG); #endif /* Grab the lock */ - cl_plock_excl_acquire(p_rcv->p_lock); + cl_plock_excl_acquire(sa->p_lock); /* define the inform record */ inform_info_rec.inform_record.inform_info = *p_recvd_inform_info; @@ -664,23 +622,23 @@ osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, /* we will need to know the mad srvc to send back through */ inform_info_rec.h_bind = p_madw->h_bind; - inform_info_rec.p_infr_rcv = p_rcv; + inform_info_rec.sa = sa; /* update the subscriber GID according to mad address */ - res = osm_get_gid_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + res = osm_get_gid_by_mad_addr(sa->p_log, + sa->p_subn, &p_madw->mad_addr, &inform_info_rec.inform_record. subscriber_gid); if (res != IB_SUCCESS) { - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_infr_rcv_process_set_method: ERR 4308 " "Subscribe Request from unknown LID: 0x%04X\n", cl_ntoh16(p_madw->mad_addr.dest_lid) ); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_REQ_INVALID); goto Exit; } @@ -690,13 +648,13 @@ osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, /* Subscribe values above 1 are undefined */ if (p_recvd_inform_info->subscribe > 1) { - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_infr_rcv_process_set_method: ERR 4308 " "Invalid subscribe: %d\n", p_recvd_inform_info->subscribe); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_REQ_INVALID); goto Exit; } @@ -714,7 +672,7 @@ osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, inform_info_rec.report_addr.addr_type. gsi.remote_qp); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_infr_rcv_process_set_method: " "Subscribe Request with QPN: 0x%06X\n", cl_ntoh32(inform_info_rec.report_addr.addr_type.gsi. @@ -725,7 +683,7 @@ osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, generic.qpn_resp_time_val, &qpn, &resp_time_val); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_infr_rcv_process_set_method: " "UnSubscribe Request with QPN: 0x%06X\n", cl_ntoh32(qpn) ); @@ -733,21 +691,21 @@ osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, /* If record exists with matching InformInfo */ p_infr = - osm_infr_get_by_rec(p_rcv->p_subn, p_rcv->p_log, &inform_info_rec); + osm_infr_get_by_rec(sa->p_subn, sa->p_log, &inform_info_rec); /* check to see if the request was for subscribe */ if (p_recvd_inform_info->subscribe) { /* validate the request for a new or update InformInfo */ - if (__validate_infr(p_rcv, &inform_info_rec) != TRUE) { - cl_plock_release(p_rcv->p_lock); + if (__validate_infr(sa, &inform_info_rec) != TRUE) { + cl_plock_release(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_infr_rcv_process_set_method: ERR 4305: " "Failed to validate a new inform object\n"); /* o13-13.1.1: we need to set the subscribe bit to 0 */ p_recvd_inform_info->subscribe = 0; - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_REQ_INVALID); goto Exit; } @@ -757,21 +715,21 @@ osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, /* Create the instance of the osm_infr_t object */ p_infr = osm_infr_new(&inform_info_rec); if (p_infr == NULL) { - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_infr_rcv_process_set_method: ERR 4306: " "Failed to create a new inform object\n"); /* o13-13.1.1: we need to set the subscribe bit to 0 */ p_recvd_inform_info->subscribe = 0; - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } /* Add this new osm_infr_t object to subnet object */ - osm_infr_insert_to_db(p_rcv->p_subn, p_rcv->p_log, + osm_infr_insert_to_db(sa->p_subn, sa->p_log, p_infr); } else { /* Update the old instance of the osm_infr_t object */ @@ -780,43 +738,43 @@ osm_infr_rcv_process_set_method(IN osm_infr_rcv_t * const p_rcv, } else { /* We got an UnSubscribe request */ if (p_infr == NULL) { - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); /* No Such Item - So Error */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_infr_rcv_process_set_method: ERR 4307: " "Failed to UnSubscribe to non existing inform object\n"); /* o13-13.1.1: we need to set the subscribe bit to 0 */ p_recvd_inform_info->subscribe = 0; - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_REQ_INVALID); goto Exit; } else { /* Delete this object from the subnet list of informs */ - osm_infr_remove_from_db(p_rcv->p_subn, p_rcv->p_log, + osm_infr_remove_from_db(sa->p_subn, sa->p_log, p_infr); } } - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); /* send the success response */ - __osm_infr_rcv_respond(p_rcv, p_madw); + __osm_infr_rcv_respond(sa, p_madw); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************* **********************************************************************/ void osm_infr_rcv_process(IN void *context, IN void *data) { - osm_infr_rcv_t *p_rcv = context; + osm_sa_t *sa = context; osm_madw_t *p_madw = data; ib_sa_mad_t *p_sa_mad; - OSM_LOG_ENTER(p_rcv->p_log, osm_infr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_infr_rcv_process); CL_ASSERT(p_madw); @@ -825,30 +783,30 @@ void osm_infr_rcv_process(IN void *context, IN void *data) CL_ASSERT(p_sa_mad->attr_id == IB_MAD_ATTR_INFORM_INFO); if (p_sa_mad->method != IB_MAD_METHOD_SET) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_infr_rcv_process: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_sa_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } - osm_infr_rcv_process_set_method(p_rcv, p_madw); + osm_infr_rcv_process_set_method(sa, p_madw); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************* **********************************************************************/ void osm_infir_rcv_process(IN void *context, IN void *data) { - osm_infr_rcv_t *p_rcv = context; + osm_sa_t *sa = context; osm_madw_t *p_madw = data; ib_sa_mad_t *p_sa_mad; - OSM_LOG_ENTER(p_rcv->p_log, osm_infr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_infr_rcv_process); CL_ASSERT(p_madw); @@ -858,17 +816,17 @@ void osm_infir_rcv_process(IN void *context, IN void *data) if ((p_sa_mad->method != IB_MAD_METHOD_GET) && (p_sa_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_infir_rcv_process: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_sa_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } - osm_infr_rcv_process_get_method(p_rcv, p_madw); + osm_infr_rcv_process_get_method(sa, p_madw); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_lft_record.c b/opensm/opensm/osm_sa_lft_record.c index 5f3f208..b6a86d5 100644 --- a/opensm/opensm/osm_sa_lft_record.c +++ b/opensm/opensm/osm_sa_lft_record.c @@ -53,9 +53,8 @@ #include #include #include -#include -#include #include +#include #include #include #include @@ -69,52 +68,14 @@ typedef struct _osm_lftr_search_ctxt { const ib_lft_record_t *p_rcvd_rec; ib_net64_t comp_mask; cl_qlist_t *p_list; - osm_lftr_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; } osm_lftr_search_ctxt_t; /********************************************************************** **********************************************************************/ -void osm_lftr_rcv_construct(IN osm_lftr_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_lftr_rcv_destroy(IN osm_lftr_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_lftr_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_lftr_rcv_init(IN osm_lftr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_lftr_rcv_init); - - osm_lftr_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static ib_api_status_t -__osm_lftr_rcv_new_lftr(IN osm_lftr_rcv_t * const p_rcv, +__osm_lftr_rcv_new_lftr(IN osm_sa_t * sa, IN const osm_switch_t * const p_sw, IN cl_qlist_t * const p_list, IN ib_net16_t const lid, IN ib_net16_t const block) @@ -122,19 +83,19 @@ __osm_lftr_rcv_new_lftr(IN osm_lftr_rcv_t * const p_rcv, osm_lftr_item_t *p_rec_item; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_lftr_rcv_new_lftr); + OSM_LOG_ENTER(sa->p_log, __osm_lftr_rcv_new_lftr); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_lftr_rcv_new_lftr: ERR 4402: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_lftr_rcv_new_lftr: " "New LinearForwardingTable: sw 0x%016" PRIx64 "\n\t\t\t\tblock 0x%02X lid 0x%02X\n", @@ -154,28 +115,28 @@ __osm_lftr_rcv_new_lftr(IN osm_lftr_rcv_t * const p_rcv, cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ -static osm_port_t *__osm_lftr_get_port_by_guid(IN osm_lftr_rcv_t * const p_rcv, +static osm_port_t *__osm_lftr_get_port_by_guid(IN osm_sa_t * sa, IN uint64_t port_guid) { osm_port_t *p_port; - CL_PLOCK_ACQUIRE(p_rcv->p_lock); + CL_PLOCK_ACQUIRE(sa->p_lock); - p_port = osm_get_port_by_guid(p_rcv->p_subn, port_guid); + p_port = osm_get_port_by_guid(sa->p_subn, port_guid); if (!p_port) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_lftr_get_port_by_guid ERR 4404: " "Invalid port GUID 0x%016" PRIx64 "\n", port_guid); p_port = NULL; } - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); return p_port; } @@ -189,7 +150,7 @@ __osm_lftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, (osm_lftr_search_ctxt_t *) context; const osm_switch_t *const p_sw = (osm_switch_t *) p_map_item; const ib_lft_record_t *const p_rcvd_rec = p_ctxt->p_rcvd_rec; - osm_lftr_rcv_t *const p_rcv = p_ctxt->p_rcv; + osm_sa_t *sa = p_ctxt->sa; ib_net64_t const comp_mask = p_ctxt->comp_mask; const osm_physp_t *const p_req_physp = p_ctxt->p_req_physp; osm_port_t *p_port; @@ -199,10 +160,10 @@ __osm_lftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, /* In switches, the port guid is the node guid. */ p_port = - __osm_lftr_get_port_by_guid(p_rcv, + __osm_lftr_get_port_by_guid(sa, p_sw->p_node->node_info.port_guid); if (!p_port) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_lftr_rcv_by_comp_mask: ERR 4405: " "Failed to find Port by Node Guid:0x%016" PRIx64 "\n", cl_ntoh64(p_sw->p_node->node_info.node_guid) @@ -214,7 +175,7 @@ __osm_lftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, the same partition. */ p_physp = p_port->p_physp; if (!p_physp) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_lftr_rcv_by_comp_mask: ERR 4406: " "Failed to find default physical Port by Node Guid:0x%016" PRIx64 "\n", @@ -222,7 +183,7 @@ __osm_lftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, ); return; } - if (!osm_physp_share_pkey(p_rcv->p_log, p_req_physp, p_physp)) + if (!osm_physp_share_pkey(sa->p_log, p_req_physp, p_physp)) return; /* get the port 0 of the switch */ @@ -230,7 +191,7 @@ __osm_lftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, /* compare the lids - if required */ if (comp_mask & IB_LFTR_COMPMASK_LID) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_lftr_rcv_by_comp_mask: " "Comparing lid:0x%02X to port lid range: 0x%02X .. 0x%02X\n", cl_ntoh16(p_rcvd_rec->lid), min_lid_ho, max_lid_ho); @@ -251,7 +212,7 @@ __osm_lftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, /* so we can add these blocks one by one ... */ for (block = min_block; block <= max_block; block++) - __osm_lftr_rcv_new_lftr(p_rcv, p_sw, p_ctxt->p_list, + __osm_lftr_rcv_new_lftr(sa, p_sw, p_ctxt->p_list, osm_port_get_base_lid(p_port), cl_hton16(block)); } @@ -260,7 +221,7 @@ __osm_lftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, **********************************************************************/ void osm_lftr_rcv_process(IN void *ctx, IN void *data) { - osm_lftr_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; const ib_sa_mad_t *p_rcvd_mad; const ib_lft_record_t *p_rcvd_rec; @@ -278,9 +239,9 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) ib_api_status_t status = IB_SUCCESS; osm_physp_t *p_req_physp; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_lftr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_lftr_rcv_process); CL_ASSERT(p_madw); @@ -292,22 +253,22 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) /* we only support SubnAdmGet and SubnAdmGetTable methods */ if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_lftr_rcv_process: ERR 4408: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_lftr_rcv_process: ERR 4407: " "Cannot find requester physical port\n"); goto Exit; @@ -318,16 +279,16 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.sa = sa; context.p_req_physp = p_req_physp; - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); /* Go over all switches */ - cl_qmap_apply_func(&p_rcv->p_subn->sw_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->sw_guid_tbl, __osm_lftr_rcv_by_comp_mask, &context); - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -337,16 +298,16 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) */ if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_lftr_rcv_process: ERR 4409: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -369,7 +330,7 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) trim_num_rec = (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_lft_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_lftr_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -377,11 +338,11 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_lftr_rcv_process: " "Returning %u records\n", num_rec); if ((p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -389,13 +350,13 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_lft_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_lftr_rcv_process: ERR 4410: " "osm_mad_pool_get failed\n"); @@ -405,7 +366,7 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; @@ -458,9 +419,9 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_lftr_rcv_process: ERR 4411: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -468,5 +429,5 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c index ba239be..1b833eb 100644 --- a/opensm/opensm/osm_sa_link_record.c +++ b/opensm/opensm/osm_sa_link_record.c @@ -53,10 +53,9 @@ #include #include #include -#include +#include #include #include -#include #include #include #include @@ -68,46 +67,8 @@ typedef struct _osm_lr_item { /********************************************************************** **********************************************************************/ -void osm_lr_rcv_construct(IN osm_lr_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_lr_rcv_destroy(IN osm_lr_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_lr_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_lr_rcv_init(IN osm_lr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_lr_rcv_init); - - osm_lr_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_rcv->p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static void -__osm_lr_rcv_build_physp_link(IN osm_lr_rcv_t * const p_rcv, +__osm_lr_rcv_build_physp_link(IN osm_sa_t * sa, IN const ib_net16_t from_lid, IN const ib_net16_t to_lid, IN const uint8_t from_port, @@ -117,7 +78,7 @@ __osm_lr_rcv_build_physp_link(IN osm_lr_rcv_t * const p_rcv, p_lr_item = malloc(sizeof(*p_lr_item)); if (p_lr_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_lr_rcv_build_physp_link: ERR 1801: " "Unable to acquire link record\n" "\t\t\t\tFrom port 0x%u\n" @@ -153,7 +114,7 @@ __get_base_lid(IN const osm_physp_t * p_physp, OUT ib_net16_t * p_base_lid) /********************************************************************** **********************************************************************/ static void -__osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv, +__osm_lr_rcv_get_physp_link(IN osm_sa_t * sa, IN const ib_link_record_t * const p_lr, IN const osm_physp_t * p_src_physp, IN const osm_physp_t * p_dest_physp, @@ -167,7 +128,7 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv, ib_net16_t to_base_lid; ib_net16_t lmc_mask; - OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_get_physp_link); + OSM_LOG_ENTER(sa->p_log, __osm_lr_rcv_get_physp_link); /* If only one end of the link is specified, determine @@ -215,20 +176,20 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv, /* Check that the p_src_physp, p_dest_physp and p_req_physp all share a pkey (doesn't have to be the same p_key). */ - if (!osm_physp_share_pkey(p_rcv->p_log, p_src_physp, p_dest_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (!osm_physp_share_pkey(sa->p_log, p_src_physp, p_dest_physp)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_lr_rcv_get_physp_link: " "Source and Dest PhysPorts do not share PKey\n"); goto Exit; } - if (!osm_physp_share_pkey(p_rcv->p_log, p_src_physp, p_req_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (!osm_physp_share_pkey(sa->p_log, p_src_physp, p_req_physp)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_lr_rcv_get_physp_link: " "Source and Requester PhysPorts do not share PKey\n"); goto Exit; } - if (!osm_physp_share_pkey(p_rcv->p_log, p_req_physp, p_dest_physp)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (!osm_physp_share_pkey(sa->p_log, p_req_physp, p_dest_physp)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_lr_rcv_get_physp_link: " "Requester and Dest PhysPorts do not share PKey\n"); goto Exit; @@ -248,7 +209,7 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv, __get_base_lid(p_src_physp, &from_base_lid); __get_base_lid(p_dest_physp, &to_base_lid); - lmc_mask = ~((1 << p_rcv->p_subn->opt.lmc) - 1); + lmc_mask = ~((1 << sa->p_subn->opt.lmc) - 1); lmc_mask = cl_hton16(lmc_mask); if (comp_mask & IB_LR_COMPMASK_FROM_LID) @@ -259,8 +220,8 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv, if (to_base_lid != (p_lr->to_lid & lmc_mask)) goto Exit; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_lr_rcv_get_physp_link: " "Acquiring link record\n" "\t\t\t\tsrc port 0x%" PRIx64 " (port 0x%X)" @@ -271,18 +232,17 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv, dest_port_num); - __osm_lr_rcv_build_physp_link(p_rcv, from_base_lid, - to_base_lid, src_port_num, - dest_port_num, p_list); + __osm_lr_rcv_build_physp_link(sa, from_base_lid, to_base_lid, + src_port_num, dest_port_num, p_list); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, +__osm_lr_rcv_get_port_links(IN osm_sa_t * sa, IN const ib_link_record_t * const p_lr, IN const osm_port_t * p_src_port, IN const osm_port_t * p_dest_port, @@ -299,7 +259,7 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, uint8_t dest_num_ports; uint8_t dest_port_num; - OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_get_port_links); + OSM_LOG_ENTER(sa->p_log, __osm_lr_rcv_get_port_links); if (p_src_port) { if (p_dest_port) { @@ -327,7 +287,7 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, if (osm_physp_is_valid(p_src_physp) && osm_physp_is_valid(p_dest_physp)) __osm_lr_rcv_get_physp_link - (p_rcv, p_lr, p_src_physp, + (sa, p_lr, p_src_physp, p_dest_physp, comp_mask, p_list, p_req_physp); } @@ -348,7 +308,7 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, port_num); if (osm_physp_is_valid(p_src_physp)) __osm_lr_rcv_get_physp_link - (p_rcv, p_lr, p_src_physp, + (sa, p_lr, p_src_physp, NULL, comp_mask, p_list, p_req_physp); } @@ -363,7 +323,7 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, port_num); if (osm_physp_is_valid(p_src_physp)) __osm_lr_rcv_get_physp_link - (p_rcv, p_lr, p_src_physp, + (sa, p_lr, p_src_physp, NULL, comp_mask, p_list, p_req_physp); } @@ -386,7 +346,7 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, port_num); if (osm_physp_is_valid(p_dest_physp)) __osm_lr_rcv_get_physp_link - (p_rcv, p_lr, NULL, + (sa, p_lr, NULL, p_dest_physp, comp_mask, p_list, p_req_physp); } @@ -401,7 +361,7 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, port_num); if (osm_physp_is_valid(p_dest_physp)) __osm_lr_rcv_get_physp_link - (p_rcv, p_lr, NULL, + (sa, p_lr, NULL, p_dest_physp, comp_mask, p_list, p_req_physp); } @@ -410,7 +370,7 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, /* Process the world (recurse once back into this function). */ - p_node_tbl = &p_rcv->p_subn->node_guid_tbl; + p_node_tbl = &sa->p_subn->node_guid_tbl; p_node = (osm_node_t *)cl_qmap_head(p_node_tbl); while (p_node != (osm_node_t *)cl_qmap_end(p_node_tbl)) { @@ -422,9 +382,9 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, p_src_physp = osm_node_get_any_physp_ptr(p_node); if (osm_physp_is_valid(p_src_physp)) { p_src_port = (osm_port_t *) - cl_qmap_get(&p_rcv->p_subn->port_guid_tbl, + cl_qmap_get(&sa->p_subn->port_guid_tbl, osm_physp_get_port_guid(p_src_physp)); - __osm_lr_rcv_get_port_links(p_rcv, p_lr, + __osm_lr_rcv_get_port_links(sa, p_lr, p_src_port, NULL, comp_mask, p_list, p_req_physp); @@ -435,14 +395,14 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv, } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** Returns the SA status to return to the client. **********************************************************************/ static ib_net16_t -__osm_lr_rcv_get_end_points(IN osm_lr_rcv_t * const p_rcv, +__osm_lr_rcv_get_end_points(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, OUT const osm_port_t ** const pp_src_port, OUT const osm_port_t ** const pp_dest_port) @@ -453,7 +413,7 @@ __osm_lr_rcv_get_end_points(IN osm_lr_rcv_t * const p_rcv, ib_api_status_t status; ib_net16_t sa_status = IB_SA_MAD_STATUS_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_get_end_points); + OSM_LOG_ENTER(sa->p_log, __osm_lr_rcv_get_end_points); /* Determine what fields are valid and then get a pointer @@ -467,7 +427,7 @@ __osm_lr_rcv_get_end_points(IN osm_lr_rcv_t * const p_rcv, *pp_dest_port = NULL; if (p_sa_mad->comp_mask & IB_LR_COMPMASK_FROM_LID) { - status = osm_get_port_by_base_lid(p_rcv->p_subn, + status = osm_get_port_by_base_lid(sa->p_subn, p_lr->from_lid, pp_src_port); if ((status != IB_SUCCESS) || (*pp_src_port == NULL)) { @@ -476,7 +436,7 @@ __osm_lr_rcv_get_end_points(IN osm_lr_rcv_t * const p_rcv, don't enter it as an error in our own log. Return an error response to the client. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_lr_rcv_get_end_points: " "No source port with LID = 0x%X\n", cl_ntoh16(p_lr->from_lid)); @@ -487,7 +447,7 @@ __osm_lr_rcv_get_end_points(IN osm_lr_rcv_t * const p_rcv, } if (p_sa_mad->comp_mask & IB_LR_COMPMASK_TO_LID) { - status = osm_get_port_by_base_lid(p_rcv->p_subn, + status = osm_get_port_by_base_lid(sa->p_subn, p_lr->to_lid, pp_dest_port); if ((status != IB_SUCCESS) || (*pp_dest_port == NULL)) { @@ -496,7 +456,7 @@ __osm_lr_rcv_get_end_points(IN osm_lr_rcv_t * const p_rcv, don't enter it as an error in our own log. Return an error response to the client. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_lr_rcv_get_end_points: " "No dest port with LID = 0x%X\n", cl_ntoh16(p_lr->to_lid)); @@ -507,14 +467,14 @@ __osm_lr_rcv_get_end_points(IN osm_lr_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (sa_status); } /********************************************************************** **********************************************************************/ static void -__osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, +__osm_lr_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN cl_qlist_t * const p_list) { @@ -530,7 +490,7 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, osm_lr_item_t *p_lr_item; const ib_sa_mad_t *p_rcvd_mad = osm_madw_get_sa_mad_ptr(p_madw); - OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_respond); + OSM_LOG_ENTER(sa->p_log, __osm_lr_rcv_respond); num_rec = cl_qlist_count(p_list); /* @@ -538,11 +498,11 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, * If we do a SubnAdmGet and got more than one record it is an error ! */ if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec > 1)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_lr_rcv_respond: ERR 1806: " "Got more than one record for SubnAdmGet (%zu)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -559,7 +519,7 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, trim_num_rec = (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_link_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_lr_rcv_respond: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -567,8 +527,8 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, } #endif - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_lr_rcv_respond: " "Generating response with %zu records", num_rec); } @@ -576,12 +536,12 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, /* Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_link_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_lr_rcv_respond: ERR 1802: " "Unable to allocate MAD\n"); /* Release the quick pool items */ @@ -648,24 +608,24 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv, } status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_lr_rcv_respond: ERR 1803: " "Unable to send MAD (%s)\n", ib_get_err_str(status)); - /* osm_mad_pool_put( p_rcv->p_mad_pool, p_resp_madw ); */ + /* osm_mad_pool_put( sa->p_mad_pool, p_resp_madw ); */ goto Exit; } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ void osm_lr_rcv_process(IN void *context, IN void *data) { - osm_lr_rcv_t *p_rcv = context; + osm_sa_t *sa = context; osm_madw_t *p_madw = data; const ib_link_record_t *p_lr; const ib_sa_mad_t *p_sa_mad; @@ -675,7 +635,7 @@ void osm_lr_rcv_process(IN void *context, IN void *data) ib_net16_t sa_status; osm_physp_t *p_req_physp; - OSM_LOG_ENTER(p_rcv->p_log, osm_lr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_lr_rcv_process); CL_ASSERT(p_madw); @@ -687,29 +647,29 @@ void osm_lr_rcv_process(IN void *context, IN void *data) /* we only support SubnAdmGet and SubnAdmGetTable methods */ if ((p_sa_mad->method != IB_MAD_METHOD_GET) && (p_sa_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_lr_rcv_process: ERR 1804: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_sa_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_lr_rcv_process: ERR 1805: " "Cannot find requester physical port\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_link_record(p_rcv->p_log, p_lr, OSM_LOG_DEBUG); + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_link_record(sa->p_log, p_lr, OSM_LOG_DEBUG); cl_qlist_init(&lr_list); @@ -717,28 +677,28 @@ void osm_lr_rcv_process(IN void *context, IN void *data) Most SA functions (including this one) are read-only on the subnet object, so we grab the lock non-exclusively. */ - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); - sa_status = __osm_lr_rcv_get_end_points(p_rcv, p_madw, + sa_status = __osm_lr_rcv_get_end_points(sa, p_madw, &p_src_port, &p_dest_port); if (sa_status == IB_SA_MAD_STATUS_SUCCESS) - __osm_lr_rcv_get_port_links(p_rcv, p_lr, p_src_port, + __osm_lr_rcv_get_port_links(sa, p_lr, p_src_port, p_dest_port, p_sa_mad->comp_mask, &lr_list, p_req_physp); - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); if ((cl_qlist_count(&lr_list) == 0) && (p_sa_mad->method == IB_MAD_METHOD_GET)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } - __osm_lr_rcv_respond(p_rcv, p_madw, &lr_list); + __osm_lr_rcv_respond(sa, p_madw, &lr_list); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_mad_ctrl.c b/opensm/opensm/osm_sa_mad_ctrl.c index b54193c..9a9b4c2 100644 --- a/opensm/opensm/osm_sa_mad_ctrl.c +++ b/opensm/opensm/osm_sa_mad_ctrl.c @@ -51,11 +51,11 @@ #include #include #include -#include #include +#include #include #include -#include +#include /****f* opensm: SA/__osm_sa_mad_ctrl_disp_done_callback * NAME @@ -506,7 +506,7 @@ void osm_sa_mad_ctrl_destroy(IN osm_sa_mad_ctrl_t * const p_ctrl) **********************************************************************/ ib_api_status_t osm_sa_mad_ctrl_init(IN osm_sa_mad_ctrl_t * const p_ctrl, - IN osm_sa_resp_t * const p_resp, + IN osm_sa_t * sa, IN osm_mad_pool_t * const p_mad_pool, IN osm_vendor_t * const p_vendor, IN osm_subn_t * const p_subn, @@ -520,13 +520,13 @@ osm_sa_mad_ctrl_init(IN osm_sa_mad_ctrl_t * const p_ctrl, osm_sa_mad_ctrl_construct(p_ctrl); + p_ctrl->sa = sa; p_ctrl->p_log = p_log; p_ctrl->p_disp = p_disp; p_ctrl->p_mad_pool = p_mad_pool; p_ctrl->p_vendor = p_vendor; p_ctrl->p_stats = p_stats; p_ctrl->p_subn = p_subn; - p_ctrl->p_resp = p_resp; p_ctrl->h_disp = cl_disp_register(p_disp, CL_DISP_MSGID_NONE, NULL, p_ctrl); diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index ddb1ca5..8eb97ad 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -56,20 +56,30 @@ #include #include #include -#include -#include +#include #include #include #include #include -#include -#include #include #include #include #include #include +#define JOIN_MC_COMP_MASK (IB_MCR_COMPMASK_MGID | \ + IB_MCR_COMPMASK_PORT_GID | \ + IB_MCR_COMPMASK_JOIN_STATE) + +#define REQUIRED_MC_CREATE_COMP_MASK (IB_MCR_COMPMASK_MGID | \ + IB_MCR_COMPMASK_PORT_GID | \ + IB_MCR_COMPMASK_JOIN_STATE | \ + IB_MCR_COMPMASK_QKEY | \ + IB_MCR_COMPMASK_TCLASS | \ + IB_MCR_COMPMASK_PKEY | \ + IB_MCR_COMPMASK_FLOW | \ + IB_MCR_COMPMASK_SL) + typedef struct _osm_mcmr_item { cl_list_item_t list_item; ib_member_rec_t rec; @@ -78,7 +88,7 @@ typedef struct _osm_mcmr_item { typedef struct osm_sa_mcmr_search_ctxt { const ib_member_rec_t *p_mcmember_rec; osm_mgrp_t *p_mgrp; - osm_mcmr_recv_t *p_rcv; + osm_sa_t *sa; cl_qlist_t *p_list; /* hold results */ ib_net64_t comp_mask; const osm_physp_t *p_req_physp; @@ -86,49 +96,6 @@ typedef struct osm_sa_mcmr_search_ctxt { } osm_sa_mcmr_search_ctxt_t; /********************************************************************** - **********************************************************************/ -void osm_mcmr_rcv_construct(IN osm_mcmr_recv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_mcmr_rcv_destroy(IN osm_mcmr_recv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - - OSM_LOG_ENTER(p_rcv->p_log, osm_mcmr_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_mcmr_rcv_init(IN osm_sm_t * const p_sm, - IN osm_mcmr_recv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_mcmr_rcv_init); - - osm_mcmr_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_sm = p_sm; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - p_rcv->mlid_ho = 0xC000; - - OSM_LOG_EXIT(p_rcv->p_log); - return IB_SUCCESS; -} - -/********************************************************************** A search function that compares the given mgrp with the search context if there is a match by mgid the p_mgrp is copied to the search context p_mgrp component @@ -145,10 +112,10 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) osm_sa_mcmr_search_ctxt_t *p_ctxt = (osm_sa_mcmr_search_ctxt_t *) context; const ib_member_rec_t *p_recvd_mcmember_rec; - osm_mcmr_recv_t *p_rcv; + osm_sa_t *sa; p_recvd_mcmember_rec = p_ctxt->p_mcmember_rec; - p_rcv = p_ctxt->p_rcv; + sa = p_ctxt->sa; /* ignore groups marked for deletion */ if (p_mgrp->to_be_deleted) @@ -161,7 +128,7 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) return; if (p_ctxt->p_mgrp) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__search_mgrp_by_mgid: ERR 1B03: " "Multiple MC groups for same MGID\n"); return; @@ -174,13 +141,13 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) /********************************************************************** Look for a MGRP in the mgrp_mlid_tbl by mlid **********************************************************************/ -static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_mcmr_recv_t * const p_rcv, +static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_sa_t * sa, IN ib_net16_t const mlid) { cl_map_item_t *map_item; - map_item = cl_qmap_get(&p_rcv->p_subn->mgrp_mlid_tbl, mlid); - if (map_item == cl_qmap_end(&p_rcv->p_subn->mgrp_mlid_tbl)) { + map_item = cl_qmap_get(&sa->p_subn->mgrp_mlid_tbl, mlid); + if (map_item == cl_qmap_end(&sa->p_subn->mgrp_mlid_tbl)) { return NULL; } return (osm_mgrp_t *) map_item; @@ -191,17 +158,17 @@ static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_mcmr_recv_t * const p_rcv, Look for a MGRP in the mgrp_mlid_tbl by mgid ***********************************************************************/ static ib_api_status_t -__get_mgrp_by_mgid(IN osm_mcmr_recv_t * const p_rcv, +__get_mgrp_by_mgid(IN osm_sa_t * sa, IN ib_member_rec_t * p_recvd_mcmember_rec, OUT osm_mgrp_t ** pp_mgrp) { osm_sa_mcmr_search_ctxt_t mcmr_search_context; mcmr_search_context.p_mcmember_rec = p_recvd_mcmember_rec; - mcmr_search_context.p_rcv = p_rcv; + mcmr_search_context.sa = sa; mcmr_search_context.p_mgrp = NULL; - cl_qmap_apply_func(&p_rcv->p_subn->mgrp_mlid_tbl, + cl_qmap_apply_func(&sa->p_subn->mgrp_mlid_tbl, __search_mgrp_by_mgid, &mcmr_search_context); if (mcmr_search_context.p_mgrp == NULL) { @@ -236,9 +203,9 @@ Return an mlid to the pool of free mlids. But this implementation is not a pool - it simply scans through the MGRP database for unused mlids... *********************************************************************/ -static void __free_mlid(IN osm_mcmr_recv_t * const p_rcv, IN uint16_t mlid) +static void __free_mlid(IN osm_sa_t * sa, IN uint16_t mlid) { - UNUSED_PARAM(p_rcv); + UNUSED_PARAM(sa); UNUSED_PARAM(mlid); } @@ -248,16 +215,16 @@ TODO: Implement a more scalable - O(1) solution based on pool of available mlids. **********************************************************************/ static ib_net16_t -__get_new_mlid(IN osm_mcmr_recv_t * const p_rcv, IN ib_net16_t requested_mlid) +__get_new_mlid(IN osm_sa_t * sa, IN ib_net16_t requested_mlid) { - osm_subn_t *p_subn = p_rcv->p_subn; + osm_subn_t *p_subn = sa->p_subn; osm_mgrp_t *p_mgrp; uint8_t *used_mlids_array; uint16_t idx; uint16_t mlid; /* the result */ uint16_t max_num_mlids; - OSM_LOG_ENTER(p_rcv->p_log, __get_new_mlid); + OSM_LOG_ENTER(sa->p_log, __get_new_mlid); if (requested_mlid && cl_ntoh16(requested_mlid) >= IB_LID_MCAST_START_HO && cl_ntoh16(requested_mlid) < p_subn->max_multicast_lid_ho @@ -272,7 +239,7 @@ __get_new_mlid(IN osm_mcmr_recv_t * const p_rcv, IN ib_net16_t requested_mlid) p_mgrp = (osm_mgrp_t *) cl_qmap_head(&p_subn->mgrp_mlid_tbl); if (p_mgrp == (osm_mgrp_t *) cl_qmap_end(&p_subn->mgrp_mlid_tbl)) { mlid = IB_LID_MCAST_START_HO; - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__get_new_mlid: " "No multicast groups found using minimal mlid:0x%04X\n", mlid); @@ -280,7 +247,7 @@ __get_new_mlid(IN osm_mcmr_recv_t * const p_rcv, IN ib_net16_t requested_mlid) } max_num_mlids = - p_rcv->p_subn->max_multicast_lid_ho - IB_LID_MCAST_START_HO; + sa->p_subn->max_multicast_lid_ho - IB_LID_MCAST_START_HO; /* track all used mlids in the array (by mlid index) */ used_mlids_array = (uint8_t *) malloc(sizeof(uint8_t) * max_num_mlids); @@ -293,7 +260,7 @@ __get_new_mlid(IN osm_mcmr_recv_t * const p_rcv, IN ib_net16_t requested_mlid) while (p_mgrp != (osm_mgrp_t *) cl_qmap_end(&p_subn->mgrp_mlid_tbl)) { /* ignore mgrps marked for deletion */ if (p_mgrp->to_be_deleted == FALSE) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__get_new_mlid: " "Found mgrp with lid:0x%X MGID: 0x%016" PRIx64 " : " "0x%016" PRIx64 "\n", @@ -305,8 +272,8 @@ __get_new_mlid(IN osm_mcmr_recv_t * const p_rcv, IN ib_net16_t requested_mlid) /* Map in table */ if (cl_ntoh16(p_mgrp->mlid) > - p_rcv->p_subn->max_multicast_lid_ho) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + sa->p_subn->max_multicast_lid_ho) { + osm_log(sa->p_log, OSM_LOG_ERROR, "__get_new_mlid: ERR 1B27: " "Found mgrp with mlid:0x%04X > max allowed mlid:0x%04X\n", cl_ntoh16(p_mgrp->mlid), @@ -326,11 +293,11 @@ __get_new_mlid(IN osm_mcmr_recv_t * const p_rcv, IN ib_net16_t requested_mlid) /* did it go above the maximal mlid allowed */ if (idx < max_num_mlids) { mlid = idx + IB_LID_MCAST_START_HO; - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__get_new_mlid: " "Found available mlid:0x%04X at idx:%u\n", mlid, idx); } else { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__get_new_mlid: ERR 1B23: " "All available:%u mlids are taken\n", max_num_mlids); mlid = 0; @@ -339,7 +306,7 @@ __get_new_mlid(IN osm_mcmr_recv_t * const p_rcv, IN ib_net16_t requested_mlid) free(used_mlids_array); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return cl_hton16(mlid); } @@ -351,18 +318,18 @@ silently drop it. Since it was an intermediate group no need to re-route it. **********************************************************************/ static void -__cleanup_mgrp(IN osm_mcmr_recv_t * const p_rcv, IN ib_net16_t const mlid) +__cleanup_mgrp(IN osm_sa_t * sa, IN ib_net16_t const mlid) { osm_mgrp_t *p_mgrp; - p_mgrp = __get_mgrp_by_mlid(p_rcv, mlid); + p_mgrp = __get_mgrp_by_mlid(sa, mlid); if (p_mgrp) { /* Remove MGRP only if osm_mcm_port_t count is 0 and * Not a well known group */ if (cl_is_qmap_empty(&p_mgrp->mcm_port_tbl) && (p_mgrp->well_known == FALSE)) { - cl_qmap_remove_item(&p_rcv->p_subn->mgrp_mlid_tbl, + cl_qmap_remove_item(&sa->p_subn->mgrp_mlid_tbl, (cl_map_item_t *) p_mgrp); osm_mgrp_delete(p_mgrp); } @@ -374,7 +341,7 @@ Add a port to the group. Calculating its PROXY_JOIN by the Port and requester gids. **********************************************************************/ static ib_api_status_t -__add_new_mgrp_port(IN osm_mcmr_recv_t * p_rcv, +__add_new_mgrp_port(IN osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, IN ib_member_rec_t * p_recvd_mcmember_rec, IN osm_mad_addr_t * p_mad_addr, @@ -386,11 +353,11 @@ __add_new_mgrp_port(IN osm_mcmr_recv_t * p_rcv, /* set the proxy_join if the requester gid is not identical to the joined gid */ - res = osm_get_gid_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + res = osm_get_gid_by_mad_addr(sa->p_log, + sa->p_subn, p_mad_addr, &requester_gid); if (res != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__add_new_mgrp_port: ERR 1B29: " "Could not find GID for requester\n"); @@ -400,7 +367,7 @@ __add_new_mgrp_port(IN osm_mcmr_recv_t * p_rcv, if (!memcmp(&p_recvd_mcmember_rec->port_gid, &requester_gid, sizeof(ib_gid_t))) { proxy_join = FALSE; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__add_new_mgrp_port: " "Create new port with proxy_join FALSE\n"); } else { @@ -408,7 +375,7 @@ __add_new_mgrp_port(IN osm_mcmr_recv_t * p_rcv, The check that the requester is in the same partition as the PortGID is done before - just need to update the proxy_join. */ proxy_join = TRUE; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__add_new_mgrp_port: " "Create new port with proxy_join TRUE\n"); } @@ -418,7 +385,7 @@ __add_new_mgrp_port(IN osm_mcmr_recv_t * p_rcv, p_recvd_mcmember_rec->scope_state, proxy_join); if (*pp_mcmr_port == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__add_new_mgrp_port: ERR 1B06: " "osm_mgrp_add_port failed\n"); @@ -450,7 +417,7 @@ __check_create_comp_mask(ib_net64_t comp_mask, Generate the response MAD **********************************************************************/ static void -__osm_mcmr_rcv_respond(IN const osm_mcmr_recv_t * const p_rcv, +__osm_mcmr_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN ib_member_rec_t * p_mcmember_rec) { @@ -459,12 +426,12 @@ __osm_mcmr_rcv_respond(IN const osm_mcmr_recv_t * const p_rcv, ib_member_rec_t *p_resp_mcmember_rec; ib_api_status_t status; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mcmr_rcv_respond); + OSM_LOG_ENTER(sa->p_log, __osm_mcmr_rcv_respond); /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, sizeof(ib_member_rec_t) + IB_SA_MAD_HDR_SIZE, @@ -506,17 +473,17 @@ __osm_mcmr_rcv_respond(IN const osm_mcmr_recv_t * const p_rcv, p_resp_mcmember_rec->pkt_life |= 2 << 6; /* exactly */ status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_respond: ERR 1B07: " "Unable to send MAD (%s) for TID <0x%" PRIx64 ">\n", ib_get_err_str(status), p_resp_sa_mad->trans_id); } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return; } @@ -666,7 +633,7 @@ __validate_port_caps(osm_log_t * const p_log, * part of the partition for that MCMemberRecord. **********************************************************************/ static boolean_t -__validate_modify(IN osm_mcmr_recv_t * const p_rcv, +__validate_modify(IN osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, IN osm_mad_addr_t * p_mad_addr, IN ib_member_rec_t * p_recvd_mcmember_rec, @@ -683,7 +650,7 @@ __validate_modify(IN osm_mcmr_recv_t * const p_rcv, /* o15-0.2.1: If this is a new port being added - nothing to check */ if (!osm_mgrp_is_port_present(p_mgrp, portguid, pp_mcm_port)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_modify: " "This is a new port in the MC group\n"); return TRUE; @@ -694,12 +661,12 @@ __validate_modify(IN osm_mcmr_recv_t * const p_rcv, if ((*pp_mcm_port)->proxy_join == FALSE) { /* The proxy_join is not set. Modifying can by done only if the requester GID == PortGID */ - res = osm_get_gid_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + res = osm_get_gid_by_mad_addr(sa->p_log, + sa->p_subn, p_mad_addr, &request_gid); if (res != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_modify: " "Could not find port for requested address\n"); return FALSE; @@ -708,7 +675,7 @@ __validate_modify(IN osm_mcmr_recv_t * const p_rcv, if (memcmp (&((*pp_mcm_port)->port_gid), &request_gid, sizeof(ib_gid_t))) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_modify: " "No ProxyJoin but different ports: stored:0x%016" PRIx64 " request:0x%016" PRIx64 "\n", @@ -722,16 +689,16 @@ __validate_modify(IN osm_mcmr_recv_t * const p_rcv, } else { /* The proxy_join is set. Modification allowed only if the requester is part of the partition for this MCMemberRecord */ - p_request_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_request_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, p_mad_addr); if (p_request_physp == NULL) return FALSE; - if (!osm_physp_has_pkey(p_rcv->p_log, p_mgrp->mcmember_rec.pkey, + if (!osm_physp_has_pkey(sa->p_log, p_mgrp->mcmember_rec.pkey, p_request_physp)) { /* the request port is not part of the partition for this mgrp */ - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_modify: " "ProxyJoin but port not in partition. stored:0x%016" PRIx64 " request:0x%016" PRIx64 "\n", @@ -769,7 +736,7 @@ __validate_modify(IN osm_mcmr_recv_t * const p_rcv, * by the stored MCMemberRecord:P_Key. */ static boolean_t -__validate_delete(IN osm_mcmr_recv_t * const p_rcv, +__validate_delete(IN osm_sa_t * sa, IN osm_mgrp_t * p_mgrp, IN osm_mad_addr_t * p_mad_addr, IN ib_member_rec_t * p_recvd_mcmember_rec, @@ -783,7 +750,7 @@ __validate_delete(IN osm_mcmr_recv_t * const p_rcv, /* 1 */ if (!osm_mgrp_is_port_present(p_mgrp, portguid, pp_mcm_port)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_delete: " "Failed to find the port in the MC group\n"); return FALSE; @@ -792,7 +759,7 @@ __validate_delete(IN osm_mcmr_recv_t * const p_rcv, /* 2 */ if (!(p_recvd_mcmember_rec->scope_state & 0x0F & (*pp_mcm_port)->scope_state)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_delete: " "Could not find any matching bits in the stored and requested JoinStates\n"); return FALSE; @@ -802,7 +769,7 @@ __validate_delete(IN osm_mcmr_recv_t * const p_rcv, if (((p_recvd_mcmember_rec->scope_state & 0x0F) | (0x0F & (*pp_mcm_port)->scope_state)) != (0x0F & (*pp_mcm_port)->scope_state)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_delete: " "Some bits in the request JoinState (0x%X) are not set in the stored port (0x%X)\n", (p_recvd_mcmember_rec->scope_state & 0x0F), @@ -813,9 +780,9 @@ __validate_delete(IN osm_mcmr_recv_t * const p_rcv, /* 4 */ /* Validate according the the proxy_join (o15-0.1.2) */ - if (__validate_modify(p_rcv, p_mgrp, p_mad_addr, p_recvd_mcmember_rec, + if (__validate_modify(sa, p_mgrp, p_mad_addr, p_recvd_mcmember_rec, pp_mcm_port) == FALSE) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_delete: " "proxy_join validation failure\n"); return FALSE; @@ -866,17 +833,17 @@ __validate_delete(IN osm_mcmr_recv_t * const p_rcv, * only source for this signature with link-local scope) */ static ib_api_status_t -__validate_requested_mgid(IN osm_mcmr_recv_t * const p_rcv, +__validate_requested_mgid(IN osm_sa_t * sa, IN const ib_member_rec_t * p_mcm_rec) { uint16_t signature; boolean_t valid = TRUE; - OSM_LOG_ENTER(p_rcv->p_log, __validate_requested_mgid); + OSM_LOG_ENTER(sa->p_log, __validate_requested_mgid); /* 14-a: mcast GID must start with 0xFF */ if (p_mcm_rec->mgid.multicast.header[0] != 0xFF) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__validate_requested_mgid: ERR 1B01: " "Wrong MGID Prefix 0x%02X must be 0xFF\n", cl_ntoh16(p_mcm_rec->mgid.multicast.header[0]) @@ -889,7 +856,7 @@ __validate_requested_mgid(IN osm_mcmr_recv_t * const p_rcv, memcpy(&signature, &(p_mcm_rec->mgid.multicast.raw_group_id), sizeof(signature)); signature = cl_ntoh16(signature); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_requested_mgid: " "MGID Signed as 0x%04X\n", signature); @@ -913,7 +880,7 @@ __validate_requested_mgid(IN osm_mcmr_recv_t * const p_rcv, * */ if (signature == 0x401B || signature == 0x601B) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_requested_mgid: " "Skipping MGID Validation for IPoIB Signed (0x%04X) MGIDs\n", signature); @@ -922,7 +889,7 @@ __validate_requested_mgid(IN osm_mcmr_recv_t * const p_rcv, /* 14-b: the 3 upper bits in the "flags" should be zero: */ if (p_mcm_rec->mgid.multicast.header[1] & 0xE0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__validate_requested_mgid: ERR 1B28: " "MGID uses Reserved Flags: flags=0x%X\n", (p_mcm_rec->mgid.multicast.header[1] & 0xE0) >> 4); @@ -935,7 +902,7 @@ __validate_requested_mgid(IN osm_mcmr_recv_t * const p_rcv, if ((signature == 0xA01B) && ((p_mcm_rec->mgid.multicast.header[1] & 0x0F) == IB_MC_SCOPE_LINK_LOCAL)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__validate_requested_mgid: ERR 1B24: " "MGID uses 0xA01B signature but with link-local scope\n"); valid = FALSE; @@ -950,7 +917,7 @@ __validate_requested_mgid(IN osm_mcmr_recv_t * const p_rcv, */ Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (valid); } @@ -959,7 +926,7 @@ __validate_requested_mgid(IN osm_mcmr_recv_t * const p_rcv, Also set the default MTU and Rate if not provided by the user. **********************************************************************/ static boolean_t -__mgrp_request_is_realizable(IN osm_mcmr_recv_t * const p_rcv, +__mgrp_request_is_realizable(IN osm_sa_t * sa, IN ib_net64_t comp_mask, IN ib_member_rec_t * p_mcm_rec, IN const osm_physp_t * const p_physp) @@ -968,9 +935,9 @@ __mgrp_request_is_realizable(IN osm_mcmr_recv_t * const p_rcv, uint8_t mtu_required, mtu, port_mtu; uint8_t rate_sel = 2; /* exactly */ uint8_t rate_required, rate, port_rate; - osm_log_t *p_log = p_rcv->p_log; + osm_log_t *p_log = sa->p_log; - OSM_LOG_ENTER(p_rcv->p_log, __mgrp_request_is_realizable); + OSM_LOG_ENTER(sa->p_log, __mgrp_request_is_realizable); /* * End of o15-0.2.3 specifies: @@ -987,7 +954,7 @@ __mgrp_request_is_realizable(IN osm_mcmr_recv_t * const p_rcv, if (!(comp_mask & IB_MCR_COMPMASK_MTU) || !(comp_mask & IB_MCR_COMPMASK_MTU_SEL) || (mtu_sel = (p_mcm_rec->mtu >> 6)) == 3) - mtu = port_mtu ? port_mtu : p_rcv->p_subn->min_ca_mtu; + mtu = port_mtu ? port_mtu : sa->p_subn->min_ca_mtu; else { mtu_required = (uint8_t) (p_mcm_rec->mtu & 0x3F); mtu = mtu_required; @@ -1003,8 +970,8 @@ __mgrp_request_is_realizable(IN osm_mcmr_recv_t * const p_rcv, /* we provide the largest MTU possible if we can */ if (port_mtu) mtu = port_mtu; - else if (mtu_required < p_rcv->p_subn->min_ca_mtu) - mtu = p_rcv->p_subn->min_ca_mtu; + else if (mtu_required < sa->p_subn->min_ca_mtu) + mtu = sa->p_subn->min_ca_mtu; else mtu++; break; @@ -1036,7 +1003,7 @@ __mgrp_request_is_realizable(IN osm_mcmr_recv_t * const p_rcv, if (!(comp_mask & IB_MCR_COMPMASK_RATE) || !(comp_mask & IB_MCR_COMPMASK_RATE_SEL) || (rate_sel = (p_mcm_rec->rate >> 6)) == 3) - rate = port_rate ? port_rate : p_rcv->p_subn->min_ca_rate; + rate = port_rate ? port_rate : sa->p_subn->min_ca_rate; else { rate_required = (uint8_t) (p_mcm_rec->rate & 0x3F); rate = rate_required; @@ -1052,8 +1019,8 @@ __mgrp_request_is_realizable(IN osm_mcmr_recv_t * const p_rcv, /* we provide the largest RATE possible if we can */ if (port_rate) rate = port_rate; - else if (rate_required < p_rcv->p_subn->min_ca_rate) - rate = p_rcv->p_subn->min_ca_rate; + else if (rate_required < sa->p_subn->min_ca_rate) + rate = sa->p_subn->min_ca_rate; else rate++; break; @@ -1080,35 +1047,15 @@ __mgrp_request_is_realizable(IN osm_mcmr_recv_t * const p_rcv, } p_mcm_rec->rate = (rate_sel << 6) | rate; - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return TRUE; } /********************************************************************** - Call this function to find or create a new mgrp. -**********************************************************************/ -ib_api_status_t -osm_mcmr_rcv_find_or_create_new_mgrp(IN osm_mcmr_recv_t * const p_rcv, - IN ib_net64_t comp_mask, - IN ib_member_rec_t * - const p_recvd_mcmember_rec, - OUT osm_mgrp_t ** pp_mgrp) -{ - ib_api_status_t status; - - status = __get_mgrp_by_mgid(p_rcv, p_recvd_mcmember_rec, pp_mgrp); - if (status == IB_SUCCESS) - return status; - return osm_mcmr_rcv_create_new_mgrp(p_rcv, comp_mask, - p_recvd_mcmember_rec, NULL, - pp_mgrp); -} - -/********************************************************************** Call this function to create a new mgrp. **********************************************************************/ ib_api_status_t -osm_mcmr_rcv_create_new_mgrp(IN osm_mcmr_recv_t * const p_rcv, +osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, IN ib_net64_t comp_mask, IN const ib_member_rec_t * const p_recvd_mcmember_rec, @@ -1123,7 +1070,7 @@ osm_mcmr_rcv_create_new_mgrp(IN osm_mcmr_recv_t * const p_rcv, ib_api_status_t status = IB_SUCCESS; ib_member_rec_t mcm_rec = *p_recvd_mcmember_rec; /* copy for modifications */ - OSM_LOG_ENTER(p_rcv->p_log, osm_mcmr_rcv_create_new_mgrp); + OSM_LOG_ENTER(sa->p_log, osm_mcmr_rcv_create_new_mgrp); /* but what if the given MGID was not 0 ? */ zero_mgid = 1; @@ -1138,16 +1085,16 @@ osm_mcmr_rcv_create_new_mgrp(IN osm_mcmr_recv_t * const p_rcv, we allocate a new mlid number before we might use it for MGID ... */ - mlid = __get_new_mlid(p_rcv, mcm_rec.mlid); + mlid = __get_new_mlid(sa, mcm_rec.mlid); if (mlid == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mcmr_rcv_create_new_mgrp: ERR 1B19: " "__get_new_mlid failed\n"); status = IB_SA_MAD_STATUS_NO_RESOURCES; goto Exit; } - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_mcmr_rcv_create_new_mgrp: " "Obtained new mlid 0x%X\n", cl_ntoh16(mlid)); @@ -1172,12 +1119,12 @@ osm_mcmr_rcv_create_new_mgrp(IN osm_mcmr_recv_t * const p_rcv, /* HACK: use the SA port gid to make it globally unique */ memcpy((&p_mgid->raw[4]), - &p_rcv->p_subn->opt.subnet_prefix, sizeof(uint64_t)); + &sa->p_subn->opt.subnet_prefix, sizeof(uint64_t)); /* HACK: how do we get a unique number - use the mlid twice */ memcpy(&p_mgid->raw[10], &mlid, sizeof(uint16_t)); memcpy(&p_mgid->raw[12], &mlid, sizeof(uint16_t)); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_mcmr_rcv_create_new_mgrp: " "Allocated new MGID:0x%016" PRIx64 " : " "0x%016" PRIx64 "\n", @@ -1185,24 +1132,24 @@ osm_mcmr_rcv_create_new_mgrp(IN osm_mcmr_recv_t * const p_rcv, cl_ntoh64(p_mgid->unicast.interface_id)); } else { /* a specific MGID was requested so validate the resulting MGID */ - valid = __validate_requested_mgid(p_rcv, &mcm_rec); + valid = __validate_requested_mgid(sa, &mcm_rec); if (!valid) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mcmr_rcv_create_new_mgrp: ERR 1B22: " "Invalid requested MGID\n"); - __free_mlid(p_rcv, mlid); + __free_mlid(sa, mlid); status = IB_SA_MAD_STATUS_REQ_INVALID; goto Exit; } } /* check the requested parameters are realizable */ - if (__mgrp_request_is_realizable(p_rcv, comp_mask, &mcm_rec, p_physp) == + if (__mgrp_request_is_realizable(sa, comp_mask, &mcm_rec, p_physp) == FALSE) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mcmr_rcv_create_new_mgrp: ERR 1B26: " "Requested MGRP parameters are not realizable\n"); - __free_mlid(p_rcv, mlid); + __free_mlid(sa, mlid); status = IB_SA_MAD_STATUS_REQ_INVALID; goto Exit; } @@ -1210,10 +1157,10 @@ osm_mcmr_rcv_create_new_mgrp(IN osm_mcmr_recv_t * const p_rcv, /* create a new MC Group */ *pp_mgrp = osm_mgrp_new(mlid); if (*pp_mgrp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mcmr_rcv_create_new_mgrp: ERR 1B08: " "osm_mgrp_new failed\n"); - __free_mlid(p_rcv, mlid); + __free_mlid(sa, mlid); status = IB_SA_MAD_STATUS_NO_RESOURCES; goto Exit; } @@ -1236,36 +1183,56 @@ osm_mcmr_rcv_create_new_mgrp(IN osm_mcmr_recv_t * const p_rcv, one whose deletion was delayed for an idle time we need to deallocate it first */ p_prev_mgrp = - (osm_mgrp_t *) cl_qmap_get(&p_rcv->p_subn->mgrp_mlid_tbl, mlid); + (osm_mgrp_t *) cl_qmap_get(&sa->p_subn->mgrp_mlid_tbl, mlid); if (p_prev_mgrp != - (osm_mgrp_t *) cl_qmap_end(&p_rcv->p_subn->mgrp_mlid_tbl)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + (osm_mgrp_t *) cl_qmap_end(&sa->p_subn->mgrp_mlid_tbl)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_mcmr_rcv_create_new_mgrp: " "Found previous group for mlid:0x%04x - Need to destroy it\n", cl_ntoh16(mlid)); - cl_qmap_remove_item(&p_rcv->p_subn->mgrp_mlid_tbl, + cl_qmap_remove_item(&sa->p_subn->mgrp_mlid_tbl, (cl_map_item_t *) p_prev_mgrp); osm_mgrp_delete(p_prev_mgrp); } - cl_qmap_insert(&p_rcv->p_subn->mgrp_mlid_tbl, + cl_qmap_insert(&sa->p_subn->mgrp_mlid_tbl, mlid, &(*pp_mgrp)->map_item); /* Send a Report to any InformInfo registerd for Trap 66: MCGroup create */ - osm_mgrp_send_create_notice(p_rcv->p_subn, p_rcv->p_log, *pp_mgrp); + osm_mgrp_send_create_notice(sa->p_subn, sa->p_log, *pp_mgrp); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return status; } +/********************************************************************** + Call this function to find or create a new mgrp. +**********************************************************************/ +ib_api_status_t +osm_mcmr_rcv_find_or_create_new_mgrp(IN osm_sa_t * sa, + IN ib_net64_t comp_mask, + IN ib_member_rec_t * + const p_recvd_mcmember_rec, + OUT osm_mgrp_t ** pp_mgrp) +{ + ib_api_status_t status; + + status = __get_mgrp_by_mgid(sa, p_recvd_mcmember_rec, pp_mgrp); + if (status == IB_SUCCESS) + return status; + return osm_mcmr_rcv_create_new_mgrp(sa, comp_mask, + p_recvd_mcmember_rec, NULL, + pp_mgrp); +} + /********************************************************************* Process a request for leaving the group **********************************************************************/ static void -__osm_mcmr_rcv_leave_mgrp(IN osm_mcmr_recv_t * const p_rcv, +__osm_mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { boolean_t valid; @@ -1281,7 +1248,7 @@ __osm_mcmr_rcv_leave_mgrp(IN osm_mcmr_recv_t * const p_rcv, uint8_t port_join_state; uint8_t new_join_state; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mcmr_rcv_leave_mgrp); + OSM_LOG_ENTER(sa->p_log, __osm_mcmr_rcv_leave_mgrp); p_mgrp = NULL; p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); @@ -1290,20 +1257,20 @@ __osm_mcmr_rcv_leave_mgrp(IN osm_mcmr_recv_t * const p_rcv, mcmember_rec = *p_recvd_mcmember_rec; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mcmr_rcv_leave_mgrp: Dump of record\n"); - osm_dump_mc_record(p_rcv->p_log, &mcmember_rec, OSM_LOG_DEBUG); + osm_dump_mc_record(sa->p_log, &mcmember_rec, OSM_LOG_DEBUG); } - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); - status = __get_mgrp_by_mgid(p_rcv, p_recvd_mcmember_rec, &p_mgrp); + CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); + status = __get_mgrp_by_mgid(sa, p_recvd_mcmember_rec, &p_mgrp); if (status == IB_SUCCESS) { mlid = p_mgrp->mlid; portguid = p_recvd_mcmember_rec->port_gid.unicast.interface_id; /* check validity of the delete request o15-0.1.14 */ - valid = __validate_delete(p_rcv, + valid = __validate_delete(sa, p_mgrp, osm_madw_get_mad_addr_ptr(p_madw), p_recvd_mcmember_rec, &p_mcm_port); @@ -1327,9 +1294,9 @@ __osm_mcmr_rcv_leave_mgrp(IN osm_mcmr_recv_t * const p_rcv, mcmember_rec.scope_state = p_mcm_port->scope_state; - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mcmr_rcv_leave_mgrp: " "After update JoinState != 0. Updating from 0x%X to 0x%X\n", port_join_state, new_join_state); @@ -1339,20 +1306,20 @@ __osm_mcmr_rcv_leave_mgrp(IN osm_mcmr_recv_t * const p_rcv, p_mcm_port->scope_state; /* OK we can leave */ - /* note: osm_sm_mcgrp_leave() will release p_rcv->p_lock */ + /* note: osm_sm_mcgrp_leave() will release sa->p_lock */ status = - osm_sm_mcgrp_leave(p_rcv->p_sm, mlid, + osm_sm_mcgrp_leave(sa->sm, mlid, portguid); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_leave_mgrp: ERR 1B09: " "osm_sm_mcgrp_leave failed\n"); } } } else { - CL_PLOCK_RELEASE(p_rcv->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + CL_PLOCK_RELEASE(sa->p_lock); + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_leave_mgrp: ERR 1B25: " "Received an invalid delete request for " "MGID: 0x%016" PRIx64 " : " @@ -1368,24 +1335,24 @@ __osm_mcmr_rcv_leave_mgrp(IN osm_mcmr_recv_t * const p_rcv, cl_ntoh64(p_recvd_mcmember_rec->port_gid. unicast.interface_id)); sa_status = IB_SA_MAD_STATUS_REQ_INVALID; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } } else { - CL_PLOCK_RELEASE(p_rcv->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + CL_PLOCK_RELEASE(sa->p_lock); + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mcmr_rcv_leave_mgrp: " "Failed since multicast group not present\n"); sa_status = IB_SA_MAD_STATUS_REQ_INVALID; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } /* Send an SA response */ - __osm_mcmr_rcv_respond(p_rcv, p_madw, &mcmember_rec); + __osm_mcmr_rcv_respond(sa, p_madw, &mcmember_rec); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return; } @@ -1393,7 +1360,7 @@ __osm_mcmr_rcv_leave_mgrp(IN osm_mcmr_recv_t * const p_rcv, Handle a join (or create) request **********************************************************************/ static void -__osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, +__osm_mcmr_rcv_join_mgrp(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { boolean_t valid; @@ -1413,7 +1380,7 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, osm_mcast_req_type_t req_type; uint8_t join_state; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mcmr_rcv_join_mgrp); + OSM_LOG_ENTER(sa->p_log, __osm_mcmr_rcv_join_mgrp); p_mgrp = NULL; p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); @@ -1424,25 +1391,25 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, mcmember_rec = *p_recvd_mcmember_rec; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mcmr_rcv_join_mgrp: " "Dump of incoming record\n"); - osm_dump_mc_record(p_rcv->p_log, &mcmember_rec, OSM_LOG_DEBUG); + osm_dump_mc_record(sa->p_log, &mcmember_rec, OSM_LOG_DEBUG); } - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); + CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); /* make sure the requested port guid is known to the SM */ - p_port = osm_get_port_by_guid(p_rcv->p_subn, portguid); + p_port = osm_get_port_by_guid(sa->p_subn, portguid); if (!p_port) { - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mcmr_rcv_join_mgrp: " "Unknown port GUID 0x%016" PRIx64 "\n", portguid); sa_status = IB_SA_MAD_STATUS_REQ_INVALID; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } @@ -1450,22 +1417,22 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, /* Check that the p_physp and the requester physp are in the same partition. */ p_request_physp = - osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr(p_madw)); if (p_request_physp == NULL) { - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); goto Exit; } - if (!osm_physp_share_pkey(p_rcv->p_log, p_physp, p_request_physp)) { - CL_PLOCK_RELEASE(p_rcv->p_lock); + if (!osm_physp_share_pkey(sa->p_log, p_physp, p_request_physp)) { + CL_PLOCK_RELEASE(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mcmr_rcv_join_mgrp: " "Port and requester don't share pkey\n"); sa_status = IB_SA_MAD_STATUS_REQ_INVALID; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } @@ -1473,12 +1440,12 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, &join_state); /* do we need to create a new group? */ - status = __get_mgrp_by_mgid(p_rcv, p_recvd_mcmember_rec, &p_mgrp); + status = __get_mgrp_by_mgid(sa, p_recvd_mcmember_rec, &p_mgrp); if ((status == IB_NOT_FOUND) || p_mgrp->to_be_deleted) { /* check for JoinState.FullMember = 1 o15.0.1.9 */ if ((join_state & 0x01) != 0x01) { - CL_PLOCK_RELEASE(p_rcv->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + CL_PLOCK_RELEASE(sa->p_lock); + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_join_mgrp: ERR 1B10: " "Provided Join State != FullMember - required for create, " "MGID: 0x%016" PRIx64 " : " @@ -1490,7 +1457,7 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, interface_id), cl_ntoh64(portguid), p_port->p_node->print_desc); sa_status = IB_SA_MAD_STATUS_REQ_INVALID; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } @@ -1498,24 +1465,24 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, valid = __check_create_comp_mask(p_sa_mad->comp_mask, p_recvd_mcmember_rec); if (valid) { - status = osm_mcmr_rcv_create_new_mgrp(p_rcv, + status = osm_mcmr_rcv_create_new_mgrp(sa, p_sa_mad-> comp_mask, p_recvd_mcmember_rec, p_physp, &p_mgrp); if (status != IB_SUCCESS) { - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); sa_status = status; - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } /* copy the MGID to the result */ mcmember_rec.mgid = p_mgrp->mcmember_rec.mgid; } else { - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_join_mgrp: ERR 1B11: " "method = %s, " "scope_state = 0x%x, " @@ -1535,7 +1502,7 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, p_port->p_node->print_desc); sa_status = IB_SA_MAD_STATUS_INSUF_COMPS; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } is_new_group = 1; @@ -1568,20 +1535,20 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, * * We need to check #3 and #5 here: */ - valid = __validate_more_comp_fields(p_rcv->p_log, + valid = __validate_more_comp_fields(sa->p_log, p_mgrp, p_recvd_mcmember_rec, p_sa_mad->comp_mask) - && __validate_port_caps(p_rcv->p_log, p_mgrp, p_physp) + && __validate_port_caps(sa->p_log, p_mgrp, p_physp) && (join_state != 0); if (!valid) { /* since we might have created the new group we need to cleanup */ - __cleanup_mgrp(p_rcv, mlid); + __cleanup_mgrp(sa, mlid); - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_join_mgrp: ERR 1B12: " "__validate_more_comp_fields, __validate_port_caps, " "or JoinState = 0 failed from port 0x%016" PRIx64 @@ -1589,7 +1556,7 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, cl_ntoh64(portguid), p_port->p_node->print_desc); sa_status = IB_SA_MAD_STATUS_REQ_INVALID; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } @@ -1601,14 +1568,14 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, * o15-0.2.1 requires validation of the requesting port * in the case of modification: */ - valid = __validate_modify(p_rcv, + valid = __validate_modify(sa, p_mgrp, osm_madw_get_mad_addr_ptr(p_madw), p_recvd_mcmember_rec, &p_mcmr_port); if (!valid) { - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_join_mgrp: ERR 1B13: " "__validate_modify failed from port 0x%016" PRIx64 " (%s), " @@ -1617,13 +1584,13 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, p_port->p_node->print_desc); sa_status = IB_SA_MAD_STATUS_REQ_INVALID; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } } /* create or update existing port (join-state will be updated) */ - status = __add_new_mgrp_port(p_rcv, + status = __add_new_mgrp_port(sa, p_mgrp, p_recvd_mcmember_rec, osm_madw_get_mad_addr_ptr(p_madw), @@ -1631,15 +1598,15 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, if (status != IB_SUCCESS) { /* we fail to add the port so we might need to delete the group */ - __cleanup_mgrp(p_rcv, mlid); + __cleanup_mgrp(sa, mlid); - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); if (status == IB_INVALID_PARAMETER) sa_status = IB_SA_MAD_STATUS_REQ_INVALID; else sa_status = IB_SA_MAD_STATUS_NO_RESOURCES; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } @@ -1650,48 +1617,48 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, __copy_from_create_mc_rec(&mcmember_rec, &p_mgrp->mcmember_rec); /* Release the lock as we don't need it. */ - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); /* do the actual routing (actually schedule the update) */ status = - osm_sm_mcgrp_join(p_rcv->p_sm, + osm_sm_mcgrp_join(sa->sm, mlid, p_recvd_mcmember_rec->port_gid.unicast. interface_id, req_type); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_join_mgrp: ERR 1B14: " "osm_sm_mcgrp_join failed from port 0x%016" PRIx64 " (%s), " "sending IB_SA_MAD_STATUS_NO_RESOURCES\n", cl_ntoh64(portguid), p_port->p_node->print_desc); - CL_PLOCK_EXCL_ACQUIRE(p_rcv->p_lock); + CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); /* the request for routing failed so we need to remove the port */ - p_mgrp = __get_mgrp_by_mlid(p_rcv, mlid); + p_mgrp = __get_mgrp_by_mlid(sa, mlid); if (p_mgrp != NULL) { - osm_mgrp_remove_port(p_rcv->p_subn, - p_rcv->p_log, + osm_mgrp_remove_port(sa->p_subn, + sa->p_log, p_mgrp, p_recvd_mcmember_rec->port_gid. unicast.interface_id); - __cleanup_mgrp(p_rcv, mlid); + __cleanup_mgrp(sa, mlid); } - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); sa_status = IB_SA_MAD_STATUS_NO_RESOURCES; - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } /* failed to route */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_mc_record(p_rcv->p_log, &mcmember_rec, OSM_LOG_DEBUG); + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_mc_record(sa->p_log, &mcmember_rec, OSM_LOG_DEBUG); - __osm_mcmr_rcv_respond(p_rcv, p_madw, &mcmember_rec); + __osm_mcmr_rcv_respond(sa, p_madw, &mcmember_rec); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return; } @@ -1699,18 +1666,18 @@ __osm_mcmr_rcv_join_mgrp(IN osm_mcmr_recv_t * const p_rcv, Add a patched multicast group to the results list **********************************************************************/ static ib_api_status_t -__osm_mcmr_rcv_new_mcmr(IN osm_mcmr_recv_t * const p_rcv, +__osm_mcmr_rcv_new_mcmr(IN osm_sa_t * sa, IN const ib_member_rec_t * p_rcvd_rec, IN cl_qlist_t * const p_list) { osm_mcmr_item_t *p_rec_item; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mcmr_rcv_new_mcmr); + OSM_LOG_ENTER(sa->p_log, __osm_mcmr_rcv_new_mcmr); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_rcv_new_mcmr: ERR 1B15: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; @@ -1725,7 +1692,7 @@ __osm_mcmr_rcv_new_mcmr(IN osm_mcmr_recv_t * const p_rcv, cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } @@ -1739,7 +1706,7 @@ __osm_sa_mcm_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, const osm_mgrp_t *const p_mgrp = (osm_mgrp_t *) p_map_item; osm_sa_mcmr_search_ctxt_t *const p_ctxt = (osm_sa_mcmr_search_ctxt_t *) context; - osm_mcmr_recv_t *const p_rcv = p_ctxt->p_rcv; + osm_sa_t *sa = p_ctxt->sa; const ib_member_rec_t *p_rcvd_rec = p_ctxt->p_mcmember_rec; const osm_physp_t *p_req_physp = p_ctxt->p_req_physp; @@ -1755,15 +1722,15 @@ __osm_sa_mcm_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, ib_gid_t port_gid; boolean_t proxy_join = FALSE; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_mcm_by_comp_mask_cb); + OSM_LOG_ENTER(sa->p_log, __osm_sa_mcm_by_comp_mask_cb); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_mcm_by_comp_mask_cb: " "Checking mlid:0x%X\n", cl_ntoh16(p_mgrp->mlid)); /* the group might be marked for deletion */ if (p_mgrp->to_be_deleted) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_mcm_by_comp_mask_cb: " "Group mlid:0x%X is marked to be deleted\n", cl_ntoh16(p_mgrp->mlid)); @@ -1783,7 +1750,7 @@ __osm_sa_mcm_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, /* if the requester physical port doesn't have the pkey that is defined for the group - exit. */ - if (!osm_physp_has_pkey(p_rcv->p_log, p_mgrp->mcmember_rec.pkey, + if (!osm_physp_has_pkey(sa->p_log, p_mgrp->mcmember_rec.pkey, p_req_physp)) goto Exit; @@ -1830,7 +1797,7 @@ __osm_sa_mcm_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, goto Exit; /* need to validate mtu, rate, and pkt_lifetime fields */ - if (__validate_more_comp_fields(p_rcv->p_log, + if (__validate_more_comp_fields(sa->p_log, p_mgrp, p_rcvd_rec, comp_mask) == FALSE) goto Exit; @@ -1861,7 +1828,7 @@ __osm_sa_mcm_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, /* Many MC records returned */ if ((p_ctxt->trusted_req == TRUE) && !(IB_MCR_COMPMASK_PORT_GID & comp_mask)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_mcm_by_comp_mask_cb: " "Trusted req is TRUE and no specific port defined\n"); @@ -1878,7 +1845,7 @@ __osm_sa_mcm_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, memcpy(&(match_rec.port_gid), &(p_mcm_port->port_gid), sizeof(ib_gid_t)); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_mcm_by_comp_mask_cb: " "Record of port_gid: 0x%016" PRIx64 "0x%016" PRIx64 @@ -1893,7 +1860,7 @@ __osm_sa_mcm_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, match_rec.proxy_join = (uint8_t) (p_mcm_port->proxy_join); - __osm_mcmr_rcv_new_mcmr(p_rcv, &match_rec, + __osm_mcmr_rcv_new_mcmr(sa, &match_rec, p_ctxt->p_list); } p_item = cl_qmap_next(p_item); @@ -1911,18 +1878,18 @@ __osm_sa_mcm_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, memcpy(&(match_rec.port_gid), &port_gid, sizeof(ib_gid_t)); match_rec.proxy_join = (uint8_t) proxy_join; - __osm_mcmr_rcv_new_mcmr(p_rcv, &match_rec, p_ctxt->p_list); + __osm_mcmr_rcv_new_mcmr(sa, &match_rec, p_ctxt->p_list); } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** Handle a query request **********************************************************************/ static void -__osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, +__osm_mcmr_query_mgrp(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { const ib_sa_mad_t *p_rcvd_mad; @@ -1943,7 +1910,7 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, osm_physp_t *p_req_physp; boolean_t trusted_req; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mcmr_query_mgrp); + OSM_LOG_ENTER(sa->p_log, __osm_mcmr_query_mgrp); p_rcvd_mad = osm_madw_get_sa_mad_ptr(p_madw); p_rcvd_rec = (ib_member_rec_t *) ib_sa_mad_get_payload_ptr(p_rcvd_mad); @@ -1956,12 +1923,12 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, trusted_req = (p_rcvd_mad->sm_key != 0); /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_query_mgrp: ERR 1B04: " "Cannot find requester physical port\n"); goto Exit; @@ -1972,17 +1939,17 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, context.p_mcmember_rec = p_rcvd_rec; context.p_list = &rec_list; context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.sa = sa; context.p_req_physp = p_req_physp; context.trusted_req = trusted_req; - CL_PLOCK_ACQUIRE(p_rcv->p_lock); + CL_PLOCK_ACQUIRE(sa->p_lock); /* simply go over all MCGs and match */ - cl_qmap_apply_func(&p_rcv->p_subn->mgrp_mlid_tbl, + cl_qmap_apply_func(&sa->p_subn->mgrp_mlid_tbl, __osm_sa_mcm_by_comp_mask_cb, &context); - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -1991,11 +1958,11 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, * If we do a SubnAdmGet and got more than one record it is an error ! */ if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec > 1)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_query_mgrp: ERR 1B05: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -2016,7 +1983,7 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, trim_num_rec = (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_member_rec_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_mcmr_query_mgrp: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -2024,11 +1991,11 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mcmr_query_mgrp: " "Returning %u records\n", num_rec); if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -2036,14 +2003,14 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_member_rec_t) + IB_SA_MAD_HDR_SIZE, osm_madw_get_mad_addr_ptr(p_madw)); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_query_mgrp: ERR 1B16: " "osm_mad_pool_get failed\n"); @@ -2053,7 +2020,7 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -2121,9 +2088,9 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mcmr_query_mgrp: ERR 1B17: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -2131,23 +2098,23 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ void osm_mcmr_rcv_process(IN void *context, IN void *data) { - osm_mcmr_recv_t *p_rcv = context; + osm_sa_t *sa = context; osm_madw_t *p_madw = data; ib_sa_mad_t *p_sa_mad; ib_net16_t sa_status = IB_SA_MAD_STATUS_REQ_INVALID; ib_member_rec_t *p_recvd_mcmember_rec; boolean_t valid; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_mcmr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_mcmr_rcv_process); CL_ASSERT(p_madw); @@ -2161,7 +2128,7 @@ void osm_mcmr_rcv_process(IN void *context, IN void *data) case IB_MAD_METHOD_SET: valid = __check_join_comp_mask(p_sa_mad->comp_mask); if (!valid) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mcmr_rcv_process: ERR 1B18: " "component mask = 0x%016" PRIx64 ", " "expected comp mask = 0x%016" PRIx64 " ," @@ -2180,52 +2147,52 @@ void osm_mcmr_rcv_process(IN void *context, IN void *data) cl_ntoh64(p_recvd_mcmember_rec->port_gid. unicast.interface_id)); - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } /* * Join or Create Multicast Group */ - __osm_mcmr_rcv_join_mgrp(p_rcv, p_madw); + __osm_mcmr_rcv_join_mgrp(sa, p_madw); break; case IB_MAD_METHOD_DELETE: valid = __check_join_comp_mask(p_sa_mad->comp_mask); if (!valid) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mcmr_rcv_process: ERR 1B20: " "component mask = 0x%016" PRIx64 ", " "expected comp mask = 0x%016" PRIx64 "\n", cl_ntoh64(p_sa_mad->comp_mask), CL_NTOH64(JOIN_MC_COMP_MASK)); - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } /* * Leave Multicast Group */ - __osm_mcmr_rcv_leave_mgrp(p_rcv, p_madw); + __osm_mcmr_rcv_leave_mgrp(sa, p_madw); break; case IB_MAD_METHOD_GET: case IB_MAD_METHOD_GETTABLE: /* * Querying a Multicast Group */ - __osm_mcmr_query_mgrp(p_rcv, p_madw); + __osm_mcmr_query_mgrp(sa, p_madw); break; default: - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mcmr_rcv_process: ERR 1B21: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_sa_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); break; } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return; } diff --git a/opensm/opensm/osm_sa_mft_record.c b/opensm/opensm/osm_sa_mft_record.c index f9ac527..30506a6 100644 --- a/opensm/opensm/osm_sa_mft_record.c +++ b/opensm/opensm/osm_sa_mft_record.c @@ -52,9 +52,8 @@ #include #include #include -#include -#include #include +#include #include #include #include @@ -68,52 +67,14 @@ typedef struct _osm_mftr_search_ctxt { const ib_mft_record_t *p_rcvd_rec; ib_net64_t comp_mask; cl_qlist_t *p_list; - osm_mftr_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; } osm_mftr_search_ctxt_t; /********************************************************************** **********************************************************************/ -void osm_mftr_rcv_construct(IN osm_mftr_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_mftr_rcv_destroy(IN osm_mftr_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_mftr_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_mftr_rcv_init(IN osm_mftr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_mftr_rcv_init); - - osm_mftr_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static ib_api_status_t -__osm_mftr_rcv_new_mftr(IN osm_mftr_rcv_t * const p_rcv, +__osm_mftr_rcv_new_mftr(IN osm_sa_t * sa, IN osm_switch_t * const p_sw, IN cl_qlist_t * const p_list, IN ib_net16_t const lid, @@ -123,19 +84,19 @@ __osm_mftr_rcv_new_mftr(IN osm_mftr_rcv_t * const p_rcv, ib_api_status_t status = IB_SUCCESS; uint16_t position_block_num; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mftr_rcv_new_mftr); + OSM_LOG_ENTER(sa->p_log, __osm_mftr_rcv_new_mftr); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mftr_rcv_new_mftr: ERR 4A02: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mftr_rcv_new_mftr: " "New MulticastForwardingTable: sw 0x%016" PRIx64 "\n\t\t\t\tblock %u position %u lid 0x%02X\n", @@ -158,27 +119,27 @@ __osm_mftr_rcv_new_mftr(IN osm_mftr_rcv_t * const p_rcv, cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ -static osm_port_t *__osm_mftr_get_port_by_guid(IN osm_mftr_rcv_t * const p_rcv, +static osm_port_t *__osm_mftr_get_port_by_guid(IN osm_sa_t * sa, IN uint64_t port_guid) { osm_port_t *p_port; - CL_PLOCK_ACQUIRE(p_rcv->p_lock); + CL_PLOCK_ACQUIRE(sa->p_lock); - p_port = osm_get_port_by_guid(p_rcv->p_subn, port_guid); + p_port = osm_get_port_by_guid(sa->p_subn, port_guid); if (!p_port) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mftr_get_port_by_guid ERR 4A04: " "Invalid port GUID 0x%016" PRIx64 "\n", port_guid); } - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); return p_port; } @@ -192,7 +153,7 @@ __osm_mftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, (osm_mftr_search_ctxt_t *) context; osm_switch_t *const p_sw = (osm_switch_t *) p_map_item; const ib_mft_record_t *const p_rcvd_rec = p_ctxt->p_rcvd_rec; - osm_mftr_rcv_t *const p_rcv = p_ctxt->p_rcv; + osm_sa_t *sa = p_ctxt->sa; ib_net64_t const comp_mask = p_ctxt->comp_mask; const osm_physp_t *const p_req_physp = p_ctxt->p_req_physp; osm_port_t *p_port; @@ -204,10 +165,10 @@ __osm_mftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, /* In switches, the port guid is the node guid. */ p_port = - __osm_mftr_get_port_by_guid(p_rcv, + __osm_mftr_get_port_by_guid(sa, p_sw->p_node->node_info.port_guid); if (!p_port) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mftr_rcv_by_comp_mask: ERR 4A05: " "Failed to find Port by Node Guid:0x%016" PRIx64 "\n", cl_ntoh64(p_sw->p_node->node_info.node_guid) @@ -219,7 +180,7 @@ __osm_mftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, the same partition. */ p_physp = p_port->p_physp; if (!p_physp) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mftr_rcv_by_comp_mask: ERR 4A06: " "Failed to find default physical Port by Node Guid:0x%016" PRIx64 "\n", @@ -227,7 +188,7 @@ __osm_mftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, ); return; } - if (!osm_physp_share_pkey(p_rcv->p_log, p_req_physp, p_physp)) + if (!osm_physp_share_pkey(sa->p_log, p_req_physp, p_physp)) return; /* get the port 0 of the switch */ @@ -235,7 +196,7 @@ __osm_mftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, /* compare the lids - if required */ if (comp_mask & IB_MFTR_COMPMASK_LID) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mftr_rcv_by_comp_mask: " "Comparing lid:0x%02X to port lid range: 0x%02X .. 0x%02X\n", cl_ntoh16(p_rcvd_rec->lid), min_lid_ho, max_lid_ho); @@ -282,7 +243,7 @@ __osm_mftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, for (block = min_block; block <= max_block; block++) for (position = min_position; position <= max_position; position++) - __osm_mftr_rcv_new_mftr(p_rcv, p_sw, p_ctxt->p_list, + __osm_mftr_rcv_new_mftr(sa, p_sw, p_ctxt->p_list, osm_port_get_base_lid(p_port), block, position); } @@ -291,7 +252,7 @@ __osm_mftr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, **********************************************************************/ void osm_mftr_rcv_process(IN void *ctx, IN void *data) { - osm_mftr_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; const ib_sa_mad_t *p_rcvd_mad; const ib_mft_record_t *p_rcvd_rec; @@ -309,9 +270,9 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) ib_api_status_t status = IB_SUCCESS; osm_physp_t *p_req_physp; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_mftr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_mftr_rcv_process); CL_ASSERT(p_madw); @@ -323,22 +284,22 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) /* we only support SubnAdmGet and SubnAdmGetTable methods */ if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mftr_rcv_process: ERR 4A08: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mftr_rcv_process: ERR 4A07: " "Cannot find requester physical port\n"); goto Exit; @@ -349,16 +310,16 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.sa = sa; context.p_req_physp = p_req_physp; - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); /* Go over all switches */ - cl_qmap_apply_func(&p_rcv->p_subn->sw_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->sw_guid_tbl, __osm_mftr_rcv_by_comp_mask, &context); - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -368,16 +329,16 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) */ if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mftr_rcv_process: ERR 4A09: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -400,7 +361,7 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) trim_num_rec = (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_mft_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_mftr_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -408,11 +369,11 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_mftr_rcv_process: " "Returning %u records\n", num_rec); if ((p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -420,13 +381,13 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_mft_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mftr_rcv_process: ERR 4A10: " "osm_mad_pool_get failed\n"); @@ -436,7 +397,7 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; @@ -489,9 +450,9 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mftr_rcv_process: ERR 4A11: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -499,5 +460,5 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_multipath_record.c b/opensm/opensm/osm_sa_multipath_record.c index 6851cce..1fa81d6 100644 --- a/opensm/opensm/osm_sa_multipath_record.c +++ b/opensm/opensm/osm_sa_multipath_record.c @@ -56,13 +56,11 @@ #include #include #include -#include +#include #include #include #include #include -#include -#include #include #include #include @@ -89,44 +87,6 @@ typedef struct _osm_path_parms { /********************************************************************** **********************************************************************/ -void osm_mpr_rcv_construct(IN osm_mpr_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_mpr_rcv_destroy(IN osm_mpr_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_mpr_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_mpr_rcv_init(IN osm_mpr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_mpr_rcv_init); - - osm_mpr_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_rcv->p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static inline boolean_t __osm_sa_multipath_rec_is_tavor_port(IN const osm_port_t * const p_port) { @@ -201,7 +161,7 @@ __osm_sa_multipath_rec_apply_tavor_mtu_limit(IN const ib_multipath_rec_t * /********************************************************************** **********************************************************************/ static ib_api_status_t -__osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_get_path_parms(IN osm_sa_t * sa, IN const ib_multipath_rec_t * const p_mpr, IN const osm_port_t * const p_src_port, IN const osm_port_t * const p_dest_port, @@ -232,7 +192,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, osm_qos_level_t *p_qos_level = NULL; uint16_t valid_sl_mask = 0xffff; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_get_path_parms); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_get_path_parms); dest_lid = cl_hton16(dest_lid_ho); @@ -250,13 +210,13 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, and at least one end of the path is Tavor we override the port MTU with 1K. */ - if (p_rcv->p_subn->opt.enable_quirks && + if (sa->p_subn->opt.enable_quirks && __osm_sa_multipath_rec_apply_tavor_mtu_limit(p_mpr, p_src_port, p_dest_port, comp_mask)) if (mtu > IB_MTU_LEN_1024) { mtu = IB_MTU_LEN_1024; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_path_parms: " "Optimized Path MTU to 1K for Mellanox Tavor device\n"); } @@ -279,7 +239,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, */ p_physp = osm_switch_get_route_by_lid(p_node->sw, dest_lid); if (p_physp == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 4514: " "Can't find routing to LID 0x%X from switch for GUID 0x%016" PRIx64 "\n", dest_lid_ho, @@ -289,7 +249,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, } } - if (p_rcv->p_subn->opt.qos) { + if (sa->p_subn->opt.qos) { /* * Whether this node is switch or CA, the IN port for @@ -304,8 +264,8 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, valid_sl_mask &= ~(1 << i); } if (!valid_sl_mask) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_path_parms: " "All the SLs lead to VL15 on this path\n"); status = IB_NOT_FOUND; @@ -325,7 +285,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, p_dest_physp = osm_switch_get_route_by_lid(p_node->sw, dest_lid); if (p_dest_physp == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 4515: " "Can't find routing to LID 0x%X from switch for GUID 0x%016" PRIx64 "\n", dest_lid_ho, @@ -346,7 +306,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, p_physp = osm_physp_get_remote(p_physp); if (p_physp == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 4505: " "Can't find remote phys port when routing to LID 0x%X from node GUID 0x%016" PRIx64 "\n", dest_lid_ho, @@ -372,7 +332,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, If this isn't a switch, we should have reached the destination by now! */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 4503: " "Internal error, bad path\n"); status = IB_ERROR; @@ -396,7 +356,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, p_physp = osm_switch_get_route_by_lid(p_node->sw, dest_lid); if (p_physp == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 4516: " "Dead end on path to LID 0x%X from switch for GUID 0x%016" PRIx64 "\n", dest_lid_ho, @@ -415,7 +375,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, if (rate > ib_port_info_compute_rate(p_pi)) rate = ib_port_info_compute_rate(p_pi); - if (p_rcv->p_subn->opt.qos) { + if (sa->p_subn->opt.qos) { /* * Check SL2VL table of the switch and update valid SLs */ @@ -426,8 +386,8 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, valid_sl_mask &= ~(1 << i); } if (!valid_sl_mask) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_path_parms: " "All the SLs lead to VL15 " "on this path\n"); @@ -448,8 +408,8 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, if (rate > ib_port_info_compute_rate(p_pi)) rate = ib_port_info_compute_rate(p_pi); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_path_parms: " "Path min MTU = %u, min rate = %u\n", mtu, rate); } @@ -458,15 +418,15 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, * Get QoS Level object according to the MultiPath request * and adjust MultiPath parameters according to QoS settings */ - if (p_rcv->p_subn->opt.qos && - p_rcv->p_subn->p_qos_policy && + if (sa->p_subn->opt.qos && + sa->p_subn->p_qos_policy && (p_qos_level = - osm_qos_policy_get_qos_level_by_mpr(p_rcv->p_subn->p_qos_policy, + osm_qos_policy_get_qos_level_by_mpr(sa->p_subn->p_qos_policy, p_mpr, p_src_physp, p_dest_physp, comp_mask))) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_path_parms: " "MultiPathRecord request matches QoS Level '%s' (%s)\n", p_qos_level->name, @@ -651,7 +611,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, required_pkey = p_mpr->pkey; if (!osm_physp_share_this_pkey (p_src_physp, p_dest_physp, required_pkey)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 4518: " "Ports do not share specified PKey 0x%04x\n" "\t\tsrc %" PRIx64 " dst %" PRIx64 "\n", @@ -664,7 +624,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, } if (p_qos_level && p_qos_level->pkey_range_len && !osm_qos_level_has_pkey(p_qos_level, required_pkey)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 451C: " "Ports do not share PKeys defined by QoS level\n"); status = IB_NOT_FOUND; @@ -680,7 +640,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, p_src_physp, p_dest_physp); if (!required_pkey) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 451D: " "Ports do not share PKeys defined by QoS level\n"); status = IB_NOT_FOUND; @@ -695,7 +655,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, required_pkey = osm_physp_find_common_pkey(p_src_physp, p_dest_physp); if (!required_pkey) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 4519: " "Ports do not have any shared PKeys\n" "\t\tsrc %" PRIx64 " dst %" PRIx64 "\n", @@ -709,11 +669,11 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, if (required_pkey) { p_prtn = - (osm_prtn_t *) cl_qmap_get(&p_rcv->p_subn->prtn_pkey_tbl, + (osm_prtn_t *) cl_qmap_get(&sa->p_subn->prtn_pkey_tbl, required_pkey & cl_ntoh16((uint16_t) ~ 0x8000)); if (p_prtn == - (osm_prtn_t *) cl_qmap_end(&p_rcv->p_subn->prtn_pkey_tbl)) + (osm_prtn_t *) cl_qmap_end(&sa->p_subn->prtn_pkey_tbl)) p_prtn = NULL; } @@ -729,7 +689,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, if (p_qos_level && p_qos_level->sl_set && p_qos_level->sl != required_sl) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 451E: " "QoS constaraints: required MultiPathRecord SL (%u) " "doesn't match QoS policy SL (%u)\n", @@ -746,7 +706,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, required_sl = p_qos_level->sl; if (required_pkey && p_prtn && p_prtn->sl != p_qos_level->sl) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_path_parms: " "QoS level SL (%u) overrides partition SL (%u)\n", p_qos_level->sl, p_prtn->sl); @@ -756,21 +716,21 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, * No specific SL in request or in QoS level - use partition SL */ p_prtn = - (osm_prtn_t *) cl_qmap_get(&p_rcv->p_subn->prtn_pkey_tbl, + (osm_prtn_t *) cl_qmap_get(&sa->p_subn->prtn_pkey_tbl, required_pkey & cl_ntoh16((uint16_t) ~ 0x8000)); if (!p_prtn) { required_sl = OSM_DEFAULT_SL; /* this may be possible when pkey tables are created somehow in previous runs or things are going wrong here */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 451A: " "No partition found for PKey 0x%04x - using default SL %d\n", cl_ntoh16(required_pkey), required_sl); } else required_sl = p_prtn->sl; - } else if (p_rcv->p_subn->opt.qos) { + } else if (sa->p_subn->opt.qos) { if (valid_sl_mask & (1 << OSM_DEFAULT_SL)) required_sl = OSM_DEFAULT_SL; else { @@ -782,8 +742,8 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, } else required_sl = OSM_DEFAULT_SL; - if (p_rcv->p_subn->opt.qos && !(valid_sl_mask & (1 << required_sl))) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + if (sa->p_subn->opt.qos && !(valid_sl_mask & (1 << required_sl))) { + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 451F: " "Selected SL (%u) leads to VL15\n", required_sl); status = IB_NOT_FOUND; @@ -802,22 +762,22 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv, p_parms->sl = required_sl; p_parms->hops = hops; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_path_parms: MultiPath params:" " mtu = %u, rate = %u, packet lifetime = %u," " pkey = 0x%04X, sl = %u, hops = %u\n", mtu, rate, pkt_life, cl_ntoh16(required_pkey), required_sl, hops); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ static void -__osm_mpr_rcv_build_pr(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * const p_src_port, IN const osm_port_t * const p_dest_port, IN const uint16_t src_lid_ho, @@ -829,7 +789,7 @@ __osm_mpr_rcv_build_pr(IN osm_mpr_rcv_t * const p_rcv, const osm_physp_t *p_src_physp; const osm_physp_t *p_dest_physp; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_build_pr); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_build_pr); p_src_physp = p_src_port->p_physp; p_dest_physp = p_dest_port->p_physp; @@ -864,13 +824,13 @@ __osm_mpr_rcv_build_pr(IN osm_mpr_rcv_t * const p_rcv, if (p_parms->reversible) p_pr->num_path = 0x80; - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static osm_mpr_item_t * -__osm_mpr_rcv_get_lid_pair_path(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_get_lid_pair_path(IN osm_sa_t * sa, IN const ib_multipath_rec_t * const p_mpr, IN const osm_port_t * const p_src_port, IN const osm_port_t * const p_dest_port, @@ -884,24 +844,24 @@ __osm_mpr_rcv_get_lid_pair_path(IN osm_mpr_rcv_t * const p_rcv, osm_mpr_item_t *p_pr_item; ib_api_status_t status, rev_path_status; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_get_lid_pair_path); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_get_lid_pair_path); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_lid_pair_path: " "Src LID 0x%X, Dest LID 0x%X\n", src_lid_ho, dest_lid_ho); p_pr_item = malloc(sizeof(*p_pr_item)); if (p_pr_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_lid_pair_path: ERR 4501: " "Unable to allocate path record\n"); goto Exit; } memset(p_pr_item, 0, sizeof(*p_pr_item)); - status = __osm_mpr_rcv_get_path_parms(p_rcv, p_mpr, p_src_port, + status = __osm_mpr_rcv_get_path_parms(sa, p_mpr, p_src_port, p_dest_port, dest_lid_ho, comp_mask, &path_parms); @@ -913,7 +873,7 @@ __osm_mpr_rcv_get_lid_pair_path(IN osm_mpr_rcv_t * const p_rcv, /* now try the reversible path */ rev_path_status = - __osm_mpr_rcv_get_path_parms(p_rcv, p_mpr, p_dest_port, p_src_port, + __osm_mpr_rcv_get_path_parms(sa, p_mpr, p_dest_port, p_src_port, src_lid_ho, comp_mask, &rev_path_parms); path_parms.reversible = (rev_path_status == IB_SUCCESS); @@ -926,7 +886,7 @@ __osm_mpr_rcv_get_lid_pair_path(IN osm_mpr_rcv_t * const p_rcv, */ if (comp_mask & IB_MPR_COMPMASK_REVERSIBLE) { if ((!path_parms.reversible && (p_mpr->num_path & 0x80))) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_lid_pair_path: " "Requested reversible path but failed to get one\n"); @@ -940,19 +900,19 @@ __osm_mpr_rcv_get_lid_pair_path(IN osm_mpr_rcv_t * const p_rcv, p_pr_item->p_dest_port = p_dest_port; p_pr_item->hops = path_parms.hops; - __osm_mpr_rcv_build_pr(p_rcv, p_src_port, p_dest_port, src_lid_ho, + __osm_mpr_rcv_build_pr(sa, p_src_port, p_dest_port, src_lid_ho, dest_lid_ho, preference, &path_parms, &p_pr_item->path_rec); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (p_pr_item); } /********************************************************************** **********************************************************************/ static uint32_t -__osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_get_port_pair_paths(IN osm_sa_t * sa, IN const ib_multipath_rec_t * const p_mpr, IN const osm_port_t * const p_req_port, IN const osm_port_t * const p_src_port, @@ -973,10 +933,10 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, uintn_t src_offset; uintn_t dest_offset; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_get_port_pair_paths); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_get_port_pair_paths); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_port_pair_paths: " "Src port 0x%016" PRIx64 ", " "Dst port 0x%016" PRIx64 "\n", @@ -985,10 +945,10 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, /* Check that the req_port, src_port and dest_port all share a pkey. The check is done on the default physical port of the ports. */ - if (osm_port_share_pkey(p_rcv->p_log, p_req_port, p_src_port) == FALSE - || osm_port_share_pkey(p_rcv->p_log, p_req_port, + if (osm_port_share_pkey(sa->p_log, p_req_port, p_src_port) == FALSE + || osm_port_share_pkey(sa->p_log, p_req_port, p_dest_port) == FALSE - || osm_port_share_pkey(p_rcv->p_log, p_src_port, + || osm_port_share_pkey(sa->p_log, p_src_port, p_dest_port) == FALSE) /* One of the pairs doesn't share a pkey so the path is disqualified. */ goto Exit; @@ -1042,8 +1002,8 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, osm_port_get_lid_range_ho(p_dest_port, &dest_lid_min_ho, &dest_lid_max_ho); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_port_pair_paths: " "Src LID [0x%X-0x%X], " "Dest LID [0x%X-0x%X]\n", @@ -1062,7 +1022,7 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, /* These paths are "fully redundant" */ - p_pr_item = __osm_mpr_rcv_get_lid_pair_path(p_rcv, p_mpr, + p_pr_item = __osm_mpr_rcv_get_lid_pair_path(sa, p_mpr, p_src_port, p_dest_port, src_lid_ho, @@ -1128,7 +1088,7 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, if (src_offset == dest_offset) continue; /* already reported */ - p_pr_item = __osm_mpr_rcv_get_lid_pair_path(p_rcv, p_mpr, + p_pr_item = __osm_mpr_rcv_get_lid_pair_path(sa, p_mpr, p_src_port, p_dest_port, src_lid_ho, @@ -1143,7 +1103,7 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return path_num; } @@ -1153,7 +1113,7 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, /********************************************************************** **********************************************************************/ static osm_mpr_item_t * -__osm_mpr_rcv_get_apm_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_get_apm_port_pair_paths(IN osm_sa_t * sa, IN const ib_multipath_rec_t * const p_mpr, IN const osm_port_t * const p_src_port, IN const osm_port_t * const p_dest_port, @@ -1171,10 +1131,10 @@ __osm_mpr_rcv_get_apm_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, uintn_t iterations; int src_lids, dest_lids; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_get_apm_port_pair_paths); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_get_apm_port_pair_paths); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_apm_port_pair_paths: " "Src port 0x%016" PRIx64 ", " "Dst port 0x%016" PRIx64 ", base offs %d\n", @@ -1194,7 +1154,7 @@ __osm_mpr_rcv_get_apm_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, src_lid_ho += base_offs % src_lids; dest_lid_ho += base_offs % dest_lids; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_apm_port_pair_paths: " "Src LIDs [0x%X-0x%X] hashed %d, " "Dest LIDs [0x%X-0x%X] hashed %d\n", @@ -1207,7 +1167,7 @@ __osm_mpr_rcv_get_apm_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, /* These paths are "fully redundant" */ - p_pr_item = __osm_mpr_rcv_get_lid_pair_path(p_rcv, p_mpr, + p_pr_item = __osm_mpr_rcv_get_lid_pair_path(sa, p_mpr, p_src_port, p_dest_port, src_lid_ho, @@ -1215,7 +1175,7 @@ __osm_mpr_rcv_get_apm_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, comp_mask, 0); if (p_pr_item) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_apm_port_pair_paths: " "Found matching path from Src LID 0x%X to Dest LID 0x%X with %d hops\n", src_lid_ho, dest_lid_ho, p_pr_item->hops); @@ -1229,14 +1189,14 @@ __osm_mpr_rcv_get_apm_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv, dest_lid_ho = dest_lid_min_ho; } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return p_pr_item; } /********************************************************************** **********************************************************************/ static ib_net16_t -__osm_mpr_rcv_get_gids(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_get_gids(IN osm_sa_t * sa, IN const ib_gid_t * gids, IN int ngids, IN int is_sgid, OUT osm_port_t ** pp_port) { @@ -1244,19 +1204,19 @@ __osm_mpr_rcv_get_gids(IN osm_mpr_rcv_t * const p_rcv, ib_net16_t ib_status = IB_SUCCESS; int i; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_get_gids); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_get_gids); for (i = 0; i < ngids; i++, gids++) { if (!ib_gid_is_link_local(gids)) { if ((is_sgid && ib_gid_is_multicast(gids)) || (ib_gid_get_subnet_prefix(gids) != - p_rcv->p_subn->opt.subnet_prefix)) { + sa->p_subn->opt.subnet_prefix)) { /* This 'error' is the client's fault (bad gid) so don't enter it as an error in our own log. Return an error response to the client. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_mpr_rcv_get_gids: ERR 451B: " "%sGID 0x%016" PRIx64 " is multicast or non local subnet prefix\n", @@ -1269,7 +1229,7 @@ __osm_mpr_rcv_get_gids(IN osm_mpr_rcv_t * const p_rcv, } p_port = - osm_get_port_by_guid(p_rcv->p_subn, + osm_get_port_by_guid(sa->p_subn, gids->unicast.interface_id); if (!p_port) { /* @@ -1277,7 +1237,7 @@ __osm_mpr_rcv_get_gids(IN osm_mpr_rcv_t * const p_rcv, don't enter it as an error in our own log. Return an error response to the client. */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_gids: ERR 4506: " "No port with GUID 0x%016" PRIx64 "\n", cl_ntoh64(gids->unicast.interface_id)); @@ -1290,7 +1250,7 @@ __osm_mpr_rcv_get_gids(IN osm_mpr_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return ib_status; } @@ -1298,7 +1258,7 @@ __osm_mpr_rcv_get_gids(IN osm_mpr_rcv_t * const p_rcv, /********************************************************************** **********************************************************************/ static ib_net16_t -__osm_mpr_rcv_get_end_points(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_get_end_points(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, OUT osm_port_t ** pp_ports, OUT int *nsrc, OUT int *ndest) @@ -1309,7 +1269,7 @@ __osm_mpr_rcv_get_end_points(IN osm_mpr_rcv_t * const p_rcv, ib_net16_t sa_status = IB_SA_MAD_STATUS_SUCCESS; ib_gid_t *gids; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_get_end_points); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_get_end_points); /* Determine what fields are valid and then get a pointer @@ -1332,7 +1292,7 @@ __osm_mpr_rcv_get_end_points(IN osm_mpr_rcv_t * const p_rcv, if (*nsrc > IB_MULTIPATH_MAX_GIDS) *nsrc = IB_MULTIPATH_MAX_GIDS; sa_status = - __osm_mpr_rcv_get_gids(p_rcv, gids, *nsrc, 1, pp_ports); + __osm_mpr_rcv_get_gids(sa, gids, *nsrc, 1, pp_ports); if (sa_status != IB_SUCCESS) goto Exit; } @@ -1342,12 +1302,12 @@ __osm_mpr_rcv_get_end_points(IN osm_mpr_rcv_t * const p_rcv, if (*ndest + *nsrc > IB_MULTIPATH_MAX_GIDS) *ndest = IB_MULTIPATH_MAX_GIDS - *nsrc; sa_status = - __osm_mpr_rcv_get_gids(p_rcv, gids + *nsrc, *ndest, 0, + __osm_mpr_rcv_get_gids(sa, gids + *nsrc, *ndest, 0, pp_ports + *nsrc); } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (sa_status); } @@ -1357,7 +1317,7 @@ __osm_mpr_rcv_get_end_points(IN osm_mpr_rcv_t * const p_rcv, /********************************************************************** **********************************************************************/ static void -__osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_get_apm_paths(IN osm_sa_t * sa, IN const ib_multipath_rec_t * const p_mpr, IN const osm_port_t * const p_req_port, IN osm_port_t ** _pp_ports, @@ -1369,7 +1329,7 @@ __osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv, int base_offs, src_lid_ho, dest_lid_ho; int sumA, sumB, minA, minB; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_get_apm_paths); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_get_apm_paths); /* * We want to: @@ -1404,27 +1364,27 @@ __osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv, dest_lid_ho = osm_port_get_base_lid(pp_ports[2]); base_offs = src_lid_ho < dest_lid_ho ? - __hash_lids(src_lid_ho, dest_lid_ho, p_rcv->p_subn->opt.lmc) : - __hash_lids(dest_lid_ho, src_lid_ho, p_rcv->p_subn->opt.lmc); + __hash_lids(src_lid_ho, dest_lid_ho, sa->p_subn->opt.lmc) : + __hash_lids(dest_lid_ho, src_lid_ho, sa->p_subn->opt.lmc); matrix[0][0] = - __osm_mpr_rcv_get_apm_port_pair_paths(p_rcv, p_mpr, pp_ports[0], + __osm_mpr_rcv_get_apm_port_pair_paths(sa, p_mpr, pp_ports[0], pp_ports[2], base_offs, comp_mask, p_list); matrix[0][1] = - __osm_mpr_rcv_get_apm_port_pair_paths(p_rcv, p_mpr, pp_ports[0], + __osm_mpr_rcv_get_apm_port_pair_paths(sa, p_mpr, pp_ports[0], pp_ports[3], base_offs, comp_mask, p_list); matrix[1][0] = - __osm_mpr_rcv_get_apm_port_pair_paths(p_rcv, p_mpr, pp_ports[1], + __osm_mpr_rcv_get_apm_port_pair_paths(sa, p_mpr, pp_ports[1], pp_ports[2], base_offs + 1, comp_mask, p_list); matrix[1][1] = - __osm_mpr_rcv_get_apm_port_pair_paths(p_rcv, p_mpr, pp_ports[1], + __osm_mpr_rcv_get_apm_port_pair_paths(sa, p_mpr, pp_ports[1], pp_ports[3], base_offs + 1, comp_mask, p_list); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_apm_paths: " + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_apm_paths: " "APM matrix:\n" "\t{0,0} 0x%X->0x%X (%d)\t| {0,1} 0x%X->0x%X (%d)\n" "\t{1,0} 0x%X->0x%X (%d)\t| {1,1} 0x%X->0x%X (%d)\n", @@ -1446,7 +1406,7 @@ __osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv, /* and the winner is... */ if (minA <= minB || (minA == minB && sumA < sumB)) { /* Diag A */ - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_apm_paths: " "Diag {0,0} & {1,1} is the best:\n" "\t{0,0} 0x%X->0x%X (%d)\t & {1,1} 0x%X->0x%X (%d)\n", @@ -1460,7 +1420,7 @@ __osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv, free(matrix[1][0]); } else { /* Diag B */ - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_get_apm_paths: " "Diag {0,1} & {1,0} is the best:\n" "\t{0,1} 0x%X->0x%X (%d)\t & {1,0} 0x%X->0x%X (%d)\n", @@ -1474,13 +1434,13 @@ __osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv, free(matrix[1][1]); } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_mpr_rcv_process_pairs(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_process_pairs(IN osm_sa_t * sa, IN const ib_multipath_rec_t * const p_mpr, IN osm_port_t * const p_req_port, IN osm_port_t ** pp_ports, @@ -1493,7 +1453,7 @@ __osm_mpr_rcv_process_pairs(IN osm_mpr_rcv_t * const p_rcv, osm_port_t **pp_dest_port, **pp_ed; uint32_t max_paths, num_paths, total_paths = 0; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_process_pairs); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_process_pairs); if (comp_mask & IB_MPR_COMPMASK_NUMBPATH) max_paths = p_mpr->num_path & 0x7F; @@ -1505,7 +1465,7 @@ __osm_mpr_rcv_process_pairs(IN osm_mpr_rcv_t * const p_rcv, for (pp_dest_port = pp_es, pp_ed = pp_es + ndest; pp_dest_port < pp_ed; pp_dest_port++) { num_paths = - __osm_mpr_rcv_get_port_pair_paths(p_rcv, p_mpr, + __osm_mpr_rcv_get_port_pair_paths(sa, p_mpr, p_req_port, *pp_src_port, *pp_dest_port, @@ -1514,7 +1474,7 @@ __osm_mpr_rcv_process_pairs(IN osm_mpr_rcv_t * const p_rcv, comp_mask, p_list); total_paths += num_paths; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_process_pairs: " "%d paths %d total paths %d max paths\n", num_paths, total_paths, max_paths); @@ -1525,13 +1485,13 @@ __osm_mpr_rcv_process_pairs(IN osm_mpr_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_mpr_rcv_respond(IN osm_mpr_rcv_t * const p_rcv, +__osm_mpr_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN cl_qlist_t * const p_list) { @@ -1546,14 +1506,14 @@ __osm_mpr_rcv_respond(IN osm_mpr_rcv_t * const p_rcv, osm_mpr_item_t *p_mpr_item; uint32_t i; - OSM_LOG_ENTER(p_rcv->p_log, __osm_mpr_rcv_respond); + OSM_LOG_ENTER(sa->p_log, __osm_mpr_rcv_respond); p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); p_mpr = (ib_multipath_rec_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); num_rec = cl_qlist_count(p_list); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_mpr_rcv_respond: " "Generating response with %zu records\n", num_rec); @@ -1562,11 +1522,11 @@ __osm_mpr_rcv_respond(IN osm_mpr_rcv_t * const p_rcv, /* Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, p_madw->h_bind, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, mad_size, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_respond: " "ERR 4502: Unable to allocate MAD\n"); @@ -1576,7 +1536,7 @@ __osm_mpr_rcv_respond(IN osm_mpr_rcv_t * const p_rcv, free(p_mpr_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -1614,27 +1574,27 @@ __osm_mpr_rcv_respond(IN osm_mpr_rcv_t * const p_rcv, CL_ASSERT(cl_is_qlist_empty(p_list)); - osm_dump_sa_mad(p_rcv->p_log, p_resp_sa_mad, OSM_LOG_FRAMES); + osm_dump_sa_mad(sa->p_log, p_resp_sa_mad, OSM_LOG_FRAMES); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_respond: ERR 4507: " "Unable to send MAD (%s)\n", ib_get_err_str(status)); - /* osm_mad_pool_put( p_rcv->p_mad_pool, p_resp_madw ); */ + /* osm_mad_pool_put( sa->p_mad_pool, p_resp_madw ); */ } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ void osm_mpr_rcv_process(IN void *context, IN void *data) { - osm_mpr_rcv_t *p_rcv = context; + osm_sa_t *sa = context; osm_madw_t *p_madw = data; const ib_multipath_rec_t *p_mpr; const ib_sa_mad_t *p_sa_mad; @@ -1644,7 +1604,7 @@ void osm_mpr_rcv_process(IN void *context, IN void *data) ib_net16_t sa_status; int nsrc, ndest; - OSM_LOG_ENTER(p_rcv->p_log, osm_mpr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_mpr_rcv_process); CL_ASSERT(p_madw); @@ -1654,38 +1614,38 @@ void osm_mpr_rcv_process(IN void *context, IN void *data) CL_ASSERT(p_sa_mad->attr_id == IB_MAD_ATTR_MULTIPATH_RECORD); if ((p_sa_mad->rmpp_flags & IB_RMPP_FLAG_ACTIVE) != IB_RMPP_FLAG_ACTIVE) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mpr_rcv_process: ERR 4510: " "Invalid request since RMPP_FLAG_ACTIVE is not set\n"); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_REQ_INVALID); goto Exit; } /* we only support SubnAdmGetMulti method */ if (p_sa_mad->method != IB_MAD_METHOD_GETMULTI) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mpr_rcv_process: ERR 4513: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_sa_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - requester_port = osm_get_port_by_mad_addr(p_rcv->p_log, p_rcv->p_subn, + requester_port = osm_get_port_by_mad_addr(sa->p_log, sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (requester_port == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mpr_rcv_process: ERR 4517: " "Cannot find requester physical port\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_multipath_record(p_rcv->p_log, p_mpr, OSM_LOG_DEBUG); + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_multipath_record(sa->p_log, p_mpr, OSM_LOG_DEBUG); cl_qlist_init(&pr_list); @@ -1693,40 +1653,40 @@ void osm_mpr_rcv_process(IN void *context, IN void *data) Most SA functions (including this one) are read-only on the subnet object, so we grab the lock non-exclusively. */ - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); - sa_status = __osm_mpr_rcv_get_end_points(p_rcv, p_madw, pp_ports, + sa_status = __osm_mpr_rcv_get_end_points(sa, p_madw, pp_ports, &nsrc, &ndest); if (sa_status != IB_SA_MAD_STATUS_SUCCESS || !nsrc || !ndest) { if (sa_status == IB_SA_MAD_STATUS_SUCCESS && (!nsrc || !ndest)) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_mpr_rcv_process_cb: ERR 4512: " "__osm_mpr_rcv_get_end_points failed, not enough GIDs " "(nsrc %d ndest %d)\n", nsrc, ndest); - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); if (sa_status == IB_SA_MAD_STATUS_SUCCESS) - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_REQ_INVALID); else - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } /* APM request */ if (nsrc == 2 && ndest == 2 && (p_mpr->num_path & 0x7F) == 2) - __osm_mpr_rcv_get_apm_paths(p_rcv, p_mpr, requester_port, + __osm_mpr_rcv_get_apm_paths(sa, p_mpr, requester_port, pp_ports, p_sa_mad->comp_mask, &pr_list); else - __osm_mpr_rcv_process_pairs(p_rcv, p_mpr, requester_port, + __osm_mpr_rcv_process_pairs(sa, p_mpr, requester_port, pp_ports, nsrc, ndest, p_sa_mad->comp_mask, &pr_list); - cl_plock_release(p_rcv->p_lock); - __osm_mpr_rcv_respond(p_rcv, p_madw, &pr_list); + cl_plock_release(sa->p_lock); + __osm_mpr_rcv_respond(sa, p_madw, &pr_list); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } #endif diff --git a/opensm/opensm/osm_sa_node_record.c b/opensm/opensm/osm_sa_node_record.c index e78e827..a9a3708 100644 --- a/opensm/opensm/osm_sa_node_record.c +++ b/opensm/opensm/osm_sa_node_record.c @@ -53,9 +53,8 @@ #include #include #include -#include -#include #include +#include #include #include #include @@ -69,52 +68,14 @@ typedef struct _osm_nr_search_ctxt { const ib_node_record_t *p_rcvd_rec; ib_net64_t comp_mask; cl_qlist_t *p_list; - osm_nr_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; } osm_nr_search_ctxt_t; /********************************************************************** **********************************************************************/ -void osm_nr_rcv_construct(IN osm_nr_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_nr_rcv_destroy(IN osm_nr_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_nr_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_nr_rcv_init(IN osm_nr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_nr_rcv_init); - - osm_nr_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static ib_api_status_t -__osm_nr_rcv_new_nr(IN osm_nr_rcv_t * const p_rcv, +__osm_nr_rcv_new_nr(IN osm_sa_t * sa, IN const osm_node_t * const p_node, IN cl_qlist_t * const p_list, IN ib_net64_t const port_guid, IN ib_net16_t const lid) @@ -122,19 +83,19 @@ __osm_nr_rcv_new_nr(IN osm_nr_rcv_t * const p_rcv, osm_nr_item_t *p_rec_item; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_nr_rcv_new_nr); + OSM_LOG_ENTER(sa->p_log, __osm_nr_rcv_new_nr); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_nr_rcv_new_nr: ERR 1D02: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_nr_rcv_new_nr: " "New NodeRecord: node 0x%016" PRIx64 "\n\t\t\t\tport 0x%016" PRIx64 ", lid 0x%X\n", @@ -153,14 +114,14 @@ __osm_nr_rcv_new_nr(IN osm_nr_rcv_t * const p_rcv, cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ static void -__osm_nr_rcv_create_nr(IN osm_nr_rcv_t * const p_rcv, +__osm_nr_rcv_create_nr(IN osm_sa_t * sa, IN const osm_node_t * const p_node, IN cl_qlist_t * const p_list, IN ib_net64_t const match_port_guid, @@ -177,10 +138,10 @@ __osm_nr_rcv_create_nr(IN osm_nr_rcv_t * const p_rcv, uint8_t lmc; ib_net64_t port_guid; - OSM_LOG_ENTER(p_rcv->p_log, __osm_nr_rcv_create_nr); + OSM_LOG_ENTER(sa->p_log, __osm_nr_rcv_create_nr); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_nr_rcv_create_nr: " "Looking for NodeRecord with LID: 0x%X GUID:0x%016" PRIx64 "\n", cl_ntoh16(match_lid), @@ -205,7 +166,7 @@ __osm_nr_rcv_create_nr(IN osm_nr_rcv_t * const p_rcv, /* Check to see if the found p_physp and the requester physp share a pkey. If not - continue */ - if (!osm_physp_share_pkey(p_rcv->p_log, p_physp, p_req_physp)) + if (!osm_physp_share_pkey(sa->p_log, p_physp, p_req_physp)) continue; port_guid = osm_physp_get_port_guid(p_physp); @@ -223,8 +184,8 @@ __osm_nr_rcv_create_nr(IN osm_nr_rcv_t * const p_rcv, /* We validate that the lid belongs to this node. */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_nr_rcv_create_nr: " "Comparing LID: 0x%X <= 0x%X <= 0x%X\n", base_lid_ho, match_lid_ho, max_lid_ho); @@ -235,11 +196,11 @@ __osm_nr_rcv_create_nr(IN osm_nr_rcv_t * const p_rcv, continue; } - __osm_nr_rcv_new_nr(p_rcv, p_node, p_list, port_guid, base_lid); + __osm_nr_rcv_new_nr(sa, p_node, p_list, port_guid, base_lid); } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -252,14 +213,14 @@ __osm_nr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, IN void *context) const osm_node_t *const p_node = (osm_node_t *) p_map_item; const ib_node_record_t *const p_rcvd_rec = p_ctxt->p_rcvd_rec; const osm_physp_t *const p_req_physp = p_ctxt->p_req_physp; - osm_nr_rcv_t *const p_rcv = p_ctxt->p_rcv; + osm_sa_t *sa = p_ctxt->sa; ib_net64_t const comp_mask = p_ctxt->comp_mask; ib_net64_t match_port_guid = 0; ib_net16_t match_lid = 0; - OSM_LOG_ENTER(p_ctxt->p_rcv->p_log, __osm_nr_rcv_by_comp_mask); + OSM_LOG_ENTER(p_ctxt->sa->p_log, __osm_nr_rcv_by_comp_mask); - osm_dump_node_info(p_ctxt->p_rcv->p_log, + osm_dump_node_info(p_ctxt->sa->p_log, &p_node->node_info, OSM_LOG_VERBOSE); if (comp_mask & IB_NR_COMPMASK_LID) @@ -269,8 +230,8 @@ __osm_nr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, IN void *context) /* DEBUG TOP */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_nr_rcv_by_comp_mask: " "Looking for node 0x%016" PRIx64 ", found 0x%016" PRIx64 "\n", @@ -345,18 +306,18 @@ __osm_nr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, IN void *context) goto Exit; } - __osm_nr_rcv_create_nr(p_rcv, p_node, p_ctxt->p_list, + __osm_nr_rcv_create_nr(sa, p_node, p_ctxt->p_list, match_port_guid, match_lid, p_req_physp); Exit: - OSM_LOG_EXIT(p_ctxt->p_rcv->p_log); + OSM_LOG_EXIT(p_ctxt->sa->p_log); } /********************************************************************** **********************************************************************/ void osm_nr_rcv_process(IN void *ctx, IN void *data) { - osm_nr_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; const ib_sa_mad_t *p_rcvd_mad; const ib_node_record_t *p_rcvd_rec; @@ -374,9 +335,9 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) ib_api_status_t status; osm_physp_t *p_req_physp; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_nr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_nr_rcv_process); CL_ASSERT(p_madw); @@ -388,44 +349,44 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) /* we only support SubnAdmGet and SubnAdmGetTable methods */ if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_nr_rcv_process: ERR 1D05: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_nr_rcv_process: ERR 1D04: " "Cannot find requester physical port\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_node_record(p_rcv->p_log, p_rcvd_rec, OSM_LOG_DEBUG); + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_node_record(sa->p_log, p_rcvd_rec, OSM_LOG_DEBUG); cl_qlist_init(&rec_list); context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.sa = sa; context.p_req_physp = p_req_physp; - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); - cl_qmap_apply_func(&p_rcv->p_subn->node_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->node_guid_tbl, __osm_nr_rcv_by_comp_mask, &context); - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -434,11 +395,11 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) * If we do a SubnAdmGet and got more than one record it is an error ! */ if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec > 1)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_nr_rcv_process: ERR 1D03: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -458,7 +419,7 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) trim_num_rec = (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_node_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_nr_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -466,11 +427,11 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_nr_rcv_process: " "Returning %u records\n", num_rec); if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -478,13 +439,13 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_node_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_nr_rcv_process: ERR 1D06: " "osm_mad_pool_get failed\n"); @@ -494,7 +455,7 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -545,9 +506,9 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_nr_rcv_process: ERR 1D07: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -555,5 +516,5 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 2ea6211..749a936 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -55,13 +55,11 @@ #include #include #include +#include #include -#include #include #include #include -#include -#include #include #include #include @@ -70,7 +68,6 @@ #include #include #include -#include #include extern uint8_t osm_get_lash_sl(osm_opensm_t * p_osm, @@ -94,7 +91,7 @@ typedef struct _osm_path_parms { typedef struct osm_sa_pr_mcmr_search_ctxt { ib_gid_t *p_mgid; osm_mgrp_t *p_mgrp; - osm_pr_rcv_t *p_rcv; + osm_sa_t *sa; } osm_sa_pr_mcmr_search_ctxt_t; static const ib_gid_t zero_gid = { {0x00, 0x00, 0x00, 0x00, @@ -105,44 +102,6 @@ static const ib_gid_t zero_gid = { {0x00, 0x00, 0x00, 0x00, /********************************************************************** **********************************************************************/ -void osm_pr_rcv_construct(IN osm_pr_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_pr_rcv_destroy(IN osm_pr_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_pr_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_pr_rcv_init(IN osm_pr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_pr_rcv_init); - - osm_pr_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_rcv->p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static inline boolean_t __osm_sa_path_rec_is_tavor_port(IN const osm_port_t * const p_port) { @@ -214,7 +173,7 @@ __osm_sa_path_rec_apply_tavor_mtu_limit(IN const ib_path_rec_t * const p_pr, /********************************************************************** **********************************************************************/ static ib_api_status_t -__osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_get_path_parms(IN osm_sa_t * sa, IN const ib_path_rec_t * const p_pr, IN const osm_port_t * const p_src_port, IN const osm_port_t * const p_dest_port, @@ -246,7 +205,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, uint16_t valid_sl_mask = 0xffff; int is_lash; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_get_path_parms); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_get_path_parms); dest_lid = cl_hton16(dest_lid_ho); @@ -254,7 +213,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, p_physp = p_src_port->p_physp; p_src_physp = p_physp; p_pi = &p_physp->port_info; - p_osm = p_rcv->p_subn->p_osm; + p_osm = sa->p_subn->p_osm; mtu = ib_port_info_get_mtu_cap(p_pi); rate = ib_port_info_compute_rate(p_pi); @@ -265,12 +224,12 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, and at least one end of the path is Tavor we override the port MTU with 1K. */ - if (p_rcv->p_subn->opt.enable_quirks && + if (sa->p_subn->opt.enable_quirks && __osm_sa_path_rec_apply_tavor_mtu_limit(p_pr, p_src_port, p_dest_port, comp_mask)) if (mtu > IB_MTU_LEN_1024) { mtu = IB_MTU_LEN_1024; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_path_parms: " "Optimized Path MTU to 1K for Mellanox Tavor device\n"); } @@ -293,7 +252,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, */ p_physp = osm_switch_get_route_by_lid(p_node->sw, dest_lid); if (p_physp == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F02: " "Cannot find routing to LID 0x%X from switch for GUID 0x%016" PRIx64 "\n", dest_lid_ho, @@ -303,7 +262,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, } } - if (p_rcv->p_subn->opt.qos) { + if (sa->p_subn->opt.qos) { /* * Whether this node is switch or CA, the IN port for @@ -318,8 +277,8 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, valid_sl_mask &= ~(1 << i); } if (!valid_sl_mask) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_path_parms: " "All the SLs lead to VL15 on this path\n"); status = IB_NOT_FOUND; @@ -339,7 +298,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, p_dest_physp = osm_switch_get_route_by_lid(p_node->sw, dest_lid); if (p_dest_physp == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F03: " "Cannot find routing to LID 0x%X from switch for GUID 0x%016" PRIx64 "\n", dest_lid_ho, @@ -360,7 +319,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, p_physp = osm_physp_get_remote(p_physp); if (p_physp == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F05: " "Cannot find remote phys port when routing to LID 0x%X from node GUID 0x%016" PRIx64 "\n", dest_lid_ho, @@ -385,7 +344,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, If this isn't a switch, we should have reached the destination by now! */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F06: " "Internal error, bad path\n"); status = IB_ERROR; @@ -409,7 +368,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, p_physp = osm_switch_get_route_by_lid(p_node->sw, dest_lid); if (p_physp == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F07: " "Dead end on path to LID 0x%X from switch for GUID 0x%016" PRIx64 "\n", dest_lid_ho, @@ -428,7 +387,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, if (rate > ib_port_info_compute_rate(p_pi)) rate = ib_port_info_compute_rate(p_pi); - if (p_rcv->p_subn->opt.qos) { + if (sa->p_subn->opt.qos) { /* * Check SL2VL table of the switch and update valid SLs */ @@ -439,8 +398,8 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, valid_sl_mask &= ~(1 << i); } if (!valid_sl_mask) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_path_parms: " "All the SLs lead to VL15 " "on this path\n"); @@ -461,8 +420,8 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, if (rate > ib_port_info_compute_rate(p_pi)) rate = ib_port_info_compute_rate(p_pi); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_path_parms: " "Path min MTU = %u, min rate = %u\n", mtu, rate); @@ -470,14 +429,14 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, * Get QoS Level object according to the path request * and adjust path parameters according to QoS settings */ - if (p_rcv->p_subn->opt.qos && - p_rcv->p_subn->p_qos_policy && + if (sa->p_subn->opt.qos && + sa->p_subn->p_qos_policy && (p_qos_level = - osm_qos_policy_get_qos_level_by_pr(p_rcv->p_subn->p_qos_policy, + osm_qos_policy_get_qos_level_by_pr(sa->p_subn->p_qos_policy, p_pr, p_src_physp, p_dest_physp, comp_mask))) { - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_path_parms: " "PathRecord request matches QoS Level '%s' (%s)\n", p_qos_level->name, @@ -664,7 +623,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, */ pkey = p_pr->pkey; if (!osm_physp_share_this_pkey(p_src_physp, p_dest_physp, pkey)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F1A: " "Ports do not share specified PKey 0x%04x\n", cl_ntoh16(pkey)); @@ -673,7 +632,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, } if (p_qos_level && p_qos_level->pkey_range_len && !osm_qos_level_has_pkey(p_qos_level, pkey)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F1D: " "Ports do not share PKeys defined by QoS level\n"); status = IB_NOT_FOUND; @@ -688,7 +647,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, pkey = osm_qos_level_get_shared_pkey(p_qos_level, p_src_physp, p_dest_physp); if (!pkey) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F1E: " "Ports do not share PKeys defined by QoS level\n"); status = IB_NOT_FOUND; @@ -701,7 +660,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, */ pkey = osm_physp_find_common_pkey(p_src_physp, p_dest_physp); if (!pkey) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F1B: " "Ports do not have any shared PKeys\n"); status = IB_NOT_FOUND; @@ -711,11 +670,11 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, if (pkey) { p_prtn = - (osm_prtn_t *) cl_qmap_get(&p_rcv->p_subn->prtn_pkey_tbl, + (osm_prtn_t *) cl_qmap_get(&sa->p_subn->prtn_pkey_tbl, pkey & cl_hton16((uint16_t) ~ 0x8000)); if (p_prtn == - (osm_prtn_t *) cl_qmap_end(&p_rcv->p_subn->prtn_pkey_tbl)) + (osm_prtn_t *) cl_qmap_end(&sa->p_subn->prtn_pkey_tbl)) p_prtn = NULL; } @@ -733,7 +692,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, if (p_qos_level && p_qos_level->sl_set && (p_qos_level->sl != sl)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F1F: " "QoS constaraints: required PathRecord SL (%u) " "doesn't match QoS policy SL (%u)\n", sl, @@ -744,7 +703,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, if (is_lash && osm_get_lash_sl(p_osm, p_src_port, p_dest_port) != sl) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F23: " "Required PathRecord SL (%u) doesn't " "match LASH SL\n", sl); @@ -759,7 +718,6 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, * slid and dest_lid are stored in network in lash. */ sl = osm_get_lash_sl(p_osm, p_src_port, p_dest_port); - } else if (p_qos_level && p_qos_level->sl_set) { /* * No specific SL was requested, and we're not in @@ -768,7 +726,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, sl = p_qos_level->sl; if (pkey && p_prtn && p_prtn->sl != p_qos_level->sl) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_path_parms: " "QoS level SL (%u) overrides partition SL (%u)\n", p_qos_level->sl, p_prtn->sl); @@ -781,13 +739,13 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, sl = OSM_DEFAULT_SL; /* this may be possible when pkey tables are created somehow in previous runs or things are going wrong here */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F1C: " "No partition found for PKey 0x%04x - using default SL %d\n", cl_ntoh16(pkey), sl); } else sl = p_prtn->sl; - } else if (p_rcv->p_subn->opt.qos) { + } else if (sa->p_subn->opt.qos) { if (valid_sl_mask & (1 << OSM_DEFAULT_SL)) sl = OSM_DEFAULT_SL; else { @@ -799,8 +757,8 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, } else sl = OSM_DEFAULT_SL; - if (p_rcv->p_subn->opt.qos && !(valid_sl_mask & (1 << sl))) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + if (sa->p_subn->opt.qos && !(valid_sl_mask & (1 << sl))) { + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F24: " "Selected SL (%u) leads to VL15\n", sl); status = IB_NOT_FOUND; @@ -818,21 +776,21 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv, p_parms->pkey = pkey; p_parms->sl = sl; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_path_parms: Path params:" " mtu = %u, rate = %u, packet lifetime = %u," " pkey = 0x%04X, sl = %u\n", mtu, rate, pkt_life, cl_ntoh16(pkey), sl); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ static void -__osm_pr_rcv_build_pr(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_build_pr(IN osm_sa_t * sa, IN const osm_port_t * const p_src_port, IN const osm_port_t * const p_dest_port, IN const ib_gid_t * const p_dgid, @@ -846,7 +804,7 @@ __osm_pr_rcv_build_pr(IN osm_pr_rcv_t * const p_rcv, const osm_physp_t *p_dest_physp; boolean_t is_nonzero_gid = 0; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_build_pr); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_build_pr); p_src_physp = p_src_port->p_physp; @@ -897,13 +855,13 @@ __osm_pr_rcv_build_pr(IN osm_pr_rcv_t * const p_rcv, if (p_parms->reversible) p_pr->num_path = 0x80; - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static osm_pr_item_t * -__osm_pr_rcv_get_lid_pair_path(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_get_lid_pair_path(IN osm_sa_t * sa, IN const ib_path_rec_t * const p_pr, IN const osm_port_t * const p_src_port, IN const osm_port_t * const p_dest_port, @@ -918,24 +876,24 @@ __osm_pr_rcv_get_lid_pair_path(IN osm_pr_rcv_t * const p_rcv, osm_pr_item_t *p_pr_item; ib_api_status_t status, rev_path_status; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_get_lid_pair_path); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_get_lid_pair_path); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_lid_pair_path: " "Src LID 0x%X, Dest LID 0x%X\n", src_lid_ho, dest_lid_ho); p_pr_item = malloc(sizeof(*p_pr_item)); if (p_pr_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_lid_pair_path: ERR 1F01: " "Unable to allocate path record\n"); goto Exit; } memset(p_pr_item, 0, sizeof(*p_pr_item)); - status = __osm_pr_rcv_get_path_parms(p_rcv, p_pr, p_src_port, + status = __osm_pr_rcv_get_path_parms(sa, p_pr, p_src_port, p_dest_port, dest_lid_ho, comp_mask, &path_parms); @@ -946,7 +904,7 @@ __osm_pr_rcv_get_lid_pair_path(IN osm_pr_rcv_t * const p_rcv, } /* now try the reversible path */ - rev_path_status = __osm_pr_rcv_get_path_parms(p_rcv, p_pr, p_dest_port, + rev_path_status = __osm_pr_rcv_get_path_parms(sa, p_pr, p_dest_port, p_src_port, src_lid_ho, comp_mask, &rev_path_parms); @@ -960,7 +918,7 @@ __osm_pr_rcv_get_lid_pair_path(IN osm_pr_rcv_t * const p_rcv, */ if (comp_mask & IB_PR_COMPMASK_REVERSIBLE) { if ((!path_parms.reversible && (p_pr->num_path & 0x80))) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_lid_pair_path: " "Requested reversible path but failed to get one\n"); @@ -970,19 +928,19 @@ __osm_pr_rcv_get_lid_pair_path(IN osm_pr_rcv_t * const p_rcv, } } - __osm_pr_rcv_build_pr(p_rcv, p_src_port, p_dest_port, p_dgid, + __osm_pr_rcv_build_pr(sa, p_src_port, p_dest_port, p_dgid, src_lid_ho, dest_lid_ho, preference, &path_parms, &p_pr_item->path_rec); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (p_pr_item); } /********************************************************************** **********************************************************************/ static void -__osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_get_port_pair_paths(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN const osm_port_t * const p_req_port, IN const osm_port_t * const p_src_port, @@ -1006,10 +964,10 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv, uintn_t src_offset; uintn_t dest_offset; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_get_port_pair_paths); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_get_port_pair_paths); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_port_pair_paths: " "Src port 0x%016" PRIx64 ", " "Dst port 0x%016" PRIx64 "\n", @@ -1018,10 +976,10 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv, /* Check that the req_port, src_port and dest_port all share a pkey. The check is done on the default physical port of the ports. */ - if (osm_port_share_pkey(p_rcv->p_log, p_req_port, p_src_port) == FALSE - || osm_port_share_pkey(p_rcv->p_log, p_req_port, + if (osm_port_share_pkey(sa->p_log, p_req_port, p_src_port) == FALSE + || osm_port_share_pkey(sa->p_log, p_req_port, p_dest_port) == FALSE - || osm_port_share_pkey(p_rcv->p_log, p_src_port, + || osm_port_share_pkey(sa->p_log, p_src_port, p_dest_port) == FALSE) /* One of the pairs doesn't share a pkey so the path is disqualified. */ goto Exit; @@ -1092,21 +1050,21 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv, &src_lid_max_ho); if (src_lid_min_ho == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_port_pair_paths: ERR 1F20:" "Obtained source LID of 0. No such LID possible\n"); goto Exit; } if (dest_lid_min_ho == 0) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_port_pair_paths: ERR 1F21:" "Obtained destination LID of 0. No such LID possible\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_get_port_pair_paths: " "Src LIDs [0x%X-0x%X], " "Dest LIDs [0x%X-0x%X]\n", @@ -1136,7 +1094,7 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv, These paths are "fully redundant" */ - p_pr_item = __osm_pr_rcv_get_lid_pair_path(p_rcv, p_pr, + p_pr_item = __osm_pr_rcv_get_lid_pair_path(sa, p_pr, p_src_port, p_dest_port, p_dgid, src_lid_ho, @@ -1202,7 +1160,7 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv, if (src_offset == dest_offset) continue; /* already reported */ - p_pr_item = __osm_pr_rcv_get_lid_pair_path(p_rcv, p_pr, + p_pr_item = __osm_pr_rcv_get_lid_pair_path(sa, p_pr, p_src_port, p_dest_port, p_dgid, src_lid_ho, @@ -1217,13 +1175,13 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static ib_net16_t -__osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_get_end_points(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, OUT const osm_port_t ** const pp_src_port, OUT const osm_port_t ** const pp_dest_port, @@ -1238,7 +1196,7 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, osm_router_t *p_rtr; osm_port_t *p_rtr_port; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_get_end_points); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_get_end_points); /* Determine what fields are valid and then get a pointer @@ -1258,13 +1216,13 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, if (comp_mask & IB_PR_COMPMASK_SGID) { if (!ib_gid_is_link_local(&p_pr->sgid)) { if (ib_gid_get_subnet_prefix(&p_pr->sgid) != - p_rcv->p_subn->opt.subnet_prefix) { + sa->p_subn->opt.subnet_prefix) { /* This 'error' is the client's fault (bad gid) so don't enter it as an error in our own log. Return an error response to the client. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_pr_rcv_get_end_points: " "Non local SGID subnet prefix 0x%016" PRIx64 "\n", @@ -1275,7 +1233,7 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, } } - *pp_src_port = osm_get_port_by_guid(p_rcv->p_subn, + *pp_src_port = osm_get_port_by_guid(sa->p_subn, p_pr->sgid.unicast. interface_id); if (!*pp_src_port) { @@ -1284,7 +1242,7 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, don't enter it as an error in our own log. Return an error response to the client. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_pr_rcv_get_end_points: " "No source port with GUID 0x%016" PRIx64 "\n", cl_ntoh64(p_pr->sgid.unicast.interface_id)); @@ -1295,7 +1253,7 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, } else { *pp_src_port = 0; if (comp_mask & IB_PR_COMPMASK_SLID) { - status = cl_ptr_vector_at(&p_rcv->p_subn->port_lid_tbl, + status = cl_ptr_vector_at(&sa->p_subn->port_lid_tbl, cl_ntoh16(p_pr->slid), (void **)pp_src_port); @@ -1305,7 +1263,7 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, don't enter it as an error in our own log. Return an error response to the client. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_pr_rcv_get_end_points: " "No source port with LID = 0x%X\n", cl_ntoh16(p_pr->slid)); @@ -1324,8 +1282,8 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, if (!ib_gid_is_link_local(&p_pr->dgid)) { if (!ib_gid_is_multicast(&p_pr->dgid) && ib_gid_get_subnet_prefix(&p_pr->dgid) != - p_rcv->p_subn->opt.subnet_prefix) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + sa->p_subn->opt.subnet_prefix) { + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_pr_rcv_get_end_points: " "Non local DGID subnet prefix 0x%016" PRIx64 "\n", @@ -1335,10 +1293,10 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, this prefix, if any: */ osm_prefix_route_t *route = NULL; osm_prefix_route_t *r = (osm_prefix_route_t *) - cl_qlist_head(&p_rcv->p_subn->prefix_routes_list); + cl_qlist_head(&sa->p_subn->prefix_routes_list); while (r != (osm_prefix_route_t *) - cl_qlist_end(&p_rcv->p_subn->prefix_routes_list)) + cl_qlist_end(&sa->p_subn->prefix_routes_list)) { if (r->prefix == p_pr->dgid.unicast.prefix || r->prefix == 0) @@ -1360,23 +1318,23 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, } else if (route->guid == 0) { /* first router */ p_rtr = (osm_router_t *) - cl_qmap_head(&p_rcv-> + cl_qmap_head(&sa-> p_subn-> rtr_guid_tbl); } else { p_rtr = (osm_router_t *) - cl_qmap_get(&p_rcv-> + cl_qmap_get(&sa-> p_subn-> rtr_guid_tbl, route->guid); } if (p_rtr == - (osm_router_t *) cl_qmap_end(&p_rcv-> + (osm_router_t *) cl_qmap_end(&sa-> p_subn-> rtr_guid_tbl)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_end_points: ERR 1F22: " "Off subnet DGID but router not found\n"); sa_status = @@ -1391,14 +1349,14 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, } } - *pp_dest_port = osm_get_port_by_guid(p_rcv->p_subn, dest_guid); + *pp_dest_port = osm_get_port_by_guid(sa->p_subn, dest_guid); if (!*pp_dest_port) { /* This 'error' is the client's fault (bad gid) so don't enter it as an error in our own log. Return an error response to the client. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_pr_rcv_get_end_points: " "No dest port with GUID 0x%016" PRIx64 "\n", cl_ntoh64(dest_guid)); @@ -1409,7 +1367,7 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, } else { *pp_dest_port = 0; if (comp_mask & IB_PR_COMPMASK_DLID) { - status = cl_ptr_vector_at(&p_rcv->p_subn->port_lid_tbl, + status = cl_ptr_vector_at(&sa->p_subn->port_lid_tbl, cl_ntoh16(p_pr->dlid), (void **)pp_dest_port); @@ -1419,7 +1377,7 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, don't enter it as an error in our own log. Return an error response to the client. */ - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_pr_rcv_get_end_points: " "No dest port with LID = 0x%X\n", cl_ntoh16(p_pr->dlid)); @@ -1431,14 +1389,14 @@ __osm_pr_rcv_get_end_points(IN osm_pr_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (sa_status); } /********************************************************************** **********************************************************************/ static void -__osm_pr_rcv_process_world(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_process_world(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN const osm_port_t * const requester_port, IN const ib_gid_t * const p_dgid, @@ -1449,7 +1407,7 @@ __osm_pr_rcv_process_world(IN osm_pr_rcv_t * const p_rcv, const osm_port_t *p_dest_port; const osm_port_t *p_src_port; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_process_world); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_process_world); /* Iterate the entire port space over itself. @@ -1459,13 +1417,13 @@ __osm_pr_rcv_process_world(IN osm_pr_rcv_t * const p_rcv, We compute both A -> B and B -> A, since we don't have any check to determine the reversability of the paths. */ - p_tbl = &p_rcv->p_subn->port_guid_tbl; + p_tbl = &sa->p_subn->port_guid_tbl; p_dest_port = (osm_port_t *) cl_qmap_head(p_tbl); while (p_dest_port != (osm_port_t *) cl_qmap_end(p_tbl)) { p_src_port = (osm_port_t *) cl_qmap_head(p_tbl); while (p_src_port != (osm_port_t *) cl_qmap_end(p_tbl)) { - __osm_pr_rcv_get_port_pair_paths(p_rcv, p_madw, + __osm_pr_rcv_get_port_pair_paths(sa, p_madw, requester_port, p_src_port, p_dest_port, p_dgid, @@ -1479,13 +1437,13 @@ __osm_pr_rcv_process_world(IN osm_pr_rcv_t * const p_rcv, (osm_port_t *) cl_qmap_next(&p_dest_port->map_item); } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_pr_rcv_process_half(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_process_half(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN const osm_port_t * const requester_port, IN const osm_port_t * const p_src_port, @@ -1497,14 +1455,14 @@ __osm_pr_rcv_process_half(IN osm_pr_rcv_t * const p_rcv, const cl_qmap_t *p_tbl; const osm_port_t *p_port; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_process_half); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_process_half); /* Iterate over every port, looking for matches... A path record from a port to itself is legit, so no need to special case that one. */ - p_tbl = &p_rcv->p_subn->port_guid_tbl; + p_tbl = &sa->p_subn->port_guid_tbl; if (p_src_port) { /* @@ -1512,7 +1470,7 @@ __osm_pr_rcv_process_half(IN osm_pr_rcv_t * const p_rcv, */ p_port = (osm_port_t *) cl_qmap_head(p_tbl); while (p_port != (osm_port_t *) cl_qmap_end(p_tbl)) { - __osm_pr_rcv_get_port_pair_paths(p_rcv, p_madw, + __osm_pr_rcv_get_port_pair_paths(sa, p_madw, requester_port, p_src_port, p_port, p_dgid, comp_mask, @@ -1525,7 +1483,7 @@ __osm_pr_rcv_process_half(IN osm_pr_rcv_t * const p_rcv, */ p_port = (osm_port_t *) cl_qmap_head(p_tbl); while (p_port != (osm_port_t *) cl_qmap_end(p_tbl)) { - __osm_pr_rcv_get_port_pair_paths(p_rcv, p_madw, + __osm_pr_rcv_get_port_pair_paths(sa, p_madw, requester_port, p_port, p_dest_port, p_dgid, comp_mask, p_list); @@ -1533,13 +1491,13 @@ __osm_pr_rcv_process_half(IN osm_pr_rcv_t * const p_rcv, } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_pr_rcv_process_pair(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_process_pair(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN const osm_port_t * const requester_port, IN const osm_port_t * const p_src_port, @@ -1548,13 +1506,13 @@ __osm_pr_rcv_process_pair(IN osm_pr_rcv_t * const p_rcv, IN const ib_net64_t comp_mask, IN cl_qlist_t * const p_list) { - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_process_pair); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_process_pair); - __osm_pr_rcv_get_port_pair_paths(p_rcv, p_madw, requester_port, + __osm_pr_rcv_get_port_pair_paths(sa, p_madw, requester_port, p_src_port, p_dest_port, p_dgid, comp_mask, p_list); - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -1566,11 +1524,11 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) osm_sa_pr_mcmr_search_ctxt_t *p_ctxt = (osm_sa_pr_mcmr_search_ctxt_t *) context; const ib_gid_t *p_recvd_mgid; - osm_pr_rcv_t *p_rcv; + osm_sa_t *sa; /* uint32_t i; */ p_recvd_mgid = p_ctxt->p_mgid; - p_rcv = p_ctxt->p_rcv; + sa = p_ctxt->sa; /* ignore groups marked for deletion */ if (p_mgrp->to_be_deleted) @@ -1592,7 +1550,7 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) #endif if (p_ctxt->p_mgrp) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__search_mgrp_by_mgid: ERR 1F08: " "Multiple MC groups for same MGID\n"); return; @@ -1603,17 +1561,17 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) /********************************************************************** **********************************************************************/ static ib_api_status_t -__get_mgrp_by_mgid(IN osm_pr_rcv_t * const p_rcv, +__get_mgrp_by_mgid(IN osm_sa_t * sa, IN ib_path_rec_t * p_recvd_path_rec, OUT osm_mgrp_t ** pp_mgrp) { osm_sa_pr_mcmr_search_ctxt_t mcmr_search_context; mcmr_search_context.p_mgid = &p_recvd_path_rec->dgid; - mcmr_search_context.p_rcv = p_rcv; + mcmr_search_context.sa = sa; mcmr_search_context.p_mgrp = NULL; - cl_qmap_apply_func(&p_rcv->p_subn->mgrp_mlid_tbl, + cl_qmap_apply_func(&sa->p_subn->mgrp_mlid_tbl, __search_mgrp_by_mgid, &mcmr_search_context); if (mcmr_search_context.p_mgrp == NULL) @@ -1625,14 +1583,14 @@ __get_mgrp_by_mgid(IN osm_pr_rcv_t * const p_rcv, /********************************************************************** **********************************************************************/ -static osm_mgrp_t *__get_mgrp_by_mlid(IN const osm_pr_rcv_t * const p_rcv, +static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_sa_t * sa, IN ib_net16_t const mlid) { cl_map_item_t *map_item; - map_item = cl_qmap_get(&p_rcv->p_subn->mgrp_mlid_tbl, mlid); + map_item = cl_qmap_get(&sa->p_subn->mgrp_mlid_tbl, mlid); - if (map_item == cl_qmap_end(&p_rcv->p_subn->mgrp_mlid_tbl)) + if (map_item == cl_qmap_end(&sa->p_subn->mgrp_mlid_tbl)) return NULL; return (osm_mgrp_t *) map_item; @@ -1641,7 +1599,7 @@ static osm_mgrp_t *__get_mgrp_by_mlid(IN const osm_pr_rcv_t * const p_rcv, /********************************************************************** **********************************************************************/ static void -__osm_pr_get_mgrp(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_get_mgrp(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, OUT osm_mgrp_t ** pp_mgrp) { ib_path_rec_t *p_pr; @@ -1649,7 +1607,7 @@ __osm_pr_get_mgrp(IN osm_pr_rcv_t * const p_rcv, ib_net64_t comp_mask; ib_api_status_t status; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_get_mgrp); + OSM_LOG_ENTER(sa->p_log, __osm_pr_get_mgrp); p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); p_pr = (ib_path_rec_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); @@ -1657,9 +1615,9 @@ __osm_pr_get_mgrp(IN osm_pr_rcv_t * const p_rcv, comp_mask = p_sa_mad->comp_mask; if (comp_mask & IB_PR_COMPMASK_DGID) { - status = __get_mgrp_by_mgid(p_rcv, p_pr, pp_mgrp); + status = __get_mgrp_by_mgid(sa, p_pr, pp_mgrp); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_get_mgrp: ERR 1F09: " "No MC group found for PathRecord destination GID\n"); goto Exit; @@ -1672,29 +1630,29 @@ __osm_pr_get_mgrp(IN osm_pr_rcv_t * const p_rcv, /* the same as the DLID in the PathRecord */ if ((*pp_mgrp)->mlid != p_pr->dlid) { /* Note: perhaps this might be better indicated as an invalid request */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_get_mgrp: ERR 1F10: " "MC group MLID does not match PathRecord destination LID\n"); *pp_mgrp = NULL; goto Exit; } } else { - *pp_mgrp = __get_mgrp_by_mlid(p_rcv, p_pr->dlid); + *pp_mgrp = __get_mgrp_by_mlid(sa, p_pr->dlid); if (*pp_mgrp == NULL) - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_get_mgrp: ERR 1F11: " "No MC group found for PathRecord destination LID\n"); } } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static ib_api_status_t -__osm_pr_match_mgrp_attributes(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_match_mgrp_attributes(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN const osm_mgrp_t * const p_mgrp) { @@ -1706,7 +1664,7 @@ __osm_pr_match_mgrp_attributes(IN osm_pr_rcv_t * const p_rcv, uint8_t sl; uint8_t hop_limit; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_match_mgrp_attributes); + OSM_LOG_ENTER(sa->p_log, __osm_pr_match_mgrp_attributes); p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); p_pr = (ib_path_rec_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); @@ -1753,14 +1711,14 @@ __osm_pr_match_mgrp_attributes(IN osm_pr_rcv_t * const p_rcv, status = IB_SUCCESS; Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ static int -__osm_pr_rcv_check_mcast_dest(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_check_mcast_dest(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { const ib_path_rec_t *p_pr; @@ -1768,7 +1726,7 @@ __osm_pr_rcv_check_mcast_dest(IN osm_pr_rcv_t * const p_rcv, ib_net64_t comp_mask; int is_multicast = 0; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_check_mcast_dest); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_check_mcast_dest); p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); p_pr = (ib_path_rec_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); @@ -1786,7 +1744,7 @@ __osm_pr_rcv_check_mcast_dest(IN osm_pr_rcv_t * const p_rcv, cl_ntoh16(p_pr->dlid) <= IB_LID_MCAST_END_HO) is_multicast = 1; else if (is_multicast) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_check_mcast_dest: ERR 1F12: " "PathRecord request indicates MGID but not MLID\n"); is_multicast = -1; @@ -1794,14 +1752,14 @@ __osm_pr_rcv_check_mcast_dest(IN osm_pr_rcv_t * const p_rcv, } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (is_multicast); } /********************************************************************** **********************************************************************/ static void -__osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, +__osm_pr_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN cl_qlist_t * const p_list) { @@ -1814,11 +1772,11 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, #endif ib_path_rec_t *p_resp_pr; ib_api_status_t status; - const ib_sa_mad_t *p_rcvd_mad = osm_madw_get_sa_mad_ptr(p_madw); + const ib_sa_mad_t *sad_mad = osm_madw_get_sa_mad_ptr(p_madw); osm_pr_item_t *p_pr_item; uint32_t i; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_respond); + OSM_LOG_ENTER(sa->p_log, __osm_pr_rcv_respond); num_rec = cl_qlist_count(p_list); @@ -1826,18 +1784,18 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, * C15-0.1.30: * If we do a SubnAdmGet and got more than one record it is an error ! */ - if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { + if (sad_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_respond: ERR 1F13: " "Got more than one record for SubnAdmGet (%zu)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ p_pr_item = @@ -1857,7 +1815,7 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, trim_num_rec = (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_path_rec_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_pr_rcv_respond: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -1865,12 +1823,12 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pr_rcv_respond: " "Generating response with %zu records\n", num_rec); - if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + if ((sad_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -1878,11 +1836,11 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, p_madw->h_bind, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_path_rec_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_respond: ERR 1F14: " "Unable to allocate MAD\n"); @@ -1892,7 +1850,7 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, free(p_pr_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -1937,24 +1895,24 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv, CL_ASSERT(cl_is_qlist_empty(p_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_respond: ERR 1F15: " "Unable to send MAD (%s)\n", ib_get_err_str(status)); - /* osm_mad_pool_put( p_rcv->p_mad_pool, p_resp_madw ); */ + /* osm_mad_pool_put( sa->p_mad_pool, p_resp_madw ); */ } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ void osm_pr_rcv_process(IN void *context, IN void *data) { - osm_pr_rcv_t *p_rcv = context; + osm_sa_t *sa = context; osm_madw_t *p_madw = data; const ib_path_rec_t *p_pr; const ib_sa_mad_t *p_sa_mad; @@ -1966,7 +1924,7 @@ void osm_pr_rcv_process(IN void *context, IN void *data) osm_port_t *requester_port; int ret; - OSM_LOG_ENTER(p_rcv->p_log, osm_pr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_pr_rcv_process); CL_ASSERT(p_madw); @@ -1978,28 +1936,28 @@ void osm_pr_rcv_process(IN void *context, IN void *data) /* we only support SubnAdmGet and SubnAdmGetTable methods */ if ((p_sa_mad->method != IB_MAD_METHOD_GET) && (p_sa_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pr_rcv_process: ERR 1F17: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_sa_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - requester_port = osm_get_port_by_mad_addr(p_rcv->p_log, p_rcv->p_subn, + requester_port = osm_get_port_by_mad_addr(sa->p_log, sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (requester_port == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pr_rcv_process: ERR 1F16: " "Cannot find requester physical port\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_path_record(p_rcv->p_log, p_pr, OSM_LOG_DEBUG); + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_path_record(sa->p_log, p_pr, OSM_LOG_DEBUG); cl_qlist_init(&pr_list); @@ -2007,13 +1965,13 @@ void osm_pr_rcv_process(IN void *context, IN void *data) Most SA functions (including this one) are read-only on the subnet object, so we grab the lock non-exclusively. */ - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); /* Handle multicast destinations separately */ - if ((ret = __osm_pr_rcv_check_mcast_dest(p_rcv, p_madw)) < 0) { + if ((ret = __osm_pr_rcv_check_mcast_dest(sa, p_madw)) < 0) { /* Multicast DGID with unicast DLID */ - cl_plock_release(p_rcv->p_lock); - osm_sa_send_error(p_rcv->p_resp, p_madw, + cl_plock_release(sa->p_lock); + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_INVALID_FIELD); goto Exit; } @@ -2021,10 +1979,10 @@ void osm_pr_rcv_process(IN void *context, IN void *data) if (ret > 0) goto McastDest; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_pr_rcv_process: " "Unicast destination requested\n"); - sa_status = __osm_pr_rcv_get_end_points(p_rcv, p_madw, + sa_status = __osm_pr_rcv_get_end_points(sa, p_madw, &p_src_port, &p_dest_port, &dgid); @@ -2035,14 +1993,14 @@ void osm_pr_rcv_process(IN void *context, IN void *data) */ if (p_src_port) { if (p_dest_port) - __osm_pr_rcv_process_pair(p_rcv, p_madw, + __osm_pr_rcv_process_pair(sa, p_madw, requester_port, p_src_port, p_dest_port, &dgid, p_sa_mad->comp_mask, &pr_list); else - __osm_pr_rcv_process_half(p_rcv, p_madw, + __osm_pr_rcv_process_half(sa, p_madw, requester_port, p_src_port, NULL, &dgid, @@ -2050,7 +2008,7 @@ void osm_pr_rcv_process(IN void *context, IN void *data) &pr_list); } else { if (p_dest_port) - __osm_pr_rcv_process_half(p_rcv, p_madw, + __osm_pr_rcv_process_half(sa, p_madw, requester_port, NULL, p_dest_port, &dgid, p_sa_mad->comp_mask, @@ -2059,7 +2017,7 @@ void osm_pr_rcv_process(IN void *context, IN void *data) /* Katie, bar the door! */ - __osm_pr_rcv_process_world(p_rcv, p_madw, + __osm_pr_rcv_process_world(sa, p_madw, requester_port, &dgid, p_sa_mad->comp_mask, @@ -2069,7 +2027,7 @@ void osm_pr_rcv_process(IN void *context, IN void *data) goto Unlock; McastDest: - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_pr_rcv_process: " "Multicast destination requested\n"); { osm_mgrp_t *p_mgrp = NULL; @@ -2080,15 +2038,15 @@ void osm_pr_rcv_process(IN void *context, IN void *data) uint8_t hop_limit; /* First, get the MC info */ - __osm_pr_get_mgrp(p_rcv, p_madw, &p_mgrp); + __osm_pr_get_mgrp(sa, p_madw, &p_mgrp); if (!p_mgrp) goto Unlock; /* Make sure the rest of the PathRecord matches the MC group attributes */ - status = __osm_pr_match_mgrp_attributes(p_rcv, p_madw, p_mgrp); + status = __osm_pr_match_mgrp_attributes(sa, p_madw, p_mgrp); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pr_rcv_process: ERR 1F19: " "MC group attributes don't match PathRecord request\n"); goto Unlock; @@ -2096,7 +2054,7 @@ void osm_pr_rcv_process(IN void *context, IN void *data) p_pr_item = malloc(sizeof(*p_pr_item)); if (p_pr_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pr_rcv_process: ERR 1F18: " "Unable to allocate path record for MC group\n"); goto Unlock; @@ -2142,11 +2100,11 @@ void osm_pr_rcv_process(IN void *context, IN void *data) } Unlock: - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); /* Now, (finally) respond to the PathRecord request */ - __osm_pr_rcv_respond(p_rcv, p_madw, &pr_list); + __osm_pr_rcv_respond(sa, p_madw, &pr_list); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_pkey_record.c b/opensm/opensm/osm_sa_pkey_record.c index 1e9f50f..e7547df 100644 --- a/opensm/opensm/osm_sa_pkey_record.c +++ b/opensm/opensm/osm_sa_pkey_record.c @@ -43,10 +43,9 @@ #include #include #include -#include +#include #include #include -#include #include #include #include @@ -61,52 +60,14 @@ typedef struct _osm_pkey_search_ctxt { ib_net64_t comp_mask; uint16_t block_num; cl_qlist_t *p_list; - osm_pkey_rec_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; } osm_pkey_search_ctxt_t; /********************************************************************** **********************************************************************/ -void osm_pkey_rec_rcv_construct(IN osm_pkey_rec_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_pkey_rec_rcv_destroy(IN osm_pkey_rec_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_pkey_rec_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_pkey_rec_rcv_init(IN osm_pkey_rec_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_pkey_rec_rcv_init); - - osm_pkey_rec_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static void -__osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv, +__osm_sa_pkey_create(IN osm_sa_t * sa, IN osm_physp_t * const p_physp, IN osm_pkey_search_ctxt_t * const p_ctxt, IN uint16_t block) @@ -115,11 +76,11 @@ __osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv, uint16_t lid; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_pkey_create); + OSM_LOG_ENTER(sa->p_log, __osm_sa_pkey_create); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sa_pkey_create: ERR 4602: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; @@ -131,8 +92,8 @@ __osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv, else lid = osm_node_get_base_lid(p_physp->p_node, 0); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_pkey_create: " "New P_Key table for: port 0x%016" PRIx64 ", lid 0x%X, port 0x%X Block:%u\n", @@ -151,40 +112,40 @@ __osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv, cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_sa_pkey_check_physp(IN osm_pkey_rec_rcv_t * const p_rcv, +__osm_sa_pkey_check_physp(IN osm_sa_t * sa, IN osm_physp_t * const p_physp, osm_pkey_search_ctxt_t * const p_ctxt) { ib_net64_t comp_mask = p_ctxt->comp_mask; uint16_t block, num_blocks; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_pkey_check_physp); + OSM_LOG_ENTER(sa->p_log, __osm_sa_pkey_check_physp); /* we got here with the phys port - all is left is to get the right block */ if (comp_mask & IB_PKEY_COMPMASK_BLOCK) { - __osm_sa_pkey_create(p_rcv, p_physp, p_ctxt, p_ctxt->block_num); + __osm_sa_pkey_create(sa, p_physp, p_ctxt, p_ctxt->block_num); } else { num_blocks = osm_pkey_tbl_get_num_blocks(osm_physp_get_pkey_tbl (p_physp)); for (block = 0; block < num_blocks; block++) { - __osm_sa_pkey_create(p_rcv, p_physp, p_ctxt, block); + __osm_sa_pkey_create(sa, p_physp, p_ctxt, block); } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_sa_pkey_by_comp_mask(IN osm_pkey_rec_rcv_t * const p_rcv, +__osm_sa_pkey_by_comp_mask(IN osm_sa_t * sa, IN const osm_port_t * const p_port, osm_pkey_search_ctxt_t * const p_ctxt) { @@ -195,7 +156,7 @@ __osm_sa_pkey_by_comp_mask(IN osm_pkey_rec_rcv_t * const p_rcv, uint8_t num_ports; const osm_physp_t *p_req_physp; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_pkey_by_comp_mask); + OSM_LOG_ENTER(sa->p_log, __osm_sa_pkey_by_comp_mask); p_rcvd_rec = p_ctxt->p_rcvd_rec; comp_mask = p_ctxt->comp_mask; @@ -207,7 +168,7 @@ __osm_sa_pkey_by_comp_mask(IN osm_pkey_rec_rcv_t * const p_rcv, if (p_port->p_node->node_info.node_type != IB_NODE_TYPE_SWITCH) { /* we put it in the comp mask and port num */ port_num = p_port->p_physp->port_num; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_pkey_by_comp_mask: " "Using Physical Default Port Number: 0x%X (for End Node)\n", port_num); @@ -222,11 +183,11 @@ __osm_sa_pkey_by_comp_mask(IN osm_pkey_rec_rcv_t * const p_rcv, with the p_req_physp. */ if (osm_physp_is_valid(p_physp) && (osm_physp_share_pkey - (p_rcv->p_log, p_req_physp, p_physp))) - __osm_sa_pkey_check_physp(p_rcv, p_physp, + (sa->p_log, p_req_physp, p_physp))) + __osm_sa_pkey_check_physp(sa, p_physp, p_ctxt); } else { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sa_pkey_by_comp_mask: ERR 4603: " "Given Physical Port Number: 0x%X is out of range should be < 0x%X\n", port_num, @@ -244,14 +205,14 @@ __osm_sa_pkey_by_comp_mask(IN osm_pkey_rec_rcv_t * const p_rcv, /* if the requester and the p_physp don't share a pkey - continue */ if (!osm_physp_share_pkey - (p_rcv->p_log, p_req_physp, p_physp)) + (sa->p_log, p_req_physp, p_physp)) continue; - __osm_sa_pkey_check_physp(p_rcv, p_physp, p_ctxt); + __osm_sa_pkey_check_physp(sa, p_physp, p_ctxt); } } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -264,14 +225,14 @@ __osm_sa_pkey_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, osm_pkey_search_ctxt_t *const p_ctxt = (osm_pkey_search_ctxt_t *) context; - __osm_sa_pkey_by_comp_mask(p_ctxt->p_rcv, p_port, p_ctxt); + __osm_sa_pkey_by_comp_mask(p_ctxt->sa, p_port, p_ctxt); } /********************************************************************** **********************************************************************/ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) { - osm_pkey_rec_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; const ib_sa_mad_t *p_rcvd_mad; const ib_pkey_table_record_t *p_rcvd_rec; @@ -293,9 +254,9 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) ib_net64_t comp_mask; osm_physp_t *p_req_physp; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_pkey_rec_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_pkey_rec_rcv_process); CL_ASSERT(p_madw); @@ -309,11 +270,11 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) /* we only support SubnAdmGet and SubnAdmGetTable methods */ if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pkey_rec_rcv_process: ERR 4605: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } @@ -323,25 +284,25 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) to trusted requests. Check that the requester is a trusted one. */ - if (p_rcvd_mad->sm_key != p_rcv->p_subn->opt.sm_key) { + if (p_rcvd_mad->sm_key != sa->p_subn->opt.sm_key) { /* This is not a trusted requester! */ - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pkey_rec_rcv_process ERR 4608: " "Request from non-trusted requester: " "Given SM_Key:0x%016" PRIx64 "\n", cl_ntoh64(p_rcvd_mad->sm_key)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_REQ_INVALID); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pkey_rec_rcv_process: ERR 4604: " "Cannot find requester physical port\n"); goto Exit; @@ -354,11 +315,11 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.sa = sa; context.block_num = p_rcvd_rec->block_num; context.p_req_physp = p_req_physp; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_pkey_rec_rcv_process: " "Got Query Lid:0x%04X(%02X), Block:0x%02X(%02X), Port:0x%02X(%02X)\n", cl_ntoh16(p_rcvd_rec->lid), @@ -366,7 +327,7 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) (comp_mask & IB_PKEY_COMPMASK_PORT) != 0, p_rcvd_rec->block_num, (comp_mask & IB_PKEY_COMPMASK_BLOCK) != 0); - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); /* If the user specified a LID, it obviously narrows our @@ -374,16 +335,16 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) */ if (comp_mask & IB_PKEY_COMPMASK_LID) { - p_tbl = &p_rcv->p_subn->port_lid_tbl; + p_tbl = &sa->p_subn->port_lid_tbl; CL_ASSERT(cl_ptr_vector_get_size(p_tbl) < 0x10000); status = - osm_get_port_by_base_lid(p_rcv->p_subn, p_rcvd_rec->lid, + osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, &p_port); if ((status != IB_SUCCESS) || (p_port == NULL)) { status = IB_NOT_FOUND; - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pkey_rec_rcv_process: ERR 460B: " "No port found with LID 0x%x\n", cl_ntoh16(p_rcvd_rec->lid)); @@ -394,15 +355,15 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) /* if we got a unique port - no need for a port search */ if (p_port) /* this does the loop on all the port phys ports */ - __osm_sa_pkey_by_comp_mask(p_rcv, p_port, &context); + __osm_sa_pkey_by_comp_mask(sa, p_port, &context); else { - cl_qmap_apply_func(&p_rcv->p_subn->port_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->port_guid_tbl, __osm_sa_pkey_by_comp_mask_cb, &context); } } - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -412,16 +373,16 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) */ if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pkey_rec_rcv_process: ERR 460A: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -444,7 +405,7 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_pkey_table_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_pkey_rec_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -452,11 +413,11 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_pkey_rec_rcv_process: " "Returning %u records\n", num_rec); if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -464,14 +425,14 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_pkey_table_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pkey_rec_rcv_process: ERR 4606: " "osm_mad_pool_get failed\n"); @@ -481,7 +442,7 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -534,9 +495,9 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pkey_rec_rcv_process: ERR 4607: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -544,5 +505,5 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_portinfo_record.c b/opensm/opensm/osm_sa_portinfo_record.c index ed3684c..16dd852 100644 --- a/opensm/opensm/osm_sa_portinfo_record.c +++ b/opensm/opensm/osm_sa_portinfo_record.c @@ -55,11 +55,10 @@ #include #include #include -#include +#include #include #include #include -#include #include #include #include @@ -73,72 +72,34 @@ typedef struct _osm_pir_search_ctxt { const ib_portinfo_record_t *p_rcvd_rec; ib_net64_t comp_mask; cl_qlist_t *p_list; - osm_pir_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; boolean_t is_enhanced_comp_mask; } osm_pir_search_ctxt_t; /********************************************************************** **********************************************************************/ -void osm_pir_rcv_construct(IN osm_pir_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_pir_rcv_destroy(IN osm_pir_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_pir_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_pir_rcv_init(IN osm_pir_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_pir_rcv_init); - - osm_pir_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static ib_api_status_t -__osm_pir_rcv_new_pir(IN osm_pir_rcv_t * const p_rcv, +__osm_pir_rcv_new_pir(IN osm_sa_t * sa, IN const osm_physp_t * const p_physp, IN cl_qlist_t * const p_list, IN ib_net16_t const lid) { osm_pir_item_t *p_rec_item; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_pir_rcv_new_pir); + OSM_LOG_ENTER(sa->p_log, __osm_pir_rcv_new_pir); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pir_rcv_new_pir: ERR 2102: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_pir_rcv_new_pir: " "New PortInfoRecord: port 0x%016" PRIx64 ", lid 0x%X, port 0x%X\n", @@ -154,14 +115,14 @@ __osm_pir_rcv_new_pir(IN osm_pir_rcv_t * const p_rcv, cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ static void -__osm_sa_pir_create(IN osm_pir_rcv_t * const p_rcv, +__osm_sa_pir_create(IN osm_sa_t * sa, IN const osm_physp_t * const p_physp, IN osm_pir_search_ctxt_t * const p_ctxt) { @@ -171,14 +132,14 @@ __osm_sa_pir_create(IN osm_pir_rcv_t * const p_rcv, uint16_t match_lid_ho; osm_physp_t *p_node_physp; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_pir_create); + OSM_LOG_ENTER(sa->p_log, __osm_sa_pir_create); if (p_physp->p_node->sw) { p_node_physp = osm_node_get_physp_ptr(p_physp->p_node, 0); base_lid_ho = cl_ntoh16(osm_physp_get_base_lid(p_node_physp)); lmc = osm_switch_sp0_is_lmc_capable(p_physp->p_node->sw, - p_rcv-> + sa-> p_subn) ? osm_physp_get_lmc(p_node_physp) : 0; } else { @@ -193,8 +154,8 @@ __osm_sa_pir_create(IN osm_pir_rcv_t * const p_rcv, /* We validate that the lid belongs to this node. */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_pir_create: " "Comparing LID: 0x%X <= 0x%X <= 0x%X\n", base_lid_ho, match_lid_ho, max_lid_ho); @@ -203,17 +164,17 @@ __osm_sa_pir_create(IN osm_pir_rcv_t * const p_rcv, goto Exit; } - __osm_pir_rcv_new_pir(p_rcv, p_physp, p_ctxt->p_list, + __osm_pir_rcv_new_pir(sa, p_physp, p_ctxt->p_list, cl_hton16(base_lid_ho)); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_sa_pir_check_physp(IN osm_pir_rcv_t * const p_rcv, +__osm_sa_pir_check_physp(IN osm_sa_t * sa, IN const osm_physp_t * const p_physp, osm_pir_search_ctxt_t * const p_ctxt) { @@ -222,14 +183,14 @@ __osm_sa_pir_check_physp(IN osm_pir_rcv_t * const p_rcv, const ib_port_info_t *p_comp_pi; const ib_port_info_t *p_pi; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_pir_check_physp); + OSM_LOG_ENTER(sa->p_log, __osm_sa_pir_check_physp); p_rcvd_rec = p_ctxt->p_rcvd_rec; comp_mask = p_ctxt->comp_mask; p_comp_pi = &p_rcvd_rec->port_info; p_pi = &p_physp->port_info; - osm_dump_port_info(p_rcv->p_log, + osm_dump_port_info(sa->p_log, osm_node_get_node_guid(p_physp->p_node), p_physp->port_guid, p_physp->port_num, @@ -436,16 +397,16 @@ __osm_sa_pir_check_physp(IN osm_pir_rcv_t * const p_rcv, goto Exit; } - __osm_sa_pir_create(p_rcv, p_physp, p_ctxt); + __osm_sa_pir_create(sa, p_physp, p_ctxt); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_sa_pir_by_comp_mask(IN osm_pir_rcv_t * const p_rcv, +__osm_sa_pir_by_comp_mask(IN osm_sa_t * sa, IN const osm_node_t * const p_node, osm_pir_search_ctxt_t * const p_ctxt) { @@ -456,7 +417,7 @@ __osm_sa_pir_by_comp_mask(IN osm_pir_rcv_t * const p_rcv, uint8_t num_ports; const osm_physp_t *p_req_physp; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_pir_by_comp_mask); + OSM_LOG_ENTER(sa->p_log, __osm_sa_pir_by_comp_mask); p_rcvd_rec = p_ctxt->p_rcvd_rec; comp_mask = p_ctxt->comp_mask; @@ -472,9 +433,9 @@ __osm_sa_pir_by_comp_mask(IN osm_pir_rcv_t * const p_rcv, /* Check that the p_physp is valid, and that the p_physp and the p_req_physp share a pkey. */ if (osm_physp_is_valid(p_physp) && - osm_physp_share_pkey(p_rcv->p_log, p_req_physp, + osm_physp_share_pkey(sa->p_log, p_req_physp, p_physp)) - __osm_sa_pir_check_physp(p_rcv, p_physp, + __osm_sa_pir_check_physp(sa, p_physp, p_ctxt); } } else { @@ -487,14 +448,14 @@ __osm_sa_pir_by_comp_mask(IN osm_pir_rcv_t * const p_rcv, /* if the requester and the p_physp don't share a pkey - continue */ if (!osm_physp_share_pkey - (p_rcv->p_log, p_req_physp, p_physp)) + (sa->p_log, p_req_physp, p_physp)) continue; - __osm_sa_pir_check_physp(p_rcv, p_physp, p_ctxt); + __osm_sa_pir_check_physp(sa, p_physp, p_ctxt); } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -506,14 +467,14 @@ __osm_sa_pir_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, const osm_node_t *const p_node = (osm_node_t *) p_map_item; osm_pir_search_ctxt_t *const p_ctxt = (osm_pir_search_ctxt_t *) context; - __osm_sa_pir_by_comp_mask(p_ctxt->p_rcv, p_node, p_ctxt); + __osm_sa_pir_by_comp_mask(p_ctxt->sa, p_node, p_ctxt); } /********************************************************************** **********************************************************************/ void osm_pir_rcv_process(IN void *ctx, IN void *data) { - osm_pir_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; const ib_sa_mad_t *p_rcvd_mad; const ib_portinfo_record_t *p_rcvd_rec; @@ -536,9 +497,9 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) osm_physp_t *p_req_physp; boolean_t trusted_req = TRUE; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_pir_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_pir_rcv_process); CL_ASSERT(p_madw); @@ -552,32 +513,32 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) /* we only support SubnAdmGet and SubnAdmGetTable methods */ if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pir_rcv_process: ERR 2105: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pir_rcv_process: ERR 2104: " "Cannot find requester physical port\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_portinfo_record(p_rcv->p_log, p_rcvd_rec, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_portinfo_record(sa->p_log, p_rcvd_rec, OSM_LOG_DEBUG); - p_tbl = &p_rcv->p_subn->port_lid_tbl; + p_tbl = &sa->p_subn->port_lid_tbl; p_pi = &p_rcvd_rec->port_info; cl_qlist_init(&rec_list); @@ -585,12 +546,12 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.sa = sa; context.p_req_physp = p_req_physp; context.is_enhanced_comp_mask = cl_ntoh32(p_rcvd_mad->attr_mod) & (1 << 31); - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); CL_ASSERT(cl_ptr_vector_get_size(p_tbl) < 0x10000); @@ -600,11 +561,11 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) */ if (comp_mask & IB_PIR_COMPMASK_LID) { status = - osm_get_port_by_base_lid(p_rcv->p_subn, p_rcvd_rec->lid, + osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, &p_port); if ((status != IB_SUCCESS) || (p_port == NULL)) { status = IB_NOT_FOUND; - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pir_rcv_process: ERR 2109: " "No port found with LID 0x%x\n", cl_ntoh16(p_rcvd_rec->lid)); @@ -616,7 +577,7 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) cl_ntoh16(p_pi->base_lid)); else { status = IB_NOT_FOUND; - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pir_rcv_process: ERR 2103: " "Given LID (0x%X) is out of range:0x%X\n", cl_ntoh16(p_pi->base_lid), @@ -626,15 +587,15 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) if (status == IB_SUCCESS) { if (p_port) - __osm_sa_pir_by_comp_mask(p_rcv, p_port->p_node, + __osm_sa_pir_by_comp_mask(sa, p_port->p_node, &context); else - cl_qmap_apply_func(&p_rcv->p_subn->node_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->node_guid_tbl, __osm_sa_pir_by_comp_mask_cb, &context); } - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -644,16 +605,16 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) */ if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pir_rcv_process: ERR 2108: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -676,7 +637,7 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_portinfo_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_pir_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -684,11 +645,11 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_pir_rcv_process: " "Returning %u records\n", num_rec); if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -696,13 +657,13 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_portinfo_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pir_rcv_process: ERR 2106: " "osm_mad_pool_get failed\n"); @@ -712,7 +673,7 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; @@ -777,9 +738,9 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_pir_rcv_process: ERR 2107: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -787,5 +748,5 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_response.c b/opensm/opensm/osm_sa_response.c index e47ac1d..d63fa70 100644 --- a/opensm/opensm/osm_sa_response.c +++ b/opensm/opensm/osm_sa_response.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -52,51 +52,15 @@ #include #include #include -#include -#include #include +#include #include #include /********************************************************************** **********************************************************************/ -void osm_sa_resp_construct(IN osm_sa_resp_t * const p_resp) -{ - memset(p_resp, 0, sizeof(*p_resp)); -} - -/********************************************************************** - **********************************************************************/ -void osm_sa_resp_destroy(IN osm_sa_resp_t * const p_resp) -{ - CL_ASSERT(p_resp); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_sa_resp_init(IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_pool, - IN osm_subn_t * const p_subn, IN osm_log_t * const p_log) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_sa_resp_init); - - osm_sa_resp_construct(p_resp); - - p_resp->p_subn = p_subn; - p_resp->p_log = p_log; - p_resp->p_pool = p_pool; - - OSM_LOG_EXIT(p_log); - return (status); -} - -/********************************************************************** - **********************************************************************/ void -osm_sa_send_error(IN osm_sa_resp_t * const p_resp, +osm_sa_send_error(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN const ib_net16_t sa_status) { @@ -105,22 +69,22 @@ osm_sa_send_error(IN osm_sa_resp_t * const p_resp, ib_sa_mad_t *p_sa_mad; ib_api_status_t status; - OSM_LOG_ENTER(p_resp->p_log, osm_sa_send_error); + OSM_LOG_ENTER(sa->p_log, osm_sa_send_error); /* avoid races - if we are exiting - exit */ if (osm_exit_flag) { - osm_log(p_resp->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_sa_send_error: " "Ignoring requested send after exit\n"); goto Exit; } - p_resp_madw = osm_mad_pool_get(p_resp->p_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, MAD_BLOCK_SIZE, &p_madw->mad_addr); if (p_resp_madw == NULL) { - osm_log(p_resp->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sa_send_error: ERR 2301: " "Unable to acquire response MAD\n"); goto Exit; @@ -150,20 +114,20 @@ osm_sa_send_error(IN osm_sa_resp_t * const p_resp, if (p_resp_sa_mad->attr_id == IB_MAD_ATTR_MULTIPATH_RECORD) p_resp_sa_mad->attr_id = IB_MAD_ATTR_PATH_RECORD; - if (osm_log_is_active(p_resp->p_log, OSM_LOG_FRAMES)) - osm_dump_sa_mad(p_resp->p_log, p_resp_sa_mad, OSM_LOG_FRAMES); + if (osm_log_is_active(sa->p_log, OSM_LOG_FRAMES)) + osm_dump_sa_mad(sa->p_log, p_resp_sa_mad, OSM_LOG_FRAMES); status = osm_sa_vendor_send(osm_madw_get_bind_handle(p_resp_madw), - p_resp_madw, FALSE, p_resp->p_subn); + p_resp_madw, FALSE, sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_resp->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sa_send_error: ERR 2302: " "Error sending MAD (%s)\n", ib_get_err_str(status)); - /* osm_mad_pool_put( p_resp->p_pool, p_resp_madw ); */ + /* osm_mad_pool_put( sa->p_mad_pool, p_resp_madw ); */ goto Exit; } Exit: - OSM_LOG_EXIT(p_resp->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_service_record.c b/opensm/opensm/osm_sa_service_record.c index fb0193e..19de389 100644 --- a/opensm/opensm/osm_sa_service_record.c +++ b/opensm/opensm/osm_sa_service_record.c @@ -55,12 +55,10 @@ #include #include #include -#include +#include #include #include #include -#include -#include #include #include #include @@ -75,7 +73,7 @@ typedef struct osm_sr_match_item { cl_qlist_t sr_list; ib_service_record_t *p_service_rec; ib_net64_t comp_mask; - osm_sr_rcv_t *p_rcv; + osm_sa_t *sa; } osm_sr_match_item_t; typedef struct _osm_sr_search_ctxt { @@ -85,53 +83,8 @@ typedef struct _osm_sr_search_ctxt { /********************************************************************** **********************************************************************/ -void osm_sr_rcv_construct(IN osm_sr_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); - cl_timer_construct(&p_rcv->sr_timer); -} - -/********************************************************************** - **********************************************************************/ -void osm_sr_rcv_destroy(IN osm_sr_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_sr_rcv_destroy); - cl_timer_trim(&p_rcv->sr_timer, 1); - cl_timer_destroy(&p_rcv->sr_timer); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_sr_rcv_init(IN osm_sr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - ib_api_status_t status; - - OSM_LOG_ENTER(p_log, osm_sr_rcv_init); - - osm_sr_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - status = cl_timer_init(&p_rcv->sr_timer, osm_sr_rcv_lease_cb, p_rcv); - - OSM_LOG_EXIT(p_rcv->p_log); - return (status); -} - -/********************************************************************** - **********************************************************************/ static boolean_t -__match_service_pkey_with_ports_pkey(IN osm_sr_rcv_t * const p_rcv, +__match_service_pkey_with_ports_pkey(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, ib_service_record_t * const p_service_rec, ib_net64_t const comp_mask) @@ -142,12 +95,12 @@ __match_service_pkey_with_ports_pkey(IN osm_sr_rcv_t * const p_rcv, osm_port_t *service_port; /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__match_service_pkey_with_ports_pkey: ERR 2404: " "Cannot find requester physical port\n"); valid = FALSE; @@ -159,7 +112,7 @@ __match_service_pkey_with_ports_pkey(IN osm_sr_rcv_t * const p_rcv, ServiceGid port (if such exists) */ /* Make sure it matches the p_req_physp */ if (!osm_physp_has_pkey - (p_rcv->p_log, p_service_rec->service_pkey, p_req_physp)) { + (sa->p_log, p_service_rec->service_pkey, p_req_physp)) { valid = FALSE; goto Exit; } @@ -169,9 +122,9 @@ __match_service_pkey_with_ports_pkey(IN osm_sr_rcv_t * const p_rcv, service_guid = p_service_rec->service_gid.unicast.interface_id; service_port = - osm_get_port_by_guid(p_rcv->p_subn, service_guid); + osm_get_port_by_guid(sa->p_subn, service_guid); if (!service_port) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__match_service_pkey_with_ports_pkey: ERR 2405: " "No port object for port 0x%016" PRIx64 "\n", cl_ntoh64(service_guid)); @@ -179,7 +132,7 @@ __match_service_pkey_with_ports_pkey(IN osm_sr_rcv_t * const p_rcv, goto Exit; } /* check on the table of the default physical port of the service port */ - if (!osm_physp_has_pkey(p_rcv->p_log, + if (!osm_physp_has_pkey(sa->p_log, p_service_rec->service_pkey, service_port->p_physp)) { valid = FALSE; @@ -195,12 +148,12 @@ __match_service_pkey_with_ports_pkey(IN osm_sr_rcv_t * const p_rcv, /********************************************************************** **********************************************************************/ static boolean_t -__match_name_to_key_association(IN osm_sr_rcv_t * const p_rcv, +__match_name_to_key_association(IN osm_sa_t * sa, ib_service_record_t * p_service_rec, ib_net64_t comp_mask) { UNUSED_PARAM(p_service_rec); - UNUSED_PARAM(p_rcv); + UNUSED_PARAM(sa); if ((comp_mask & (IB_SR_COMPMASK_SKEY | IB_SR_COMPMASK_SNAME)) == (IB_SR_COMPMASK_SKEY | IB_SR_COMPMASK_SNAME)) { @@ -216,36 +169,36 @@ __match_name_to_key_association(IN osm_sr_rcv_t * const p_rcv, /********************************************************************** **********************************************************************/ static boolean_t -__validate_sr(IN osm_sr_rcv_t * const p_rcv, IN const osm_madw_t * const p_madw) +__validate_sr(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { boolean_t valid = TRUE; ib_sa_mad_t *p_sa_mad; ib_service_record_t *p_recvd_service_rec; - OSM_LOG_ENTER(p_rcv->p_log, __validate_sr); + OSM_LOG_ENTER(sa->p_log, __validate_sr); p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw); p_recvd_service_rec = (ib_service_record_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); - valid = __match_service_pkey_with_ports_pkey(p_rcv, + valid = __match_service_pkey_with_ports_pkey(sa, p_madw, p_recvd_service_rec, p_sa_mad->comp_mask); if (!valid) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_sr: " "No Match for Service Pkey\n"); valid = FALSE; goto Exit; } - valid = __match_name_to_key_association(p_rcv, + valid = __match_name_to_key_association(sa, p_recvd_service_rec, p_sa_mad->comp_mask); if (!valid) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__validate_sr: " "Service Record Name to key matching failed\n"); valid = FALSE; @@ -253,14 +206,14 @@ __validate_sr(IN osm_sr_rcv_t * const p_rcv, IN const osm_madw_t * const p_madw) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return valid; } /********************************************************************** **********************************************************************/ static void -__osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, +__osm_sr_rcv_respond(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw, IN cl_qlist_t * const p_list) { @@ -277,7 +230,7 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, const ib_sa_mad_t *p_rcvd_mad = osm_madw_get_sa_mad_ptr(p_madw); boolean_t trusted_req = TRUE; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sr_rcv_respond); + OSM_LOG_ENTER(sa->p_log, __osm_sr_rcv_respond); num_rec = cl_qlist_count(p_list); @@ -286,11 +239,11 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, * If we do a SubnAdmGet and got more than one record it is an error ! */ if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec > 1)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sr_rcv_respond: ERR 2406: " "Got more than one record for SubnAdmGet (%u).\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -307,7 +260,7 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, trim_num_rec = (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_service_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "__osm_sr_rcv_respond: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -315,8 +268,8 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, } #endif - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sr_rcv_respond: " "Generating response with %u records\n", num_rec); } @@ -324,12 +277,12 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, /* Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_service_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sr_rcv_respond: ERR 2402: " "Unable to allocate MAD\n"); /* Release the quick pool items */ @@ -417,18 +370,18 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv, } status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sr_rcv_respond: ERR 2407: " "Unable to send MAD (%s)\n", ib_get_err_str(status)); - /* osm_mad_pool_put( p_rcv->p_mad_pool, p_resp_madw ); */ + /* osm_mad_pool_put( sa->p_mad_pool, p_resp_madw ); */ goto Exit; } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -635,10 +588,10 @@ __get_matching_sr(IN cl_list_item_t * const p_list_item, IN void *context) If not - then it cannot receive this ServiceRecord. */ /* The check is relevant only if the service_pkey is valid */ if (!ib_pkey_is_invalid(p_svcr->service_record.service_pkey)) { - if (!osm_physp_has_pkey(p_sr_item->p_rcv->p_log, + if (!osm_physp_has_pkey(p_sr_item->sa->p_log, p_svcr->service_record.service_pkey, p_req_physp)) { - osm_log(p_sr_item->p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(p_sr_item->sa->p_log, OSM_LOG_VERBOSE, "__get_matching_sr: " "requester port doesn't have the service_pkey: 0x%X\n", cl_ntoh16(p_svcr->service_record.service_pkey)); @@ -648,7 +601,7 @@ __get_matching_sr(IN cl_list_item_t * const p_list_item, IN void *context) p_sr_pool_item = malloc(sizeof(*p_sr_pool_item)); if (p_sr_pool_item == NULL) { - osm_log(p_sr_item->p_rcv->p_log, OSM_LOG_ERROR, + osm_log(p_sr_item->sa->p_log, OSM_LOG_ERROR, "__get_matching_sr: ERR 2408: " "Unable to acquire Service Record from pool\n"); goto Exit; @@ -665,7 +618,7 @@ __get_matching_sr(IN cl_list_item_t * const p_list_item, IN void *context) /********************************************************************** **********************************************************************/ static void -osm_sr_rcv_process_get_method(IN osm_sr_rcv_t * const p_rcv, +osm_sr_rcv_process_get_method(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { ib_sa_mad_t *p_sa_mad; @@ -674,17 +627,17 @@ osm_sr_rcv_process_get_method(IN osm_sr_rcv_t * const p_rcv, osm_sr_search_ctxt_t context; osm_physp_t *p_req_physp; - OSM_LOG_ENTER(p_rcv->p_log, osm_sr_rcv_process_get_method); + OSM_LOG_ENTER(sa->p_log, osm_sr_rcv_process_get_method); CL_ASSERT(p_madw); /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sr_rcv_process_get_method: ERR 2409: " "Cannot find requester physical port\n"); goto Exit; @@ -694,49 +647,49 @@ osm_sr_rcv_process_get_method(IN osm_sr_rcv_t * const p_rcv, p_recvd_service_rec = (ib_service_record_t *) ib_sa_mad_get_payload_ptr(p_sa_mad); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_dump_service_record(p_rcv->p_log, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_dump_service_record(sa->p_log, p_recvd_service_rec, OSM_LOG_DEBUG); } cl_qlist_init(&sr_match_item.sr_list); sr_match_item.p_service_rec = p_recvd_service_rec; sr_match_item.comp_mask = p_sa_mad->comp_mask; - sr_match_item.p_rcv = p_rcv; + sr_match_item.sa = sa; context.p_sr_item = &sr_match_item; context.p_req_physp = p_req_physp; /* Grab the lock */ - cl_plock_excl_acquire(p_rcv->p_lock); + cl_plock_excl_acquire(sa->p_lock); - cl_qlist_apply_func(&p_rcv->p_subn->sa_sr_list, + cl_qlist_apply_func(&sa->p_subn->sa_sr_list, __get_matching_sr, &context); - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); if ((p_sa_mad->method == IB_MAD_METHOD_GET) && (cl_qlist_count(&sr_match_item.sr_list) == 0)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_sr_rcv_process_get_method: " "No records matched the Service Record query\n"); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } - __osm_sr_rcv_respond(p_rcv, p_madw, &sr_match_item.sr_list); + __osm_sr_rcv_respond(sa, p_madw, &sr_match_item.sr_list); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return; } /********************************************************************** **********************************************************************/ static void -osm_sr_rcv_process_set_method(IN osm_sr_rcv_t * const p_rcv, +osm_sr_rcv_process_set_method(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { ib_sa_mad_t *p_sa_mad; @@ -747,7 +700,7 @@ osm_sr_rcv_process_set_method(IN osm_sr_rcv_t * const p_rcv, osm_sr_item_t *p_sr_item; cl_qlist_t sr_list; - OSM_LOG_ENTER(p_rcv->p_log, osm_sr_rcv_process_set_method); + OSM_LOG_ENTER(sa->p_log, osm_sr_rcv_process_set_method); CL_ASSERT(p_madw); @@ -757,78 +710,78 @@ osm_sr_rcv_process_set_method(IN osm_sr_rcv_t * const p_rcv, comp_mask = p_sa_mad->comp_mask; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_dump_service_record(p_rcv->p_log, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_dump_service_record(sa->p_log, p_recvd_service_rec, OSM_LOG_DEBUG); } if ((comp_mask & (IB_SR_COMPMASK_SID | IB_SR_COMPMASK_SGID)) != (IB_SR_COMPMASK_SID | IB_SR_COMPMASK_SGID)) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_sr_rcv_process_set_method: " "Component Mask RID check failed for METHOD_SET\n"); - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } /* if we were not provided with a service lease make it infinite */ if ((comp_mask & IB_SR_COMPMASK_SLEASE) != IB_SR_COMPMASK_SLEASE) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_sr_rcv_process_set_method: " "ServiceLease Component Mask not set - using infinite lease\n"); p_recvd_service_rec->service_lease = 0xFFFFFFFF; } /* Grab the lock */ - cl_plock_excl_acquire(p_rcv->p_lock); + cl_plock_excl_acquire(sa->p_lock); /* If Record exists with matching RID */ - p_svcr = osm_svcr_get_by_rid(p_rcv->p_subn, - p_rcv->p_log, p_recvd_service_rec); + p_svcr = osm_svcr_get_by_rid(sa->p_subn, + sa->p_log, p_recvd_service_rec); if (p_svcr == NULL) { /* Create the instance of the osm_svcr_t object */ p_svcr = osm_svcr_new(p_recvd_service_rec); if (p_svcr == NULL) { - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sr_rcv_process_set_method: ERR 2411: " "osm_svcr_get_by_rid failed\n"); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } /* Add this new osm_svcr_t object to subnet object */ - osm_svcr_insert_to_db(p_rcv->p_subn, p_rcv->p_log, p_svcr); + osm_svcr_insert_to_db(sa->p_subn, sa->p_log, p_svcr); } else { /* Update the old instance of the osm_svcr_t object */ osm_svcr_init(p_svcr, p_recvd_service_rec); } - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); if (p_recvd_service_rec->service_lease != 0xFFFFFFFF) { #if 0 - cl_timer_trim(&p_rcv->sr_timer, + cl_timer_trim(&sa->sr_timer, p_recvd_service_rec->service_lease * 1000); #endif /* This was a bug since no check was made to see if too long */ /* just make sure the timer works - get a call back within a second */ - cl_timer_trim(&p_rcv->sr_timer, 1000); + cl_timer_trim(&sa->sr_timer, 1000); p_svcr->modified_time = cl_get_time_stamp_sec(); } p_sr_item = malloc(sizeof(*p_sr_item)); if (p_sr_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sr_rcv_process_set_method: ERR 2412: " "Unable to acquire Service record\n"); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -843,17 +796,17 @@ osm_sr_rcv_process_set_method(IN osm_sr_rcv_t * const p_rcv, cl_qlist_insert_tail(&sr_list, &p_sr_item->list_item); - __osm_sr_rcv_respond(p_rcv, p_madw, &sr_list); + __osm_sr_rcv_respond(sa, p_madw, &sr_list); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return; } /********************************************************************** **********************************************************************/ static void -osm_sr_rcv_process_delete_method(IN osm_sr_rcv_t * const p_rcv, +osm_sr_rcv_process_delete_method(IN osm_sa_t * sa, IN const osm_madw_t * const p_madw) { ib_sa_mad_t *p_sa_mad; @@ -863,7 +816,7 @@ osm_sr_rcv_process_delete_method(IN osm_sr_rcv_t * const p_rcv, osm_sr_item_t *p_sr_item; cl_qlist_t sr_list; - OSM_LOG_ENTER(p_rcv->p_log, osm_sr_rcv_process_delete_method); + OSM_LOG_ENTER(sa->p_log, osm_sr_rcv_process_delete_method); CL_ASSERT(p_madw); @@ -873,38 +826,38 @@ osm_sr_rcv_process_delete_method(IN osm_sr_rcv_t * const p_rcv, comp_mask = p_sa_mad->comp_mask; - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_dump_service_record(p_rcv->p_log, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_dump_service_record(sa->p_log, p_recvd_service_rec, OSM_LOG_DEBUG); } /* Grab the lock */ - cl_plock_excl_acquire(p_rcv->p_lock); + cl_plock_excl_acquire(sa->p_lock); /* If Record exists with matching RID */ - p_svcr = osm_svcr_get_by_rid(p_rcv->p_subn, - p_rcv->p_log, p_recvd_service_rec); + p_svcr = osm_svcr_get_by_rid(sa->p_subn, + sa->p_log, p_recvd_service_rec); if (p_svcr == NULL) { - cl_plock_release(p_rcv->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + cl_plock_release(sa->p_lock); + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_sr_rcv_process_delete_method: " "No records matched the RID\n"); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } else { - osm_svcr_remove_from_db(p_rcv->p_subn, p_rcv->p_log, p_svcr); + osm_svcr_remove_from_db(sa->p_subn, sa->p_log, p_svcr); } - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); p_sr_item = malloc(sizeof(*p_sr_item)); if (p_sr_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sr_rcv_process_delete_method: ERR 2413: " "Unable to acquire Service record\n"); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -918,10 +871,10 @@ osm_sr_rcv_process_delete_method(IN osm_sr_rcv_t * const p_rcv, if (p_svcr) osm_svcr_delete(p_svcr); - __osm_sr_rcv_respond(p_rcv, p_madw, &sr_list); + __osm_sr_rcv_respond(sa, p_madw, &sr_list); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return; } @@ -929,13 +882,13 @@ osm_sr_rcv_process_delete_method(IN osm_sr_rcv_t * const p_rcv, **********************************************************************/ void osm_sr_rcv_process(IN void *context, IN void *data) { - osm_sr_rcv_t *p_rcv = context; + osm_sa_t *sa = context; osm_madw_t *p_madw = data; ib_sa_mad_t *p_sa_mad; ib_net16_t sa_status = IB_SA_MAD_STATUS_REQ_INVALID; boolean_t valid; - OSM_LOG_ENTER(p_rcv->p_log, osm_sr_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_sr_rcv_process); CL_ASSERT(p_madw); @@ -945,43 +898,43 @@ void osm_sr_rcv_process(IN void *context, IN void *data) switch (p_sa_mad->method) { case IB_MAD_METHOD_SET: - valid = __validate_sr(p_rcv, p_madw); + valid = __validate_sr(sa, p_madw); if (!valid) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_sr_rcv_process: " "Component Mask check failed for set request\n"); - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } - osm_sr_rcv_process_set_method(p_rcv, p_madw); + osm_sr_rcv_process_set_method(sa, p_madw); break; case IB_MAD_METHOD_DELETE: - valid = __validate_sr(p_rcv, p_madw); + valid = __validate_sr(sa, p_madw); if (!valid) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_sr_rcv_process: " "Component Mask check failed for delete request\n"); - osm_sa_send_error(p_rcv->p_resp, p_madw, sa_status); + osm_sa_send_error(sa, p_madw, sa_status); goto Exit; } - osm_sr_rcv_process_delete_method(p_rcv, p_madw); + osm_sr_rcv_process_delete_method(sa, p_madw); break; case IB_MAD_METHOD_GET: case IB_MAD_METHOD_GETTABLE: - osm_sr_rcv_process_get_method(p_rcv, p_madw); + osm_sr_rcv_process_get_method(sa, p_madw); break; default: - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_sr_rcv_process: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_sa_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); break; } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return; } @@ -989,7 +942,7 @@ void osm_sr_rcv_process(IN void *context, IN void *data) **********************************************************************/ void osm_sr_rcv_lease_cb(IN void *context) { - osm_sr_rcv_t *p_rcv = (osm_sr_rcv_t *) context; + osm_sa_t *sa = context; cl_list_item_t *p_list_item; cl_list_item_t *p_next_list_item; osm_svcr_t *p_svcr; @@ -997,13 +950,13 @@ void osm_sr_rcv_lease_cb(IN void *context) uint32_t elapsed_time; uint32_t trim_time = 20; /* maxiaml timer refresh is 20 seconds */ - OSM_LOG_ENTER(p_rcv->p_log, osm_sr_rcv_lease_cb); + OSM_LOG_ENTER(sa->p_log, osm_sr_rcv_lease_cb); - cl_plock_excl_acquire(p_rcv->p_lock); + cl_plock_excl_acquire(sa->p_lock); - p_list_item = cl_qlist_head(&p_rcv->p_subn->sa_sr_list); + p_list_item = cl_qlist_head(&sa->p_subn->sa_sr_list); - while (p_list_item != cl_qlist_end(&p_rcv->p_subn->sa_sr_list)) { + while (p_list_item != cl_qlist_end(&sa->p_subn->sa_sr_list)) { p_svcr = (osm_svcr_t *) p_list_item; if (p_svcr->service_record.service_lease == 0xFFFFFFFF) { @@ -1027,7 +980,7 @@ void osm_sr_rcv_lease_cb(IN void *context) */ p_svcr->lease_period -= elapsed_time; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_sr_rcv_lease_cb: " "Remaining time for Service Name:%s is:0x%X\n", p_svcr->service_record.service_name, @@ -1049,8 +1002,8 @@ void osm_sr_rcv_lease_cb(IN void *context) p_next_list_item = cl_qlist_next(p_list_item); /* Remove the service Record */ - osm_svcr_remove_from_db(p_rcv->p_subn, - p_rcv->p_log, p_svcr); + osm_svcr_remove_from_db(sa->p_subn, + sa->p_log, p_svcr); osm_svcr_delete(p_svcr); @@ -1060,11 +1013,11 @@ void osm_sr_rcv_lease_cb(IN void *context) } /* Release the Lock */ - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); if (trim_time != 0xFFFFFFFF) { - cl_timer_trim(&p_rcv->sr_timer, trim_time * 1000); /* Convert to milli seconds */ + cl_timer_trim(&sa->sr_timer, trim_time * 1000); /* Convert to milli seconds */ } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_slvl_record.c b/opensm/opensm/osm_sa_slvl_record.c index fd48296..87891b0 100644 --- a/opensm/opensm/osm_sa_slvl_record.c +++ b/opensm/opensm/osm_sa_slvl_record.c @@ -55,10 +55,9 @@ #include #include #include -#include +#include #include #include -#include #include #include #include @@ -73,52 +72,14 @@ typedef struct _osm_slvl_search_ctxt { ib_net64_t comp_mask; uint8_t in_port_num; cl_qlist_t *p_list; - osm_slvl_rec_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; } osm_slvl_search_ctxt_t; /********************************************************************** **********************************************************************/ -void osm_slvl_rec_rcv_construct(IN osm_slvl_rec_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_slvl_rec_rcv_destroy(IN osm_slvl_rec_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_slvl_rec_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_slvl_rec_rcv_init(IN osm_slvl_rec_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_slvl_rec_rcv_init); - - osm_slvl_rec_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static void -__osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv, +__osm_sa_slvl_create(IN osm_sa_t * sa, IN const osm_physp_t * const p_physp, IN osm_slvl_search_ctxt_t * const p_ctxt, IN uint8_t in_port_idx) @@ -127,11 +88,11 @@ __osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv, uint16_t lid; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_slvl_create); + OSM_LOG_ENTER(sa->p_log, __osm_sa_slvl_create); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sa_slvl_create: ERR 2602: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; @@ -143,8 +104,8 @@ __osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv, else lid = osm_node_get_base_lid(p_physp->p_node, 0); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_slvl_create: " "New SLtoVL Map for: OUT port 0x%016" PRIx64 ", lid 0x%X, port 0x%X to In Port:%u\n", @@ -163,13 +124,13 @@ __osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv, cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_sa_slvl_by_comp_mask(IN osm_slvl_rec_rcv_t * const p_rcv, +__osm_sa_slvl_by_comp_mask(IN osm_sa_t * sa, IN const osm_port_t * const p_port, osm_slvl_search_ctxt_t * const p_ctxt) { @@ -182,7 +143,7 @@ __osm_sa_slvl_by_comp_mask(IN osm_slvl_rec_rcv_t * const p_rcv, uint8_t out_port_start, out_port_end; const osm_physp_t *p_req_physp; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_slvl_by_comp_mask); + OSM_LOG_ENTER(sa->p_log, __osm_sa_slvl_by_comp_mask); p_rcvd_rec = p_ctxt->p_rcvd_rec; comp_mask = p_ctxt->comp_mask; @@ -194,15 +155,15 @@ __osm_sa_slvl_by_comp_mask(IN osm_slvl_rec_rcv_t * const p_rcv, p_req_physp = p_ctxt->p_req_physp; if (p_port->p_node->node_info.node_type != IB_NODE_TYPE_SWITCH) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_slvl_by_comp_mask: " "Using Physical Default Port Number: 0x%X (for End Node)\n", p_port->p_physp->port_num); p_out_physp = p_port->p_physp; /* check that the p_out_physp and the p_req_physp share a pkey */ if (osm_physp_share_pkey - (p_rcv->p_log, p_req_physp, p_out_physp)) - __osm_sa_slvl_create(p_rcv, p_out_physp, p_ctxt, 0); + (sa->p_log, p_req_physp, p_out_physp)) + __osm_sa_slvl_create(sa, p_out_physp, p_ctxt, 0); } else { if (comp_mask & IB_SLVL_COMPMASK_OUT_PORT) out_port_start = out_port_end = @@ -234,15 +195,15 @@ __osm_sa_slvl_by_comp_mask(IN osm_slvl_rec_rcv_t * const p_rcv, /* if the requester and the p_out_physp don't share a pkey - continue */ if (!osm_physp_share_pkey - (p_rcv->p_log, p_req_physp, p_out_physp)) + (sa->p_log, p_req_physp, p_out_physp)) continue; - __osm_sa_slvl_create(p_rcv, p_out_physp, p_ctxt, + __osm_sa_slvl_create(sa, p_out_physp, p_ctxt, in_port_num); } } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -255,14 +216,14 @@ __osm_sa_slvl_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, osm_slvl_search_ctxt_t *const p_ctxt = (osm_slvl_search_ctxt_t *) context; - __osm_sa_slvl_by_comp_mask(p_ctxt->p_rcv, p_port, p_ctxt); + __osm_sa_slvl_by_comp_mask(p_ctxt->sa, p_port, p_ctxt); } /********************************************************************** **********************************************************************/ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) { - osm_slvl_rec_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; const ib_sa_mad_t *p_rcvd_mad; const ib_slvl_table_record_t *p_rcvd_rec; @@ -284,9 +245,9 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) ib_net64_t comp_mask; osm_physp_t *p_req_physp; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_slvl_rec_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_slvl_rec_rcv_process); CL_ASSERT(p_madw); @@ -300,22 +261,22 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) /* we only support SubnAdmGet and SubnAdmGetTable methods */ if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_slvl_rec_rcv_process: ERR 2604: " "Unsupported Method (%s)\n", ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_slvl_rec_rcv_process: ERR 2603: " "Cannot find requester physical port\n"); goto Exit; @@ -328,13 +289,13 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.sa = sa; context.in_port_num = p_rcvd_rec->in_port_num; context.p_req_physp = p_req_physp; - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_slvl_rec_rcv_process: " "Got Query Lid:0x%04X(%02X), In-Port:0x%02X(%02X), Out-Port:0x%02X(%02X)\n", cl_ntoh16(p_rcvd_rec->lid), @@ -350,16 +311,16 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) */ if (comp_mask & IB_SLVL_COMPMASK_LID) { - p_tbl = &p_rcv->p_subn->port_lid_tbl; + p_tbl = &sa->p_subn->port_lid_tbl; CL_ASSERT(cl_ptr_vector_get_size(p_tbl) < 0x10000); status = - osm_get_port_by_base_lid(p_rcv->p_subn, p_rcvd_rec->lid, + osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, &p_port); if ((status != IB_SUCCESS) || (p_port == NULL)) { status = IB_NOT_FOUND; - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_slvl_rec_rcv_process: ERR 2608: " "No port found with LID 0x%x\n", cl_ntoh16(p_rcvd_rec->lid)); @@ -370,14 +331,14 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) /* if we have a unique port - no need for a port search */ if (p_port) /* this does the loop on all the port phys ports */ - __osm_sa_slvl_by_comp_mask(p_rcv, p_port, &context); + __osm_sa_slvl_by_comp_mask(sa, p_port, &context); else - cl_qmap_apply_func(&p_rcv->p_subn->port_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->port_guid_tbl, __osm_sa_slvl_by_comp_mask_cb, &context); } - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -387,16 +348,16 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) */ if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_slvl_rec_rcv_process: ERR 2607: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -419,7 +380,7 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_slvl_table_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_slvl_rec_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -427,11 +388,11 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_slvl_rec_rcv_process: " "Returning %u records\n", num_rec); if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -439,14 +400,14 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_slvl_table_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_slvl_rec_rcv_process: ERR 2605: " "osm_mad_pool_get failed\n"); @@ -456,7 +417,7 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -509,9 +470,9 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_slvl_rec_rcv_process: ERR 2606: " "osm_vendor_send status = %s\n", ib_get_err_str(status)); @@ -519,5 +480,5 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_sminfo_record.c b/opensm/opensm/osm_sa_sminfo_record.c index 6f84ac7..b89173c 100644 --- a/opensm/opensm/osm_sa_sminfo_record.c +++ b/opensm/opensm/osm_sa_sminfo_record.c @@ -55,20 +55,18 @@ #include #include #include -#include -#include +#include #include #include #include #include -#include -#include #include #include #include #include #include #include +#include typedef struct _osm_smir_item { cl_list_item_t list_item; @@ -79,53 +77,12 @@ typedef struct _osm_smir_search_ctxt { const ib_sminfo_record_t *p_rcvd_rec; ib_net64_t comp_mask; cl_qlist_t *p_list; - osm_smir_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; } osm_smir_search_ctxt_t; -/********************************************************************** - **********************************************************************/ -void osm_smir_rcv_construct(IN osm_smir_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_smir_rcv_destroy(IN osm_smir_rcv_t * const p_rcv) -{ - CL_ASSERT(p_rcv); - OSM_LOG_ENTER(p_rcv->p_log, osm_smir_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_smir_rcv_init(IN osm_smir_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_stats_t * const p_stats, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_smir_rcv_init); - - osm_smir_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_stats = p_stats; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_rcv->p_log); - return IB_SUCCESS; -} - static ib_api_status_t -__osm_smir_rcv_new_smir(IN osm_smir_rcv_t * const p_rcv, +__osm_smir_rcv_new_smir(IN osm_sa_t * sa, IN const osm_port_t * const p_port, IN cl_qlist_t * const p_list, IN ib_net64_t const guid, @@ -136,19 +93,19 @@ __osm_smir_rcv_new_smir(IN osm_smir_rcv_t * const p_rcv, osm_smir_item_t *p_rec_item; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_smir_rcv_new_smir); + OSM_LOG_ENTER(sa->p_log, __osm_smir_rcv_new_smir); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_smir_rcv_new_smir: ERR 2801: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_smir_rcv_new_smir: " "New SMInfo: GUID 0x%016" PRIx64 "\n", cl_ntoh64(guid) ); @@ -163,14 +120,14 @@ __osm_smir_rcv_new_smir(IN osm_smir_rcv_t * const p_rcv, cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ static void -__osm_sa_smir_by_comp_mask(IN osm_smir_rcv_t * const p_rcv, +__osm_sa_smir_by_comp_mask(IN osm_sa_t * sa, IN const osm_remote_sm_t * const p_rem_sm, osm_smir_search_ctxt_t * const p_ctxt) { @@ -178,7 +135,7 @@ __osm_sa_smir_by_comp_mask(IN osm_smir_rcv_t * const p_rcv, const osm_physp_t *const p_req_physp = p_ctxt->p_req_physp; ib_net64_t const comp_mask = p_ctxt->comp_mask; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_smir_by_comp_mask); + OSM_LOG_ENTER(sa->p_log, __osm_sa_smir_by_comp_mask); if (comp_mask & IB_SMIR_COMPMASK_GUID) { if (p_rem_sm->smi.guid != p_rcvd_rec->sm_info.guid) @@ -199,13 +156,13 @@ __osm_sa_smir_by_comp_mask(IN osm_smir_rcv_t * const p_rcv, /* Implement any other needed search cases */ - __osm_smir_rcv_new_smir(p_rcv, p_rem_sm->p_port, p_ctxt->p_list, + __osm_smir_rcv_new_smir(sa, p_rem_sm->p_port, p_ctxt->p_list, p_rem_sm->smi.guid, p_rem_sm->smi.act_count, p_rem_sm->smi.pri_state, p_req_physp); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -218,16 +175,16 @@ __osm_sa_smir_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, osm_smir_search_ctxt_t *const p_ctxt = (osm_smir_search_ctxt_t *) context; - __osm_sa_smir_by_comp_mask(p_ctxt->p_rcv, p_rem_sm, p_ctxt); + __osm_sa_smir_by_comp_mask(p_ctxt->sa, p_rem_sm, p_ctxt); } /********************************************************************** **********************************************************************/ void osm_smir_rcv_process(IN void *ctx, IN void *data) { - osm_smir_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; - const ib_sa_mad_t *p_rcvd_mad; + const ib_sa_mad_t *sad_mad; const ib_sminfo_record_t *p_rcvd_rec; const cl_qmap_t *p_tbl; const osm_port_t *p_port = NULL; @@ -252,59 +209,59 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) cl_qmap_t *p_sm_guid_tbl; uint8_t pri_state; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_smir_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_smir_rcv_process); CL_ASSERT(p_madw); - p_rcvd_mad = osm_madw_get_sa_mad_ptr(p_madw); + sad_mad = osm_madw_get_sa_mad_ptr(p_madw); p_rcvd_rec = - (ib_sminfo_record_t *) ib_sa_mad_get_payload_ptr(p_rcvd_mad); - comp_mask = p_rcvd_mad->comp_mask; + (ib_sminfo_record_t *) ib_sa_mad_get_payload_ptr(sad_mad); + comp_mask = sad_mad->comp_mask; - CL_ASSERT(p_rcvd_mad->attr_id == IB_MAD_ATTR_SMINFO_RECORD); + CL_ASSERT(sad_mad->attr_id == IB_MAD_ATTR_SMINFO_RECORD); /* we only support SubnAdmGet and SubnAdmGetTable methods */ - if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && - (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + if ((sad_mad->method != IB_MAD_METHOD_GET) && + (sad_mad->method != IB_MAD_METHOD_GETTABLE)) { + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_smir_rcv_process: ERR 2804: " "Unsupported Method (%s)\n", - ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + ib_get_sa_method_str(sad_mad->method)); + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_smir_rcv_process: ERR 2803: " "Cannot find requester physical port\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_sm_info_record(p_rcv->p_log, p_rcvd_rec, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_sm_info_record(sa->p_log, p_rcvd_rec, OSM_LOG_DEBUG); - p_tbl = &p_rcv->p_subn->sm_guid_tbl; + p_tbl = &sa->p_subn->sm_guid_tbl; p_smi = &p_rcvd_rec->sm_info; cl_qlist_init(&rec_list); context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; - context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.comp_mask = sad_mad->comp_mask; + context.sa = sa; context.p_req_physp = p_req_physp; - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); /* If the user specified a LID, it obviously narrows our @@ -312,11 +269,11 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) */ if (comp_mask & IB_SMIR_COMPMASK_LID) { status = - osm_get_port_by_base_lid(p_rcv->p_subn, p_rcvd_rec->lid, + osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, &p_port); if ((status != IB_SUCCESS) || (p_port == NULL)) { status = IB_NOT_FOUND; - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_smir_rcv_process: ERR 2806: " "No port found with LID 0x%x\n", cl_ntoh16(p_rcvd_rec->lid)); @@ -326,23 +283,23 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) if (status == IB_SUCCESS) { /* Handle our own SM first */ local_port = - osm_get_port_by_guid(p_rcv->p_subn, - p_rcv->p_subn->sm_port_guid); + osm_get_port_by_guid(sa->p_subn, + sa->p_subn->sm_port_guid); if (!local_port) { - cl_plock_release(p_rcv->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + cl_plock_release(sa->p_lock); + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_smir_rcv_process: ERR 2809: " "No port found with GUID 0x%016" PRIx64 "\n", - cl_ntoh64(p_rcv->p_subn->sm_port_guid)); + cl_ntoh64(sa->p_subn->sm_port_guid)); goto Exit; } if (!p_port || local_port == p_port) { if (FALSE == - osm_physp_share_pkey(p_rcv->p_log, p_req_physp, + osm_physp_share_pkey(sa->p_log, p_req_physp, local_port->p_physp)) { - cl_plock_release(p_rcv->p_lock); - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + cl_plock_release(sa->p_lock); + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_smir_rcv_process: ERR 2805: " "Cannot get SMInfo record due to pkey violation\n"); goto Exit; @@ -350,29 +307,28 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) /* Check that other search components specified match */ if (comp_mask & IB_SMIR_COMPMASK_GUID) { - if (p_rcv->p_subn->sm_port_guid != p_smi->guid) + if (sa->p_subn->sm_port_guid != p_smi->guid) goto Remotes; } if (comp_mask & IB_SMIR_COMPMASK_PRIORITY) { - if (p_rcv->p_subn->opt.sm_priority != + if (sa->p_subn->opt.sm_priority != ib_sminfo_get_priority(p_smi)) goto Remotes; } if (comp_mask & IB_SMIR_COMPMASK_SMSTATE) { - if (p_rcv->p_subn->sm_state != + if (sa->p_subn->sm_state != ib_sminfo_get_state(p_smi)) goto Remotes; } /* Now, add local SMInfo to list */ - pri_state = p_rcv->p_subn->sm_state & 0x0F; + pri_state = sa->p_subn->sm_state & 0x0F; pri_state |= - (p_rcv->p_subn->opt.sm_priority & 0x0F) << 4; - __osm_smir_rcv_new_smir(p_rcv, local_port, + (sa->p_subn->opt.sm_priority & 0x0F) << 4; + __osm_smir_rcv_new_smir(sa, local_port, context.p_list, - p_rcv->p_subn->sm_port_guid, - cl_ntoh32(p_rcv->p_stats-> - qp0_mads_sent), + sa->p_subn->sm_port_guid, + cl_ntoh32(sa->p_subn->p_osm->stats.qp0_mads_sent), pri_state, p_req_physp); } @@ -380,29 +336,29 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) if (p_port && p_port != local_port) { /* Find remote SM corresponding to p_port */ port_guid = osm_port_get_guid(p_port); - p_sm_guid_tbl = &p_rcv->p_subn->sm_guid_tbl; + p_sm_guid_tbl = &sa->p_subn->sm_guid_tbl; p_rem_sm = (osm_remote_sm_t *) cl_qmap_get(p_sm_guid_tbl, port_guid); if (p_rem_sm != (osm_remote_sm_t *) cl_qmap_end(p_sm_guid_tbl)) - __osm_sa_smir_by_comp_mask(p_rcv, p_rem_sm, + __osm_sa_smir_by_comp_mask(sa, p_rem_sm, &context); else { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_smir_rcv_process: ERR 280A: " "No remote SM for GUID 0x%016" PRIx64 "\n", cl_ntoh64(port_guid)); } } else { /* Go over all other known (remote) SMs */ - cl_qmap_apply_func(&p_rcv->p_subn->sm_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->sm_guid_tbl, __osm_sa_smir_by_comp_mask_cb, &context); } } - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -410,18 +366,18 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) * C15-0.1.30: * If we do a SubnAdmGet and got more than one record it is an error ! */ - if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { + if (sad_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_smir_rcv_process: ERR 2808: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -443,7 +399,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) trim_num_rec = (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_sminfo_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_smir_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -451,11 +407,11 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_smir_rcv_process: " "Returning %u records\n", num_rec); - if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + if ((sad_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -463,13 +419,13 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_sminfo_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_smir_rcv_process: ERR 2807: " "osm_mad_pool_get failed\n"); @@ -479,7 +435,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; @@ -493,7 +449,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) Then copy all records from the list into the response payload. */ - memcpy(p_resp_sa_mad, p_rcvd_mad, IB_SA_MAD_HDR_SIZE); + memcpy(p_resp_sa_mad, sad_mad, IB_SA_MAD_HDR_SIZE); p_resp_sa_mad->method |= IB_MAD_METHOD_RESP_MASK; /* C15-0.1.5 - always return SM_Key = 0 (table 185 p 884) */ p_resp_sa_mad->sm_key = 0; @@ -534,14 +490,14 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_smir_rcv_process: ERR 2802: " "Error sending MAD (%s)\n", ib_get_err_str(status)); goto Exit; } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_sw_info_record.c b/opensm/opensm/osm_sa_sw_info_record.c index a9947e1..f1eddda 100644 --- a/opensm/opensm/osm_sa_sw_info_record.c +++ b/opensm/opensm/osm_sa_sw_info_record.c @@ -52,16 +52,12 @@ #include #include #include -#include -#include #include +#include #include #include #include -#define OSM_SIR_RCV_POOL_MIN_SIZE 32 -#define OSM_SIR_RCV_POOL_GROW_SIZE 32 - typedef struct _osm_sir_item { cl_list_item_t list_item; ib_switch_info_record_t rec; @@ -71,71 +67,33 @@ typedef struct _osm_sir_search_ctxt { const ib_switch_info_record_t *p_rcvd_rec; ib_net64_t comp_mask; cl_qlist_t *p_list; - osm_sir_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; } osm_sir_search_ctxt_t; /********************************************************************** **********************************************************************/ -void osm_sir_rcv_construct(IN osm_sir_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_sir_rcv_destroy(IN osm_sir_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_sir_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_sir_rcv_init(IN osm_sir_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_sir_rcv_init); - - osm_sir_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static ib_api_status_t -__osm_sir_rcv_new_sir(IN osm_sir_rcv_t * const p_rcv, +__osm_sir_rcv_new_sir(IN osm_sa_t * sa, IN const osm_switch_t * const p_sw, IN cl_qlist_t * const p_list, IN ib_net16_t const lid) { osm_sir_item_t *p_rec_item; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sir_rcv_new_sir); + OSM_LOG_ENTER(sa->p_log, __osm_sir_rcv_new_sir); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sir_rcv_new_sir: ERR 5308: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sir_rcv_new_sir: " "New SwitchInfoRecord: lid 0x%X\n", cl_ntoh16(lid) ); @@ -148,35 +106,35 @@ __osm_sir_rcv_new_sir(IN osm_sir_rcv_t * const p_rcv, cl_qlist_insert_tail(p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); return (status); } /********************************************************************** **********************************************************************/ -static osm_port_t *__osm_sir_get_port_by_guid(IN osm_sir_rcv_t * const p_rcv, +static osm_port_t *__osm_sir_get_port_by_guid(IN osm_sa_t * sa, IN uint64_t port_guid) { osm_port_t *p_port; - CL_PLOCK_ACQUIRE(p_rcv->p_lock); + CL_PLOCK_ACQUIRE(sa->p_lock); - p_port = osm_get_port_by_guid(p_rcv->p_subn, port_guid); + p_port = osm_get_port_by_guid(sa->p_subn, port_guid); if (!p_port) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sir_get_port_by_guid ERR 5309: " "Invalid port GUID 0x%016" PRIx64 "\n", port_guid); p_port = NULL; } - CL_PLOCK_RELEASE(p_rcv->p_lock); + CL_PLOCK_RELEASE(sa->p_lock); return p_port; } /********************************************************************** **********************************************************************/ static void -__osm_sir_rcv_create_sir(IN osm_sir_rcv_t * const p_rcv, +__osm_sir_rcv_create_sir(IN osm_sa_t * sa, IN const osm_switch_t * const p_sw, IN cl_qlist_t * const p_list, IN ib_net16_t const match_lid, @@ -188,10 +146,10 @@ __osm_sir_rcv_create_sir(IN osm_sir_rcv_t * const p_rcv, ib_net16_t min_lid_ho; ib_net16_t max_lid_ho; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sir_rcv_create_sir); + OSM_LOG_ENTER(sa->p_log, __osm_sir_rcv_create_sir); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sir_rcv_create_sir: " "Looking for SwitchInfoRecord with LID: 0x%X\n", cl_ntoh16(match_lid) @@ -200,10 +158,10 @@ __osm_sir_rcv_create_sir(IN osm_sir_rcv_t * const p_rcv, /* In switches, the port guid is the node guid. */ p_port = - __osm_sir_get_port_by_guid(p_rcv, + __osm_sir_get_port_by_guid(sa, p_sw->p_node->node_info.port_guid); if (!p_port) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sir_rcv_create_sir: ERR 530A: " "Failed to find Port by Node Guid:0x%016" PRIx64 "\n", cl_ntoh64(p_sw->p_node->node_info.node_guid) @@ -215,7 +173,7 @@ __osm_sir_rcv_create_sir(IN osm_sir_rcv_t * const p_rcv, the same partition. */ p_physp = p_port->p_physp; if (!p_physp) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sir_rcv_create_sir: ERR 530B: " "Failed to find default physical Port by Node Guid:0x%016" PRIx64 "\n", @@ -223,7 +181,7 @@ __osm_sir_rcv_create_sir(IN osm_sir_rcv_t * const p_rcv, ); goto Exit; } - if (!osm_physp_share_pkey(p_rcv->p_log, p_req_physp, p_physp)) + if (!osm_physp_share_pkey(sa->p_log, p_req_physp, p_physp)) goto Exit; /* get the port 0 of the switch */ @@ -234,8 +192,8 @@ __osm_sir_rcv_create_sir(IN osm_sir_rcv_t * const p_rcv, /* We validate that the lid belongs to this switch. */ - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) { - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) { + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sir_rcv_create_sir: " "Comparing LID: 0x%X <= 0x%X <= 0x%X\n", min_lid_ho, match_lid_ho, max_lid_ho); @@ -246,11 +204,11 @@ __osm_sir_rcv_create_sir(IN osm_sir_rcv_t * const p_rcv, } - __osm_sir_rcv_new_sir(p_rcv, p_sw, p_list, + __osm_sir_rcv_new_sir(sa, p_sw, p_list, osm_port_get_base_lid(p_port)); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -264,13 +222,13 @@ __osm_sir_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, const osm_switch_t *const p_sw = (osm_switch_t *) p_map_item; const ib_switch_info_record_t *const p_rcvd_rec = p_ctxt->p_rcvd_rec; const osm_physp_t *const p_req_physp = p_ctxt->p_req_physp; - osm_sir_rcv_t *const p_rcv = p_ctxt->p_rcv; + osm_sa_t *sa = p_ctxt->sa; ib_net64_t const comp_mask = p_ctxt->comp_mask; ib_net16_t match_lid = 0; - OSM_LOG_ENTER(p_ctxt->p_rcv->p_log, __osm_sir_rcv_by_comp_mask); + OSM_LOG_ENTER(p_ctxt->sa->p_log, __osm_sir_rcv_by_comp_mask); - osm_dump_switch_info(p_ctxt->p_rcv->p_log, + osm_dump_switch_info(p_ctxt->sa->p_log, &p_sw->switch_info, OSM_LOG_VERBOSE); if (comp_mask & IB_SWIR_COMPMASK_LID) { @@ -279,20 +237,20 @@ __osm_sir_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, goto Exit; } - __osm_sir_rcv_create_sir(p_rcv, p_sw, p_ctxt->p_list, + __osm_sir_rcv_create_sir(sa, p_sw, p_ctxt->p_list, match_lid, p_req_physp); Exit: - OSM_LOG_EXIT(p_ctxt->p_rcv->p_log); + OSM_LOG_EXIT(p_ctxt->sa->p_log); } /********************************************************************** **********************************************************************/ void osm_sir_rcv_process(IN void *ctx, IN void *data) { - osm_sir_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; - const ib_sa_mad_t *p_rcvd_mad; + const ib_sa_mad_t *sad_mad; const ib_switch_info_record_t *p_rcvd_rec; ib_switch_info_record_t *p_resp_rec; cl_qlist_t rec_list; @@ -308,61 +266,61 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) ib_api_status_t status; osm_physp_t *p_req_physp; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_sir_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_sir_rcv_process); CL_ASSERT(p_madw); - p_rcvd_mad = osm_madw_get_sa_mad_ptr(p_madw); + sad_mad = osm_madw_get_sa_mad_ptr(p_madw); p_rcvd_rec = - (ib_switch_info_record_t *) ib_sa_mad_get_payload_ptr(p_rcvd_mad); + (ib_switch_info_record_t *) ib_sa_mad_get_payload_ptr(sad_mad); - CL_ASSERT(p_rcvd_mad->attr_id == IB_MAD_ATTR_SWITCH_INFO_RECORD); + CL_ASSERT(sad_mad->attr_id == IB_MAD_ATTR_SWITCH_INFO_RECORD); /* we only support SubnAdmGet and SubnAdmGetTable methods */ - if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && - (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + if ((sad_mad->method != IB_MAD_METHOD_GET) && + (sad_mad->method != IB_MAD_METHOD_GETTABLE)) { + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sir_rcv_process: ERR 5305: " "Unsupported Method (%s)\n", - ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + ib_get_sa_method_str(sad_mad->method)); + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sir_rcv_process: ERR 5304: " "Cannot find requester physical port\n"); goto Exit; } - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_dump_switch_info_record(p_rcv->p_log, p_rcvd_rec, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_dump_switch_info_record(sa->p_log, p_rcvd_rec, OSM_LOG_DEBUG); cl_qlist_init(&rec_list); context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; - context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.comp_mask = sad_mad->comp_mask; + context.sa = sa; context.p_req_physp = p_req_physp; - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); /* Go over all switches */ - cl_qmap_apply_func(&p_rcv->p_subn->sw_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->sw_guid_tbl, __osm_sir_rcv_by_comp_mask, &context); - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -370,12 +328,12 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) * C15-0.1.30: * If we do a SubnAdmGet and got more than one record it is an error ! */ - if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec > 1)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + if ((sad_mad->method == IB_MAD_METHOD_GET) && (num_rec > 1)) { + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sir_rcv_process: ERR 5303: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -396,7 +354,7 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_switch_info_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_sir_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -404,11 +362,11 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_sir_rcv_process: " "Returning %u records\n", num_rec); - if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + if ((sad_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -416,14 +374,14 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_switch_info_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sir_rcv_process: ERR 5306: " "osm_mad_pool_get failed\n"); @@ -433,7 +391,7 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -446,7 +404,7 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) Then copy all records from the list into the response payload. */ - memcpy(p_resp_sa_mad, p_rcvd_mad, IB_SA_MAD_HDR_SIZE); + memcpy(p_resp_sa_mad, sad_mad, IB_SA_MAD_HDR_SIZE); p_resp_sa_mad->method |= IB_MAD_METHOD_RESP_MASK; /* C15-0.1.5 - always return SM_Key = 0 (table 185 p 884) */ p_resp_sa_mad->sm_key = 0; @@ -484,9 +442,9 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_sir_rcv_process: ERR 5307: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -494,5 +452,5 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } diff --git a/opensm/opensm/osm_sa_vlarb_record.c b/opensm/opensm/osm_sa_vlarb_record.c index a538a0b..51bc517 100644 --- a/opensm/opensm/osm_sa_vlarb_record.c +++ b/opensm/opensm/osm_sa_vlarb_record.c @@ -55,10 +55,9 @@ #include #include #include -#include +#include #include #include -#include #include #include #include @@ -73,52 +72,14 @@ typedef struct _osm_vl_arb_search_ctxt { ib_net64_t comp_mask; uint8_t block_num; cl_qlist_t *p_list; - osm_vlarb_rec_rcv_t *p_rcv; + osm_sa_t *sa; const osm_physp_t *p_req_physp; } osm_vl_arb_search_ctxt_t; /********************************************************************** **********************************************************************/ -void osm_vlarb_rec_rcv_construct(IN osm_vlarb_rec_rcv_t * const p_rcv) -{ - memset(p_rcv, 0, sizeof(*p_rcv)); -} - -/********************************************************************** - **********************************************************************/ -void osm_vlarb_rec_rcv_destroy(IN osm_vlarb_rec_rcv_t * const p_rcv) -{ - OSM_LOG_ENTER(p_rcv->p_log, osm_vlarb_rec_rcv_destroy); - OSM_LOG_EXIT(p_rcv->p_log); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_vlarb_rec_rcv_init(IN osm_vlarb_rec_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) -{ - OSM_LOG_ENTER(p_log, osm_vlarb_rec_rcv_init); - - osm_vlarb_rec_rcv_construct(p_rcv); - - p_rcv->p_log = p_log; - p_rcv->p_subn = p_subn; - p_rcv->p_lock = p_lock; - p_rcv->p_resp = p_resp; - p_rcv->p_mad_pool = p_mad_pool; - - OSM_LOG_EXIT(p_log); - return IB_SUCCESS; -} - -/********************************************************************** - **********************************************************************/ static void -__osm_sa_vl_arb_create(IN osm_vlarb_rec_rcv_t * const p_rcv, +__osm_sa_vl_arb_create(IN osm_sa_t * sa, IN osm_physp_t * const p_physp, IN osm_vl_arb_search_ctxt_t * const p_ctxt, IN uint8_t block) @@ -127,11 +88,11 @@ __osm_sa_vl_arb_create(IN osm_vlarb_rec_rcv_t * const p_rcv, uint16_t lid; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_vl_arb_create); + OSM_LOG_ENTER(sa->p_log, __osm_sa_vl_arb_create); p_rec_item = malloc(sizeof(*p_rec_item)); if (p_rec_item == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sa_vl_arb_create: ERR 2A02: " "rec_item alloc failed\n"); status = IB_INSUFFICIENT_RESOURCES; @@ -143,8 +104,8 @@ __osm_sa_vl_arb_create(IN osm_vlarb_rec_rcv_t * const p_rcv, else lid = osm_node_get_base_lid(p_physp->p_node, 0); - if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sa->p_log, OSM_LOG_DEBUG)) + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_vl_arb_create: " "New VLArbitration for: port 0x%016" PRIx64 ", lid 0x%X, port 0x%X Block:%u\n", @@ -161,36 +122,36 @@ __osm_sa_vl_arb_create(IN osm_vlarb_rec_rcv_t * const p_rcv, cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item); Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_sa_vl_arb_check_physp(IN osm_vlarb_rec_rcv_t * const p_rcv, +__osm_sa_vl_arb_check_physp(IN osm_sa_t * sa, IN osm_physp_t * const p_physp, osm_vl_arb_search_ctxt_t * const p_ctxt) { ib_net64_t comp_mask = p_ctxt->comp_mask; uint8_t block; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_vl_arb_check_physp); + OSM_LOG_ENTER(sa->p_log, __osm_sa_vl_arb_check_physp); /* we got here with the phys port - all that's left is to get the right block */ for (block = 1; block <= 4; block++) { if (!(comp_mask & IB_VLA_COMPMASK_BLOCK) || block == p_ctxt->block_num) { - __osm_sa_vl_arb_create(p_rcv, p_physp, p_ctxt, block); + __osm_sa_vl_arb_create(sa, p_physp, p_ctxt, block); } } - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** **********************************************************************/ static void -__osm_sa_vl_arb_by_comp_mask(IN osm_vlarb_rec_rcv_t * const p_rcv, +__osm_sa_vl_arb_by_comp_mask(IN osm_sa_t * sa, IN const osm_port_t * const p_port, osm_vl_arb_search_ctxt_t * const p_ctxt) { @@ -201,7 +162,7 @@ __osm_sa_vl_arb_by_comp_mask(IN osm_vlarb_rec_rcv_t * const p_rcv, uint8_t num_ports; const osm_physp_t *p_req_physp; - OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_vl_arb_by_comp_mask); + OSM_LOG_ENTER(sa->p_log, __osm_sa_vl_arb_by_comp_mask); p_rcvd_rec = p_ctxt->p_rcvd_rec; comp_mask = p_ctxt->comp_mask; @@ -213,7 +174,7 @@ __osm_sa_vl_arb_by_comp_mask(IN osm_vlarb_rec_rcv_t * const p_rcv, if (p_port->p_node->node_info.node_type != IB_NODE_TYPE_SWITCH) { /* we put it in the comp mask and port num */ port_num = p_port->p_physp->port_num; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "__osm_sa_vl_arb_by_comp_mask: " "Using Physical Default Port Number: 0x%X (for End Node)\n", port_num); @@ -227,12 +188,12 @@ __osm_sa_vl_arb_by_comp_mask(IN osm_vlarb_rec_rcv_t * const p_rcv, /* check that the p_physp is valid, and that the requester and the p_physp share a pkey. */ if (osm_physp_is_valid(p_physp) && - osm_physp_share_pkey(p_rcv->p_log, p_req_physp, + osm_physp_share_pkey(sa->p_log, p_req_physp, p_physp)) - __osm_sa_vl_arb_check_physp(p_rcv, p_physp, + __osm_sa_vl_arb_check_physp(sa, p_physp, p_ctxt); } else { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_sa_vl_arb_by_comp_mask: ERR 2A03: " "Given Physical Port Number: 0x%X is out of range should be < 0x%X\n", port_num, @@ -250,14 +211,14 @@ __osm_sa_vl_arb_by_comp_mask(IN osm_vlarb_rec_rcv_t * const p_rcv, /* if the requester and the p_physp don't share a pkey - continue */ if (!osm_physp_share_pkey - (p_rcv->p_log, p_req_physp, p_physp)) + (sa->p_log, p_req_physp, p_physp)) continue; - __osm_sa_vl_arb_check_physp(p_rcv, p_physp, p_ctxt); + __osm_sa_vl_arb_check_physp(sa, p_physp, p_ctxt); } } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } /********************************************************************** @@ -270,16 +231,16 @@ __osm_sa_vl_arb_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, osm_vl_arb_search_ctxt_t *const p_ctxt = (osm_vl_arb_search_ctxt_t *) context; - __osm_sa_vl_arb_by_comp_mask(p_ctxt->p_rcv, p_port, p_ctxt); + __osm_sa_vl_arb_by_comp_mask(p_ctxt->sa, p_port, p_ctxt); } /********************************************************************** **********************************************************************/ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) { - osm_vlarb_rec_rcv_t *p_rcv = ctx; + osm_sa_t *sa = ctx; osm_madw_t *p_madw = data; - const ib_sa_mad_t *p_rcvd_mad; + const ib_sa_mad_t *sad_mad; const ib_vl_arb_table_record_t *p_rcvd_rec; const cl_ptr_vector_t *p_tbl; const osm_port_t *p_port = NULL; @@ -299,55 +260,55 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) ib_net64_t comp_mask; osm_physp_t *p_req_physp; - CL_ASSERT(p_rcv); + CL_ASSERT(sa); - OSM_LOG_ENTER(p_rcv->p_log, osm_vlarb_rec_rcv_process); + OSM_LOG_ENTER(sa->p_log, osm_vlarb_rec_rcv_process); CL_ASSERT(p_madw); - p_rcvd_mad = osm_madw_get_sa_mad_ptr(p_madw); + sad_mad = osm_madw_get_sa_mad_ptr(p_madw); p_rcvd_rec = - (ib_vl_arb_table_record_t *) ib_sa_mad_get_payload_ptr(p_rcvd_mad); - comp_mask = p_rcvd_mad->comp_mask; + (ib_vl_arb_table_record_t *) ib_sa_mad_get_payload_ptr(sad_mad); + comp_mask = sad_mad->comp_mask; - CL_ASSERT(p_rcvd_mad->attr_id == IB_MAD_ATTR_VLARB_RECORD); + CL_ASSERT(sad_mad->attr_id == IB_MAD_ATTR_VLARB_RECORD); /* we only support SubnAdmGet and SubnAdmGetTable methods */ - if ((p_rcvd_mad->method != IB_MAD_METHOD_GET) && - (p_rcvd_mad->method != IB_MAD_METHOD_GETTABLE)) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + if ((sad_mad->method != IB_MAD_METHOD_GET) && + (sad_mad->method != IB_MAD_METHOD_GETTABLE)) { + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_vlarb_rec_rcv_process: ERR 2A05: " "Unsupported Method (%s)\n", - ib_get_sa_method_str(p_rcvd_mad->method)); - osm_sa_send_error(p_rcv->p_resp, p_madw, + ib_get_sa_method_str(sad_mad->method)); + osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; } /* update the requester physical port. */ - p_req_physp = osm_get_physp_by_mad_addr(p_rcv->p_log, - p_rcv->p_subn, + p_req_physp = osm_get_physp_by_mad_addr(sa->p_log, + sa->p_subn, osm_madw_get_mad_addr_ptr (p_madw)); if (p_req_physp == NULL) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_vlarb_rec_rcv_process: ERR 2A04: " "Cannot find requester physical port\n"); goto Exit; } - p_vl_arb = (ib_vl_arb_table_t *) ib_sa_mad_get_payload_ptr(p_rcvd_mad); + p_vl_arb = (ib_vl_arb_table_t *) ib_sa_mad_get_payload_ptr(sad_mad); cl_qlist_init(&rec_list); context.p_rcvd_rec = p_rcvd_rec; context.p_list = &rec_list; - context.comp_mask = p_rcvd_mad->comp_mask; - context.p_rcv = p_rcv; + context.comp_mask = sad_mad->comp_mask; + context.sa = sa; context.block_num = p_rcvd_rec->block_num; context.p_req_physp = p_req_physp; - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_vlarb_rec_rcv_process: " "Got Query Lid:0x%04X(%02X), Port:0x%02X(%02X), Block:0x%02X(%02X)\n", cl_ntoh16(p_rcvd_rec->lid), @@ -356,7 +317,7 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) p_rcvd_rec->block_num, (comp_mask & IB_VLA_COMPMASK_BLOCK) != 0); - cl_plock_acquire(p_rcv->p_lock); + cl_plock_acquire(sa->p_lock); /* If the user specified a LID, it obviously narrows our @@ -364,16 +325,16 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) */ if (comp_mask & IB_VLA_COMPMASK_LID) { - p_tbl = &p_rcv->p_subn->port_lid_tbl; + p_tbl = &sa->p_subn->port_lid_tbl; CL_ASSERT(cl_ptr_vector_get_size(p_tbl) < 0x10000); status = - osm_get_port_by_base_lid(p_rcv->p_subn, p_rcvd_rec->lid, + osm_get_port_by_base_lid(sa->p_subn, p_rcvd_rec->lid, &p_port); if ((status != IB_SUCCESS) || (p_port == NULL)) { status = IB_NOT_FOUND; - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_vlarb_rec_rcv_process: ERR 2A09: " "No port found with LID 0x%x\n", cl_ntoh16(p_rcvd_rec->lid)); @@ -384,15 +345,15 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) /* if we got a unique port - no need for a port search */ if (p_port) /* this does the loop on all the port phys ports */ - __osm_sa_vl_arb_by_comp_mask(p_rcv, p_port, &context); + __osm_sa_vl_arb_by_comp_mask(sa, p_port, &context); else { - cl_qmap_apply_func(&p_rcv->p_subn->port_guid_tbl, + cl_qmap_apply_func(&sa->p_subn->port_guid_tbl, __osm_sa_vl_arb_by_comp_mask_cb, &context); } } - cl_plock_release(p_rcv->p_lock); + cl_plock_release(sa->p_lock); num_rec = cl_qlist_count(&rec_list); @@ -400,18 +361,18 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) * C15-0.1.30: * If we do a SubnAdmGet and got more than one record it is an error ! */ - if (p_rcvd_mad->method == IB_MAD_METHOD_GET) { + if (sad_mad->method == IB_MAD_METHOD_GET) { if (num_rec == 0) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } if (num_rec > 1) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_vlarb_rec_rcv_process: ERR 2A08: " "Got more than one record for SubnAdmGet (%u)\n", num_rec); - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_TOO_MANY_RECORDS); /* need to set the mem free ... */ @@ -434,7 +395,7 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) (MAD_BLOCK_SIZE - IB_SA_MAD_HDR_SIZE) / sizeof(ib_vl_arb_table_record_t); if (trim_num_rec < num_rec) { - osm_log(p_rcv->p_log, OSM_LOG_VERBOSE, + osm_log(sa->p_log, OSM_LOG_VERBOSE, "osm_vlarb_rec_rcv_process: " "Number of records:%u trimmed to:%u to fit in one MAD\n", num_rec, trim_num_rec); @@ -442,12 +403,12 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) } #endif - osm_log(p_rcv->p_log, OSM_LOG_DEBUG, + osm_log(sa->p_log, OSM_LOG_DEBUG, "osm_vlarb_rec_rcv_process: " "Returning %u records\n", num_rec); - if ((p_rcvd_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { - osm_sa_send_error(p_rcv->p_resp, p_madw, + if ((sad_mad->method == IB_MAD_METHOD_GET) && (num_rec == 0)) { + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RECORDS); goto Exit; } @@ -455,14 +416,14 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) /* * Get a MAD to reply. Address of Mad is in the received mad_wrapper */ - p_resp_madw = osm_mad_pool_get(p_rcv->p_mad_pool, + p_resp_madw = osm_mad_pool_get(sa->p_mad_pool, p_madw->h_bind, num_rec * sizeof(ib_vl_arb_table_record_t) + IB_SA_MAD_HDR_SIZE, &p_madw->mad_addr); if (!p_resp_madw) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_vlarb_rec_rcv_process: ERR 2A06: " "osm_mad_pool_get failed\n"); @@ -472,7 +433,7 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) free(p_rec_item); } - osm_sa_send_error(p_rcv->p_resp, p_madw, + osm_sa_send_error(sa, p_madw, IB_SA_MAD_STATUS_NO_RESOURCES); goto Exit; } @@ -485,7 +446,7 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) Then copy all records from the list into the response payload. */ - memcpy(p_resp_sa_mad, p_rcvd_mad, IB_SA_MAD_HDR_SIZE); + memcpy(p_resp_sa_mad, sad_mad, IB_SA_MAD_HDR_SIZE); p_resp_sa_mad->method |= IB_MAD_METHOD_RESP_MASK; /* C15-0.1.5 - always return SM_Key = 0 (table 185 p 884) */ p_resp_sa_mad->sm_key = 0; @@ -525,9 +486,9 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) CL_ASSERT(cl_is_qlist_empty(&rec_list)); status = osm_sa_vendor_send(p_resp_madw->h_bind, p_resp_madw, FALSE, - p_rcv->p_subn); + sa->p_subn); if (status != IB_SUCCESS) { - osm_log(p_rcv->p_log, OSM_LOG_ERROR, + osm_log(sa->p_log, OSM_LOG_ERROR, "osm_vlarb_rec_rcv_process: ERR 2A07: " "osm_sa_vendor_send status = %s\n", ib_get_err_str(status)); @@ -535,5 +496,5 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data) } Exit: - OSM_LOG_EXIT(p_rcv->p_log); + OSM_LOG_EXIT(sa->p_log); } -- 1.5.3.4.206.g58ba4 From sashak at voltaire.com Thu Jan 3 02:01:15 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 10:01:15 +0000 Subject: [ofa-general] [PATCH 4/4] opensm: remove not used osm_sa_*.h header files In-Reply-To: <11993544753970-git-send-email-sashak@voltaire.com> References: <11993544753970-git-send-email-sashak@voltaire.com> Message-ID: <11993544763750-git-send-email-sashak@voltaire.com> Remove not used anymore osm_sa_*.h header files. Signed-off-by: Sasha Khapyorsky --- opensm/include/Makefile.am | 18 -- opensm/include/opensm/osm_sa_class_port_info.h | 252 ------------------- opensm/include/opensm/osm_sa_guidinfo_record.h | 259 ------------------- opensm/include/opensm/osm_sa_informinfo.h | 275 -------------------- opensm/include/opensm/osm_sa_lft_record.h | 266 -------------------- opensm/include/opensm/osm_sa_link_record.h | 258 ------------------- opensm/include/opensm/osm_sa_mcmember_record.h | 305 ----------------------- opensm/include/opensm/osm_sa_mft_record.h | 265 -------------------- opensm/include/opensm/osm_sa_multipath_record.h | 254 ------------------- opensm/include/opensm/osm_sa_node_record.h | 258 ------------------- opensm/include/opensm/osm_sa_path_record.h | 255 ------------------- opensm/include/opensm/osm_sa_pkey_record.h | 248 ------------------ opensm/include/opensm/osm_sa_portinfo_record.h | 261 ------------------- opensm/include/opensm/osm_sa_response.h | 212 ---------------- opensm/include/opensm/osm_sa_service_record.h | 274 -------------------- opensm/include/opensm/osm_sa_slvl_record.h | 261 ------------------- opensm/include/opensm/osm_sa_sminfo_record.h | 252 ------------------- opensm/include/opensm/osm_sa_sw_info_record.h | 265 -------------------- opensm/include/opensm/osm_sa_vlarb_record.h | 262 ------------------- 19 files changed, 0 insertions(+), 4700 deletions(-) delete mode 100644 opensm/include/opensm/osm_sa_class_port_info.h delete mode 100644 opensm/include/opensm/osm_sa_guidinfo_record.h delete mode 100644 opensm/include/opensm/osm_sa_informinfo.h delete mode 100644 opensm/include/opensm/osm_sa_lft_record.h delete mode 100644 opensm/include/opensm/osm_sa_link_record.h delete mode 100644 opensm/include/opensm/osm_sa_mcmember_record.h delete mode 100644 opensm/include/opensm/osm_sa_mft_record.h delete mode 100644 opensm/include/opensm/osm_sa_multipath_record.h delete mode 100644 opensm/include/opensm/osm_sa_node_record.h delete mode 100644 opensm/include/opensm/osm_sa_path_record.h delete mode 100644 opensm/include/opensm/osm_sa_pkey_record.h delete mode 100644 opensm/include/opensm/osm_sa_portinfo_record.h delete mode 100644 opensm/include/opensm/osm_sa_response.h delete mode 100644 opensm/include/opensm/osm_sa_service_record.h delete mode 100644 opensm/include/opensm/osm_sa_slvl_record.h delete mode 100644 opensm/include/opensm/osm_sa_sminfo_record.h delete mode 100644 opensm/include/opensm/osm_sa_sw_info_record.h delete mode 100644 opensm/include/opensm/osm_sa_vlarb_record.h diff --git a/opensm/include/Makefile.am b/opensm/include/Makefile.am index cdb83c9..117087f 100644 --- a/opensm/include/Makefile.am +++ b/opensm/include/Makefile.am @@ -4,45 +4,32 @@ SUBDIRS = . nobase_pkginclude_HEADERS = iba/ib_types.h iba/ib_cm_types.h EXTRA_DIST = \ - $(srcdir)/opensm/osm_sa_path_record.h \ $(srcdir)/opensm/osm_lid_mgr.h \ $(srcdir)/opensm/osm_port.h \ $(srcdir)/opensm/osm_sm_state_mgr.h \ $(srcdir)/opensm/osm_state_mgr.h \ $(srcdir)/opensm/osm_rand_fwd_tbl.h \ - $(srcdir)/opensm/osm_sa_vlarb_record.h \ $(srcdir)/opensm/osm_madw.h \ $(srcdir)/opensm/osm_subnet.h \ $(srcdir)/opensm/osm_sweep_fail_ctrl.h \ - $(srcdir)/opensm/osm_sa_lft_record.h \ - $(srcdir)/opensm/osm_sa_mft_record.h \ $(srcdir)/opensm/osm_resp.h \ $(srcdir)/opensm/osm_partition.h \ $(srcdir)/opensm/osm_helper.h \ - $(srcdir)/opensm/osm_sa_portinfo_record.h \ - $(srcdir)/opensm/osm_sa_guidinfo_record.h \ - $(srcdir)/opensm/osm_sa_multipath_record.h \ - $(srcdir)/opensm/osm_sa_service_record.h \ - $(srcdir)/opensm/osm_sa_response.h \ $(srcdir)/opensm/osm_node.h \ $(srcdir)/opensm/osm_console.h \ $(srcdir)/opensm/osm_req.h \ $(srcdir)/opensm/osm_mcm_info.h \ - $(srcdir)/opensm/osm_sa_pkey_record.h \ $(srcdir)/opensm/osm_inform.h \ $(srcdir)/opensm/osm_path.h \ $(srcdir)/opensm/osm_service.h \ $(srcdir)/opensm/osm_switch.h \ $(srcdir)/opensm/osm_router.h \ $(srcdir)/opensm/osm_prefix_route.h \ - $(srcdir)/opensm/osm_sa_slvl_record.h \ $(srcdir)/opensm/osm_opensm.h \ $(srcdir)/opensm/osm_sa.h \ $(srcdir)/opensm/osm_port_profile.h \ $(srcdir)/opensm/osm_multicast.h \ - $(srcdir)/opensm/osm_sa_class_port_info.h \ $(srcdir)/opensm/osm_base.h \ - $(srcdir)/opensm/osm_sa_sminfo_record.h \ $(srcdir)/opensm/osm_mcast_mgr.h \ $(srcdir)/opensm/osm_errors.h \ $(srcdir)/opensm/osm_event_plugin.h \ @@ -51,18 +38,15 @@ EXTRA_DIST = \ $(srcdir)/opensm/osm_lin_fwd_tbl.h \ $(srcdir)/opensm/osm_ucast_mgr.h \ $(srcdir)/opensm/osm_db.h \ - $(srcdir)/opensm/osm_sa_informinfo.h \ $(srcdir)/opensm/osm_mad_pool.h \ $(srcdir)/opensm/osm_remote_sm.h \ $(srcdir)/opensm/osm_link_mgr.h \ $(srcdir)/opensm/osm_msgdef.h \ - $(srcdir)/opensm/osm_sa_node_record.h \ $(srcdir)/opensm/st.h \ $(srcdir)/opensm/osm_mcast_tbl.h \ $(srcdir)/opensm/osm_pkey.h \ $(srcdir)/opensm/osm_pkey_mgr.h \ $(srcdir)/opensm/osm_sa_mad_ctrl.h \ - $(srcdir)/opensm/osm_sa_link_record.h \ $(srcdir)/opensm/osm_mcm_port.h \ $(srcdir)/opensm/osm_log.h \ $(srcdir)/opensm/osm_fwd_tbl.h \ @@ -70,8 +54,6 @@ EXTRA_DIST = \ $(srcdir)/opensm/osm_sm_mad_ctrl.h \ $(srcdir)/opensm/osm_attrib_req.h \ $(srcdir)/opensm/osm_stats.h \ - $(srcdir)/opensm/osm_sa_mcmember_record.h \ - $(srcdir)/opensm/osm_sa_sw_info_record.h \ $(srcdir)/opensm/osm_vl15intf.h \ $(srcdir)/opensm/osm_drop_mgr.h \ $(srcdir)/opensm/osm_perfmgr.h \ diff --git a/opensm/include/opensm/osm_sa_class_port_info.h b/opensm/include/opensm/osm_sa_class_port_info.h deleted file mode 100644 index 52b3c9e..0000000 --- a/opensm/include/opensm/osm_sa_class_port_info.h +++ /dev/null @@ -1,252 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_cpi_rcv_t. - * This object represents the ClassPortInfo Receiver object. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.2 $ - */ - -#ifndef _OSM_CPI_H_ -#define _OSM_CPI_H_ - -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/ClassPort Info Receiver -* NAME -* ClassPort Info Receiver -* -* DESCRIPTION -* The ClassPort Info Receiver object encapsulates the information -* needed to receive the ClassPortInfo request from a node. -* -* The ClassPort Info Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Eitan Zahavi, Mellanox -* -*********/ -/****s* OpenSM: ClassPort Info Receiver/osm_cpi_rcv_t -* NAME -* osm_cpi_rcv_t -* -* DESCRIPTION -* ClassPort Info Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_cpi_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_cpi_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_gen_req_ctrl -* Pointer to the generic request controller. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* ClassPort Info Receiver object -*********/ - -/****f* OpenSM: ClassPort Info Receiver/osm_cpi_rcv_construct -* NAME -* osm_cpi_rcv_construct -* -* DESCRIPTION -* This function constructs a ClassPort Info Receiver object. -* -* SYNOPSIS -*/ -void osm_cpi_rcv_construct(IN osm_cpi_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a ClassPort Info Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_cpi_rcv_init, osm_cpi_rcv_destroy -* -* Calling osm_cpi_rcv_construct is a prerequisite to calling any other -* method except osm_cpi_rcv_init. -* -* SEE ALSO -* ClassPort Info Receiver object, osm_cpi_rcv_init, osm_cpi_rcv_destroy -*********/ - -/****f* OpenSM: ClassPort Info Receiver/osm_cpi_rcv_destroy -* NAME -* osm_cpi_rcv_destroy -* -* DESCRIPTION -* The osm_cpi_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_cpi_rcv_destroy(IN osm_cpi_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* ClassPort Info Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_cpi_rcv_construct or osm_cpi_rcv_init. -* -* SEE ALSO -* ClassPort Info Receiver object, osm_cpi_rcv_construct, -* osm_cpi_rcv_init -*********/ - -/****f* OpenSM: ClassPort Info Receiver/osm_cpi_rcv_init -* NAME -* osm_cpi_rcv_init -* -* DESCRIPTION -* The osm_cpi_rcv_init function initializes a -* ClassPort Info Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_cpi_rcv_init(IN osm_cpi_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_cpi_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the ClassPort Info Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other ClassPort Info Receiver methods. -* -* SEE ALSO -* ClassPort Info Receiver object, osm_cpi_rcv_construct, -* osm_cpi_rcv_destroy -*********/ - -/****f* OpenSM: ClassPort Info Receiver/osm_cpi_rcv_process -* NAME -* osm_cpi_rcv_process -* -* DESCRIPTION -* Process the ClassPortInfo request. -* -* SYNOPSIS -*/ -void osm_cpi_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_cpi_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the ClassPortInfo attribute. -* -* RETURN VALUES -* IB_SUCCESS if the ClassPortInfo processing was successful. -* -* NOTES -* This function processes a ClassPortInfo attribute. -* -* SEE ALSO -* ClassPort Info Receiver, ClassPort Info Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_CPI_H_ */ diff --git a/opensm/include/opensm/osm_sa_guidinfo_record.h b/opensm/include/opensm/osm_sa_guidinfo_record.h deleted file mode 100644 index c074b7b..0000000 --- a/opensm/include/opensm/osm_sa_guidinfo_record.h +++ /dev/null @@ -1,259 +0,0 @@ -/* - * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_gir_rcv_t. - * This object represents the GUIDInfo Record Receiver object. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - */ - -#ifndef _OSM_GIR_RCV_H_ -#define _OSM_GIR_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/GUIDInfo Record Receiver -* NAME -* GUIDInfo Record Receiver -* -* DESCRIPTION -* The GUIDInfo Record Receiver object encapsulates the information -* needed to receive the GUIDInfoRecord attribute from a node. -* -* The GUIDInfo Record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Hal Rosenstock, Voltaire -* -*********/ -/****s* OpenSM: GUIDInfo Record Receiver/osm_gir_rcv_t -* NAME -* osm_gir_rcv_t -* -* DESCRIPTION -* GUIDInfo Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_gir_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_gir_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_resp -* Pointer to the SA responder. -* -* p_mad_pool -* Pointer to the mad pool. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* -*********/ - -/****f* OpenSM: GUIDInfo Record Receiver/osm_gir_rcv_construct -* NAME -* osm_gir_rcv_construct -* -* DESCRIPTION -* This function constructs a GUIDInfo Record Receiver object. -* -* SYNOPSIS -*/ -void osm_gir_rcv_construct(IN osm_gir_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a GUIDInfo Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_gir_rcv_init, osm_gir_rcv_destroy -* -* Calling osm_gir_rcv_construct is a prerequisite to calling any other -* method except osm_gir_rcv_init. -* -* SEE ALSO -* GUIDInfo Record Receiver object, osm_gir_rcv_init, -* osm_gir_rcv_destroy -*********/ - -/****f* OpenSM: GUIDInfo Record Receiver/osm_gir_rcv_destroy -* NAME -* osm_gir_rcv_destroy -* -* DESCRIPTION -* The osm_gir_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_gir_rcv_destroy(IN osm_gir_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* GUIDInfo Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_gir_rcv_construct or osm_gir_rcv_init. -* -* SEE ALSO -* GUIDInfo Record Receiver object, osm_gir_rcv_construct, -* osm_gir_rcv_init -*********/ - -/****f* OpenSM: GUIDInfo Record Receiver/osm_gir_rcv_init -* NAME -* osm_gir_rcv_init -* -* DESCRIPTION -* The osm_gir_rcv_init function initializes a -* GUIDInfo Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_gir_rcv_init(IN osm_gir_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_gir_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the GUIDInfo Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other GUIDInfo Record Receiver methods. -* -* SEE ALSO -* GUIDInfo Record Receiver object, osm_gir_rcv_construct, -* osm_gir_rcv_destroy -*********/ - -/****f* OpenSM: GUIDInfo Record Receiver/osm_gir_rcv_process -* NAME -* osm_gir_rcv_process -* -* DESCRIPTION -* Process the GUIDInfoRecord attribute. -* -* SYNOPSIS -*/ -void osm_gir_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_gir_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's GUIDInfoRecord attribute. -* -* RETURN VALUES -* CL_SUCCESS if the GUIDInfoRecord processing was successful. -* -* NOTES -* This function processes a GUIDInfoRecord attribute. -* -* SEE ALSO -* GUIDInfo Record Receiver, GUIDInfo Record Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_GIR_RCV_H_ */ diff --git a/opensm/include/opensm/osm_sa_informinfo.h b/opensm/include/opensm/osm_sa_informinfo.h deleted file mode 100644 index 2a4b4ba..0000000 --- a/opensm/include/opensm/osm_sa_informinfo.h +++ /dev/null @@ -1,275 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_infr_rcv_t. - * This object represents the InformInfoRecord Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.3 $ - */ - -#ifndef _OSM_SA_INFR_H_ -#define _OSM_SA_INFR_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/InformInfo Receiver -* NAME -* InformInfo Receiver -* -* DESCRIPTION -* The InformInfo Receiver object encapsulates the information -* needed to receive the InformInfo request from a node. -* -* The InformInfo Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Eitan Zahavi, Mellanox -* -*********/ -/****s* OpenSM: InformInfo Receiver/osm_infr_rcv_t -* NAME -* osm_infr_rcv_t -* -* DESCRIPTION -* InformInfo Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_infr_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_infr_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_resp -* Pointer to the osm_sa_resp_t object. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* InformInfo Receiver object -*********/ - -/****f* OpenSM: InformInfo Receiver/osm_infr_rcv_construct -* NAME -* osm_infr_rcv_construct -* -* DESCRIPTION -* This function constructs a InformInfo Receiver object. -* -* SYNOPSIS -*/ -void osm_infr_rcv_construct(IN osm_infr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a InformInfo Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_infr_rcv_init, osm_infr_rcv_destroy -* -* Calling osm_infr_rcv_construct is a prerequisite to calling any other -* method except osm_infr_rcv_init. -* -* SEE ALSO -* InformInfo Receiver object, osm_infr_rcv_init, osm_infr_rcv_destroy -*********/ - -/****f* OpenSM: InformInfo Receiver/osm_infr_rcv_destroy -* NAME -* osm_infr_rcv_destroy -* -* DESCRIPTION -* The osm_infr_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_infr_rcv_destroy(IN osm_infr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* InformInfo Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_infr_rcv_construct or osm_infr_rcv_init. -* -* SEE ALSO -* InformInfo Receiver object, osm_infr_rcv_construct, -* osm_infr_rcv_init -*********/ - -/****f* OpenSM: InformInfo Receiver/osm_infr_rcv_init -* NAME -* osm_infr_rcv_init -* -* DESCRIPTION -* The osm_infr_rcv_init function initializes a -* InformInfo Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_infr_rcv_init(IN osm_infr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_infr_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the InformInfo Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other InformInfo Receiver methods. -* -* SEE ALSO -* InformInfo Receiver object, osm_infr_rcv_construct, -* osm_infr_rcv_destroy -*********/ - -/****f* OpenSM: InformInfo Receiver/osm_infr_rcv_process -* NAME -* osm_infr_rcv_process -* -* DESCRIPTION -* Process the InformInfo request. -* -* SYNOPSIS -*/ -void osm_infr_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_infr_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's InformInfo attribute. -* NOTES -* This function processes a InformInfo attribute. -* -* SEE ALSO -* InformInfo Receiver -*********/ - -/****f* OpenSM: InformInfo Record Receiver/osm_infir_rcv_process -* NAME -* osm_infir_rcv_process -* -* DESCRIPTION -* Process the InformInfo Record request. -* -* SYNOPSIS -*/ -void osm_infir_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_infr_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's InformInfo Record attribute. -* NOTES -* This function processes a InformInfo Record attribute. -* -* SEE ALSO -* InformInfo Receiver -*********/ - -END_C_DECLS -#endif /* _OSM_SA_INFR_H_ */ diff --git a/opensm/include/opensm/osm_sa_lft_record.h b/opensm/include/opensm/osm_sa_lft_record.h deleted file mode 100644 index 8470490..0000000 --- a/opensm/include/opensm/osm_sa_lft_record.h +++ /dev/null @@ -1,266 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_lftr_rcv_t. - * This object represents the LinearForwardingTable Receiver object. - * attribute from a switch node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_LFTR_H_ -#define _OSM_LFTR_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Linear Forwarding Table Receiver -* NAME -* Linear Forwarding Table Receiver -* -* DESCRIPTION -* The Linear Forwarding Table Receiver object encapsulates the information -* needed to receive the LinearForwardingTable attribute from a switch node. -* -* The Linear Forwarding Table Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Eitan Zahavi, Mellanox Technologies LTD -* -*********/ -/****s* OpenSM: Linear Forwarding Table Receiver/osm_lftr_rcv_t -* NAME -* osm_lftr_rcv_t -* -* DESCRIPTION -* Linear Forwarding Table Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_lft { - osm_subn_t *p_subn; - osm_stats_t *p_stats; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_lftr_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_stats -* Pointer to the statistics. -* -* p_resp -* Pointer to the SA responder. -* -* p_mad_pool -* Pointer to the mad pool. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Linear Forwarding Table Receiver object -*********/ - -/****f* OpenSM: Linear Forwarding Table Receiver/osm_lftr_rcv_construct -* NAME -* osm_lftr_rcv_construct -* -* DESCRIPTION -* This function constructs a Linear Forwarding Table Receiver object. -* -* SYNOPSIS -*/ -void osm_lftr_rcv_construct(IN osm_lftr_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a Linear Forwarding Table Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_lftr_rcv_init, osm_lftr_rcv_destroy -* -* Calling osm_lftr_rcv_construct is a prerequisite to calling any other -* method except osm_lftr_rcv_init. -* -* SEE ALSO -* Linear Forwarding Table Receiver object, osm_lftr_rcv_init, -* osm_lftr_rcv_destroy -*********/ - -/****f* OpenSM: Linear Forwarding Table Receiver/osm_lftr_rcv_destroy -* NAME -* osm_lftr_rcv_destroy -* -* DESCRIPTION -* The osm_lftr_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_lftr_rcv_destroy(IN osm_lftr_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Linear Forwarding Table Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_lftr_rcv_construct or osm_lftr_rcv_init. -* -* SEE ALSO -* Linear Forwarding Table Receiver object, osm_lftr_rcv_construct, -* osm_lftr_rcv_init -*********/ - -/****f* OpenSM: Linear Forwarding Table Receiver/osm_lftr_rcv_init -* NAME -* osm_lftr_rcv_init -* -* DESCRIPTION -* The osm_lftr_rcv_init function initializes a -* Linear Forwarding Table Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_lftr_rcv_init(IN osm_lftr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_lftr_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the Linear Forwarding Table Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Linear Forwarding Table Receiver methods. -* -* SEE ALSO -* Linear Forwarding Table Receiver object, osm_lftr_rcv_construct, -* osm_lftr_rcv_destroy -*********/ - -/****f* OpenSM: Linear Forwarding Table Receiver/osm_lftr_rcv_process -* NAME -* osm_lftr_rcv_process -* -* DESCRIPTION -* Process the LinearForwardingTable attribute. -* -* SYNOPSIS -*/ -void osm_lftr_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_lftr_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the switch node's LinearForwardingTable attribute. -* -* RETURN VALUES -* CL_SUCCESS if the LinearForwardingTable processing was successful. -* -* NOTES -* This function processes a LinearForwardingTable attribute. -* -* SEE ALSO -* Linear Forwarding Table Receiver, Linear Forwarding Table Response -* Controller -*********/ - -END_C_DECLS -#endif /* _OSM_LFTR_H_ */ diff --git a/opensm/include/opensm/osm_sa_link_record.h b/opensm/include/opensm/osm_sa_link_record.h deleted file mode 100644 index d09eb69..0000000 --- a/opensm/include/opensm/osm_sa_link_record.h +++ /dev/null @@ -1,258 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_lr_rcv_t. - * This object represents the Link Record Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_LR_RCV_H_ -#define _OSM_LR_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Link Record Receiver -* NAME -* Link Record Receiver -* -* DESCRIPTION -* The Link Record Receiver object encapsulates the information -* needed to receive the Link Record attribute from a node. -* -* The Link Record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Ranjit Pandit, Intel -* -*********/ -/****s* OpenSM: Link Record Receiver/osm_lr_rcv_t -* NAME -* osm_lr_rcv_t -* -* DESCRIPTION -* Link Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_lr_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_lr_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_resp -* Pointer to the SA responder. -* -* p_mad_pool -* Pointer to the mad pool. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -*********/ - -/****f* OpenSM: Link Record Receiver/osm_lr_rcv_construct -* NAME -* osm_lr_rcv_construct -* -* DESCRIPTION -* This function constructs a Link Record Receiver object. -* -* SYNOPSIS -*/ -void osm_lr_rcv_construct(IN osm_lr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a Link Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_lr_rcv_init, osm_lr_rcv_destroy -* -* Calling osm_lr_rcv_construct is a prerequisite to calling any other -* method except osm_lr_rcv_init. -* -* SEE ALSO -* Link Record Receiver object, osm_lr_rcv_init, osm_lr_rcv_destroy -*********/ - -/****f* OpenSM: Link Record Receiver/osm_lr_rcv_destroy -* NAME -* osm_lr_rcv_destroy -* -* DESCRIPTION -* The osm_lr_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_lr_rcv_destroy(IN osm_lr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Link Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_lr_rcv_construct or osm_lr_rcv_init. -* -* SEE ALSO -* Link Record Receiver object, osm_lr_rcv_construct, -* osm_lr_rcv_init -*********/ - -/****f* OpenSM: Link Record Receiver/osm_lr_rcv_init -* NAME -* osm_lr_rcv_init -* -* DESCRIPTION -* The osm_lr_rcv_init function initializes a -* Link Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_lr_rcv_init(IN osm_lr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_lr_rcv_t object to initialize. -* -* p_resp -* [in] Pointer to the SA Responder object. -* -* p_mad_pool -* [in] Pointer to the mad pool. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the Link Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Link Record Receiver methods. -* -* SEE ALSO -* Link Record Receiver object, osm_lr_rcv_construct, osm_lr_rcv_destroy -*********/ - -/****f* OpenSM: Link Record Receiver/osm_lr_rcv_process -* NAME -* osm_lr_rcv_process -* -* DESCRIPTION -* Process the Link Record attribute. -* -* SYNOPSIS -*/ -void osm_lr_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_lr_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's Link Record attribute. -* -* NOTES -* This function processes a Link Record attribute. -* -* SEE ALSO -* Link Record Receiver, Link Record Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_LR_RCV_H_ */ diff --git a/opensm/include/opensm/osm_sa_mcmember_record.h b/opensm/include/opensm/osm_sa_mcmember_record.h deleted file mode 100644 index 09db580..0000000 --- a/opensm/include/opensm/osm_sa_mcmember_record.h +++ /dev/null @@ -1,305 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_mcmr_recv_t. - * This object represents the MCMemberRecord Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.7 $ - */ - -#ifndef _OSM_MCMR_H_ -#define _OSM_MCMR_H_ - -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/MCMember Receiver -* NAME -* MCMember Receiver -* -* DESCRIPTION -* The MCMember Receiver object encapsulates the information -* needed to receive the MCMemberRecord attribute from a node. -* -* The MCMember Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Anil Keshavamurthy, Intel -* -*********/ -/****s* OpenSM: MCMember Receiver/osm_mcmr_recv_t -* NAME -* osm_mcmr_recv_t -* -* DESCRIPTION -* MCMember Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_mcmr { - osm_subn_t *p_subn; - osm_sm_t *p_sm; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_mcmr_recv_t; - -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_gen_req_ctrl -* Pointer to the generic request controller. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* MCMember Receiver object -*********/ - -/****f* OpenSM: MCMember Receiver/osm_mcmr_rcv_construct -* NAME -* osm_mcmr_rcv_construct -* -* DESCRIPTION -* This function constructs a MCMember Receiver object. -* -* SYNOPSIS -*/ -void osm_mcmr_rcv_construct(IN osm_mcmr_recv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a MCMember Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_mcmr_rcv_init, osm_mcmr_rcv_destroy -* -* Calling osm_mcmr_rcv_construct is a prerequisite to calling any other -* method except osm_mcmr_init. -* -* SEE ALSO -* MCMember Receiver object, osm_mcmr_init, -* osm_mcmr_rcv_destroy -*********/ - -/****f* OpenSM: MCMember Receiver/osm_mcmr_rcv_destroy -* NAME -* osm_mcmr_rcv_destroy -* -* DESCRIPTION -* The osm_mcmr_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_mcmr_rcv_destroy(IN osm_mcmr_recv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* MCMember Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_mcmr_rcv_construct or osm_mcmr_init. -* -* SEE ALSO -* MCMember Receiver object, osm_mcmr_rcv_construct, -* osm_mcmr_init -*********/ - -/****f* OpenSM: MCMember Receiver/osm_mcmr_rcv_init -* NAME -* osm_mcmr_init -* -* DESCRIPTION -* The osm_mcmr_init function initializes a -* MCMember Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_mcmr_rcv_init(IN osm_sm_t * const p_sm, - IN osm_mcmr_recv_t * const p_ctrl, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_sm -* [in] pointer to osm_sm_t object -* p_ctrl -* [in] Pointer to an osm_mcmr_recv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the MCMember Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other MCMember Receiver methods. -* -* SEE ALSO -* MCMember Receiver object, osm_mcmr_rcv_construct, -* osm_mcmr_rcv_destroy -*********/ - -/****f* OpenSM: MCMember Receiver/osm_mcmr_rcv_process -* NAME -* osm_mcmr_rcv_process -* -* DESCRIPTION -* Process the MCMemberRecord attribute. -* -* SYNOPSIS -*/ -void osm_mcmr_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_mcmr_recv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's MCMemberRecord attribute. -* -* RETURN VALUES -* CL_SUCCESS if the MCMemberRecord processing was successful. -* -* NOTES -* This function processes a MCMemberRecord attribute. -* -* SEE ALSO -* MCMember Receiver, MCMember Response Controller -*********/ - -/****f* OpenSM: MC Member Record Receiver/osm_mcmr_rcv_create_new_mgrp -* NAME -* osm_mcmr_rcv_create_new_mgrp -* -* DESCRIPTION -* Create new Multicast group -* -* SYNOPSIS -*/ - -ib_api_status_t -osm_mcmr_rcv_create_new_mgrp(IN osm_mcmr_recv_t * const p_mcmr, - IN uint64_t comp_mask, - IN const ib_member_rec_t * - const p_recvd_mcmember_rec, - IN const osm_physp_t * const p_req_physp, - OUT osm_mgrp_t ** pp_mgrp); -/* -* PARAMETERS -* p_mcmr -* [in] Pointer to an osm_mcmr_recv_t object. -* p_recvd_mcmember_rec -* [in] Received Multicast member record -* -* p_req_physp -* [in] The requesting osm_physp_t object. -* NULL if the creation is without a requesting port (e.g - ipoib known mcgroups) -* -* pp_mgrp -* [out] pointer the osm_mgrp_t object -* -* RETURN VALUES -* IB_SUCCESS, IB_ERROR -* -* NOTES -* -* -* SEE ALSO -* -*********/ - -END_C_DECLS -#endif /* _OSM_MCMR_H_ */ diff --git a/opensm/include/opensm/osm_sa_mft_record.h b/opensm/include/opensm/osm_sa_mft_record.h deleted file mode 100644 index 09b922d..0000000 --- a/opensm/include/opensm/osm_sa_mft_record.h +++ /dev/null @@ -1,265 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_mftr_rcv_t. - * This object represents the MulticastForwardingTable Receiver object. - * attribute from a switch node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - */ - -#ifndef _OSM_MFTR_H_ -#define _OSM_MFTR_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Multicast Forwarding Table Receiver -* NAME -* Multicast Forwarding Table Receiver -* -* DESCRIPTION -* The Multicast Forwarding Table Receiver object encapsulates the information -* needed to receive the MulticastForwardingTable attribute from a switch node. -* -* The Multicast Forwarding Table Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Hal Rosenstock, Voltaire -* -*********/ -/****s* OpenSM: Multicast Forwarding Table Receiver/osm_mftr_rcv_t -* NAME -* osm_mftr_rcv_t -* -* DESCRIPTION -* Multicast Forwarding Table Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_mft { - osm_subn_t *p_subn; - osm_stats_t *p_stats; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_mftr_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_stats -* Pointer to the statistics. -* -* p_resp -* Pointer to the SA responder. -* -* p_mad_pool -* Pointer to the mad pool. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Multicast Forwarding Table Receiver object -*********/ - -/****f* OpenSM: Multicast Forwarding Table Receiver/osm_mftr_rcv_construct -* NAME -* osm_mftr_rcv_construct -* -* DESCRIPTION -* This function constructs a Multicast Forwarding Table Receiver object. -* -* SYNOPSIS -*/ -void osm_mftr_rcv_construct(IN osm_mftr_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a Multicast Forwarding Table Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_mftr_rcv_init, osm_mftr_rcv_destroy -* -* Calling osm_mftr_rcv_construct is a prerequisite to calling any other -* method except osm_mftr_rcv_init. -* -* SEE ALSO -* Multicast Forwarding Table Receiver object, osm_mftr_rcv_init, -* osm_mftr_rcv_destroy -*********/ - -/****f* OpenSM: Multicast Forwarding Table Receiver/osm_mftr_rcv_destroy -* NAME -* osm_mftr_rcv_destroy -* -* DESCRIPTION -* The osm_mftr_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_mftr_rcv_destroy(IN osm_mftr_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Multicast Forwarding Table Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_mftr_rcv_construct or osm_mftr_rcv_init. -* -* SEE ALSO -* Multicast Forwarding Table Receiver object, osm_mftr_rcv_construct, -* osm_mftr_rcv_init -*********/ - -/****f* OpenSM: Multicast Forwarding Table Receiver/osm_mftr_rcv_init -* NAME -* osm_mftr_rcv_init -* -* DESCRIPTION -* The osm_mftr_rcv_init function initializes a -* Multicast Forwarding Table Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_mftr_rcv_init(IN osm_mftr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_mftr_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the Multicast Forwarding Table Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Multicast Forwarding Table Receiver methods. -* -* SEE ALSO -* Multicast Forwarding Table Receiver object, osm_mftr_rcv_construct, -* osm_mftr_rcv_destroy -*********/ - -/****f* OpenSM: Multicast Forwarding Table Receiver/osm_mftr_rcv_process -* NAME -* osm_mftr_rcv_process -* -* DESCRIPTION -* Process the MulticastForwardingTable attribute. -* -* SYNOPSIS -*/ -void osm_mftr_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_mftr_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the switch node's MulticastForwardingTable attribute. -* -* RETURN VALUES -* CL_SUCCESS if the MulticastForwardingTable processing was successful. -* -* NOTES -* This function processes a MulticastForwardingTable attribute. -* -* SEE ALSO -* Multicast Forwarding Table Receiver, Multicast Forwarding Table Response -* Controller -*********/ - -END_C_DECLS -#endif /* _OSM_MFTR_H_ */ diff --git a/opensm/include/opensm/osm_sa_multipath_record.h b/opensm/include/opensm/osm_sa_multipath_record.h deleted file mode 100644 index afd407d..0000000 --- a/opensm/include/opensm/osm_sa_multipath_record.h +++ /dev/null @@ -1,254 +0,0 @@ -/* - * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_mpr_rcv_t. - * This object represents the MultiPathRecord Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - */ - -#ifndef _OSM_MPR_RCV_H_ -#define _OSM_MPR_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/MultiPath Record Receiver -* NAME -* MultiPath Record Receiver -* -* DESCRIPTION -* The MultiPath Record Receiver object encapsulates the information -* needed to receive the PathRecord request from a node. -* -* The MultiPath Record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Hal Rosenstock, Voltaire -* -*********/ -/****s* OpenSM: MultiPath Record Receiver/osm_mpr_rcv_t -* NAME -* osm_mpr_rcv_t -* -* DESCRIPTION -* MultiPath Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_mpr_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_mpr_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_gen_req_ctrl -* Pointer to the generic request controller. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* MultiPath Record Receiver object -*********/ - -/****f* OpenSM: MultiPath Record Receiver/osm_mpr_rcv_construct -* NAME -* osm_mpr_rcv_construct -* -* DESCRIPTION -* This function constructs a MultiPath Record Receiver object. -* -* SYNOPSIS -*/ -void osm_mpr_rcv_construct(IN osm_mpr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a MultiPath Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_mpr_rcv_init, osm_mpr_rcv_destroy -* -* Calling osm_mpr_rcv_construct is a prerequisite to calling any other -* method except osm_mpr_rcv_init. -* -* SEE ALSO -* MultiPath Record Receiver object, osm_mpr_rcv_init, osm_mpr_rcv_destroy -*********/ - -/****f* OpenSM: MultiPath Record Receiver/osm_mpr_rcv_destroy -* NAME -* osm_mpr_rcv_destroy -* -* DESCRIPTION -* The osm_mpr_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_mpr_rcv_destroy(IN osm_mpr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* MultiPath Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_mpr_rcv_construct or osm_mpr_rcv_init. -* -* SEE ALSO -* MultiPath Record Receiver object, osm_mpr_rcv_construct, -* osm_mpr_rcv_init -*********/ - -/****f* OpenSM: MultiPath Record Receiver/osm_mpr_rcv_init -* NAME -* osm_mpr_rcv_init -* -* DESCRIPTION -* The osm_mpr_rcv_init function initializes a -* MultiPath Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_mpr_rcv_init(IN osm_mpr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_mpr_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the MultiPath Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other MultiPath Record Receiver methods. -* -* SEE ALSO -* MultiPath Record Receiver object, osm_mpr_rcv_construct, -* osm_mpr_rcv_destroy -*********/ - -/****f* OpenSM: MultiPath Record Receiver/osm_mpr_rcv_process -* NAME -* osm_mpr_rcv_process -* -* DESCRIPTION -* Process the MultiPathRecord request. -* -* SYNOPSIS -*/ -void osm_mpr_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_mpr_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's MultiPathRecord attribute. -* -* RETURN VALUES -* IB_SUCCESS if the MultiPathRecord processing was successful. -* -* NOTES -* This function processes a MultiPathRecord attribute. -* -* SEE ALSO -* MultiPath Record Receiver, Node Info Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_MPR_RCV_H_ */ diff --git a/opensm/include/opensm/osm_sa_node_record.h b/opensm/include/opensm/osm_sa_node_record.h deleted file mode 100644 index 8f385f8..0000000 --- a/opensm/include/opensm/osm_sa_node_record.h +++ /dev/null @@ -1,258 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_nr_rcv_t. - * This object represents the NodeRecord Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_NR_H_ -#define _OSM_NR_H_ - -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Node Record Receiver -* NAME -* Node Record Receiver -* -* DESCRIPTION -* The Node Record Receiver object encapsulates the information -* needed to receive the NodeRecord attribute from a node. -* -* The Node record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Anil S Keshavamurthy, Intel -* -*********/ -/****s* OpenSM: Node Record Receiver/osm_nr_rcv_t -* NAME -* osm_nr_rcv_t -* -* DESCRIPTION -* Node Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_nr_recv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_nr_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_resp -* Pointer to the SA responder. -* -* p_mad_pool -* Pointer to the mad pool. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* -*********/ - -/****f* OpenSM: Node Record Receiver/osm_nr_rcv_construct -* NAME -* osm_nr_rcv_construct -* -* DESCRIPTION -* This function constructs a Node Record Receiver object. -* -* SYNOPSIS -*/ -void osm_nr_rcv_construct(IN osm_nr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a Node Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_nr_rcv_init, osm_nr_rcv_destroy -* -* Calling osm_nr_rcv_construct is a prerequisite to calling any other -* method except osm_nr_rcv_init. -* -* SEE ALSO -* Node Record Receiver object, osm_nr_rcv_init, osm_lr_rcv_destroy -*********/ - -/****f* OpenSM: Node Record Receiver/osm_nr_rcv_destroy -* NAME -* osm_nr_rcv_destroy -* -* DESCRIPTION -* The osm_nr_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_nr_rcv_destroy(IN osm_nr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Node Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_nr_rcv_construct or osm_nr_rcv_init. -* -* SEE ALSO -* Node Record Receiver object, osm_nr_rcv_construct, -* osm_nr_rcv_init -*********/ - -/****f* OpenSM: Node Record Receiver/osm_nr_rcv_init -* NAME -* osm_nr_rcv_init -* -* DESCRIPTION -* The osm_nr_rcv_init function initializes a -* Node Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_nr_rcv_init(IN osm_nr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_nr_rcv_t object to initialize. -* -* p_resp -* [in] Pointer to the SA Responder object. -* -* p_mad_pool -* [in] Pointer to the mad pool. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the Node Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Link Record Receiver methods. -* -* SEE ALSO -* Node Record Receiver object, osm_nr_rcv_construct, osm_nr_rcv_destroy -*********/ - -/****f* OpenSM: Node Record Receiver/osm_nr_rcv_process -* NAME -* osm_nr_rcv_process -* -* DESCRIPTION -* Process the NodeRecord attribute. -* -* SYNOPSIS -*/ -void osm_nr_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_nr_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's NodeRecord attribute. -* -* NOTES -* This function processes a NodeRecord attribute. -* -* SEE ALSO -* Node Record Receiver, Node Record Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_NR_H_ */ diff --git a/opensm/include/opensm/osm_sa_path_record.h b/opensm/include/opensm/osm_sa_path_record.h deleted file mode 100644 index 76d24fc..0000000 --- a/opensm/include/opensm/osm_sa_path_record.h +++ /dev/null @@ -1,255 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_pr_rcv_t. - * This object represents the PathRecord Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_PR_H_ -#define _OSM_PR_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Path Record Receiver -* NAME -* Path Record Receiver -* -* DESCRIPTION -* The Path Record Receiver object encapsulates the information -* needed to receive the PathRecord request from a node. -* -* The Path Record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Ranjit Pandit, Intel -* Steve King, Intel -* -*********/ -/****s* OpenSM: Path Record Receiver/osm_pr_rcv_t -* NAME -* osm_pr_rcv_t -* -* DESCRIPTION -* Path Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_pr_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_pr_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_gen_req_ctrl -* Pointer to the generic request controller. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Path Record Receiver object -*********/ - -/****f* OpenSM: Path Record Receiver/osm_pr_rcv_construct -* NAME -* osm_pr_rcv_construct -* -* DESCRIPTION -* This function constructs a Path Record Receiver object. -* -* SYNOPSIS -*/ -void osm_pr_rcv_construct(IN osm_pr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a Path Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_pr_rcv_init, osm_pr_rcv_destroy -* -* Calling osm_pr_rcv_construct is a prerequisite to calling any other -* method except osm_pr_rcv_init. -* -* SEE ALSO -* Path Record Receiver object, osm_pr_rcv_init, osm_pr_rcv_destroy -*********/ - -/****f* OpenSM: Path Record Receiver/osm_pr_rcv_destroy -* NAME -* osm_pr_rcv_destroy -* -* DESCRIPTION -* The osm_pr_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_pr_rcv_destroy(IN osm_pr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Path Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_pr_rcv_construct or osm_pr_rcv_init. -* -* SEE ALSO -* Path Record Receiver object, osm_pr_rcv_construct, -* osm_pr_rcv_init -*********/ - -/****f* OpenSM: Path Record Receiver/osm_pr_rcv_init -* NAME -* osm_pr_rcv_init -* -* DESCRIPTION -* The osm_pr_rcv_init function initializes a -* Path Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_pr_rcv_init(IN osm_pr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_pr_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the Path Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Path Record Receiver methods. -* -* SEE ALSO -* Path Record Receiver object, osm_pr_rcv_construct, -* osm_pr_rcv_destroy -*********/ - -/****f* OpenSM: Path Record Receiver/osm_pr_rcv_process -* NAME -* osm_pr_rcv_process -* -* DESCRIPTION -* Process the PathRecord request. -* -* SYNOPSIS -*/ -void osm_pr_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_pr_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's PathRecord attribute. -* -* RETURN VALUES -* IB_SUCCESS if the PathRecord processing was successful. -* -* NOTES -* This function processes a PathRecord attribute. -* -* SEE ALSO -* Path Record Receiver, Path Record Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_PR_H_ */ diff --git a/opensm/include/opensm/osm_sa_pkey_record.h b/opensm/include/opensm/osm_sa_pkey_record.h deleted file mode 100644 index b2f43f0..0000000 --- a/opensm/include/opensm/osm_sa_pkey_record.h +++ /dev/null @@ -1,248 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -#ifndef _OSM_PKEY_REC_RCV_H_ -#define _OSM_PKEY_REC_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/P_Key Record Receiver -* NAME -* P_Key Record Receiver -* -* DESCRIPTION -* The P_Key Record Receiver object encapsulates the information -* needed to handle P_Key Record query from a SA. -* -* The P_Key Record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Yael Kalka, Mellanox -* -*********/ -/****s* OpenSM: P_Key Record Receiver/osm_pkey_rec_rcv_t -* NAME -* osm_pkey_rec_rcv_t -* -* DESCRIPTION -* P_Key Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_pkey_rec_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_pkey_rec_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_resp -* Pointer to the SA responder. -* -* p_mad_pool -* Pointer to the mad pool. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* -*********/ - -/****f* OpenSM: P_Key Record Receiver/osm_vlarb_rec_rcv_construct -* NAME -* osm_pkey_rec_rcv_construct -* -* DESCRIPTION -* This function constructs a P_Key Record Receiver object. -* -* SYNOPSIS -*/ -void osm_pkey_rec_rcv_construct(IN osm_pkey_rec_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a P_Key Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_pkey_rec_rcv_init, osm_pkey_rec_rcv_destroy -* -* Calling osm_pkey_rec_rcv_construct is a prerequisite to calling any other -* method except osm_pkey_rec_rcv_init. -* -* SEE ALSO -* P_Key Record Receiver object, osm_pkey_rec_rcv_init, -* osm_pkey_rec_rcv_destroy -*********/ - -/****f* OpenSM: P_Key Record Receiver/osm_pkey_rec_rcv_destroy -* NAME -* osm_pkey_rec_rcv_destroy -* -* DESCRIPTION -* The osm_pkey_rec_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_pkey_rec_rcv_destroy(IN osm_pkey_rec_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* P_Key Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_pkey_rec_rcv_construct or osm_pkey_rec_rcv_init. -* -* SEE ALSO -* P_Key Record Receiver object, osm_pkey_rec_rcv_construct, -* osm_pkey_rec_rcv_init -*********/ - -/****f* OpenSM: P_Key Record Receiver/osm_pkey_rec_rcv_init -* NAME -* osm_pkey_rec_rcv_init -* -* DESCRIPTION -* The osm_pkey_rec_rcv_init function initializes a -* P_Key Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_pkey_rec_rcv_init(IN osm_pkey_rec_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_pkey_rec_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the P_Key Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other P_Key Record Receiver methods. -* -* SEE ALSO -* P_Key Record Receiver object, osm_pkey_rec_rcv_construct, -* osm_pkey_rec_rcv_destroy -*********/ - -/****f* OpenSM: P_Key Record Receiver/osm_pkey_rec_rcv_process -* NAME -* osm_pkey_rec_rcv_process -* -* DESCRIPTION -* Process the P_Key Table Query . -* -* SYNOPSIS -*/ -void osm_pkey_rec_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_pkey_rec_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the P_Key Record Query attribute. -* -* RETURN VALUES -* CL_SUCCESS if the Query processing was successful. -* -* NOTES -* This function processes a SA P_Key Record attribute. -* -* SEE ALSO -* P_Key Record Receiver, P_Key Record Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_PKEY_REC_RCV_H_ */ diff --git a/opensm/include/opensm/osm_sa_portinfo_record.h b/opensm/include/opensm/osm_sa_portinfo_record.h deleted file mode 100644 index a818f25..0000000 --- a/opensm/include/opensm/osm_sa_portinfo_record.h +++ /dev/null @@ -1,261 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_pir_rcv_t. - * This object represents the PortInfo Record Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_PIR_RCV_H_ -#define _OSM_PIR_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/PortInfo Record Receiver -* NAME -* PortInfo Record Receiver -* -* DESCRIPTION -* The PortInfo Record Receiver object encapsulates the information -* needed to receive the PortInfoRecord attribute from a node. -* -* The PortInfo Record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Ranjit Pandit, Intel -* -*********/ -/****s* OpenSM: PortInfo Record Receiver/osm_pir_rcv_t -* NAME -* osm_pir_rcv_t -* -* DESCRIPTION -* PortInfo Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_pir_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_pir_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_resp -* Pointer to the SA responder. -* -* p_mad_pool -* Pointer to the mad pool. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* -*********/ - -/****f* OpenSM: PortInfo Record Receiver/osm_pir_rcv_construct -* NAME -* osm_pir_rcv_construct -* -* DESCRIPTION -* This function constructs a PortInfo Record Receiver object. -* -* SYNOPSIS -*/ -void osm_pir_rcv_construct(IN osm_pir_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a PortInfo Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_pir_rcv_init, osm_pir_rcv_destroy -* -* Calling osm_pir_rcv_construct is a prerequisite to calling any other -* method except osm_pir_rcv_init. -* -* SEE ALSO -* PortInfo Record Receiver object, osm_pir_rcv_init, -* osm_pir_rcv_destroy -*********/ - -/****f* OpenSM: PortInfo Record Receiver/osm_pir_rcv_destroy -* NAME -* osm_pir_rcv_destroy -* -* DESCRIPTION -* The osm_pir_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_pir_rcv_destroy(IN osm_pir_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* PortInfo Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_pir_rcv_construct or osm_pir_rcv_init. -* -* SEE ALSO -* PortInfo Record Receiver object, osm_pir_rcv_construct, -* osm_pir_rcv_init -*********/ - -/****f* OpenSM: PortInfo Record Receiver/osm_pir_rcv_init -* NAME -* osm_pir_rcv_init -* -* DESCRIPTION -* The osm_pir_rcv_init function initializes a -* PortInfo Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_pir_rcv_init(IN osm_pir_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_pir_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the PortInfo Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other PortInfo Record Receiver methods. -* -* SEE ALSO -* PortInfo Record Receiver object, osm_pir_rcv_construct, -* osm_pir_rcv_destroy -*********/ - -/****f* OpenSM: PortInfo Record Receiver/osm_pir_rcv_process -* NAME -* osm_pir_rcv_process -* -* DESCRIPTION -* Process the PortInfoRecord attribute. -* -* SYNOPSIS -*/ -void osm_pir_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_pir_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's PortInfoRecord attribute. -* -* RETURN VALUES -* CL_SUCCESS if the PortInfoRecord processing was successful. -* -* NOTES -* This function processes a PortInfoRecord attribute. -* -* SEE ALSO -* PortInfo Record Receiver, PortInfo Record Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_PIR_RCV_H_ */ diff --git a/opensm/include/opensm/osm_sa_response.h b/opensm/include/opensm/osm_sa_response.h deleted file mode 100644 index 53c4f95..0000000 --- a/opensm/include/opensm/osm_sa_response.h +++ /dev/null @@ -1,212 +0,0 @@ -/* - * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_sa_resp_t. - * This object represents an object that responds to SA queries. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_SA_RESP_H_ -#define _OSM_SA_RESP_H_ - -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/SA Response -* NAME -* SA Response -* -* DESCRIPTION -* The SA Response object encapsulates the information -* needed to respond to an SA query. -* -* The SA Response object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Ranjit Pandit, Intel -* Steve King, Intel -* -*********/ -/****s* OpenSM: SA Response/osm_sa_resp_t -* NAME -* osm_sa_resp_t -* -* DESCRIPTION -* SA Response structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_sa_resp { - osm_mad_pool_t *p_pool; - osm_subn_t *p_subn; - osm_log_t *p_log; -} osm_sa_resp_t; -/* -* FIELDS -* p_pool -* Pointer to the MAD pool. -* -* SEE ALSO -* SA Response object -*********/ - -/****f* OpenSM: SA Response/osm_sa_resp_construct -* NAME -* osm_sa_resp_construct -* -* DESCRIPTION -* This function constructs a SA Response object. -* -* SYNOPSIS -*/ -void osm_sa_resp_construct(IN osm_sa_resp_t * const p_resp); -/* -* PARAMETERS -* p_resp -* [in] Pointer to a SA Response object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_sa_resp_init, and osm_sa_resp_destroy. -* -* Calling osm_sa_resp_construct is a prerequisite to calling any other -* method except osm_sa_resp_init. -* -* SEE ALSO -* SA Response object, osm_sa_resp_init, -* osm_sa_resp_destroy -*********/ - -/****f* OpenSM: SA Response/osm_sa_resp_destroy -* NAME -* osm_sa_resp_destroy -* -* DESCRIPTION -* The osm_sa_resp_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_sa_resp_destroy(IN osm_sa_resp_t * const p_resp); -/* -* PARAMETERS -* p_resp -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* SA Response object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_sa_resp_construct or osm_sa_resp_init. -* -* SEE ALSO -* SA Response object, osm_sa_resp_construct, -* osm_sa_resp_init -*********/ - -/****f* OpenSM: SA Response/osm_sa_resp_init -* NAME -* osm_sa_resp_init -* -* DESCRIPTION -* The osm_sa_resp_init function initializes a -* SA Response object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_sa_resp_init(IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log); -/* -* PARAMETERS -* p_resp -* [in] Pointer to an osm_sa_resp_t object to initialize. -* -* p_mad_pool -* [in] Pointer to the MAD pool. -* -* p_subn -* [in] Pointer to Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* RETURN VALUES -* IB_SUCCESS if the SA Response object was initialized -* successfully. -* -* NOTES -* Allows calling other SA Response methods. -* -* SEE ALSO -* SA Response object, osm_sa_resp_construct, -* osm_sa_resp_destroy -*********/ - -END_C_DECLS -#endif /* _OSM_SA_RESP_H_ */ diff --git a/opensm/include/opensm/osm_sa_service_record.h b/opensm/include/opensm/osm_sa_service_record.h deleted file mode 100644 index 63bcc46..0000000 --- a/opensm/include/opensm/osm_sa_service_record.h +++ /dev/null @@ -1,274 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_sr_rcv_t. - * This object represents the ServiceRecord Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_SR_H_ -#define _OSM_SR_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Service Record Receiver -* NAME -* Service Record Receiver -* -* DESCRIPTION -* The Service Record Receiver object encapsulates the information -* needed to receive the ServiceRecord request from a node. -* -* The Service Record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Anil S Keshavamurthy -* -*********/ -/****s* OpenSM: Service Record Receiver/osm_sr_rcv_t -* NAME -* osm_sr_rcv_t -* -* DESCRIPTION -* Service Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_sr_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_sr_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_resp -* Pointer to the osm_sa_resp_t object. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Service Record Receiver object -*********/ - -/****f* OpenSM: Service Record Receiver/osm_sr_rcv_construct -* NAME -* osm_sr_rcv_construct -* -* DESCRIPTION -* This function constructs a Service Record Receiver object. -* -* SYNOPSIS -*/ -void osm_sr_rcv_construct(IN osm_sr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a Service Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_sr_rcv_init, osm_sr_rcv_destroy -* -* Calling osm_sr_rcv_construct is a prerequisite to calling any other -* method except osm_sr_rcv_init. -* -* SEE ALSO -* Service Record Receiver object, osm_sr_rcv_init, osm_sr_rcv_destroy -*********/ - -/****f* OpenSM: Service Record Receiver/osm_sr_rcv_destroy -* NAME -* osm_sr_rcv_destroy -* -* DESCRIPTION -* The osm_sr_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_sr_rcv_destroy(IN osm_sr_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Service Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_sr_rcv_construct or osm_sr_rcv_init. -* -* SEE ALSO -* Service Record Receiver object, osm_sr_rcv_construct, -* osm_sr_rcv_init -*********/ - -/****f* OpenSM: Service Record Receiver/osm_sr_rcv_init -* NAME -* osm_sr_rcv_init -* -* DESCRIPTION -* The osm_sr_rcv_init function initializes a -* Service Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_sr_rcv_init(IN osm_sr_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_sr_rcv_t object to initialize. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the Service Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Service Record Receiver methods. -* -* SEE ALSO -* Service Record Receiver object, osm_sr_rcv_construct, -* osm_sr_rcv_destroy -*********/ - -/****f* OpenSM: Service Record Receiver/osm_sr_rcv_process -* NAME -* osm_sr_rcv_process -* -* DESCRIPTION -* Process the ServiceRecord request. -* -* SYNOPSIS -*/ -void osm_sr_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_sr_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's ServiceRecord attribute. -* NOTES -* This function processes a ServiceRecord attribute. -* -* SEE ALSO -* Service Record Receiver -*********/ - -/****f* OpenSM: Service Record Receiver/osm_sr_rcv_lease_cb -* NAME -* osm_sr_rcv_lease_cb -* -* DESCRIPTION -* Timer Callback function which is executed to check the lease period -* expiration -* -* SYNOPSIS -*/ - -void osm_sr_rcv_lease_cb(IN void *context); -/* -* PARAMETERS -* context -* [in] Pointer to osm_sa_db_t object. -* -* NOTES -* This function processes a ServiceRecord attribute. -* -* SEE ALSO -* Service Record Receiver -*********/ - -END_C_DECLS -#endif /* _OSM_SR_H_ */ diff --git a/opensm/include/opensm/osm_sa_slvl_record.h b/opensm/include/opensm/osm_sa_slvl_record.h deleted file mode 100644 index 518a0f1..0000000 --- a/opensm/include/opensm/osm_sa_slvl_record.h +++ /dev/null @@ -1,261 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_slvl_rec_rcv_t. - * This object represents the SLtoVL Mapping Table Receiver object. - * attribute from a SA query. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.3 $ - */ - -#ifndef _OSM_SLVL_REC_RCV_H_ -#define _OSM_SLVL_REC_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/SLtoVL Mapping Record Receiver -* NAME -* SLtoVL Mapping Record Receiver -* -* DESCRIPTION -* The SLtoVL Mapping Record Receiver object encapsulates the information -* needed to handle SLtoVL Mapping Record query from a SA. -* -* The SLtoVL Mapping Record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Eitan Zahavi, Mellanox -* -*********/ -/****s* OpenSM: SLtoVL Mapping Record Receiver/osm_slvl_rec_rcv_t -* NAME -* osm_slvl_rec_rcv_t -* -* DESCRIPTION -* SLtoVL Mapping Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_slvl_rec_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_slvl_rec_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_resp -* Pointer to the SA responder. -* -* p_mad_pool -* Pointer to the mad pool. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* -*********/ - -/****f* OpenSM: SLtoVL Mapping Record Receiver/osm_slvl_rec_rcv_construct -* NAME -* osm_slvl_rec_rcv_construct -* -* DESCRIPTION -* This function constructs a SLtoVL Mapping Record Receiver object. -* -* SYNOPSIS -*/ -void osm_slvl_rec_rcv_construct(IN osm_slvl_rec_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a SLtoVL Mapping Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_slvl_rec_rcv_init, osm_slvl_rec_rcv_destroy -* -* Calling osm_slvl_rec_rcv_construct is a prerequisite to calling any other -* method except osm_slvl_rec_rcv_init. -* -* SEE ALSO -* SLtoVL Mapping Record Receiver object, osm_slvl_rec_rcv_init, -* osm_slvl_rec_rcv_destroy -*********/ - -/****f* OpenSM: SLtoVL Mapping Record Receiver/osm_slvl_rec_rcv_destroy -* NAME -* osm_slvl_rec_rcv_destroy -* -* DESCRIPTION -* The osm_slvl_rec_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_slvl_rec_rcv_destroy(IN osm_slvl_rec_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* SLtoVL Mapping Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_slvl_rec_rcv_construct or osm_slvl_rec_rcv_init. -* -* SEE ALSO -* SLtoVL Mapping Record Receiver object, osm_slvl_rec_rcv_construct, -* osm_slvl_rec_rcv_init -*********/ - -/****f* OpenSM: SLtoVL Mapping Record Receiver/osm_slvl_rec_rcv_init -* NAME -* osm_slvl_rec_rcv_init -* -* DESCRIPTION -* The osm_slvl_rec_rcv_init function initializes a -* SLtoVL Mapping Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_slvl_rec_rcv_init(IN osm_slvl_rec_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_slvl_rec_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the SLtoVL Mapping Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other SLtoVL Mapping Record Receiver methods. -* -* SEE ALSO -* SLtoVL Mapping Record Receiver object, osm_slvl_rec_rcv_construct, -* osm_slvl_rec_rcv_destroy -*********/ - -/****f* OpenSM: SLtoVL Mapping Record Receiver/osm_slvl_rec_rcv_process -* NAME -* osm_slvl_rec_rcv_process -* -* DESCRIPTION -* Process the SLtoVL Map Table Query . -* -* SYNOPSIS -*/ -void osm_slvl_rec_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_slvl_rec_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the SLtoVL Map Record Query attribute. -* -* RETURN VALUES -* CL_SUCCESS if the Query processing was successful. -* -* NOTES -* This function processes a SA SLtoVL Map Record attribute. -* -* SEE ALSO -* SLtoVL Mapping Record Receiver, SLtoVL Mapping Record Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_SLVL_REC_RCV_H_ */ diff --git a/opensm/include/opensm/osm_sa_sminfo_record.h b/opensm/include/opensm/osm_sa_sminfo_record.h deleted file mode 100644 index f4fd1ff..0000000 --- a/opensm/include/opensm/osm_sa_sminfo_record.h +++ /dev/null @@ -1,252 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_smir_rcv_t. - * This object represents the SMInfo Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_SMIR_H_ -#define _OSM_SMIR_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/SM Info Receiver -* NAME -* SM Info Receiver -* -* DESCRIPTION -* The SM Info Receiver object encapsulates the information -* needed to receive the SMInfoRecord attribute from a node. -* -* The SM Info Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Ranjit Pandit, Intel -* -*********/ -/****s* OpenSM: SM Info Receiver/osm_smir_rcv_t -* NAME -* osm_smir_rcv_t -* -* DESCRIPTION -* SM Info Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_smir { - osm_subn_t *p_subn; - osm_stats_t *p_stats; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_smir_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* SEE ALSO -* SM Info Receiver object -*********/ - -/****f* OpenSM: SM Info Receiver/osm_smir_rcv_construct -* NAME -* osm_smir_rcv_construct -* -* DESCRIPTION -* This function constructs a SM Info Receiver object. -* -* SYNOPSIS -*/ -void osm_smir_rcv_construct(IN osm_smir_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a SM Info Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_smir_rcv_init, osm_smir_rcv_destroy -* -* Calling osm_smir_rcv_construct is a prerequisite to calling any other -* method except osm_smir_rcv_init. -* -* SEE ALSO -* SM Info Receiver object, osm_smir_rcv_init, osm_smir_rcv_destroy -*********/ - -/****f* OpenSM: SM Info Receiver/osm_smir_rcv_destroy -* NAME -* osm_smir_rcv_destroy -* -* DESCRIPTION -* The osm_smir_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_smir_rcv_destroy(IN osm_smir_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* SM Info Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_smir_rcv_construct or osm_smir_rcv_init. -* -* SEE ALSO -* SM Info Receiver object, osm_smir_rcv_construct, -* osm_smir_rcv_init -*********/ - -/****f* OpenSM: SM Info Receiver/osm_smir_rcv_init -* NAME -* osm_smir_rcv_init -* -* DESCRIPTION -* The osm_smir_rcv_init function initializes a -* SM Info Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_smir_rcv_init(IN osm_smir_rcv_t * const p_ctrl, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_stats_t * const p_stats, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to an osm_smir_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_stats -* [in] Pointer to the Statistics object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the SM Info Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other SM Info Receiver methods. -* -* SEE ALSO -* SM Info Receiver object, osm_smir_rcv_construct, osm_smir_rcv_destroy -*********/ - -/****f* OpenSM: SM Info Receiver/osm_smir_rcv_process -* NAME -* osm_smir_rcv_process -* -* DESCRIPTION -* Process the SMInfoRecord attribute. -* -* SYNOPSIS -*/ -void osm_smir_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_smir_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's SMInfoRecord attribute. -* -* RETURN VALUES -* CL_SUCCESS if the SMInfoRecord processing was successful. -* -* NOTES -* This function processes a SMInfoRecord attribute. -* -* SEE ALSO -* SM Info Receiver, SM Info Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_SMIR_H_ */ diff --git a/opensm/include/opensm/osm_sa_sw_info_record.h b/opensm/include/opensm/osm_sa_sw_info_record.h deleted file mode 100644 index df6f842..0000000 --- a/opensm/include/opensm/osm_sa_sw_info_record.h +++ /dev/null @@ -1,265 +0,0 @@ -/* - * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_sir_rcv_t. - * This object represents the SwitchInfo Receiver object. - * attribute from a switch node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - */ - -#ifndef _OSM_SIR_RCV_H_ -#define _OSM_SIR_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Switch Info Receiver -* NAME -* Switch Info Receiver -* -* DESCRIPTION -* The Switch Info Receiver object encapsulates the information -* needed to receive the SwitchInfo attribute from a switch node. -* -* The Switch Info Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Hal Rosenstock, Voltaire -* -*********/ -/****s* OpenSM: Switch Info Receiver/osm_sir_rcv_t -* NAME -* osm_sir_rcv_t -* -* DESCRIPTION -* Switch Info Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_sir_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - osm_req_t *p_req; - osm_state_mgr_t *p_state_mgr; - cl_plock_t *p_lock; -} osm_sir_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_log -* Pointer to the log object. -* -* p_req -* Pointer to the Request object. -* -* p_state_mgr -* Pointer to the State Manager object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* Switch Info Receiver object -*********/ - -/****f* OpenSM: Switch Info Receiver/osm_sir_rcv_construct -* NAME -* osm_sir_rcv_construct -* -* DESCRIPTION -* This function constructs a Switch Info Receiver object. -* -* SYNOPSIS -*/ -void osm_sir_rcv_construct(IN osm_sir_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to a Switch Info Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_sir_rcv_destroy. -* -* Calling osm_sir_rcv_construct is a prerequisite to calling any other -* method except osm_sir_rcv_init. -* -* SEE ALSO -* Switch Info Receiver object, osm_sir_rcv_init, osm_sir_rcv_destroy -*********/ - -/****f* OpenSM: Switch Info Receiver/osm_sir_rcv_destroy -* NAME -* osm_sir_rcv_destroy -* -* DESCRIPTION -* The osm_sir_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_sir_rcv_destroy(IN osm_sir_rcv_t * const p_ctrl); -/* -* PARAMETERS -* p_ctrl -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Switch Info Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_sir_rcv_construct or osm_sir_rcv_init. -* -* SEE ALSO -* Switch Info Receiver object, osm_sir_rcv_construct, -* osm_sir_rcv_init -*********/ - -/****f* OpenSM: Switch Info Receiver/osm_sir_rcv_init -* NAME -* osm_sir_rcv_init -* -* DESCRIPTION -* The osm_sir_rcv_init function initializes a -* Switch Info Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t osm_sir_rcv_init(IN osm_sir_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_sir_rcv_t object to initialize. -* -* p_resp -* [in] Pointer to the SA Responder object. -* -* p_mad_pool -* [in] Pointer to the mad pool. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* IB_SUCCESS if the Switch Info Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other Switch Info Receiver methods. -* -* SEE ALSO -* Switch Info Receiver object, osm_sir_rcv_construct, -* osm_sir_rcv_destroy -*********/ - -/****f* OpenSM: Switch Info Receiver/osm_sir_rcv_process -* NAME -* osm_sir_rcv_process -* -* DESCRIPTION -* Process the SwitchInfo attribute. -* -* SYNOPSIS -*/ -void osm_sir_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_sir_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the node's SwitchInfo attribute. -* -* RETURN VALUES -* CL_SUCCESS if the SwitchInfo processing was successful. -* -* NOTES -* This function processes a SwitchInfo attribute. -* -* SEE ALSO -* Switch Info Receiver, Switch Info Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_SIR_RCV_H_ */ diff --git a/opensm/include/opensm/osm_sa_vlarb_record.h b/opensm/include/opensm/osm_sa_vlarb_record.h deleted file mode 100644 index 1ed8554..0000000 --- a/opensm/include/opensm/osm_sa_vlarb_record.h +++ /dev/null @@ -1,262 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_vlarb_rec_rcv_t. - * This object represents the VLArbitration Record Receiver object. - * attribute from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.3 $ - */ - -#ifndef _OSM_VLARB_REC_RCV_H_ -#define _OSM_VLARB_REC_RCV_H_ - -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/VLArbitration Record Receiver -* NAME -* VLArbitration Record Receiver -* -* DESCRIPTION -* The VLArbitration Record Receiver object encapsulates the information -* needed to handle VL Arbitration Record query from a SA. -* -* The VLArbitration Record Receiver object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Eitan Zahavi, Mellanox -* -*********/ -/****s* OpenSM: VLArbitration Record Receiver/osm_vlarb_rec_rcv_t -* NAME -* osm_vlarb_rec_rcv_t -* -* DESCRIPTION -* VLArbitration Record Receiver structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_vlarb_rec_rcv { - osm_subn_t *p_subn; - osm_sa_resp_t *p_resp; - osm_mad_pool_t *p_mad_pool; - osm_log_t *p_log; - cl_plock_t *p_lock; -} osm_vlarb_rec_rcv_t; -/* -* FIELDS -* p_subn -* Pointer to the Subnet object for this subnet. -* -* p_resp -* Pointer to the SA responder. -* -* p_mad_pool -* Pointer to the mad pool. -* -* p_log -* Pointer to the log object. -* -* p_lock -* Pointer to the serializing lock. -* -* SEE ALSO -* -*********/ - -/****f* OpenSM: VLArbitration Record Receiver/osm_vlarb_rec_rcv_construct -* NAME -* osm_vlarb_rec_rcv_construct -* -* DESCRIPTION -* This function constructs a VLArbitration Record Receiver object. -* -* SYNOPSIS -*/ -void osm_vlarb_rec_rcv_construct(IN osm_vlarb_rec_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to a VLArbitration Record Receiver object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_vlarb_rec_rcv_init, osm_vlarb_rec_rcv_destroy -* -* Calling osm_vlarb_rec_rcv_construct is a prerequisite to calling any other -* method except osm_vlarb_rec_rcv_init. -* -* SEE ALSO -* VLArbitration Record Receiver object, osm_vlarb_rec_rcv_init, -* osm_vlarb_rec_rcv_destroy -*********/ - -/****f* OpenSM: VLArbitration Record Receiver/osm_vlarb_rec_rcv_destroy -* NAME -* osm_vlarb_rec_rcv_destroy -* -* DESCRIPTION -* The osm_vlarb_rec_rcv_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_vlarb_rec_rcv_destroy(IN osm_vlarb_rec_rcv_t * const p_rcv); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* VLArbitration Record Receiver object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_vlarb_rec_rcv_construct or osm_vlarb_rec_rcv_init. -* -* SEE ALSO -* VLArbitration Record Receiver object, osm_vlarb_rec_rcv_construct, -* osm_vlarb_rec_rcv_init -*********/ - -/****f* OpenSM: VLArbitration Record Receiver/osm_vlarb_rec_rcv_init -* NAME -* osm_vlarb_rec_rcv_init -* -* DESCRIPTION -* The osm_vlarb_rec_rcv_init function initializes a -* VLArbitration Record Receiver object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_vlarb_rec_rcv_init(IN osm_vlarb_rec_rcv_t * const p_rcv, - IN osm_sa_resp_t * const p_resp, - IN osm_mad_pool_t * const p_mad_pool, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN cl_plock_t * const p_lock); -/* -* PARAMETERS -* p_rcv -* [in] Pointer to an osm_vlarb_rec_rcv_t object to initialize. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. -* -* RETURN VALUES -* CL_SUCCESS if the VLArbitration Record Receiver object was initialized -* successfully. -* -* NOTES -* Allows calling other VLArbitration Record Receiver methods. -* -* SEE ALSO -* VLArbitration Record Receiver object, osm_vlarb_rec_rcv_construct, -* osm_vlarb_rec_rcv_destroy -*********/ - -/****f* OpenSM: VLArbitration Record Receiver/osm_vlarb_rec_rcv_process -* NAME -* osm_vlarb_rec_rcv_process -* -* DESCRIPTION -* Process the VL Arbitration Table Query . -* -* SYNOPSIS -*/ -void osm_vlarb_rec_rcv_process(IN void *context, IN void *data); -/* -* PARAMETERS -* context -* [in] Pointer to an osm_vlarb_rec_rcv_t object. -* -* data -* [in] Pointer to the MAD Wrapper containing the MAD -* that contains the VL Arbitration Record Query attribute. -* -* RETURN VALUES -* CL_SUCCESS if the Query processing was successful. -* -* NOTES -* This function processes a SA VL Arbitration Record attribute. -* -* SEE ALSO -* VLArbitration Record Receiver, VLArbitration Record Response Controller -*********/ - -END_C_DECLS -#endif /* _OSM_VLARB_REC_RCV_H_ */ -- 1.5.3.4.206.g58ba4 From jessica_egobia19 at yahoo.com Thu Jan 3 02:04:23 2008 From: jessica_egobia19 at yahoo.com (jessica egobia) Date: Thu, 3 Jan 2008 02:04:23 -0800 (PST) Subject: [ofa-general] NNPC Message-ID: <1038.87459.qm@web59313.mail.re1.yahoo.com> MRS, JESSICA EGOBIA. OPERATION DIRECTOR NATIONAL CRUDE OIL(NNPC Attn;Sir, THIS MIGHT COME TO YOU AS A STRANGE, BUT BASED ON MERIT I AM THE OPERATION DIRECTOR NATIONAL CRUDE OIL THE CRUX OF THIS LETTER IS THAT I HAVE OVER INVOICED TO THE TURN OF ($20M. THIS FUND IS IN ABROAD TO AVOID ANY BANK TRANSACTION TRACES, SEND THE FOLLOWING 1, YOUR FULL NAMES 2 ,YOUR CONTACT ADDRESS, 3, YOUR PHONE AND FAX NUMBER IF ANY FOR EASY CONTACT THANKS AND GOD BLESS YOU AS YOU REPLY TO THIS LETTER , BEST REGARDS MRS, JESSICA EGOBIA. OPERATION DIRECTOR NATIONAL CRUDE OIL(NNPC) --------------------------------- Never miss a thing. Make Yahoo your homepage. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlad at lists.openfabrics.org Thu Jan 3 03:06:28 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 3 Jan 2008 03:06:28 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080103-0200 daily build status Message-ID: <20080103110628.0B459E60090@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.15 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.12 Passed on powerpc with linux-2.6.12 Passed on x86_64 with linux-2.6.19 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ppc64 with linux-2.6.14 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.15 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.17 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.18 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.16 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.22 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Failed: From hrosenstock at xsigo.com Thu Jan 3 03:53:57 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 03 Jan 2008 03:53:57 -0800 Subject: [ofa-general] Re: [PATCH] ib/ipoib: Reduce comparison size in data path In-Reply-To: <1199346598.21275.393.camel@mtls03> References: <1199097884.21275.242.camel@mtls03> <1199346598.21275.393.camel@mtls03> Message-ID: <1199361237.23289.630.camel@hrosenstock-ws.xsigo.com> On Thu, 2008-01-03 at 09:49 +0200, Eli Cohen wrote: > On Wed, 2008-01-02 at 10:16 -0800, Roland Dreier wrote: > > > In the majority of cases, if the neighbour will change, it will > > > be reflected in the guid part of the GID (bytes 8-15). If the GID > > > prefix will change as well (bytes 0-7) it will be because the master > > > SM has changed, in which case we will get an SM change event resulting > > > in all paths flushed. > > > > Is it guaranteed that an active SM can't change a GID prefix? > I know opensm has a fixed, hard coded subnet prefix. OpenSM's subnet prefix is determined via a config file (or defaults if not overridden there). There has been discussion about supporting reconfiguration without OpenSM restart. (One can also envision a restart scenario here where the subnet prefix is changed). Also, OpenSM is not the only SM out there and IMO ideally we would/should rely on only what the IBA architecture requires and not on current implementation. -- Hal > > Especially if we're using a GID at an index != 0? > I think ipoib uses only the GID from index 0, isn't it? > > In other words, is > > this change definitely 100 percent safe? > It looks safe to me but I wanted to hear other opinions. > > > > Also I assume this change is coming from performance tuning. For > > patches like this it is always helpful to include hard data like "this > > gives a speedup of X on test Y on system Z." > > > > Thanks... > Indeed I am working on performance and on the branch I am working on, > which is different than main branch, it does makes a slight difference. > I am trying to improve the throughput of small (up to 128 bytes) UDP > messages where I am CPU bound so everything counts. But I believe that > if it is correct than we should use it even if the improvement is not > outstanding. > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Thu Jan 3 04:03:59 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 03 Jan 2008 04:03:59 -0800 Subject: [ofa-general] [PATCH][TRIVIAL] opensm/osm_subnet.(h c): Cosmetic changes to some options descriptions Message-ID: <1199361839.23289.636.camel@hrosenstock-ws.xsigo.com> opensm/osm_subnet.(h c): Cosmetic changes to some options descriptions for better clarity Signed-off-by: Hal Rosenstock diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h index cf52b49..d693875 100644 --- a/opensm/include/opensm/osm_subnet.h +++ b/opensm/include/opensm/osm_subnet.h @@ -306,6 +306,13 @@ typedef struct _osm_subn_opt { * The number of seconds between subnet sweeps. A value of 0 * disables sweeping. * +* max_wire_smps +* The maximum number of SMPs sent in parallel. Default is 4. +* +* transaction_timeout +* The maximum time in milliseconds allowed for a transaction +* to complete. Default is 200. +* * sm_priority * The priority of this SM as specified by the user. This * value is made available in the SMInfo attribute. diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index f9eb714..0103940 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -1582,9 +1582,9 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) fprintf(opts_file, "#\n# TIMING AND THREADING OPTIONS\n#\n" - "# Number of MADs sent in parallel\n" + "# Maximum number of SMPs sent in parallel\n" "max_wire_smps %u\n\n" - "# The time taken to a transaction to finish in [msec]\n" + "# The maximum time in [msec] allowed for a transaction to complete\n" "transaction_timeout %u\n\n" "# Maximal time in [msec] a message can stay in the incoming message queue.\n" "# If there is more than one message in the queue and the last message\n" From eli at mellanox.co.il Thu Jan 3 04:14:35 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 03 Jan 2008 14:14:35 +0200 Subject: [ofa-general] Re: [PATCH] ib/ipoib: Reduce comparison size in data path In-Reply-To: <1199361237.23289.630.camel@hrosenstock-ws.xsigo.com> References: <1199097884.21275.242.camel@mtls03> <1199346598.21275.393.camel@mtls03> <1199361237.23289.630.camel@hrosenstock-ws.xsigo.com> Message-ID: <1199362475.21275.413.camel@mtls03> On Thu, 2008-01-03 at 03:53 -0800, Hal Rosenstock wrote: > OpenSM's subnet prefix is determined via a config file (or defaults if > not overridden there). There has been discussion about supporting > reconfiguration without OpenSM restart. (One can also envision a restart > scenario here where the subnet prefix is changed). Won't such a possible event require the SM to send an SM change event? If it does then ipoib can handle this. > Also, OpenSM is not > the only SM out there and IMO ideally we would/should rely on only what > the IBA architecture requires and not on current implementation. Agree but again, wouldn't such a change in configuration require an SM change event? From hrosenstock at xsigo.com Thu Jan 3 04:20:57 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 03 Jan 2008 04:20:57 -0800 Subject: [ofa-general] [PATCH] infiniband-diags/man/ibnetdiscover.8: Add newly added ports report option Message-ID: <1199362857.23289.642.camel@hrosenstock-ws.xsigo.com> infiniband-diags/man/ibnetdiscover.8: Add newly added ports report option Signed-off-by: Hal Rosenstock diff --git a/infiniband-diags/man/ibnetdiscover.8 b/infiniband-diags/man/ibnetdiscover.8 index 9099cf3..383efe1 100644 --- a/infiniband-diags/man/ibnetdiscover.8 +++ b/infiniband-diags/man/ibnetdiscover.8 @@ -1,11 +1,11 @@ -.TH IBNETDISCOVER 8 "June 5, 2007" "OpenIB" "OpenIB Diagnostics" +.TH IBNETDISCOVER 8 "January 3, 2008" "OpenIB" "OpenIB Diagnostics" .SH NAME ibnetdiscover \- discover InfiniBand topology .SH SYNOPSIS .B ibnetdiscover -[\-d(ebug)] [\-e(rr_show)] [\-v(erbose)] [\-s(how)] [\-l(ist)] [\-g(rouping)] [\-H(ca_list)] [\-S(witch_list)] [\-R(outer_list)] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] [\-V(ersion)] [\--node-name-map ] [\-h(elp)] [] +[\-d(ebug)] [\-e(rr_show)] [\-v(erbose)] [\-s(how)] [\-l(ist)] [\-g(rouping)] [\-H(ca_list)] [\-S(witch_list)] [\-R(outer_list)] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] [\-V(ersion)] [\--node-name-map ] [\-p(orts)] [\-h(elp)] [] .SH DESCRIPTION .PP @@ -42,6 +42,11 @@ Show more information \fB\-\-node\-name\-map\fR Specify a node name map. The node name map file maps GUIDs to more user friendly names. See file format below. +.TP +\fB\-p\fR, \fB\-\-ports\fR +Obtain a ports report which is a +list of connected ports with relevant information (like LID, portnum, +GUID, width, speed, and NodeDescription). .SH COMMON OPTIONS From sashak at voltaire.com Thu Jan 3 04:44:41 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 12:44:41 +0000 Subject: [ofa-general] Re: [PATCH][TRIVIAL] opensm/osm_subnet.(h c): Cosmetic changes to some options descriptions In-Reply-To: <1199361839.23289.636.camel@hrosenstock-ws.xsigo.com> References: <1199361839.23289.636.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080103124441.GH19494@sashak.voltaire.com> On 04:03 Thu 03 Jan , Hal Rosenstock wrote: > opensm/osm_subnet.(h c): Cosmetic changes to some options descriptions > for better clarity > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Thu Jan 3 04:45:41 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 3 Jan 2008 12:45:41 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/man/ibnetdiscover.8: Add newly added ports report option In-Reply-To: <1199362857.23289.642.camel@hrosenstock-ws.xsigo.com> References: <1199362857.23289.642.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080103124541.GI19494@sashak.voltaire.com> On 04:20 Thu 03 Jan , Hal Rosenstock wrote: > infiniband-diags/man/ibnetdiscover.8: Add newly added ports report > option > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From hrosenstock at xsigo.com Thu Jan 3 04:54:35 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 03 Jan 2008 04:54:35 -0800 Subject: [ofa-general] Re: [PATCH] ib/ipoib: Reduce comparison size in data path In-Reply-To: <1199362475.21275.413.camel@mtls03> References: <1199097884.21275.242.camel@mtls03> <1199346598.21275.393.camel@mtls03> <1199361237.23289.630.camel@hrosenstock-ws.xsigo.com> <1199362475.21275.413.camel@mtls03> Message-ID: <1199364875.23289.671.camel@hrosenstock-ws.xsigo.com> On Thu, 2008-01-03 at 14:14 +0200, Eli Cohen wrote: > On Thu, 2008-01-03 at 03:53 -0800, Hal Rosenstock wrote: > > > OpenSM's subnet prefix is determined via a config file (or defaults if > > not overridden there). There has been discussion about supporting > > reconfiguration without OpenSM restart. (One can also envision a restart > > scenario here where the subnet prefix is changed). > Won't such a possible event require the SM to send an SM change event? I don't think there is such an SM event. Isn't SM change some locally (end node/port) fabricated event ? Does it rely on SM LID in PortInfo being changed or something else ? If that is the case, then SM LID wouldn't change. The hammer that OpenSM currently uses in these sorts of scenarios is client reregister but this is an optional feature and not all other SM's use this AFAIK. Maybe for this case, they would need to. > If it does then ipoib can handle this. > > > Also, OpenSM is not > > the only SM out there and IMO ideally we would/should rely on only what > > the IBA architecture requires and not on current implementation. > Agree but again, wouldn't such a change in configuration require an SM > change event? > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From dwsamchangm at samchang.com Wed Jan 2 06:47:02 2008 From: dwsamchangm at samchang.com (Gary Metcalf) Date: Thu, 2 Jan 2008 17:47:02 +0300 Subject: [ofa-general] Start new life with CanadianPharmacy. Message-ID: <01c84d67$7af66f00$5663294d@dwsamchangm> Canadian.»CanadianPharmacy» drugstore offers a great selection of best quality certified generic medicines at absolutely low prices. Buy in Canada and save money. Rely on our experience and purchase your meds in a discreet and confidential way. Fast delivery! http://geocities.com/ThadMercer82/ We don’t advertise, we advise. Gary Metcalf From eli at mellanox.co.il Thu Jan 3 06:50:22 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 03 Jan 2008 16:50:22 +0200 Subject: [ofa-general] [PATCH] IB/core - ib_wr_opcode change to add IB_WR_LSO breaks ib_ipath In-Reply-To: <477CA0B0.5070106@voltaire.com> References: <1199312334.4280.33.camel@brick.pathscale.com> <477CA0B0.5070106@voltaire.com> Message-ID: <1199371822.21749.2.camel@mtls03> OK, I'll push this to ofed. On Thu, 2008-01-03 at 10:45 +0200, Or Gerlitz wrote: > Ralph Campbell wrote: > > The ib_ipath driver depends on /usr/include/infiniband/verbs.h > > enum ibv_wr_opcode matching the kernel's ib_verbs.h > > enum ib_wr_opcode. The recent change to add IB_WR_LSO breaks this. > > > > Now, you may argue that the kernel should not depend on this equivalence > > but you would then need to define IBV_WR_RDMA_WRITE, etc. in some > > kernel header file and do a table look up to map from user to > > kernel opcode values. Since I don't see any other code which depends > > on the value of IB_WR_LSO, I think the following patch is the right > > fix. > > > > This should be applied to 2.6.24 and 2.6.25. > > SO in a way, putting IB_WR_LSO where it was broke the ABI for libibpath. > > Eli, > > Can you please apply this change to the patch set which you consider as > candidate for upstream merging? > > Or. > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From kliteyn at dev.mellanox.co.il Thu Jan 3 07:02:07 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Thu, 03 Jan 2008 17:02:07 +0200 Subject: [ofa-general] Re: CMA can't establish connection with QoS on In-Reply-To: <4761315A.1070306@dev.mellanox.co.il> References: <47600070.8050008@dev.mellanox.co.il> <000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com> <47605620.3070105@dev.mellanox.co.il> <47608BE4.7020209@ichips.intel.com> <4761315A.1070306@dev.mellanox.co.il> Message-ID: <477CF8EF.5010307@dev.mellanox.co.il> Hi Sean, Did you get a chance to look at this issue? https://bugs.openfabrics.org/show_bug.cgi?id=821 -- Yevgeny Yevgeny Kliteynik wrote: > Sean Hefty wrote: >>> Not sure if it helps, but "ibv_rc_pingpong -l " works. >> >> I'll try this tomorrow. >> >> With QoS enabled (using the QoS file provided in bug 821), I get a >> ROUTE_ERROR on the client side, but I don't see a system hang. I'm >> running 2.6.24-rc3 with patches in my for-roland branch. >> >> I do see that opensm hangs when I try to kill it, but using 'kill' >> forces it to exit. > > Sean, > > Sorry, but I forgot to add an important detail in the instructions to > reproduce this problem. Your FW has to be QoS-enabled. > > Use the latest ConnectX FW release 2_3_000. If you don'r have it, you > can get it from the Mellanox site: > > http://www.mellanox.com/support/firmware_download.php > > When you burn it, you need to enable the QoS by adding the following > line in the [hca] section of the .ini file: > > sx_vlarb_en = true > > -- Yevgeny > >> - Sean >> > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From changquing.tang at hp.com Thu Jan 3 07:49:17 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Thu, 3 Jan 2008 15:49:17 +0000 Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of any one user process In-Reply-To: <6C2C79E72C305246B504CBA17B5500C903039083@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C903039083@mtlexch01.mtl.com> Message-ID: Thanks for the comment. Another issue I have after thinking about the interface more. Rank A is the sender, rank B and C are two ranks on a remote node. At first, B creates the receiving QP and make connection to A and register the QP number for receiving. And A gets the receiving QP nubmer from B. After some communication between A and B, B decides to close the connection, and unregister the QP number. Then A and C want to talk, so A tell C the receiving QP number, C tries to register the QP number. I wonder at the time when C tries to register the QP number, the receiving QP has been destroyed by the kernel, since when B unregister the QP number, the reference count becomes zero, and kernel will cleanup it. Am I right ? --CQ > -----Original Message----- > From: Ishai Rabinovitz [mailto:ishai at mellanox.co.il] > Sent: Thursday, January 03, 2008 2:59 AM > To: panda at cse.ohio-state.edu; Tang, Changqing; Jack > Morgenstein; Pavel Shamis > Cc: Gleb Natapov; Roland Dreier; general at lists.openfabrics.org > Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP > independent of any one user process > > Please see my comments (prefix [Ishai]) > > -----Original Message----- > From: Tang, Changqing [mailto:changquing.tang at hp.com] > Sent: ד 02 ינואר 2008 17:27 > To: Jack Morgenstein; Pavel Shamis > Cc: Ishai Rabinovitz; Gleb Natapov; Roland Dreier; > general at lists.openfabrics.org > Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP > independent of any one user process > > > This interface is OK for me. > > Now, every rank on a node who wants to receive message from > the same remote rank must know the same receiving QP number, > and register for receiving using this QP number. > > If rank B does not register (receiving QP has been created by > another rank A on the node), and sender know B's SRQ number, > if sender sends a message to B, can B still receive this > message ? (I hope, no register, no receive) > > [Ishai] I guess that from the MPI layer prospective, the > sender can not know B's SRQ number until it ask B to give it > to him. So B can register to this QP before sending the SRQ number. > > I hope to know the opinion from other MPI team, or other XRC user. > > [Ishai] We already discussed this issues with Open MPI IB > group, and it looks fine to them. I'm sending this mail to > Prof. Panda, so he can comment on it as well. > > --CQ > > > > > -----Original Message----- > > From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il] > > Sent: Monday, December 31, 2007 5:40 AM > > To: pasha at mellanox.co.il > > Cc: ishai at mellanox.co.il; Gleb Natapov; Roland Dreier; Tang, > > Changqing; general at lists.openfabrics.org > > Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP > > independent of any one user process > > > > > Tang, Changqing wrote: > > > > If I have a MPI server processes on a node, many > > other MPI > > > > client processes will dynamically connect/disconnect with the > > > > server. The server use same XRC domain. > > > > > > > > Will this cause accumulating the "kernel" QP for such > > > > application ? we want the server to run 365 days a year. > > > > > > > > I have some question about the scenario above. Did you > > call for the > > > > mpi disconnect on the both ends (server/client) before > the client > > > > exit (did we must to do it?) > > > > > > Yes, both ends will call disconnect. But for us, > > MPI_Comm_disconnect() > > > call is not a collective call, it is just a local operation. > > > > > > --CQ > > > > > Possible solution (internal review as yet): > > > > Each user process registers with the XRC QP: > > a. each process registers ONCE. If it registers multiple times, > > there is no reference increment -- > > rather the registration succeeds, but only one PID entry is > > kept per QP. > > b. Can have cleanup in the event of a process dying suddenly. > > c. QP cannot be destroyed while there are any user > processes still > > registered with it. > > > > libibverbs API is as follows: > > > > ============================================================== > > ======================== > > /** > > * ibv_xrc_rcv_qp_alloc - creates an XRC QP for serving as a > > receive-side only QP, > > * and moves the created qp through the RESET->INIT and > > INIT->RTR transitions. > > * (The RTR->RTS transition is not needed, since this QP > > does no sending). > > * The sending XRC QP uses this QP as destination, while > > specifying an XRC SRQ > > * for actually receiving the transmissions and > > generating all completions on the > > * receiving side. > > * > > * This QP is created in kernel space, and persists > > until the last process registered > > * for the QP calls ibv_xrc_rcv_qp_unregister() (at > > which time the QP is destroyed). > > * > > * @pd: protection domain to use. At lower layer, this provides > > access to userspace obj > > * @xrc_domain: xrc domain to use for the QP. > > * @attr: modify-qp attributes needed to bring the QP to RTR. > > * @attr_mask: bitmap indicating which attributes are > provided in the > > attr struct. > > * used for validity checking. > > * @xrc_rcv_qpn: qp_num of created QP (if success). To be passed to > > the remote node (sender). > > * The remote node will use xrc_rcv_qpn in > > ibv_post_send when sending to > > * XRC SRQ's on this host in the same xrc domain. > > * > > * RETURNS: success (0), or a (negative) error value. > > * > > * NOTE: this verb also registers the calling user-process > with the QP > > at its creation time > > * (implicit call to ibv_xrc_rcv_qp_register), to avoid > > race conditions. > > * The creating process will need to call > > ibv_xrc_qp_unregister() for the QP to release it from > > * this process. > > */ > > > > int ibv_xrc_rcv_qp_alloc(struct ibv_pd *pd, > > struct ibv_xrc_domain *xrc_domain, > > struct ibv_qp_attr *attr, > > enum ibv_qp_attr_mask attr_mask, > > uint32_t *xrc_rcv_qpn); > > > > > ===================================================================== > > > > /** > > * ibv_xrc_rcv_qp_register: registers a user process with an XRC QP > > which serves as > > * a receive-side only QP. > > * > > * @xrc_domain: xrc domain the QP belongs to (for verification). > > * @xrc_qp_num: The (24 bit) number of the XRC QP. > > * > > * RETURNS: success (0), > > * or error (-EINVAL), if: > > * 1. There is no such QP_num allocated. > > * 2. The QP is allocated, but is not an receive XRC QP > > * 3. The XRC QP does not belong to the given domain. > > */ > > int ibv_xrc_rcv_qp_register(struct ibv_xrc_domain *xrc_domain, > > uint32_t xrc_qp_num); > > > > > ===================================================================== > > /** > > * ibv_xrc_rcv_qp_unregister: detaches a user process from > an XRC QP > > serving as > > * a receive-side only QP. If as a result, there are > > no remaining userspace processes > > * registered for this XRC QP, it is destroyed. > > * > > * @xrc_domain: xrc domain the QP belongs to (for verification). > > * @xrc_qp_num: The (24 bit) number of the XRC QP. > > * > > * RETURNS: success (0), > > * or error (-EINVAL), if: > > * 1. There is no such QP_num allocated. > > * 2. The QP is allocated, but is not an XRC QP > > * 3. The XRC QP does not belong to the given domain. > > * NOTE: I don't see any reason to return a special code if > the QP is > > destroyed -- the unregister simply > > * succeeds. > > */ > > int ibv_xrc_rcv_qp_unregister(struct ibv_xrc_domain *xrc_domain, > > uint32_t xrc_qp_num); > > ============================================================== > > =============================== > > > > Usage: > > > > 1. Sender creates an XRC QP (sending QP) 2. Sender sends some > > receiving process on a remote node (say R1) a request to provide an > > XRC QP and XRC SRQ for > > receiving messages (the request includes the sending QP number). > > 3. R1 calls ibv_xrc_rcv_qp_alloc() to create a receiving XRC QP in > > kernel space, and move > > that QP up to RTR state. This function also registers process R1 > > with the XRC QP. > > 4. R1 calls ibv_create_xrc_srq() to create an SRQ for > receive messages > > via the just created XRC QP. > > 5. R1 responds to request, providing the XRC qp number, and XRC SRQ > > number to be used in communication. > > 6. Sender then may wish to communicate with another > receiving process > > on the remote host (say R2). > > it sends a request to R2 containing the remote XRC QP number > > (obtained from R1) > > which it will use to send messages. > > 7. R2 creates an XRC SRQ (if one does not already exist for the > > domain), and also > > calls ibv_xrc_rcv_qp_register() to register the process > R2 with the > > XRC QP created by R1. > > 8. If R1 no longer needs to communicate with the sender, it calls > > ibv_xrc_rcv_qp_unregister() for the QP. > > The QP will not yet be destroyed, since R2 is still > registered with > > it. > > 9. If R2 no longer needs to communicate with the sender, it calls > > ibv_xrc_rcv_qp_unregister() for the QP. > > At this point, the QP is destroyed, since no processes remain > > registered with it. > > > > NOTES: > > 1. The problem of the QP being destroyed and quickly > re-allocated does > > not exist -- the upper bits of the > > QP number are incremented at each allocation (except for the MSB > > which is always 1 for XRC QPs). Thus, > > even if the same QP is re-allocated, its QP number > (stored in the > > QP object) will be different than > > expected (unless it is re-destroyed/re-allocated several hundred > > times). > > > > 2. With this model, we do not need a heartbeat: if a > receiving process > > dies, all XRC QPs it has registered for will > > be unregistered as part of process cleanup in kernel space. > > > > - Jack > > > > > From ishai at mellanox.co.il Thu Jan 3 07:55:07 2008 From: ishai at mellanox.co.il (Ishai Rabinovitz) Date: Thu, 3 Jan 2008 17:55:07 +0200 Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of any one user process In-Reply-To: Message-ID: <6C2C79E72C305246B504CBA17B5500C90308DAB0@mtlexch01.mtl.com> CQ, You are right. And there is no race because the register and deregister are locked in the kernel using the same spin lock. So in the MPI implementation, when C finds out that the QP is no longer valid, he should send a reject back to A, and then A ask C to open also a new QP. Ishai > -----Original Message----- > From: Tang, Changqing [mailto:changquing.tang at hp.com] > Sent: ה 03 ינואר 2008 17:49 > To: Ishai Rabinovitz; panda at cse.ohio-state.edu; Jack > Morgenstein; Pavel Shamis > Cc: Gleb Natapov; Roland Dreier; general at lists.openfabrics.org > Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP > independent of any one user process > > > Thanks for the comment. > > Another issue I have after thinking about the interface more. > > Rank A is the sender, rank B and C are two ranks on a remote > node. At first, B creates the receiving QP and make > connection to A and register the QP number for receiving. And > A gets the receiving QP nubmer from B. After some > communication between A and B, B decides to close the > connection, and unregister the QP number. Then A and C want > to talk, so A tell C the receiving QP number, C tries to > register the QP number. > > I wonder at the time when C tries to register the QP number, > the receiving QP has been destroyed by the kernel, since when > B unregister the QP number, the reference count becomes zero, > and kernel will cleanup it. > > Am I right ? > > > --CQ > > > > > -----Original Message----- > > From: Ishai Rabinovitz [mailto:ishai at mellanox.co.il] > > Sent: Thursday, January 03, 2008 2:59 AM > > To: panda at cse.ohio-state.edu; Tang, Changqing; Jack > Morgenstein; Pavel > > Shamis > > Cc: Gleb Natapov; Roland Dreier; general at lists.openfabrics.org > > Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP > > independent of any one user process > > > > Please see my comments (prefix [Ishai]) > > > > -----Original Message----- > > From: Tang, Changqing [mailto:changquing.tang at hp.com] > > Sent: ד 02 ינואר 2008 17:27 > > To: Jack Morgenstein; Pavel Shamis > > Cc: Ishai Rabinovitz; Gleb Natapov; Roland Dreier; > > general at lists.openfabrics.org > > Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP > > independent of any one user process > > > > > > This interface is OK for me. > > > > Now, every rank on a node who wants to receive message from > the same > > remote rank must know the same receiving QP number, and > register for > > receiving using this QP number. > > > > If rank B does not register (receiving QP has been created > by another > > rank A on the node), and sender know B's SRQ number, if > sender sends a > > message to B, can B still receive this > > message ? (I hope, no register, no receive) > > > > [Ishai] I guess that from the MPI layer prospective, the sender can > > not know B's SRQ number until it ask B to give it to him. So B can > > register to this QP before sending the SRQ number. > > > > I hope to know the opinion from other MPI team, or other XRC user. > > > > [Ishai] We already discussed this issues with Open MPI IB > group, and > > it looks fine to them. I'm sending this mail to Prof. > Panda, so he can > > comment on it as well. > > > > --CQ > > > > > > > > > -----Original Message----- > > > From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il] > > > Sent: Monday, December 31, 2007 5:40 AM > > > To: pasha at mellanox.co.il > > > Cc: ishai at mellanox.co.il; Gleb Natapov; Roland Dreier; Tang, > > > Changqing; general at lists.openfabrics.org > > > Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP > > > independent of any one user process > > > > > > > Tang, Changqing wrote: > > > > > If I have a MPI server processes on a node, many > > > other MPI > > > > > client processes will dynamically connect/disconnect with the > > > > > server. The server use same XRC domain. > > > > > > > > > > Will this cause accumulating the "kernel" QP for such > > > > > application ? we want the server to run 365 days a year. > > > > > > > > > > I have some question about the scenario above. Did you > > > call for the > > > > > mpi disconnect on the both ends (server/client) before > > the client > > > > > exit (did we must to do it?) > > > > > > > > Yes, both ends will call disconnect. But for us, > > > MPI_Comm_disconnect() > > > > call is not a collective call, it is just a local operation. > > > > > > > > --CQ > > > > > > > Possible solution (internal review as yet): > > > > > > Each user process registers with the XRC QP: > > > a. each process registers ONCE. If it registers > multiple times, > > > there is no reference increment -- > > > rather the registration succeeds, but only one PID > entry is > > > kept per QP. > > > b. Can have cleanup in the event of a process dying suddenly. > > > c. QP cannot be destroyed while there are any user > > processes still > > > registered with it. > > > > > > libibverbs API is as follows: > > > > > > ============================================================== > > > ======================== > > > /** > > > * ibv_xrc_rcv_qp_alloc - creates an XRC QP for serving as a > > > receive-side only QP, > > > * and moves the created qp through the RESET->INIT and > > > INIT->RTR transitions. > > > * (The RTR->RTS transition is not needed, since this QP > > > does no sending). > > > * The sending XRC QP uses this QP as destination, while > > > specifying an XRC SRQ > > > * for actually receiving the transmissions and > > > generating all completions on the > > > * receiving side. > > > * > > > * This QP is created in kernel space, and persists > > > until the last process registered > > > * for the QP calls ibv_xrc_rcv_qp_unregister() (at > > > which time the QP is destroyed). > > > * > > > * @pd: protection domain to use. At lower layer, this provides > > > access to userspace obj > > > * @xrc_domain: xrc domain to use for the QP. > > > * @attr: modify-qp attributes needed to bring the QP to RTR. > > > * @attr_mask: bitmap indicating which attributes are > > provided in the > > > attr struct. > > > * used for validity checking. > > > * @xrc_rcv_qpn: qp_num of created QP (if success). To be > passed to > > > the remote node (sender). > > > * The remote node will use xrc_rcv_qpn in > > > ibv_post_send when sending to > > > * XRC SRQ's on this host in the same xrc domain. > > > * > > > * RETURNS: success (0), or a (negative) error value. > > > * > > > * NOTE: this verb also registers the calling user-process > > with the QP > > > at its creation time > > > * (implicit call to ibv_xrc_rcv_qp_register), to avoid > > > race conditions. > > > * The creating process will need to call > > > ibv_xrc_qp_unregister() for the QP to release it from > > > * this process. > > > */ > > > > > > int ibv_xrc_rcv_qp_alloc(struct ibv_pd *pd, > > > struct ibv_xrc_domain *xrc_domain, > > > struct ibv_qp_attr *attr, > > > enum ibv_qp_attr_mask attr_mask, > > > uint32_t *xrc_rcv_qpn); > > > > > > > > > ===================================================================== > > > > > > /** > > > * ibv_xrc_rcv_qp_register: registers a user process with > an XRC QP > > > which serves as > > > * a receive-side only QP. > > > * > > > * @xrc_domain: xrc domain the QP belongs to (for verification). > > > * @xrc_qp_num: The (24 bit) number of the XRC QP. > > > * > > > * RETURNS: success (0), > > > * or error (-EINVAL), if: > > > * 1. There is no such QP_num allocated. > > > * 2. The QP is allocated, but is not an receive XRC QP > > > * 3. The XRC QP does not belong to the given domain. > > > */ > > > int ibv_xrc_rcv_qp_register(struct ibv_xrc_domain *xrc_domain, > > > uint32_t xrc_qp_num); > > > > > > > > > ===================================================================== > > > /** > > > * ibv_xrc_rcv_qp_unregister: detaches a user process from > > an XRC QP > > > serving as > > > * a receive-side only QP. If as a result, there are > > > no remaining userspace processes > > > * registered for this XRC QP, it is destroyed. > > > * > > > * @xrc_domain: xrc domain the QP belongs to (for verification). > > > * @xrc_qp_num: The (24 bit) number of the XRC QP. > > > * > > > * RETURNS: success (0), > > > * or error (-EINVAL), if: > > > * 1. There is no such QP_num allocated. > > > * 2. The QP is allocated, but is not an XRC QP > > > * 3. The XRC QP does not belong to the given domain. > > > * NOTE: I don't see any reason to return a special code if > > the QP is > > > destroyed -- the unregister simply > > > * succeeds. > > > */ > > > int ibv_xrc_rcv_qp_unregister(struct ibv_xrc_domain *xrc_domain, > > > uint32_t xrc_qp_num); > > > ============================================================== > > > =============================== > > > > > > Usage: > > > > > > 1. Sender creates an XRC QP (sending QP) 2. Sender sends some > > > receiving process on a remote node (say R1) a request to > provide an > > > XRC QP and XRC SRQ for > > > receiving messages (the request includes the sending > QP number). > > > 3. R1 calls ibv_xrc_rcv_qp_alloc() to create a receiving > XRC QP in > > > kernel space, and move > > > that QP up to RTR state. This function also registers > process R1 > > > with the XRC QP. > > > 4. R1 calls ibv_create_xrc_srq() to create an SRQ for > > receive messages > > > via the just created XRC QP. > > > 5. R1 responds to request, providing the XRC qp number, > and XRC SRQ > > > number to be used in communication. > > > 6. Sender then may wish to communicate with another > > receiving process > > > on the remote host (say R2). > > > it sends a request to R2 containing the remote XRC QP number > > > (obtained from R1) > > > which it will use to send messages. > > > 7. R2 creates an XRC SRQ (if one does not already exist for the > > > domain), and also > > > calls ibv_xrc_rcv_qp_register() to register the process > > R2 with the > > > XRC QP created by R1. > > > 8. If R1 no longer needs to communicate with the sender, it calls > > > ibv_xrc_rcv_qp_unregister() for the QP. > > > The QP will not yet be destroyed, since R2 is still > > registered with > > > it. > > > 9. If R2 no longer needs to communicate with the sender, it calls > > > ibv_xrc_rcv_qp_unregister() for the QP. > > > At this point, the QP is destroyed, since no processes remain > > > registered with it. > > > > > > NOTES: > > > 1. The problem of the QP being destroyed and quickly > > re-allocated does > > > not exist -- the upper bits of the > > > QP number are incremented at each allocation (except > for the MSB > > > which is always 1 for XRC QPs). Thus, > > > even if the same QP is re-allocated, its QP number > > (stored in the > > > QP object) will be different than > > > expected (unless it is re-destroyed/re-allocated > several hundred > > > times). > > > > > > 2. With this model, we do not need a heartbeat: if a > > receiving process > > > dies, all XRC QPs it has registered for will > > > be unregistered as part of process cleanup in kernel space. > > > > > > - Jack > > > > > > > > > From mshefty at ichips.intel.com Thu Jan 3 09:15:57 2008 From: mshefty at ichips.intel.com (Sean Hefty) Date: Thu, 03 Jan 2008 09:15:57 -0800 Subject: [ofa-general] Re: CMA can't establish connection with QoS on In-Reply-To: <477CF8EF.5010307@dev.mellanox.co.il> References: <47600070.8050008@dev.mellanox.co.il> <000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com> <47605620.3070105@dev.mellanox.co.il> <47608BE4.7020209@ichips.intel.com> <4761315A.1070306@dev.mellanox.co.il> <477CF8EF.5010307@dev.mellanox.co.il> Message-ID: <477D184D.8020300@ichips.intel.com> > Did you get a chance to look at this issue? > > https://bugs.openfabrics.org/show_bug.cgi?id=821 I had to build a couple of systems up to test this, but I wasn't able to reproduce the error before going on vacation. I kept getting a path record query error reported by the rdma_cm, so something must be off with my configuration. I'll continue to look at it. - Sean From ggrundstrom at NetEffect.com Thu Jan 3 09:28:36 2008 From: ggrundstrom at NetEffect.com (Glenn Grundstrom) Date: Thu, 3 Jan 2008 11:28:36 -0600 Subject: [ofa-general] RE: [PATCH 4 of 5] libnes: zero context struct at allocation time (prep for additional context ops) In-Reply-To: <200712171019.36846.jackm@dev.mellanox.co.il> References: <200712171019.36846.jackm@dev.mellanox.co.il> Message-ID: <5E701717F2B2ED4EA60F87C8AA57B7CC07B92D12@venom2> > The ibv_context structure will be getting additional ops, > to be added at the end of the structure (and not as part of > the existing ibv_context_ops structure). > > Reason: ibv_context_ops is declared directly as a member of > ibv_context, > and not as a pointer. Binaries compiled with previous > libibverbs versions > will not be backwards compatible if we add new operations to > ibv_context_ops, > since fields following the ops structure will move. > > To enable adding new operations at the end of the existing > ibv_context struct, > all driver libraries MUST zero their context structure at > allocation time, so > that new ops will be NULL by default. > > Signed-off-by: Jack Morgenstein Applied. Thanks, Glenn. From changquing.tang at hp.com Thu Jan 3 09:50:14 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Thu, 3 Jan 2008 17:50:14 +0000 Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of any one user process In-Reply-To: <6C2C79E72C305246B504CBA17B5500C90308DAB0@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C90308DAB0@mtlexch01.mtl.com> Message-ID: OK, thanks for the clearification. When can we test the code via OFED ? --CQ > -----Original Message----- > From: Ishai Rabinovitz [mailto:ishai at mellanox.co.il] > Sent: Thursday, January 03, 2008 9:55 AM > To: Tang, Changqing; panda at cse.ohio-state.edu; Jack > Morgenstein; Pavel Shamis > Cc: Gleb Natapov; Roland Dreier; general at lists.openfabrics.org > Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP > independent of any one user process > > CQ, You are right. > > And there is no race because the register and deregister are > locked in the kernel using the same spin lock. > > So in the MPI implementation, when C finds out that the QP is > no longer valid, he should send a reject back to A, and then > A ask C to open also a new QP. > > Ishai > > > -----Original Message----- > > From: Tang, Changqing [mailto:changquing.tang at hp.com] > > Sent: ה 03 ינואר 2008 17:49 > > To: Ishai Rabinovitz; panda at cse.ohio-state.edu; Jack Morgenstein; > > Pavel Shamis > > Cc: Gleb Natapov; Roland Dreier; general at lists.openfabrics.org > > Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP > > independent of any one user process > > > > > > Thanks for the comment. > > > > Another issue I have after thinking about the interface more. > > > > Rank A is the sender, rank B and C are two ranks on a > remote node. At > > first, B creates the receiving QP and make connection to A and > > register the QP number for receiving. And A gets the receiving QP > > nubmer from B. After some communication between A and B, B > decides to > > close the connection, and unregister the QP number. Then A > and C want > > to talk, so A tell C the receiving QP number, C tries to > register the > > QP number. > > > > I wonder at the time when C tries to register the QP number, the > > receiving QP has been destroyed by the kernel, since when B > unregister > > the QP number, the reference count becomes zero, and kernel will > > cleanup it. > > > > Am I right ? > > > > > > --CQ > > > > > > > > > -----Original Message----- > > > From: Ishai Rabinovitz [mailto:ishai at mellanox.co.il] > > > Sent: Thursday, January 03, 2008 2:59 AM > > > To: panda at cse.ohio-state.edu; Tang, Changqing; Jack > > Morgenstein; Pavel > > > Shamis > > > Cc: Gleb Natapov; Roland Dreier; general at lists.openfabrics.org > > > Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP > > > independent of any one user process > > > > > > Please see my comments (prefix [Ishai]) > > > > > > -----Original Message----- > > > From: Tang, Changqing [mailto:changquing.tang at hp.com] > > > Sent: ד 02 ינואר 2008 17:27 > > > To: Jack Morgenstein; Pavel Shamis > > > Cc: Ishai Rabinovitz; Gleb Natapov; Roland Dreier; > > > general at lists.openfabrics.org > > > Subject: RE: [ofa-general] [RFC] XRC -- make receiving XRC QP > > > independent of any one user process > > > > > > > > > This interface is OK for me. > > > > > > Now, every rank on a node who wants to receive message from > > the same > > > remote rank must know the same receiving QP number, and > > register for > > > receiving using this QP number. > > > > > > If rank B does not register (receiving QP has been created > > by another > > > rank A on the node), and sender know B's SRQ number, if > > sender sends a > > > message to B, can B still receive this > > > message ? (I hope, no register, no receive) > > > > > > [Ishai] I guess that from the MPI layer prospective, the > sender can > > > not know B's SRQ number until it ask B to give it to him. > So B can > > > register to this QP before sending the SRQ number. > > > > > > I hope to know the opinion from other MPI team, or other XRC user. > > > > > > [Ishai] We already discussed this issues with Open MPI IB > > group, and > > > it looks fine to them. I'm sending this mail to Prof. > > Panda, so he can > > > comment on it as well. > > > > > > --CQ > > > > > > > > > > > > > -----Original Message----- > > > > From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il] > > > > Sent: Monday, December 31, 2007 5:40 AM > > > > To: pasha at mellanox.co.il > > > > Cc: ishai at mellanox.co.il; Gleb Natapov; Roland Dreier; Tang, > > > > Changqing; general at lists.openfabrics.org > > > > Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP > > > > independent of any one user process > > > > > > > > > Tang, Changqing wrote: > > > > > > If I have a MPI server processes on a node, many > > > > other MPI > > > > > > client processes will dynamically > connect/disconnect with the > > > > > > server. The server use same XRC domain. > > > > > > > > > > > > Will this cause accumulating the "kernel" > QP for such > > > > > > application ? we want the server to run 365 days a year. > > > > > > > > > > > > I have some question about the scenario above. Did you > > > > call for the > > > > > > mpi disconnect on the both ends (server/client) before > > > the client > > > > > > exit (did we must to do it?) > > > > > > > > > > Yes, both ends will call disconnect. But for us, > > > > MPI_Comm_disconnect() > > > > > call is not a collective call, it is just a local operation. > > > > > > > > > > --CQ > > > > > > > > > Possible solution (internal review as yet): > > > > > > > > Each user process registers with the XRC QP: > > > > a. each process registers ONCE. If it registers > > multiple times, > > > > there is no reference increment -- > > > > rather the registration succeeds, but only one PID > > entry is > > > > kept per QP. > > > > b. Can have cleanup in the event of a process dying > suddenly. > > > > c. QP cannot be destroyed while there are any user > > > processes still > > > > registered with it. > > > > > > > > libibverbs API is as follows: > > > > > > > > ============================================================== > > > > ======================== > > > > /** > > > > * ibv_xrc_rcv_qp_alloc - creates an XRC QP for serving as a > > > > receive-side only QP, > > > > * and moves the created qp through the RESET->INIT and > > > > INIT->RTR transitions. > > > > * (The RTR->RTS transition is not needed, since this QP > > > > does no sending). > > > > * The sending XRC QP uses this QP as destination, while > > > > specifying an XRC SRQ > > > > * for actually receiving the transmissions and > > > > generating all completions on the > > > > * receiving side. > > > > * > > > > * This QP is created in kernel space, and persists > > > > until the last process registered > > > > * for the QP calls ibv_xrc_rcv_qp_unregister() (at > > > > which time the QP is destroyed). > > > > * > > > > * @pd: protection domain to use. At lower layer, this > provides > > > > access to userspace obj > > > > * @xrc_domain: xrc domain to use for the QP. > > > > * @attr: modify-qp attributes needed to bring the QP to RTR. > > > > * @attr_mask: bitmap indicating which attributes are > > > provided in the > > > > attr struct. > > > > * used for validity checking. > > > > * @xrc_rcv_qpn: qp_num of created QP (if success). To be > > passed to > > > > the remote node (sender). > > > > * The remote node will use xrc_rcv_qpn in > > > > ibv_post_send when sending to > > > > * XRC SRQ's on this host in the same xrc domain. > > > > * > > > > * RETURNS: success (0), or a (negative) error value. > > > > * > > > > * NOTE: this verb also registers the calling user-process > > > with the QP > > > > at its creation time > > > > * (implicit call to ibv_xrc_rcv_qp_register), to avoid > > > > race conditions. > > > > * The creating process will need to call > > > > ibv_xrc_qp_unregister() for the QP to release it from > > > > * this process. > > > > */ > > > > > > > > int ibv_xrc_rcv_qp_alloc(struct ibv_pd *pd, > > > > struct ibv_xrc_domain *xrc_domain, > > > > struct ibv_qp_attr *attr, > > > > enum ibv_qp_attr_mask attr_mask, > > > > uint32_t *xrc_rcv_qpn); > > > > > > > > > > > > > > ===================================================================== > > > > > > > > /** > > > > * ibv_xrc_rcv_qp_register: registers a user process with > > an XRC QP > > > > which serves as > > > > * a receive-side only QP. > > > > * > > > > * @xrc_domain: xrc domain the QP belongs to (for verification). > > > > * @xrc_qp_num: The (24 bit) number of the XRC QP. > > > > * > > > > * RETURNS: success (0), > > > > * or error (-EINVAL), if: > > > > * 1. There is no such QP_num allocated. > > > > * 2. The QP is allocated, but is not an > receive XRC QP > > > > * 3. The XRC QP does not belong to the given domain. > > > > */ > > > > int ibv_xrc_rcv_qp_register(struct ibv_xrc_domain *xrc_domain, > > > > uint32_t xrc_qp_num); > > > > > > > > > > > > > > ===================================================================== > > > > /** > > > > * ibv_xrc_rcv_qp_unregister: detaches a user process from > > > an XRC QP > > > > serving as > > > > * a receive-side only QP. If as a result, there are > > > > no remaining userspace processes > > > > * registered for this XRC QP, it is destroyed. > > > > * > > > > * @xrc_domain: xrc domain the QP belongs to (for verification). > > > > * @xrc_qp_num: The (24 bit) number of the XRC QP. > > > > * > > > > * RETURNS: success (0), > > > > * or error (-EINVAL), if: > > > > * 1. There is no such QP_num allocated. > > > > * 2. The QP is allocated, but is not an XRC QP > > > > * 3. The XRC QP does not belong to the given domain. > > > > * NOTE: I don't see any reason to return a special code if > > > the QP is > > > > destroyed -- the unregister simply > > > > * succeeds. > > > > */ > > > > int ibv_xrc_rcv_qp_unregister(struct ibv_xrc_domain > *xrc_domain, > > > > uint32_t xrc_qp_num); > > > > ============================================================== > > > > =============================== > > > > > > > > Usage: > > > > > > > > 1. Sender creates an XRC QP (sending QP) 2. Sender sends some > > > > receiving process on a remote node (say R1) a request to > > provide an > > > > XRC QP and XRC SRQ for > > > > receiving messages (the request includes the sending > > QP number). > > > > 3. R1 calls ibv_xrc_rcv_qp_alloc() to create a receiving > > XRC QP in > > > > kernel space, and move > > > > that QP up to RTR state. This function also registers > > process R1 > > > > with the XRC QP. > > > > 4. R1 calls ibv_create_xrc_srq() to create an SRQ for > > > receive messages > > > > via the just created XRC QP. > > > > 5. R1 responds to request, providing the XRC qp number, > > and XRC SRQ > > > > number to be used in communication. > > > > 6. Sender then may wish to communicate with another > > > receiving process > > > > on the remote host (say R2). > > > > it sends a request to R2 containing the remote XRC QP number > > > > (obtained from R1) > > > > which it will use to send messages. > > > > 7. R2 creates an XRC SRQ (if one does not already exist for the > > > > domain), and also > > > > calls ibv_xrc_rcv_qp_register() to register the process > > > R2 with the > > > > XRC QP created by R1. > > > > 8. If R1 no longer needs to communicate with the > sender, it calls > > > > ibv_xrc_rcv_qp_unregister() for the QP. > > > > The QP will not yet be destroyed, since R2 is still > > > registered with > > > > it. > > > > 9. If R2 no longer needs to communicate with the > sender, it calls > > > > ibv_xrc_rcv_qp_unregister() for the QP. > > > > At this point, the QP is destroyed, since no > processes remain > > > > registered with it. > > > > > > > > NOTES: > > > > 1. The problem of the QP being destroyed and quickly > > > re-allocated does > > > > not exist -- the upper bits of the > > > > QP number are incremented at each allocation (except > > for the MSB > > > > which is always 1 for XRC QPs). Thus, > > > > even if the same QP is re-allocated, its QP number > > > (stored in the > > > > QP object) will be different than > > > > expected (unless it is re-destroyed/re-allocated > > several hundred > > > > times). > > > > > > > > 2. With this model, we do not need a heartbeat: if a > > > receiving process > > > > dies, all XRC QPs it has registered for will > > > > be unregistered as part of process cleanup in kernel space. > > > > > > > > - Jack > > > > > > > > > > > > > > From rdreier at cisco.com Thu Jan 3 10:29:51 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 10:29:51 -0800 Subject: [ofa-general] [GIT PULL] please pull infiniband.git for-linus Message-ID: Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This will pull one fix for an oops caused by reloading the ib_srp module: David Dillow (1): IB/srp: Fix list corruption/oops on module reload drivers/infiniband/ulp/srp/ib_srp.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 950228f..77e8b90 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device) list_for_each_entry_safe(target, tmp_target, &host->target_list, list) { scsi_remove_host(target->scsi_host); + srp_remove_host(target->scsi_host); srp_disconnect_target(target); ib_destroy_cm_id(target->cm_id); srp_free_target_ib(target); From nuyicjfry at bmount.com Wed Jan 2 10:38:10 2008 From: nuyicjfry at bmount.com (Paulette Stone) Date: Thu, 2 Jan 2008 19:38:10 +0100 Subject: [ofa-general] Cheap and excellent software - too good to be true? Read information below Message-ID: <029316101.50400135682180@bmount.com> Appreciate a brilliant combination of high quality software and low prices. All popular and widely used software in many languages of the world. Only complete and fully-functional programs. Free access to all updates. Customer service is always ready to help with installation and to find necessary software for you if you don't see it in the list.http://geocities.com/blackburn.travis/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sweitzen at cisco.com Thu Jan 3 10:38:56 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Thu, 3 Jan 2008 10:38:56 -0800 Subject: [ofa-general] RE: can you please add a new product to OpenFabrics Linux? In-Reply-To: <476A60EB.2020100@dev.mellanox.co.il> References: <476A60EB.2020100@dev.mellanox.co.il> Message-ID: Done. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -----Original Message----- > From: Dotan Barak [mailto:dotanb at dev.mellanox.co.il] > Sent: Thursday, December 20, 2007 4:33 AM > To: Scott Weitzenkamp (sweitzen); openib-general > Subject: can you please add a new product to OpenFabrics Linux? > > The product "mstflint" is missing. > > The owner of this product is orenk.at.dev.mellanox.co.il > > > thanks > Dotan > From dillowda at ornl.gov Thu Jan 3 10:39:50 2008 From: dillowda at ornl.gov (David Dillow) Date: Thu, 03 Jan 2008 13:39:50 -0500 Subject: [ofa-general] Re: [GIT PULL] please pull infiniband.git for-linus In-Reply-To: References: Message-ID: <1199385590.7561.11.camel@lap75545.ornl.gov> On Thu, 2008-01-03 at 10:29 -0800, Roland Dreier wrote: > Linus, please pull from > > master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus > > This tree is also available from kernel.org mirrors at: > > git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus > > This will pull one fix for an oops caused by reloading the ib_srp module: > > David Dillow (1): > IB/srp: Fix list corruption/oops on module reload If we've got time before 2.6.24 final, I'd wait on this a bit. ib_srp:srp_remove_work() has them reversed as well, and I'm currently tracking down why it oopses when the srp_remove_host() happens before the scsi_remove_host(), which is the documented call sequence. This "fixes" the oops I see on unload, but I'm sure it is a correct fix. From rdreier at cisco.com Thu Jan 3 10:56:54 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 10:56:54 -0800 Subject: [ofa-general] Re: [GIT PULL] please pull infiniband.git for-linus In-Reply-To: <1199385590.7561.11.camel@lap75545.ornl.gov> (David Dillow's message of "Thu, 03 Jan 2008 13:39:50 -0500") References: <1199385590.7561.11.camel@lap75545.ornl.gov> Message-ID: > If we've got time before 2.6.24 final, I'd wait on this a bit. > ib_srp:srp_remove_work() has them reversed as well, and I'm currently > tracking down why it oopses when the srp_remove_host() happens before > the scsi_remove_host(), which is the documented call sequence. I think the best thing to do is to merge this (assuming that Linus gets to it), since it looks quite safe and definitely fixes a crash. Then if we get to the root cause we can change the order of the calls if it turns out a different fix is required. - R. From arlin.r.davis at intel.com Thu Jan 3 11:55:32 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 3 Jan 2008 11:55:32 -0800 Subject: [ofa-general] [Patch 1 of 4] uDAPL v2: OFW changes for IBAL cm, init/fini, dat.conf, and dlerror Message-ID: <000001c84e42$994e6940$9f97070a@amr.corp.intel.com> Windows specific - Add dapl_ep fields ibal_cm_handle, recv_disc, sent_disc for IBAL provider Support for direct object on CQ INIT and FINI changes setup dat.conf default path, fix sr parsing Common code - Add Stan as contributor O/S independent dat_os_library_error() Signed-off by: Stan Smith Signed-off by: Arlin Davis diff --git a/AUTHORS b/AUTHORS index 7c609ba..b2c6a58 100644 --- a/AUTHORS +++ b/AUTHORS @@ -12,5 +12,6 @@ DAPL project: Gil Rubin Steve Sears Randy Smith + Stan Smith Anthony Topper Steve Wise diff --git a/dapl/common/dapl_adapter_util.h b/dapl/common/dapl_adapter_util.h index 0483839..37fb77b 100755 --- a/dapl/common/dapl_adapter_util.h +++ b/dapl/common/dapl_adapter_util.h @@ -290,6 +290,8 @@ dapls_cqe_to_event_extension( #include "dapl_dummy_dto.h" #elif OPENIB #include "dapl_ib_dto.h" +#else +#include "dapl_ibal_dto.h" #endif diff --git a/dapl/common/dapl_ep_util.c b/dapl/common/dapl_ep_util.c index 4518a2b..cf90d46 100644 --- a/dapl/common/dapl_ep_util.c +++ b/dapl/common/dapl_ep_util.c @@ -238,6 +238,14 @@ dapl_ep_dealloc ( dapl_os_free ( ep_ptr->cxn_timer, sizeof ( DAPL_OS_TIMER ) ); } +#if defined(_WIN32) || defined(_WIN64) + if ( ep_ptr->ibal_cm_handle ) + { + dapl_os_free ( ep_ptr->ibal_cm_handle, + sizeof ( *ep_ptr->ibal_cm_handle ) ); + ep_ptr->ibal_cm_handle = NULL; + } +#endif dapl_os_free (ep_ptr, sizeof (DAPL_EP) + sizeof (DAT_SOCK_ADDR) ); } diff --git a/dapl/common/dapl_sp_util.c b/dapl/common/dapl_sp_util.c index 1ca1204..1e2ca14 100644 --- a/dapl/common/dapl_sp_util.c +++ b/dapl/common/dapl_sp_util.c @@ -39,6 +39,7 @@ #include "dapl_sp_util.h" #include "dapl_cr_util.h" + /* * Local definitions */ @@ -97,6 +98,9 @@ dapls_sp_alloc ( dapl_llist_init_entry (&sp_ptr->header.ia_list_entry); dapl_os_lock_init (&sp_ptr->header.lock); +#if defined(_WIN32) || defined(_WIN64) + dapl_os_wait_object_init( &sp_ptr->wait_object ); +#endif /* * Initialize the Body (set to NULL above) */ @@ -129,8 +133,11 @@ dapls_sp_free_sp ( sp_ptr->header.magic == DAPL_MAGIC_RSP); dapl_os_assert (dapl_llist_is_empty (&sp_ptr->cr_list_head)); +#if defined(_WIN32) || defined(_WIN64) + dapl_os_wait_object_destroy( &sp_ptr->wait_object ); +#endif dapl_os_lock (&sp_ptr->header.lock); - sp_ptr->header.magic = DAPL_MAGIC_INVALID; /* reset magic to prevent reuse */ + sp_ptr->header.magic = DAPL_MAGIC_INVALID;/* reset magic to prevent reuse */ dapl_os_unlock (&sp_ptr->header.lock); dapl_os_free (sp_ptr, sizeof (DAPL_SP)); } diff --git a/dapl/include/dapl.h b/dapl/include/dapl.h index ade101b..d6c1a8c 100755 --- a/dapl/include/dapl.h +++ b/dapl/include/dapl.h @@ -64,6 +64,8 @@ #include "dapl_dummy_util.h" #elif OPENIB #include "dapl_ib_util.h" +#else /* windows - IBAL and/or IBAL+Sock_CM */ +#include "dapl_ibal_util.h" #endif /********************************************************************* @@ -448,19 +450,24 @@ struct dapl_ep DAPL_ATOMIC req_count; DAPL_ATOMIC recv_count; - DAPL_COOKIE_BUFFER req_buffer; - DAPL_COOKIE_BUFFER recv_buffer; + DAPL_COOKIE_BUFFER req_buffer; + DAPL_COOKIE_BUFFER recv_buffer; - ib_data_segment_t *recv_iov; + ib_data_segment_t *recv_iov; DAT_COUNT recv_iov_num; - ib_data_segment_t *send_iov; + ib_data_segment_t *send_iov; DAT_COUNT send_iov_num; #ifdef DAPL_DBG_IO_TRC int ibt_dumped; struct io_buf_track *ibt_base; DAPL_RING_BUFFER ibt_queue; #endif /* DAPL_DBG_IO_TRC */ +#if defined(_WIN32) || defined(_WIN64) + DAT_BOOLEAN recv_discreq; + DAT_BOOLEAN sent_discreq; + dp_ib_cm_handle_t ibal_cm_handle; +#endif }; /* DAPL_SRQ maps to DAT_SRQ_HANDLE */ diff --git a/dapl/udapl/dapl_init.c b/dapl/udapl/dapl_init.c index cdd90d8..2c45956 100644 --- a/dapl/udapl/dapl_init.c +++ b/dapl/udapl/dapl_init.c @@ -184,6 +184,13 @@ DAT_PROVIDER_INIT_FUNC_NAME ( provider = NULL; hca_ptr = NULL; +#if defined(_WIN32) || defined(_WIN64) + /* initialize DAPL library here as when called from DLL context in DLLmain() + * the IB (ibal) call hangs. + */ + dapl_init(); +#endif + dat_status = dapl_provider_list_insert(provider_info->ia_name, &provider); if ( DAT_SUCCESS != dat_status ) { @@ -289,6 +296,13 @@ DAT_PROVIDER_FINI_FUNC_NAME ( dapl_hca_free (provider->extension); (void) dapl_provider_list_remove(provider_info->ia_name); + +#if defined(_WIN32) || defined(_WIN64) + /* cleanup DAPL library - relocated here from OSD DLL context as the IBAL + * calls hung in the DLL context? + */ + dapl_fini(); +#endif } diff --git a/dat/common/dat_sr.c b/dat/common/dat_sr.c index ee50375..76991a8 100755 --- a/dat/common/dat_sr.c +++ b/dat/common/dat_sr.c @@ -388,7 +388,7 @@ dat_sr_provider_open ( fncptr = dat_os_library_sym(data->lib_handle, "dapl_extensions"); - if ((dlerror() != NULL) || (fncptr == NULL)) + if ((dat_os_library_error() != NULL) || (fncptr == NULL)) { dat_os_dbg_print(DAT_OS_DBG_TYPE_SR, "DAT Registry: WARNING: library %s, " diff --git a/dat/udat/udat_sr_parser.c b/dat/udat/udat_sr_parser.c index 64c4114..84b5b9d 100644 --- a/dat/udat/udat_sr_parser.c +++ b/dat/udat/udat_sr_parser.c @@ -49,7 +49,11 @@ *********************************************************************/ #define DAT_SR_CONF_ENV "DAT_OVERRIDE" +#if defined(_WIN32) || defined(_WIN64) +#define DAT_SR_CONF_DEFAULT "C:\\DAT\\dat.conf" +#else #define DAT_SR_CONF_DEFAULT "/etc/dat.conf" +#endif #define DAT_SR_TOKEN_THREADSAFE "threadsafe" #define DAT_SR_TOKEN_NONTHREADSAFE "nonthreadsafe" @@ -1474,7 +1478,7 @@ dat_sr_read_quoted_str ( } else { - token->value[j] = c; + token->value[j] = (char)c; j++; is_prev_char_backslash = DAT_FALSE; @@ -1521,5 +1525,5 @@ dat_sr_read_comment ( } while ( (DAT_SR_CHAR_NEWLINE != c) && (EOF != c) ); /* put back the newline */ - dat_os_fputc (file, c); + dat_os_ungetc (file, c); } From arlin.r.davis at intel.com Thu Jan 3 11:55:38 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 3 Jan 2008 11:55:38 -0800 Subject: [ofa-general] [Patch 2 of 4] uDAPL v2: OFW changes - missing DAT_API defs Message-ID: <000101c84e42$9cc4a440$9f97070a@amr.corp.intel.com> Common code - Add DAT_API definitions for dat_redirection.h, udat_redirection.h Signed-off by: Stan Smith Signed-off by: Arlin Davis diff --git a/dat/include/dat/dat_redirection.h b/dat/include/dat/dat_redirection.h index 52f1770..ea61eff 100755 --- a/dat/include/dat/dat_redirection.h +++ b/dat/include/dat/dat_redirection.h @@ -479,13 +479,13 @@ typedef struct dat_provider DAT_PROVIDER; * FUNCTION PROTOTYPES ****************************************************************/ -typedef DAT_RETURN (*DAT_IA_OPEN_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_IA_OPEN_FUNC) ( IN const DAT_NAME_PTR, /* provider */ IN DAT_COUNT, /* asynch_evd_min_qlen */ INOUT DAT_EVD_HANDLE *, /* asynch_evd_handle */ OUT DAT_IA_HANDLE *); /* ia_handle */ -typedef DAT_RETURN (*DAT_IA_OPENV_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_IA_OPENV_FUNC) ( IN const DAT_NAME_PTR, /* provider */ IN DAT_COUNT, /* asynch_evd_min_qlen */ INOUT DAT_EVD_HANDLE *, /* asynch_evd_handle */ @@ -494,11 +494,11 @@ typedef DAT_RETURN (*DAT_IA_OPENV_FUNC) ( IN DAT_UINT32, /* dat_minor_version number */ IN DAT_BOOLEAN); /* dat_thread_safety */ -typedef DAT_RETURN (*DAT_IA_CLOSE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_IA_CLOSE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_CLOSE_FLAGS ); /* close_flags */ -typedef DAT_RETURN (*DAT_IA_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_IA_QUERY_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ OUT DAT_EVD_HANDLE *, /* async_evd_handle */ IN DAT_IA_ATTR_MASK, /* ia_attr_mask */ @@ -508,32 +508,32 @@ typedef DAT_RETURN (*DAT_IA_QUERY_FUNC) ( /* helper functions */ -typedef DAT_RETURN (*DAT_SET_CONSUMER_CONTEXT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_SET_CONSUMER_CONTEXT_FUNC) ( IN DAT_HANDLE, /* dat_handle */ IN DAT_CONTEXT); /* context */ -typedef DAT_RETURN (*DAT_GET_CONSUMER_CONTEXT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_GET_CONSUMER_CONTEXT_FUNC) ( IN DAT_HANDLE, /* dat_handle */ OUT DAT_CONTEXT * ); /* context */ -typedef DAT_RETURN (*DAT_GET_HANDLE_TYPE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_GET_HANDLE_TYPE_FUNC) ( IN DAT_HANDLE, /* dat_handle */ OUT DAT_HANDLE_TYPE * ); /* dat_handle_type */ /* CR functions */ -typedef DAT_RETURN (*DAT_CR_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CR_QUERY_FUNC) ( IN DAT_CR_HANDLE, /* cr_handle */ IN DAT_CR_PARAM_MASK, /* cr_param_mask */ OUT DAT_CR_PARAM * ); /* cr_param */ -typedef DAT_RETURN (*DAT_CR_ACCEPT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CR_ACCEPT_FUNC) ( IN DAT_CR_HANDLE, /* cr_handle */ IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_COUNT, /* private_data_size */ IN const DAT_PVOID ); /* private_data */ -typedef DAT_RETURN (*DAT_CR_REJECT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CR_REJECT_FUNC) ( IN DAT_CR_HANDLE, /* cr_handle */ IN DAT_COUNT, /* private_data_size */ IN const DAT_PVOID ); /* private_data */ @@ -542,35 +542,35 @@ typedef DAT_RETURN (*DAT_CR_REJECT_FUNC) ( * For DAT-1.0 it was only defined for uDAPL. */ -typedef DAT_RETURN (*DAT_CR_HANDOFF_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CR_HANDOFF_FUNC) ( IN DAT_CR_HANDLE, /* cr_handle */ IN DAT_CONN_QUAL); /* handoff */ /* EVD functions */ -typedef DAT_RETURN (*DAT_EVD_RESIZE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_RESIZE_FUNC) ( IN DAT_EVD_HANDLE, /* evd_handle */ IN DAT_COUNT ); /* evd_min_qlen */ -typedef DAT_RETURN (*DAT_EVD_POST_SE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_POST_SE_FUNC) ( IN DAT_EVD_HANDLE, /* evd_handle */ IN const DAT_EVENT * ); /* event */ -typedef DAT_RETURN (*DAT_EVD_DEQUEUE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_DEQUEUE_FUNC) ( IN DAT_EVD_HANDLE, /* evd_handle */ OUT DAT_EVENT * ); /* event */ -typedef DAT_RETURN (*DAT_EVD_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_FREE_FUNC) ( IN DAT_EVD_HANDLE ); /* evd_handle */ -typedef DAT_RETURN (*DAT_EVD_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_QUERY_FUNC) ( IN DAT_EVD_HANDLE, /* evd_handle */ IN DAT_EVD_PARAM_MASK, /* evd_param_mask */ OUT DAT_EVD_PARAM * ); /* evd_param */ /* EP functions */ -typedef DAT_RETURN (*DAT_EP_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_PZ_HANDLE, /* pz_handle */ IN DAT_EVD_HANDLE, /* recv_completion_evd_handle */ @@ -579,7 +579,7 @@ typedef DAT_RETURN (*DAT_EP_CREATE_FUNC) ( IN const DAT_EP_ATTR *, /* ep_attributes */ OUT DAT_EP_HANDLE * ); /* ep_handle */ -typedef DAT_RETURN (*DAT_EP_CREATE_WITH_SRQ_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_CREATE_WITH_SRQ_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_PZ_HANDLE, /* pz_handle */ IN DAT_EVD_HANDLE, /* recv_completion_evd_handle */ @@ -589,17 +589,17 @@ typedef DAT_RETURN (*DAT_EP_CREATE_WITH_SRQ_FUNC) ( IN const DAT_EP_ATTR *, /* ep_attributes */ OUT DAT_EP_HANDLE * ); /* ep_handle */ -typedef DAT_RETURN (*DAT_EP_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_QUERY_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_EP_PARAM_MASK, /* ep_param_mask */ OUT DAT_EP_PARAM * ); /* ep_param */ -typedef DAT_RETURN (*DAT_EP_MODIFY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_MODIFY_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_EP_PARAM_MASK, /* ep_param_mask */ IN const DAT_EP_PARAM * ); /* ep_param */ -typedef DAT_RETURN (*DAT_EP_CONNECT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_CONNECT_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_IA_ADDRESS_PTR, /* remote_ia_address */ IN DAT_CONN_QUAL, /* remote_conn_qual */ @@ -609,14 +609,14 @@ typedef DAT_RETURN (*DAT_EP_CONNECT_FUNC) ( IN DAT_QOS, /* quality_of_service */ IN DAT_CONNECT_FLAGS ); /* connect_flags */ -typedef DAT_RETURN (*DAT_EP_COMMON_CONNECT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_COMMON_CONNECT_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_IA_ADDRESS_PTR, /* remote_ia_address */ IN DAT_TIMEOUT, /* timeout */ IN DAT_COUNT, /* private_data_size */ IN const DAT_PVOID ); /* private_data */ -typedef DAT_RETURN (*DAT_EP_DUP_CONNECT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_DUP_CONNECT_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_EP_HANDLE, /* ep_dup_handle */ IN DAT_TIMEOUT, /* timeout */ @@ -624,18 +624,18 @@ typedef DAT_RETURN (*DAT_EP_DUP_CONNECT_FUNC) ( IN const DAT_PVOID, /* private_data */ IN DAT_QOS); /* quality_of_service */ -typedef DAT_RETURN (*DAT_EP_DISCONNECT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_DISCONNECT_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_CLOSE_FLAGS ); /* close_flags */ -typedef DAT_RETURN (*DAT_EP_POST_SEND_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_POST_SEND_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_COUNT, /* num_segments */ IN DAT_LMR_TRIPLET *, /* local_iov */ IN DAT_DTO_COOKIE, /* user_cookie */ IN DAT_COMPLETION_FLAGS ); /* completion_flags */ -typedef DAT_RETURN (*DAT_EP_POST_SEND_WITH_INVALIDATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_POST_SEND_WITH_INVALIDATE_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_COUNT, /* num_segments */ IN DAT_LMR_TRIPLET *, /* local_iov */ @@ -644,14 +644,14 @@ typedef DAT_RETURN (*DAT_EP_POST_SEND_WITH_INVALIDATE_FUNC) ( IN DAT_BOOLEAN, /* invalidate_flag */ IN DAT_RMR_CONTEXT ); /* RMR context */ -typedef DAT_RETURN (*DAT_EP_POST_RECV_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_POST_RECV_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_COUNT, /* num_segments */ IN DAT_LMR_TRIPLET *, /* local_iov */ IN DAT_DTO_COOKIE, /* user_cookie */ IN DAT_COMPLETION_FLAGS ); /* completion_flags */ -typedef DAT_RETURN (*DAT_EP_POST_RDMA_READ_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_POST_RDMA_READ_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_COUNT, /* num_segments */ IN DAT_LMR_TRIPLET *, /* local_iov */ @@ -659,14 +659,14 @@ typedef DAT_RETURN (*DAT_EP_POST_RDMA_READ_FUNC) ( IN const DAT_RMR_TRIPLET *,/* remote_iov */ IN DAT_COMPLETION_FLAGS ); /* completion_flags */ -typedef DAT_RETURN (*DAT_EP_POST_RDMA_READ_TO_RMR_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_POST_RDMA_READ_TO_RMR_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN const DAT_RMR_TRIPLET *, /* local_iov */ IN DAT_DTO_COOKIE, /* user_cookie */ IN const DAT_RMR_TRIPLET *,/* remote_iov */ IN DAT_COMPLETION_FLAGS ); /* completion_flags */ -typedef DAT_RETURN (*DAT_EP_POST_RDMA_WRITE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_POST_RDMA_WRITE_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_COUNT, /* num_segments */ IN DAT_LMR_TRIPLET *, /* local_iov */ @@ -674,59 +674,59 @@ typedef DAT_RETURN (*DAT_EP_POST_RDMA_WRITE_FUNC) ( IN const DAT_RMR_TRIPLET *,/* remote_iov */ IN DAT_COMPLETION_FLAGS ); /* completion_flags */ -typedef DAT_RETURN (*DAT_EP_GET_STATUS_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_GET_STATUS_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ OUT DAT_EP_STATE *, /* ep_state */ OUT DAT_BOOLEAN *, /* recv_idle */ OUT DAT_BOOLEAN * ); /* request_idle */ -typedef DAT_RETURN (*DAT_EP_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_FREE_FUNC) ( IN DAT_EP_HANDLE); /* ep_handle */ -typedef DAT_RETURN (*DAT_EP_RESET_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_RESET_FUNC) ( IN DAT_EP_HANDLE); /* ep_handle */ -typedef DAT_RETURN (*DAT_EP_RECV_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_RECV_QUERY_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ OUT DAT_COUNT *, /* nbufs_allocated */ OUT DAT_COUNT *); /* bufs_alloc_span */ -typedef DAT_RETURN (*DAT_EP_SET_WATERMARK_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EP_SET_WATERMARK_FUNC) ( IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_COUNT, /* ep_soft_high_watermark*/ IN DAT_COUNT ); /* ep_hard_high_watermark*/ /* LMR functions */ -typedef DAT_RETURN (*DAT_LMR_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_LMR_FREE_FUNC) ( IN DAT_LMR_HANDLE ); /* lmr_handle */ -typedef DAT_RETURN (*DAT_LMR_SYNC_RDMA_READ_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_LMR_SYNC_RDMA_READ_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN const DAT_LMR_TRIPLET *,/* local segments */ IN DAT_VLEN ); /* num_segments */ -typedef DAT_RETURN (*DAT_LMR_SYNC_RDMA_WRITE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_LMR_SYNC_RDMA_WRITE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN const DAT_LMR_TRIPLET *, /* local_segments */ IN DAT_VLEN ); /* num_segments */ /* RMR functions */ -typedef DAT_RETURN (*DAT_RMR_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_RMR_CREATE_FUNC) ( IN DAT_PZ_HANDLE, /* pz_handle */ OUT DAT_RMR_HANDLE *); /* rmr_handle */ -typedef DAT_RETURN (*DAT_RMR_CREATE_FOR_EP_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_RMR_CREATE_FOR_EP_FUNC) ( IN DAT_PZ_HANDLE, /* pz_handle */ OUT DAT_RMR_HANDLE *); /* rmr_handle */ -typedef DAT_RETURN (*DAT_RMR_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_RMR_QUERY_FUNC) ( IN DAT_RMR_HANDLE, /* rmr_handle */ IN DAT_RMR_PARAM_MASK, /* rmr_param_mask */ OUT DAT_RMR_PARAM *); /* rmr_param */ -typedef DAT_RETURN (*DAT_RMR_BIND_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_RMR_BIND_FUNC) ( IN DAT_RMR_HANDLE, /* rmr_handle */ IN DAT_LMR_HANDLE, /* lmr_handle */ IN const DAT_LMR_TRIPLET *,/* lmr_triplet */ @@ -737,119 +737,119 @@ typedef DAT_RETURN (*DAT_RMR_BIND_FUNC) ( IN DAT_COMPLETION_FLAGS, /* completion_flags */ OUT DAT_RMR_CONTEXT * ); /* context */ -typedef DAT_RETURN (*DAT_RMR_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_RMR_FREE_FUNC) ( IN DAT_RMR_HANDLE); /* rmr_handle */ /* PSP functions */ -typedef DAT_RETURN (*DAT_PSP_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_PSP_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_CONN_QUAL, /* conn_qual */ IN DAT_EVD_HANDLE, /* evd_handle */ IN DAT_PSP_FLAGS, /* psp_flags */ OUT DAT_PSP_HANDLE * ); /* psp_handle */ -typedef DAT_RETURN (*DAT_PSP_CREATE_ANY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_PSP_CREATE_ANY_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ OUT DAT_CONN_QUAL *, /* conn_qual */ IN DAT_EVD_HANDLE, /* evd_handle */ IN DAT_PSP_FLAGS, /* psp_flags */ OUT DAT_PSP_HANDLE * ); /* psp_handle */ -typedef DAT_RETURN (*DAT_PSP_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_PSP_QUERY_FUNC) ( IN DAT_PSP_HANDLE, /* psp_handle */ IN DAT_PSP_PARAM_MASK, /* psp_param_mask */ OUT DAT_PSP_PARAM * ); /* psp_param */ -typedef DAT_RETURN (*DAT_PSP_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_PSP_FREE_FUNC) ( IN DAT_PSP_HANDLE ); /* psp_handle */ /* RSP functions */ -typedef DAT_RETURN (*DAT_RSP_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_RSP_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_CONN_QUAL, /* conn_qual */ IN DAT_EP_HANDLE, /* ep_handle */ IN DAT_EVD_HANDLE, /* evd_handle */ OUT DAT_RSP_HANDLE * ); /* rsp_handle */ -typedef DAT_RETURN (*DAT_RSP_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_RSP_QUERY_FUNC) ( IN DAT_RSP_HANDLE, /* rsp_handle */ IN DAT_RSP_PARAM_MASK, /* rsp_param_mask */ OUT DAT_RSP_PARAM * ); /* rsp_param */ -typedef DAT_RETURN (*DAT_RSP_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_RSP_FREE_FUNC) ( IN DAT_RSP_HANDLE ); /* rsp_handle */ /* CSP functions functions - DAT 2.0 */ -typedef DAT_RETURN (*DAT_CSP_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CSP_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_COMM *, /* communicator */ IN DAT_IA_ADDRESS_PTR, /* address */ IN DAT_EVD_HANDLE, /* evd_handle */ OUT DAT_CSP_HANDLE * ); /* csp_handle */ -typedef DAT_RETURN (*DAT_CSP_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CSP_QUERY_FUNC) ( IN DAT_CSP_HANDLE, /* csp_handle */ IN DAT_CSP_PARAM_MASK, /* csp_param_mask */ OUT DAT_CSP_PARAM * ); /* csp_param */ -typedef DAT_RETURN (*DAT_CSP_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CSP_FREE_FUNC) ( IN DAT_CSP_HANDLE ); /* csp_handle */ /* PZ functions */ -typedef DAT_RETURN (*DAT_PZ_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_PZ_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ OUT DAT_PZ_HANDLE * ); /* pz_handle */ -typedef DAT_RETURN (*DAT_PZ_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_PZ_QUERY_FUNC) ( IN DAT_PZ_HANDLE, /* pz_handle */ IN DAT_PZ_PARAM_MASK, /* pz_param_mask */ OUT DAT_PZ_PARAM *); /* pz_param */ -typedef DAT_RETURN (*DAT_PZ_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_PZ_FREE_FUNC) ( IN DAT_PZ_HANDLE ); /* pz_handle */ /* SRQ functions */ -typedef DAT_RETURN (*DAT_SRQ_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_SRQ_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_PZ_HANDLE, /* pz_handle */ IN DAT_SRQ_ATTR *, /* srq_attributes */ OUT DAT_SRQ_HANDLE *); /* srq_handle */ -typedef DAT_RETURN (*DAT_SRQ_SET_LW_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_SRQ_SET_LW_FUNC) ( IN DAT_SRQ_HANDLE, /* srq_handle */ IN DAT_COUNT ); /* srq_low_watermark */ -typedef DAT_RETURN (*DAT_SRQ_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_SRQ_FREE_FUNC) ( IN DAT_SRQ_HANDLE); /* srq_handle */ -typedef DAT_RETURN (*DAT_SRQ_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_SRQ_QUERY_FUNC) ( IN DAT_SRQ_HANDLE, /* srq_handle */ IN DAT_SRQ_PARAM_MASK, /* srq_param_mask */ OUT DAT_SRQ_PARAM *); /* srq_param */ -typedef DAT_RETURN (*DAT_SRQ_RESIZE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_SRQ_RESIZE_FUNC) ( IN DAT_SRQ_HANDLE, /* srq_handle */ IN DAT_COUNT ); /* srq_queue_length */ -typedef DAT_RETURN (*DAT_SRQ_POST_RECV_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_SRQ_POST_RECV_FUNC) ( IN DAT_SRQ_HANDLE, /* srq_handle */ IN DAT_COUNT, /* num_segments */ IN DAT_LMR_TRIPLET *, /* local_iov */ IN DAT_DTO_COOKIE ); /* user_cookie */ -typedef DAT_RETURN (*DAT_IA_HA_RELATED_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_IA_HA_RELATED_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN const DAT_NAME_PTR, /* provider */ OUT DAT_BOOLEAN *); /* answer */ #ifdef DAT_EXTENSIONS #include -typedef DAT_RETURN (*DAT_HANDLE_EXTENDEDOP_FUNC)( +typedef DAT_RETURN (DAT_API *DAT_HANDLE_EXTENDEDOP_FUNC)( IN DAT_HANDLE, /* handle */ IN DAT_EXTENDED_OP, /* extended op */ IN va_list); /* argument list */ diff --git a/dat/include/dat/udat_redirection.h b/dat/include/dat/udat_redirection.h index 4a7b11e..d73f9bd 100755 --- a/dat/include/dat/udat_redirection.h +++ b/dat/include/dat/udat_redirection.h @@ -154,7 +154,7 @@ * ****************************************************************/ -typedef DAT_RETURN (*DAT_LMR_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_LMR_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_MEM_TYPE, /* mem_type */ IN DAT_REGION_DESCRIPTION, /* region_description */ @@ -168,72 +168,72 @@ typedef DAT_RETURN (*DAT_LMR_CREATE_FUNC) ( OUT DAT_VLEN *, /* registered_length */ OUT DAT_VADDR * ); /* registered_address */ -typedef DAT_RETURN (*DAT_LMR_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_LMR_QUERY_FUNC) ( IN DAT_LMR_HANDLE, /* lmr_handle */ IN DAT_LMR_PARAM_MASK, /* lmr_param_mask */ OUT DAT_LMR_PARAM *); /* lmr_param */ /* Event functions */ -typedef DAT_RETURN (*DAT_EVD_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_COUNT, /* evd_min_qlen */ IN DAT_CNO_HANDLE, /* cno_handle */ IN DAT_EVD_FLAGS, /* evd_flags */ OUT DAT_EVD_HANDLE * ); /* evd_handle */ -typedef DAT_RETURN (*DAT_EVD_MODIFY_CNO_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_MODIFY_CNO_FUNC) ( IN DAT_EVD_HANDLE, /* evd_handle */ IN DAT_CNO_HANDLE); /* cno_handle */ -typedef DAT_RETURN (*DAT_CNO_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CNO_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ IN DAT_OS_WAIT_PROXY_AGENT,/* agent */ OUT DAT_CNO_HANDLE *); /* cno_handle */ -typedef DAT_RETURN (*DAT_CNO_FD_CREATE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CNO_FD_CREATE_FUNC) ( IN DAT_IA_HANDLE, /* ia_handle */ OUT DAT_FD *, /* file_descriptor */ OUT DAT_CNO_HANDLE *); /* cno_handle */ -typedef DAT_RETURN (*DAT_CNO_TRIGGER_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CNO_TRIGGER_FUNC) ( IN DAT_CNO_HANDLE, /* cno_handle */ OUT DAT_EVD_HANDLE *); /* trigger */ -typedef DAT_RETURN (*DAT_CNO_MODIFY_AGENT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CNO_MODIFY_AGENT_FUNC) ( IN DAT_CNO_HANDLE, /* cno_handle */ IN DAT_OS_WAIT_PROXY_AGENT);/* agent */ -typedef DAT_RETURN (*DAT_CNO_QUERY_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CNO_QUERY_FUNC) ( IN DAT_CNO_HANDLE, /* cno_handle */ IN DAT_CNO_PARAM_MASK, /* cno_param_mask */ OUT DAT_CNO_PARAM * ); /* cno_param */ -typedef DAT_RETURN (*DAT_CNO_FREE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CNO_FREE_FUNC) ( IN DAT_CNO_HANDLE); /* cno_handle */ -typedef DAT_RETURN (*DAT_CNO_WAIT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_CNO_WAIT_FUNC) ( IN DAT_CNO_HANDLE, /* cno_handle */ IN DAT_TIMEOUT, /* timeout */ OUT DAT_EVD_HANDLE *); /* evd_handle */ -typedef DAT_RETURN (*DAT_EVD_ENABLE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_ENABLE_FUNC) ( IN DAT_EVD_HANDLE); /* evd_handle */ -typedef DAT_RETURN (*DAT_EVD_WAIT_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_WAIT_FUNC) ( IN DAT_EVD_HANDLE, /* evd_handle */ IN DAT_TIMEOUT, /* Timeout */ IN DAT_COUNT, /* Threshold */ OUT DAT_EVENT *, /* event */ OUT DAT_COUNT * ); /* N more events */ -typedef DAT_RETURN (*DAT_EVD_DISABLE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_DISABLE_FUNC) ( IN DAT_EVD_HANDLE); /* evd_handle */ -typedef DAT_RETURN (*DAT_EVD_SET_UNWAITABLE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_SET_UNWAITABLE_FUNC) ( IN DAT_EVD_HANDLE); /* evd_handle */ -typedef DAT_RETURN (*DAT_EVD_CLEAR_UNWAITABLE_FUNC) ( +typedef DAT_RETURN (DAT_API *DAT_EVD_CLEAR_UNWAITABLE_FUNC) ( IN DAT_EVD_HANDLE); /* evd_handle */ @@ -333,20 +333,20 @@ struct dat_provider DAT_SRQ_SET_LW_FUNC srq_set_lw_func; /* DAT 2.0 functions */ - DAT_CSP_CREATE_FUNC csp_create_func; - DAT_CSP_QUERY_FUNC csp_query_func; - DAT_CSP_FREE_FUNC csp_free_func; + DAT_CSP_CREATE_FUNC csp_create_func; + DAT_CSP_QUERY_FUNC csp_query_func; + DAT_CSP_FREE_FUNC csp_free_func; - DAT_EP_COMMON_CONNECT_FUNC ep_common_connect_func; + DAT_EP_COMMON_CONNECT_FUNC ep_common_connect_func; - DAT_RMR_CREATE_FOR_EP_FUNC rmr_create_for_ep_func; - DAT_EP_POST_SEND_WITH_INVALIDATE_FUNC ep_post_send_with_invalidate_func; - DAT_EP_POST_RDMA_READ_TO_RMR_FUNC ep_post_rdma_read_to_rmr_func; + DAT_RMR_CREATE_FOR_EP_FUNC rmr_create_for_ep_func; + DAT_EP_POST_SEND_WITH_INVALIDATE_FUNC ep_post_send_with_invalidate_func; + DAT_EP_POST_RDMA_READ_TO_RMR_FUNC ep_post_rdma_read_to_rmr_func; - DAT_CNO_FD_CREATE_FUNC cno_fd_create_func; - DAT_CNO_TRIGGER_FUNC cno_trigger_func; + DAT_CNO_FD_CREATE_FUNC cno_fd_create_func; + DAT_CNO_TRIGGER_FUNC cno_trigger_func; - DAT_IA_HA_RELATED_FUNC ia_ha_related_func; + DAT_IA_HA_RELATED_FUNC ia_ha_related_func; #ifdef DAT_EXTENSIONS DAT_HANDLE_EXTENDEDOP_FUNC handle_extendedop_func; From arlin.r.davis at intel.com Thu Jan 3 11:55:42 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 3 Jan 2008 11:55:42 -0800 Subject: [ofa-general] [Patch 3 of 4] uDAPL v2: OFW changes - extension debug, build issues, ptr check macro Message-ID: <000201c84e42$9f35a440$9f97070a@amr.corp.intel.com> Common code - Missing DAT_API defs casting to fix build issues bitmaps for extension debug DAPL_BAD_PTR macro Signed-off by: Stan Smith Signed-off by: Arlin Davis diff --git a/dapl/common/dapl_ep_connect.c b/dapl/common/dapl_ep_connect.c index 0998edc..12d391f 100755 --- a/dapl/common/dapl_ep_connect.c +++ b/dapl/common/dapl_ep_connect.c @@ -260,7 +260,7 @@ dapl_ep_connect ( max_req_pdata_size = dapls_ib_private_data_size (NULL, DAPL_PDATA_CONN_REQ); - if (private_data_size + req_hdr_size > max_req_pdata_size) + if (private_data_size + req_hdr_size > (DAT_COUNT)max_req_pdata_size) { dapl_os_unlock ( &ep_ptr->header.lock ); dat_status = DAT_ERROR (DAT_INVALID_PARAMETER, DAT_INVALID_ARG5); diff --git a/dapl/common/dapl_ep_create.c b/dapl/common/dapl_ep_create.c index 0362ac3..9e6bd9c 100644 --- a/dapl/common/dapl_ep_create.c +++ b/dapl/common/dapl_ep_create.c @@ -64,7 +64,7 @@ * DAT_INVALID_ATTRIBUTE * DAT_MODEL_NOT_SUPPORTED */ -DAT_RETURN +DAT_RETURN DAT_API dapl_ep_create ( IN DAT_IA_HANDLE ia_handle, IN DAT_PZ_HANDLE pz_handle, @@ -148,7 +148,7 @@ dapl_ep_create ( dat_status = DAT_ERROR (DAT_INVALID_PARAMETER, DAT_INVALID_ARG7); goto bail; } - if ((unsigned long)ep_attr & 3) + if (DAPL_BAD_PTR(ep_attr)) { dat_status = DAT_ERROR (DAT_INVALID_PARAMETER, DAT_INVALID_ARG6); goto bail; diff --git a/dapl/common/dapl_ep_create_with_srq.c b/dapl/common/dapl_ep_create_with_srq.c index b62f53b..e288670 100644 --- a/dapl/common/dapl_ep_create_with_srq.c +++ b/dapl/common/dapl_ep_create_with_srq.c @@ -164,7 +164,8 @@ dapl_ep_create_with_srq ( dat_status = DAT_ERROR (DAT_INVALID_PARAMETER, DAT_INVALID_ARG7); goto bail; } - if ((unsigned long)ep_attr & 3) + + if ( DAPL_BAD_PTR(ep_attr) ) { dat_status = DAT_ERROR (DAT_INVALID_PARAMETER, DAT_INVALID_ARG6); goto bail; diff --git a/dapl/common/dapl_get_consumer_context.c b/dapl/common/dapl_get_consumer_context.c index 142b57b..c143d59 100644 --- a/dapl/common/dapl_get_consumer_context.c +++ b/dapl/common/dapl_get_consumer_context.c @@ -67,7 +67,7 @@ dapl_get_consumer_context ( header = (DAPL_HEADER *)dat_handle; if ( ((header) == NULL) || - ((unsigned long)(header) & 3) || + DAPL_BAD_PTR(header) || (header->magic != DAPL_MAGIC_IA && header->magic != DAPL_MAGIC_EVD && header->magic != DAPL_MAGIC_EP && @@ -81,7 +81,7 @@ dapl_get_consumer_context ( dat_status = DAT_ERROR (DAT_INVALID_HANDLE,0); goto bail; } - if ( context == NULL || ((unsigned long)(header) & 3) ) + if ( context == NULL || DAPL_BAD_PTR(header) ) { dat_status = DAT_ERROR (DAT_INVALID_PARAMETER,DAT_INVALID_ARG2); goto bail; diff --git a/dapl/common/dapl_get_handle_type.c b/dapl/common/dapl_get_handle_type.c index 156d758..c970b77 100644 --- a/dapl/common/dapl_get_handle_type.c +++ b/dapl/common/dapl_get_handle_type.c @@ -68,7 +68,7 @@ dapl_get_handle_type ( header = (DAPL_HEADER *)dat_handle; if ( ((header) == NULL) || - ((unsigned long)(header) & 3) || + DAPL_BAD_PTR(header) || (header->magic != DAPL_MAGIC_IA && header->magic != DAPL_MAGIC_EVD && header->magic != DAPL_MAGIC_EP && diff --git a/dapl/common/dapl_ring_buffer_util.c b/dapl/common/dapl_ring_buffer_util.c index 7484234..730f5df 100644 --- a/dapl/common/dapl_ring_buffer_util.c +++ b/dapl/common/dapl_ring_buffer_util.c @@ -342,7 +342,7 @@ dapls_rbuf_adjust ( pos = dapl_os_atomic_read (&rbuf->head); while ( pos != dapl_os_atomic_read (&rbuf->tail) ) { - rbuf->base[pos] = rbuf->base[pos] + offset; + rbuf->base[pos] = (void*)((char*)rbuf->base[pos] + offset); pos = (pos + 1) & rbuf->lim; /* verify in range */ } } diff --git a/dapl/common/dapl_set_consumer_context.c b/dapl/common/dapl_set_consumer_context.c index e43be33..2043b57 100644 --- a/dapl/common/dapl_set_consumer_context.c +++ b/dapl/common/dapl_set_consumer_context.c @@ -68,7 +68,7 @@ dapl_set_consumer_context ( header = (DAPL_HEADER *)dat_handle; if ( ((header) == NULL) || - ((unsigned long) (header) & 3) || + DAPL_BAD_PTR(header) || (header->magic != DAPL_MAGIC_IA && header->magic != DAPL_MAGIC_EVD && header->magic != DAPL_MAGIC_EP && diff --git a/dapl/common/dapl_srq_create.c b/dapl/common/dapl_srq_create.c index 66e9d0e..b03bbd6 100644 --- a/dapl/common/dapl_srq_create.c +++ b/dapl/common/dapl_srq_create.c @@ -115,7 +115,7 @@ dapl_srq_create ( dat_status = DAT_ERROR (DAT_INVALID_PARAMETER, DAT_INVALID_ARG4); goto bail; } - if ((unsigned long)srq_attr & 3) + if (DAPL_BAD_PTR(srq_attr)) { dat_status = DAT_ERROR (DAT_INVALID_PARAMETER, DAT_INVALID_ARG3); goto bail; diff --git a/dapl/include/dapl.h b/dapl/include/dapl.h index ade101b..6375d29 100755 --- a/dapl/include/dapl.h +++ b/dapl/include/dapl.h @@ -161,15 +161,26 @@ typedef enum dapl_qp_state * * *********************************************************************/ +#if defined (sun) || defined(__sun) || defined(_sun_) || defined (__solaris__) +#define DAPL_BAD_PTR(a) ((unsigned long)(a) & 3) +#elif defined(__linux__) +#define DAPL_BAD_PTR(a) ((unsigned long)(a) & 3) +#elif defined(_WIN64) +#define DAPL_BAD_PTR(a) ((unsigned long)((DAT_UINT64)(a)) & 3) +#elif defined(_WIN32) +#define DAPL_BAD_PTR(a) ((unsigned long)((DAT_UINT64)(a)) & 3) +#endif + /* * Simple macro to verify a handle is bad. Conditions: * - pointer is NULL * - pointer is not word aligned * - pointer's magic number is wrong */ + #define DAPL_BAD_HANDLE(h, magicNum) ( \ ((h) == NULL) || \ - ((unsigned long)(h) & 3) || \ + DAPL_BAD_PTR(h) || \ (((DAPL_HEADER *)(h))->magic != (magicNum))) #define DAPL_MIN(a, b) ((a < b) ? (a) : (b)) diff --git a/dapl/include/dapl_debug.h b/dapl/include/dapl_debug.h index a3bdcf6..76db8fd 100644 --- a/dapl/include/dapl_debug.h +++ b/dapl/include/dapl_debug.h @@ -85,7 +85,16 @@ extern DAPL_DBG_DEST g_dapl_dbg_dest; extern void dapl_internal_dbg_log ( DAPL_DBG_TYPE type, const char *fmt, ...); #else /* !DAPL_DBG */ + +#if defined(_WIN32) || defined(_WIN64) +/* sigh - no support for (...) in macros. Compiler should optimize this away */ +static __inline void dapl_dbg_log ( DAPL_DBG_TYPE type, const char *fmt, ...) +{ +} +#else #define dapl_dbg_log(...) +#endif + #endif /* !DAPL_DBG */ /* @@ -112,7 +121,8 @@ extern void dapl_internal_dbg_log ( DAPL_DBG_TYPE type, const char *fmt, ...); #define DCNT_EVD_DEQUEUE_NOT_FOUND 18 #define DCNT_TIMER_SET 19 #define DCNT_TIMER_CANCEL 20 -#define DCNT_NUM_COUNTERS 21 +#define DCNT_EXTENSION 21 +#define DCNT_NUM_COUNTERS 22 #define DCNT_ALL_COUNTERS DCNT_NUM_COUNTERS #if defined(DAPL_COUNTERS) From arlin.r.davis at intel.com Thu Jan 3 11:55:46 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Thu, 3 Jan 2008 11:55:46 -0800 Subject: [ofa-general] [Patch 4 of 4] uDAPL v2: OFW changes - return status checking in evd processing, add function dapl_event_str, IB extension build Message-ID: <000301c84e42$a14dfcf0$9f97070a@amr.corp.intel.com> Windows specific - IBAL support in evd_create Build IB extensions by default Common code - check return status, evd_free, evd_wait add dapl_event_str function definitions for dat_os_library_error, dat_os_ungetc Signed-off by: Stan Smith Signed-off by: Arlin Davis diff --git a/dapl/common/dapl_evd_free.c b/dapl/common/dapl_evd_free.c index 407dbc8..cacb84a 100755 --- a/dapl/common/dapl_evd_free.c +++ b/dapl/common/dapl_evd_free.c @@ -125,9 +125,11 @@ DAT_RETURN DAT_API dapl_evd_free ( #endif /* defined(__KDAPL__) */ bail: - dapl_dbg_log (DAPL_DBG_TYPE_RTN, - "dapl_evd_free () returns 0x%x\n", - dat_status); + if ( dat_status ) + { + dapl_dbg_log (DAPL_DBG_TYPE_RTN, + "dapl_evd_free () returns 0x%x\n", dat_status); + } return dat_status; } diff --git a/dapl/common/dapl_evd_util.c b/dapl/common/dapl_evd_util.c index 1f55ea3..c2888f3 100755 --- a/dapl/common/dapl_evd_util.c +++ b/dapl/common/dapl_evd_util.c @@ -53,6 +53,52 @@ DAT_RETURN dapli_evd_event_alloc ( IN DAT_COUNT qlen); +char *dapl_event_str( IN DAT_EVENT_NUMBER event_num ) +{ +#if defined(DAPL_DBG) + struct dat_event_str { char *str; DAT_EVENT_NUMBER num;}; + static struct dat_event_str events[] = { + {"DAT_DTO_COMPLETION_EVENT", DAT_DTO_COMPLETION_EVENT}, + {"DAT_RMR_BIND_COMPLETION_EVENT", DAT_RMR_BIND_COMPLETION_EVENT}, + {"DAT_CONNECTION_REQUEST_EVENT", DAT_CONNECTION_REQUEST_EVENT}, + {"DAT_CONNECTION_EVENT_ESTABLISHED", DAT_CONNECTION_EVENT_ESTABLISHED}, + {"DAT_CONNECTION_EVENT_PEER_REJECTED", DAT_CONNECTION_EVENT_PEER_REJECTED}, + {"DAT_CONNECTION_EVENT_NON_PEER_REJECTED", DAT_CONNECTION_EVENT_NON_PEER_REJECTED}, + {"DAT_CONNECTION_EVENT_ACCEPT_COMPLETION_ERROR", DAT_CONNECTION_EVENT_ACCEPT_COMPLETION_ERROR}, + {"DAT_CONNECTION_EVENT_DISCONNECTED", DAT_CONNECTION_EVENT_DISCONNECTED}, + {"DAT_CONNECTION_EVENT_BROKEN", DAT_CONNECTION_EVENT_BROKEN}, + {"DAT_CONNECTION_EVENT_TIMED_OUT", DAT_CONNECTION_EVENT_TIMED_OUT}, + {"DAT_CONNECTION_EVENT_UNREACHABLE", DAT_CONNECTION_EVENT_UNREACHABLE}, + {"DAT_ASYNC_ERROR_EVD_OVERFLOW", DAT_ASYNC_ERROR_EVD_OVERFLOW}, + {"DAT_ASYNC_ERROR_IA_CATASTROPHIC", DAT_ASYNC_ERROR_IA_CATASTROPHIC}, + {"DAT_ASYNC_ERROR_EP_BROKEN", DAT_ASYNC_ERROR_EP_BROKEN}, + {"DAT_ASYNC_ERROR_TIMED_OUT", DAT_ASYNC_ERROR_TIMED_OUT}, + {"DAT_ASYNC_ERROR_PROVIDER_INTERNAL_ERROR", DAT_ASYNC_ERROR_PROVIDER_INTERNAL_ERROR}, + {"DAT_HA_DOWN_TO_1", DAT_HA_DOWN_TO_1}, + {"DAT_HA_UP_TO_MULTI_PATH", DAT_HA_UP_TO_MULTI_PATH}, + {"DAT_SOFTWARE_EVENT", DAT_SOFTWARE_EVENT}, +#ifdef DAT_EXTENSIONS + {"DAT_EXTENSION_EVENT", DAT_EXTENSION_EVENT}, + {"DAT_IB_EXTENSION_RANGE_BASE", DAT_IB_EXTENSION_RANGE_BASE}, + {"DAT_IW_EXTENSION_RANGE_BASE", DAT_IW_EXTENSION_RANGE_BASE}, +#endif /* DAT_EXTENSIONS */ + {NULL,0}, + }; + int i; + + for(i=0; events[i].str; i++) + { + if (events[i].num == event_num) + return events[i].str; + } + return "Unknown DAT event?"; +#else + static char str[16]; + sprintf(str,"%x",event_num); + return str; +#endif +} + /* * dapls_evd_internal_create * @@ -122,7 +168,15 @@ dapls_evd_internal_create ( | DAT_EVD_CONNECTION_FLAG | DAT_EVD_CR_FLAG) ) ) { - +#if defined(_VENDOR_IBAL_) + /* + * The creation of CQ required a PD (PZ) associated with it and + * we do not have a PD here; therefore, the work-around is that we + * will postpone the creation of the cq till the creation of QP which + * this cq will associate with. + */ + evd_ptr->ib_cq_handle = IB_INVALID_HANDLE; +#else dat_status = dapls_ib_cq_alloc (ia_ptr, evd_ptr, &cq_len); @@ -153,6 +207,7 @@ dapls_evd_internal_create ( goto bail; } +#endif /* _VENDOR_IBAL_ */ } /* We now have an accurate count of events, so allocate them into diff --git a/dapl/common/dapl_ia_query.c b/dapl/common/dapl_ia_query.c index 7596daf..593f356 100755 --- a/dapl/common/dapl_ia_query.c +++ b/dapl/common/dapl_ia_query.c @@ -222,9 +222,10 @@ dapl_ia_query ( } bail: - dapl_dbg_log (DAPL_DBG_TYPE_RTN, - "dapl_ia_query () returns 0x%x\n", - dat_status); + if (dat_status != DAT_SUCCESS) { + dapl_dbg_log (DAPL_DBG_TYPE_RTN, + "dapl_ia_query () returns 0x%x\n", dat_status); + } return dat_status; } diff --git a/dapl/udapl/dapl_evd_wait.c b/dapl/udapl/dapl_evd_wait.c index e4e5b37..7cfece7 100644 --- a/dapl/udapl/dapl_evd_wait.c +++ b/dapl/udapl/dapl_evd_wait.c @@ -273,9 +273,9 @@ DAT_RETURN DAT_API dapl_evd_wait ( *nmore = dapls_rbuf_count(&evd_ptr->pending_event_queue); bail: - dapl_dbg_log (DAPL_DBG_TYPE_RTN, - "dapl_evd_wait () returns 0x%x\n", - dat_status); - + if ( dat_status ) { + dapl_dbg_log (DAPL_DBG_TYPE_RTN, + "dapl_evd_wait () returns 0x%x\n", dat_status); + } return dat_status; } diff --git a/dat/include/dat/udat_config.h b/dat/include/dat/udat_config.h index 674f579..a720376 100644 --- a/dat/include/dat/udat_config.h +++ b/dat/include/dat/udat_config.h @@ -79,4 +79,11 @@ #define DAT_THREADSAFE DAT_TRUE #endif /* DAT_THREADSAFE */ +/* + * Enable DAT Extensions + */ +#ifndef DAT_EXTENSIONS +#define DAT_EXTENSIONS 1 +#endif + #endif /* _UDAT_CONFIG_H_ */ diff --git a/dat/udat/linux/dat_osd.h b/dat/udat/linux/dat_osd.h index c2ecc16..4a96ab5 100644 --- a/dat/udat/linux/dat_osd.h +++ b/dat/udat/linux/dat_osd.h @@ -114,6 +114,7 @@ dat_os_dbg_print ( *********************************************************************/ #define DAT_ERROR(Type, SubType) ((DAT_RETURN)(DAT_CLASS_ERROR | Type | SubType)) +#define dat_os_library_error() dlerror() typedef size_t DAT_OS_SIZE; typedef void * DAT_OS_LIBRARY_HANDLE; @@ -296,6 +297,16 @@ dat_os_fgetc ( return fgetc (file); } +/* dat_os_ungetc() returns EOF on error or char 'c'. + * Push char 'c' back into specified stream for subsequent read. + */ +STATIC INLINE int +dat_os_ungetc ( + DAT_OS_FILE *file, int c) +{ + return ungetc(c, file); +} + /* dat_os_fgetc() returns EOF on error or end of file. */ STATIC INLINE int dat_os_fputc ( From dillowda at ornl.gov Thu Jan 3 12:09:17 2008 From: dillowda at ornl.gov (David Dillow) Date: Thu, 03 Jan 2008 15:09:17 -0500 Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5 In-Reply-To: <20080103173330T.tomof@acm.org> References: <20071223014407L.tomof@acm.org> <1198689251.25003.2.camel@lap75545.ornl.gov> <20080103173330T.tomof@acm.org> Message-ID: <1199390957.7561.33.camel@lap75545.ornl.gov> On Thu, 2008-01-03 at 17:30 +0900, FUJITA Tomonori wrote: > On Wed, 02 Jan 2008 09:51:38 -0800 > Roland Dreier wrote: > > > > > Can you try this? > > > > > > That patched oopsed in scsi_remove_host(), but reversing the order has > > > survived over 500 insert/probe/remove cycles. > > > > > > Tested-by: David Dillow > > > --- > > > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c > > > index 950228f..77e8b90 100644 > > > --- a/drivers/infiniband/ulp/srp/ib_srp.c > > > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > > > @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device) > > > list_for_each_entry_safe(target, tmp_target, > > > &host->target_list, list) { > > > scsi_remove_host(target->scsi_host); > > > + srp_remove_host(target->scsi_host); > > > srp_disconnect_target(target); > > > > Where do we stand on this? What is the right place to put the > > srp_remove_host? Is there a bug somewhere else? > > {sas|fc}_remove_host is called before scsi_remove_host. And in > srp_remove_work(), we call srp_remove_host and then > scsi_remove_host. ibmvscsi also calls them in that order. > > I thought that I messed up something in srp_transport_class. But I > can't figure out what's wrong. The above patch works and is unlikely > to lead to critical problems so I'm fine with it for now. I added some debugging printk's -- the first word is the function name: printk(KERN_DEBUG "ib_srp:srp_remove_one %p %p\n", target, target->scsi_host); printk(KERN_DEBUG "srp_rport_del %p %p %p %s\n", shost, rport, dev, dev->kobj.k_name); printk(KERN_DEBUG "transport_remove_dev %p %d\n", dev, atomic_read(&dev->kobj.kref.refcount)); printk(KERN_DEBUG "transport_remove_classdev %p\n", dev); printk(KERN_DEBUG "scsi_target_reap_usercontext %p %p %p\n", shost, starget, &starget->dev); And the dmesg output: ib_srp:srp_remove_one ffff810845498450 ffff810845498000 srp_rport_del ffff810845498000 ffff8108450d6000 ffff8108450d6000 port-3:1 transport_remove_dev ffff8108450d6000 4 transport_remove_classdev ffff8108450d6000 srp_rport_del done srp_rport_del ffff810845498000 ffff810845123028 ffff810845123028 target3:0:0 transport_remove_dev ffff810845123028 9 srp_rport_del done transport_remove_dev ffff81084557f920 6 sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK transport_remove_dev ffff8108454f6920 6 sd 0:0:0:1: [sdb] Synchronizing SCSI cache sd 0:0:0:1: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK scsi_target_reap_usercontext ffff810845498000 ffff810845123000 ffff810845123028 transport_remove_dev ffff810845123028 2 It looks like srp_rport_del() is getting called for a device object it doesn't own -- target3:0:0. And when scsi_remove_host() goes to remove it, it is already gone. Adding if (strncpy(dev->kobject.k_name, "port-", 5)) return; to the top of srp_rport_del() fixes the oops, so that seems to confirm my hypothesis. When scsi_remove_host() is called before srp_remove_host(), it removes that "target3:0:0" entry, and all is happy, so the fix already posted should be fine for 2.6.24, though we may want to fix up srp_remove_work() as well -- I've not looked at it to see if it would have the same problem. As for a better fix, I'm not sure. I'll go out on a limb and bet the other users of srp_remove_host() may have the same issue. Dave From dillowda at ornl.gov Thu Jan 3 12:13:07 2008 From: dillowda at ornl.gov (David Dillow) Date: Thu, 03 Jan 2008 15:13:07 -0500 Subject: [ofa-general] Re: [GIT PULL] please pull infiniband.git for-linus In-Reply-To: References: <1199385590.7561.11.camel@lap75545.ornl.gov> Message-ID: <1199391187.7561.38.camel@lap75545.ornl.gov> On Thu, 2008-01-03 at 10:56 -0800, Roland Dreier wrote: > > If we've got time before 2.6.24 final, I'd wait on this a bit. > > ib_srp:srp_remove_work() has them reversed as well, and I'm currently > > tracking down why it oopses when the srp_remove_host() happens before > > the scsi_remove_host(), which is the documented call sequence. > > I think the best thing to do is to merge this (assuming that Linus > gets to it), since it looks quite safe and definitely fixes a crash. > Then if we get to the root cause we can change the order of the calls > if it turns out a different fix is required. I've made progress on the root cause (posted in another thread), but we need to fix ib_srp.c:srp_remove_work() as well, as I think it will have the same issue. It will only be hit if we cannot reconnect to the target, so it probably doesn't see a lot of use. I'll send a new patch to you for just the ib_srp.c side. Dave From dillowda at ornl.gov Thu Jan 3 12:20:09 2008 From: dillowda at ornl.gov (David Dillow) Date: Thu, 03 Jan 2008 15:20:09 -0500 Subject: [ofa-general] Re: [GIT PULL] please pull infiniband.git for-linus In-Reply-To: <1199391187.7561.38.camel@lap75545.ornl.gov> References: <1199385590.7561.11.camel@lap75545.ornl.gov> <1199391187.7561.38.camel@lap75545.ornl.gov> Message-ID: <1199391609.7561.46.camel@lap75545.ornl.gov> Subject: IB/srp: Fix list corruption/oops on module reload ib_srp doesn't clean up the transport attributes properly when unloading, so it leaves references around to free'd memory. The srp_remove_host() cannot go before the scsi_remove_host() call as the documented call sequence suggests, as it will cause an oops when the SRP transport code will free up an object that is not its own. So, temporarily reorder the calls in srp_remove_work() to avoid the problem as well until the transport code can be fixed. Signed-off-by: David Dillow --- On Thu, 2008-01-03 at 15:13 -0500, David Dillow wrote: > I'll send a new patch to you for just the ib_srp.c side. Should've just sent it with the last email. diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 950228f..6e7e3c8 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -423,8 +423,8 @@ static void srp_remove_work(struct work_struct *work) list_del(&target->list); spin_unlock(&target->srp_host->target_lock); - srp_remove_host(target->scsi_host); scsi_remove_host(target->scsi_host); + srp_remove_host(target->scsi_host); ib_destroy_cm_id(target->cm_id); srp_free_target_ib(target); scsi_host_put(target->scsi_host); @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device) list_for_each_entry_safe(target, tmp_target, &host->target_list, list) { scsi_remove_host(target->scsi_host); + srp_remove_host(target->scsi_host); srp_disconnect_target(target); ib_destroy_cm_id(target->cm_id); srp_free_target_ib(target); From dave at thedillows.org Thu Jan 3 12:51:25 2008 From: dave at thedillows.org (David Dillow) Date: Thu, 03 Jan 2008 15:51:25 -0500 Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5 In-Reply-To: <1199390957.7561.33.camel@lap75545.ornl.gov> References: <20071223014407L.tomof@acm.org> <1198689251.25003.2.camel@lap75545.ornl.gov> <20080103173330T.tomof@acm.org> <1199390957.7561.33.camel@lap75545.ornl.gov> Message-ID: <1199393485.7561.51.camel@lap75545.ornl.gov> On Thu, 2008-01-03 at 15:09 -0500, David Dillow wrote: > As for a better fix, I'm not sure. Here's a better way than the strncmp. If this meets everyone's approval, then I can roll up a proper commit. diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c index 44a340b..65c584d 100644 --- a/drivers/scsi/scsi_transport_srp.c +++ b/drivers/scsi/scsi_transport_srp.c @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del); static int do_srp_rport_del(struct device *dev, void *data) { - srp_rport_del(dev_to_rport(dev)); + if (scsi_is_srp_rport(dev)) + srp_rport_del(dev_to_rport(dev)); return 0; } From rdreier at cisco.com Thu Jan 3 13:33:14 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 13:33:14 -0800 Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5 In-Reply-To: <1199393485.7561.51.camel@lap75545.ornl.gov> (David Dillow's message of "Thu, 03 Jan 2008 15:51:25 -0500") References: <20071223014407L.tomof@acm.org> <1198689251.25003.2.camel@lap75545.ornl.gov> <20080103173330T.tomof@acm.org> <1199390957.7561.33.camel@lap75545.ornl.gov> <1199393485.7561.51.camel@lap75545.ornl.gov> Message-ID: > + if (scsi_is_srp_rport(dev)) > + srp_rport_del(dev_to_rport(dev)); This has the ring of truth to me as the right fix, although I certainly don't know the details of this code... Fujita-san? - R. From rdreier at cisco.com Thu Jan 3 13:34:24 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 13:34:24 -0800 Subject: [ofa-general] Re: [GIT PULL] please pull infiniband.git for-linus In-Reply-To: <1199391609.7561.46.camel@lap75545.ornl.gov> (David Dillow's message of "Thu, 03 Jan 2008 15:20:09 -0500") References: <1199385590.7561.11.camel@lap75545.ornl.gov> <1199391187.7561.38.camel@lap75545.ornl.gov> <1199391609.7561.46.camel@lap75545.ornl.gov> Message-ID: > @@ -423,8 +423,8 @@ static void srp_remove_work(struct work_struct *work) > list_del(&target->list); > spin_unlock(&target->srp_host->target_lock); > > - srp_remove_host(target->scsi_host); > scsi_remove_host(target->scsi_host); > + srp_remove_host(target->scsi_host); Thanks... I just confirmed (by crashing my system) that either this change or the fix to the srp transport class is needed too. I think we have time before 2.6.24 final to get the right fix in, so I'll wait until tomorrow before asking Linus to pull this. - R. From rdreier at cisco.com Thu Jan 3 13:52:00 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 13:52:00 -0800 Subject: [ofa-general] Re: [PATCH] [2.6.25] MAINTAINTERS: update email address In-Reply-To: <000c01c84d7a$1cc23e60$3c98070a@amr.corp.intel.com> (Sean Hefty's message of "Wed, 2 Jan 2008 12:00:24 -0800") References: <000c01c84d7a$1cc23e60$3c98070a@amr.corp.intel.com> Message-ID: thanks... I think I'll sneak this into 2.6.24, since I have another SRP fix still to come. From pradeeps at linux.vnet.ibm.com Thu Jan 3 14:18:20 2008 From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana) Date: Thu, 03 Jan 2008 14:18:20 -0800 Subject: [ofa-general] Re: [PATCH] IPOIB/CM Enable SRQ support on HCAs with less than 16 SG entries In-Reply-To: <476C2B47.5060507@linux.vnet.ibm.com> References: <476C2B47.5060507@linux.vnet.ibm.com> Message-ID: <477D5F2C.2020703@linux.vnet.ibm.com> Roland, When you get a chance can you please look over this path and let me know if it needs any further updates? I incorporated your earlier suggestions, and would like this one to be merged into for-2.6.25 branch. http://lists.openfabrics.org/pipermail/general/2007-December/044298.html Pradeep From hnguyen at linux.vnet.ibm.com Thu Jan 3 14:43:44 2008 From: hnguyen at linux.vnet.ibm.com (Hoang-Nam Nguyen) Date: Thu, 3 Jan 2008 23:43:44 +0100 Subject: [ofa-general] Re: [PATCH] IB/ehca: Forward event client-reregister-required to registered clients In-Reply-To: <200712201506.34253.hnguyen@linux.vnet.ibm.com> References: <200712201506.34253.hnguyen@linux.vnet.ibm.com> Message-ID: <200801032343.44345.hnguyen@linux.vnet.ibm.com> Hi Roland, Just want to make sure you've seen this patch and if it looks ok for you. Thanks Nam On Thursday 20 December 2007 15:06, Hoang-Nam Nguyen wrote: > This patch allows ehca to forward event client-reregister-required to > registered clients. Such one event is generated by the switch eg. after > its reboot. > > Signed-off-by: Hoang-Nam Nguyen > --- > drivers/infiniband/hw/ehca/ehca_irq.c | 12 ++++++++++++ > 1 files changed, 12 insertions(+), 0 deletions(-) From riel at redhat.com Thu Jan 3 15:11:27 2008 From: riel at redhat.com (Rik van Riel) Date: Thu, 3 Jan 2008 18:11:27 -0500 Subject: [ofa-general] Re: [GIT PULL] please pull infiniband.git for-linus In-Reply-To: <1199391609.7561.46.camel@lap75545.ornl.gov> References: <1199385590.7561.11.camel@lap75545.ornl.gov> <1199391187.7561.38.camel@lap75545.ornl.gov> <1199391609.7561.46.camel@lap75545.ornl.gov> Message-ID: <20080103181127.054dd917@cuia.boston.redhat.com> On Thu, 03 Jan 2008 15:20:09 -0500 David Dillow wrote: > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c > index 950228f..6e7e3c8 100644 > --- a/drivers/infiniband/ulp/srp/ib_srp.c > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > @@ -423,8 +423,8 @@ static void srp_remove_work(struct work_struct *work) > list_del(&target->list); > spin_unlock(&target->srp_host->target_lock); > > - srp_remove_host(target->scsi_host); > scsi_remove_host(target->scsi_host); > + srp_remove_host(target->scsi_host); > ib_destroy_cm_id(target->cm_id); > srp_free_target_ib(target); > scsi_host_put(target->scsi_host); These last two look suspicious. Are you freeing target before freeing target->scsi_host or does the code simply not do what it looks like it's doing? :) (no, I haven't looked at the IB code - I'm probably wrong) -- All Rights Reversed From dillowda at ornl.gov Thu Jan 3 15:18:34 2008 From: dillowda at ornl.gov (Dave Dillow) Date: Thu, 03 Jan 2008 18:18:34 -0500 Subject: [ofa-general] Re: [GIT PULL] please pull infiniband.git for-linus In-Reply-To: <20080103181127.054dd917@cuia.boston.redhat.com> References: <1199385590.7561.11.camel@lap75545.ornl.gov> <1199391187.7561.38.camel@lap75545.ornl.gov> <1199391609.7561.46.camel@lap75545.ornl.gov> <20080103181127.054dd917@cuia.boston.redhat.com> Message-ID: <1199402314.6854.0.camel@obelisk.thedillows.org> On Thu, 2008-01-03 at 18:11 -0500, Rik van Riel wrote: > On Thu, 03 Jan 2008 15:20:09 -0500 > David Dillow wrote: > > > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c > > index 950228f..6e7e3c8 100644 > > --- a/drivers/infiniband/ulp/srp/ib_srp.c > > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > > @@ -423,8 +423,8 @@ static void srp_remove_work(struct work_struct *work) > > list_del(&target->list); > > spin_unlock(&target->srp_host->target_lock); > > > > - srp_remove_host(target->scsi_host); > > scsi_remove_host(target->scsi_host); > > + srp_remove_host(target->scsi_host); > > ib_destroy_cm_id(target->cm_id); > > srp_free_target_ib(target); > > scsi_host_put(target->scsi_host); > > These last two look suspicious. Are you freeing target before > freeing target->scsi_host or does the code simply not do what > it looks like it's doing? :) > > (no, I haven't looked at the IB code - I'm probably wrong) srp_free_target_ib() just frees the buffers for the target, and scsi_host_put() does the actual cleanup once the refcount drops to zero. Dave From swise at opengridcomputing.com Thu Jan 3 15:42:52 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 03 Jan 2008 17:42:52 -0600 Subject: [ofa-general] [GIT PULL ofed-1.2.5 / ofed-1.3] - libcxgb3-1.1.2 release Message-ID: <477D72FC.2000800@opengridcomputing.com> Vlad, Please pull version 1.1.2 of libcxgb3 for ofed-1.2.5 and ofed-1.3. This release fixes a segfault that can happen when running rdma apps over chelsio's device on 32b platforms and distros (bug 680). Pull from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 and git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3 Thanks, Steve. From sean.hefty at intel.com Thu Jan 3 15:54:23 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 3 Jan 2008 15:54:23 -0800 Subject: [ofa-general] [PATCH] librdmacm: override default responder_resources with user value Message-ID: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com> By default, the responder_resources parameter is set to that received in a connection request. The passive side may override this value when accepting the connection. Use the value provided by the passive side when transitioning the QP to RTR state, rather than the value given in the connect request. Without this change, the RTR transition may fail if the passive side supports fewer responder_resources than that requested in the request. Signed-off-by: Sean Hefty --- This fixes an issue seen by uDAPL. The kernel rdma_cm will need a similar fix. I will wait a couple of weeks and release yet another version of librdmacm (1.0.6) for OFED 1.3 that includes this fix. src/cma.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/src/cma.c b/src/cma.c index 00ea394..751ca9d 100644 --- a/src/cma.c +++ b/src/cma.c @@ -611,7 +611,8 @@ static int rdma_init_qp_attr(struct rdma_cm_id *id, struct ibv_qp_attr *qp_attr, return 0; } -static int ucma_modify_qp_rtr(struct rdma_cm_id *id) +static int ucma_modify_qp_rtr(struct rdma_cm_id *id, + struct rdma_conn_param *conn_param) { struct ibv_qp_attr qp_attr; int qp_attr_mask, ret; @@ -634,6 +635,8 @@ static int ucma_modify_qp_rtr(struct rdma_cm_id *id) if (ret) return ret; + if (conn_param) + qp_attr.max_dest_rd_atomic = conn_param->responder_resources; return ibv_modify_qp(id->qp, &qp_attr, qp_attr_mask); } @@ -911,7 +914,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param) return ret; if (!ucma_is_ud_ps(id->ps)) { - ret = ucma_modify_qp_rtr(id); + ret = ucma_modify_qp_rtr(id, conn_param); if (ret) return ret; } @@ -1193,7 +1196,7 @@ static int ucma_process_conn_resp(struct cma_id_private *id_priv) void *msg; int ret, size; - ret = ucma_modify_qp_rtr(&id_priv->id); + ret = ucma_modify_qp_rtr(&id_priv->id, NULL); if (ret) goto err; From sean.hefty at intel.com Thu Jan 3 16:34:47 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 3 Jan 2008 16:34:47 -0800 Subject: [ofa-general] setting iWarp IRD and ORD In-Reply-To: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com> References: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com> Message-ID: <000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com> >-static int ucma_modify_qp_rtr(struct rdma_cm_id *id) >+static int ucma_modify_qp_rtr(struct rdma_cm_id *id, >+ struct rdma_conn_param *conn_param) > { > struct ibv_qp_attr qp_attr; > int qp_attr_mask, ret; >@@ -634,6 +635,8 @@ static int ucma_modify_qp_rtr(struct rdma_cm_id *id) > if (ret) > return ret; > >+ if (conn_param) >+ qp_attr.max_dest_rd_atomic = conn_param->responder_resources; > return ibv_modify_qp(id->qp, &qp_attr, qp_attr_mask); Can one of the iWarp providers explain how IRD and ORD get set for both userspace and the kernel? - Sean From swise at opengridcomputing.com Thu Jan 3 16:39:09 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 03 Jan 2008 18:39:09 -0600 Subject: [ofa-general] Re: setting iWarp IRD and ORD In-Reply-To: <000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com> References: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com> <000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com> Message-ID: <477D802D.2050205@opengridcomputing.com> Sean Hefty wrote: >> -static int ucma_modify_qp_rtr(struct rdma_cm_id *id) >> +static int ucma_modify_qp_rtr(struct rdma_cm_id *id, >> + struct rdma_conn_param *conn_param) >> { >> struct ibv_qp_attr qp_attr; >> int qp_attr_mask, ret; >> @@ -634,6 +635,8 @@ static int ucma_modify_qp_rtr(struct rdma_cm_id *id) >> if (ret) >> return ret; >> >> + if (conn_param) >> + qp_attr.max_dest_rd_atomic = conn_param->responder_resources; >> return ibv_modify_qp(id->qp, &qp_attr, qp_attr_mask); > > Can one of the iWarp providers explain how IRD and ORD get set for both > userspace and the kernel? > They're set by the application via the rdma_conn_param on the active side, and they get set to the device max values by default on the passive side. There is currently no matching of these by the iwcm or any of the driver code: its up to the ULP/application to negotiate this via private data or some other way. Is that what you're asking? At some point IBM posted a patch to add fields to the private data to allow negotiating this as part of iwarp connection setup... Steve. From fujita.tomonori at lab.ntt.co.jp Thu Jan 3 16:47:22 2008 From: fujita.tomonori at lab.ntt.co.jp (FUJITA Tomonori) Date: Fri, 04 Jan 2008 09:47:22 +0900 Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5 In-Reply-To: <1199393485.7561.51.camel@lap75545.ornl.gov> References: <20080103173330T.tomof@acm.org> <1199390957.7561.33.camel@lap75545.ornl.gov> <1199393485.7561.51.camel@lap75545.ornl.gov> Message-ID: <20080104094722F.fujita.tomonori@lab.ntt.co.jp> On Thu, 03 Jan 2008 15:51:25 -0500 David Dillow wrote: > > On Thu, 2008-01-03 at 15:09 -0500, David Dillow wrote: > > As for a better fix, I'm not sure. > > Here's a better way than the strncmp. If this meets everyone's approval, > then I can roll up a proper commit. Thanks! I really apprecate it. I think that this is the root problem and the patch fixes it in the right way. Please send this patch to linux-scsi at vger.kernel.org and a patch to move srp_remove_host before scsi_remove_host in srp_remove_one to Roland. Acked-by: FUJITA Tomonori > diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c > index 44a340b..65c584d 100644 > --- a/drivers/scsi/scsi_transport_srp.c > +++ b/drivers/scsi/scsi_transport_srp.c > @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del); > > static int do_srp_rport_del(struct device *dev, void *data) > { > - srp_rport_del(dev_to_rport(dev)); > + if (scsi_is_srp_rport(dev)) > + srp_rport_del(dev_to_rport(dev)); > return 0; > } > > From sean.hefty at intel.com Thu Jan 3 16:50:21 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 3 Jan 2008 16:50:21 -0800 Subject: [ofa-general] RE: setting iWarp IRD and ORD In-Reply-To: <477D802D.2050205@opengridcomputing.com> References: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com> <000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com> <477D802D.2050205@opengridcomputing.com> Message-ID: <000201c84e6b$c8d77610$ff0da8c0@amr.corp.intel.com> >They're set by the application via the rdma_conn_param on the active >side, and they get set to the device max values by default on the >passive side. There is currently no matching of these by the iwcm or >any of the driver code: its up to the ULP/application to negotiate this >via private data or some other way. > >Is that what you're asking? In part - I don't see that IRD or ORD are set through the modify QP calls in the rdma_cm. I'm guessing that the values are passed to the driver, and are used by the driver when transitioning to RTS? Basically, I'm trying to determine if the fix to the librdmacm affects iWarp at all (I don't believe that it does), and if a similar fix would also work in the kernel. I'm pretty sure that it will for IB, but I'm not sure what the behavior would be for iWarp. - Sean From swise at opengridcomputing.com Thu Jan 3 16:52:16 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 03 Jan 2008 18:52:16 -0600 Subject: [ofa-general] Re: setting iWarp IRD and ORD In-Reply-To: <000201c84e6b$c8d77610$ff0da8c0@amr.corp.intel.com> References: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com> <000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com> <477D802D.2050205@opengridcomputing.com> <000201c84e6b$c8d77610$ff0da8c0@amr.corp.intel.com> Message-ID: <477D8340.4060606@opengridcomputing.com> Sean Hefty wrote: >> They're set by the application via the rdma_conn_param on the active >> side, and they get set to the device max values by default on the >> passive side. There is currently no matching of these by the iwcm or >> any of the driver code: its up to the ULP/application to negotiate this >> via private data or some other way. >> >> Is that what you're asking? > > In part - I don't see that IRD or ORD are set through the modify QP calls in the > rdma_cm. I'm guessing that the values are passed to the driver, and are used by > the driver when transitioning to RTS? > > Basically, I'm trying to determine if the fix to the librdmacm affects iWarp at > all (I don't believe that it does), and if a similar fix would also work in the > kernel. I'm pretty sure that it will for IB, but I'm not sure what the behavior > would be for iWarp. > > - Sean What exactly are you fixing? From sean.hefty at intel.com Thu Jan 3 17:04:26 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 3 Jan 2008 17:04:26 -0800 Subject: [ofa-general] RE: setting iWarp IRD and ORD In-Reply-To: <477D8340.4060606@opengridcomputing.com> References: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com> <000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com> <477D802D.2050205@opengridcomputing.com> <000201c84e6b$c8d77610$ff0da8c0@amr.corp.intel.com> <477D8340.4060606@opengridcomputing.com> Message-ID: <000301c84e6d$c0904020$ff0da8c0@amr.corp.intel.com> >What exactly are you fixing? It may help to look at the patch for the librdmacm that I just posted. The problem that was seen was that uDAPL set IRD/ORD to 16 in the connection request, but the passive side could only support 4. The rdma_cm was supposed to use the lower value (provided by the user when calling rdma_accept) when transitioning the QP, but instead used the value from the request. Since iWarp connect request doesn't carry the IRD/ORD and always uses the value provided by the user through rdma_connect or rdma_accept, it doesn't sound like it would hit this problem. I just don't quite follow why iwcm_init_qp_rts_attr() doesn't set the QP attribute mask for IRD/ORD, or how the QP is programmed with these values. - Sean From kliteyn at mellanox.co.il Thu Jan 3 17:37:34 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 4 Jan 2008 03:37:34 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-04:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-03 OpenSM git rev = Thu_Jan_3_04:20:57_2008 [d1470d92223f94fb3f60d5a6f549ed91e1d9d627] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=398 Fail=2 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 9 Stability IS3-loop.topo 9 LidMgr IS3-128.topo Failures: 1 Stability IS3-loop.topo 1 LidMgr IS3-128.topo From dillowda at ornl.gov Thu Jan 3 18:34:49 2008 From: dillowda at ornl.gov (Dave Dillow) Date: Thu, 03 Jan 2008 21:34:49 -0500 Subject: [ofa-general] [2.6.24-rc BUGFIX] SRP transport: only remove our own entries In-Reply-To: <20080104094722F.fujita.tomonori@lab.ntt.co.jp> References: <20080103173330T.tomof@acm.org> <1199390957.7561.33.camel@lap75545.ornl.gov> <1199393485.7561.51.camel@lap75545.ornl.gov> <20080104094722F.fujita.tomonori@lab.ntt.co.jp> Message-ID: <1199414089.3636.7.camel@obelisk.thedillows.org> The SCSI SRP transport class currently iterates over all children devices of the host that is being removed in srp_remove_host(). However, not all of those children were created by the SRP transport, and removing them will cause corruption and an oops when their creator tries to remove them. Signed-off-by: David Dillow Acked-by: FUJITA Tomonori --- On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote: > On Thu, 03 Jan 2008 15:51:25 -0500 > I think that this is the root problem and the patch fixes it in the > right way. Please send this patch to linux-scsi at vger.kernel.org and a > patch to move srp_remove_host before scsi_remove_host in > srp_remove_one to Roland. > > Acked-by: FUJITA Tomonori diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c index 44a340b..65c584d 100644 --- a/drivers/scsi/scsi_transport_srp.c +++ b/drivers/scsi/scsi_transport_srp.c @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del); static int do_srp_rport_del(struct device *dev, void *data) { - srp_rport_del(dev_to_rport(dev)); + if (scsi_is_srp_rport(dev)) + srp_rport_del(dev_to_rport(dev)); return 0; } From dillowda at ornl.gov Thu Jan 3 18:39:19 2008 From: dillowda at ornl.gov (Dave Dillow) Date: Thu, 03 Jan 2008 21:39:19 -0500 Subject: [ofa-general] [2.6.24-rc BUGFIX] IB/srp: release transport when removing host In-Reply-To: <20080104094722F.fujita.tomonori@lab.ntt.co.jp> References: <20080103173330T.tomof@acm.org> <1199390957.7561.33.camel@lap75545.ornl.gov> <1199393485.7561.51.camel@lap75545.ornl.gov> <20080104094722F.fujita.tomonori@lab.ntt.co.jp> Message-ID: <1199414359.3636.13.camel@obelisk.thedillows.org> When removing the ib_srp module, srp_remove_one() does not release the SRP transport class when it is releasing the SCSI host. This leads to dangling references to kfree()'d memory, and an eventual oops. Signed-off-by: David Dillow --- On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote: > I think that this is the root problem and the patch fixes it in the > right way. Please send this patch to linux-scsi at vger.kernel.org and a > patch to move srp_remove_host before scsi_remove_host in > srp_remove_one to Roland. > > Acked-by: FUJITA Tomonori Not sure if your Acked-by was for this one as well, so I left it off. diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 950228f..bdb6f85 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2053,6 +2053,7 @@ static void srp_remove_one(struct ib_device *device) list_for_each_entry_safe(target, tmp_target, &host->target_list, list) { + srp_remove_host(target->scsi_host); scsi_remove_host(target->scsi_host); srp_disconnect_target(target); ib_destroy_cm_id(target->cm_id); From swise at opengridcomputing.com Thu Jan 3 18:41:34 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 03 Jan 2008 20:41:34 -0600 Subject: [ofa-general] Re: setting iWarp IRD and ORD In-Reply-To: <000301c84e6d$c0904020$ff0da8c0@amr.corp.intel.com> References: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com> <000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com> <477D802D.2050205@opengridcomputing.com> <000201c84e6b$c8d77610$ff0da8c0@amr.corp.intel.com> <477D8340.4060606@opengridcomputing.com> <000301c84e6d$c0904020$ff0da8c0@amr.corp.intel.com> Message-ID: <477D9CDE.5010704@opengridcomputing.com> An HTML attachment was scrubbed... URL: From fujita.tomonori at lab.ntt.co.jp Thu Jan 3 18:54:17 2008 From: fujita.tomonori at lab.ntt.co.jp (FUJITA Tomonori) Date: Fri, 04 Jan 2008 11:54:17 +0900 Subject: [ofa-general] Re: [2.6.24-rc BUGFIX] SRP transport: only remove our own entries In-Reply-To: <1199414089.3636.7.camel@obelisk.thedillows.org> References: <1199393485.7561.51.camel@lap75545.ornl.gov> <20080104094722F.fujita.tomonori@lab.ntt.co.jp> <1199414089.3636.7.camel@obelisk.thedillows.org> Message-ID: <20080104115417Q.fujita.tomonori@lab.ntt.co.jp> On Thu, 03 Jan 2008 21:34:49 -0500 Dave Dillow wrote: > The SCSI SRP transport class currently iterates over all children > devices of the host that is being removed in srp_remove_host(). However, > not all of those children were created by the SRP transport, and > removing them will cause corruption and an oops when their creator tries > to remove them. > > Signed-off-by: David Dillow > Acked-by: FUJITA Tomonori > --- Thanks! James, please put this patch into scsi-rc-fixes. > On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote: > > On Thu, 03 Jan 2008 15:51:25 -0500 > > I think that this is the root problem and the patch fixes it in the > > right way. Please send this patch to linux-scsi at vger.kernel.org and a > > patch to move srp_remove_host before scsi_remove_host in > > srp_remove_one to Roland. > > > > Acked-by: FUJITA Tomonori > > diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c > index 44a340b..65c584d 100644 > --- a/drivers/scsi/scsi_transport_srp.c > +++ b/drivers/scsi/scsi_transport_srp.c > @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del); > > static int do_srp_rport_del(struct device *dev, void *data) > { > - srp_rport_del(dev_to_rport(dev)); > + if (scsi_is_srp_rport(dev)) > + srp_rport_del(dev_to_rport(dev)); > return 0; > } > > > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html From fujita.tomonori at lab.ntt.co.jp Thu Jan 3 18:54:27 2008 From: fujita.tomonori at lab.ntt.co.jp (FUJITA Tomonori) Date: Fri, 04 Jan 2008 11:54:27 +0900 Subject: [ofa-general] Re: [2.6.24-rc BUGFIX] IB/srp: release transport when removing host In-Reply-To: <1199414359.3636.13.camel@obelisk.thedillows.org> References: <1199393485.7561.51.camel@lap75545.ornl.gov> <20080104094722F.fujita.tomonori@lab.ntt.co.jp> <1199414359.3636.13.camel@obelisk.thedillows.org> Message-ID: <20080104115427F.fujita.tomonori@lab.ntt.co.jp> On Thu, 03 Jan 2008 21:39:19 -0500 Dave Dillow wrote: > When removing the ib_srp module, srp_remove_one() does not release the > SRP transport class when it is releasing the SCSI host. This leads to > dangling references to kfree()'d memory, and an eventual oops. > > Signed-off-by: David Dillow Thanks again! Linus has already merged your previous patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b0e47c8b79154772a436f25bf7646733e1d6194c So please resend a patch to move srp_remove_host before scsi_remove_host instead of adding srp_remove_host. > --- > On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote: > > I think that this is the root problem and the patch fixes it in the > > right way. Please send this patch to linux-scsi at vger.kernel.org and a > > patch to move srp_remove_host before scsi_remove_host in > > srp_remove_one to Roland. > > > > Acked-by: FUJITA Tomonori > > Not sure if your Acked-by was for this one as well, so I left it off. Acked-by: FUJITA Tomonori > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c > index 950228f..bdb6f85 100644 > --- a/drivers/infiniband/ulp/srp/ib_srp.c > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > @@ -2053,6 +2053,7 @@ static void srp_remove_one(struct ib_device *device) > > list_for_each_entry_safe(target, tmp_target, > &host->target_list, list) { > + srp_remove_host(target->scsi_host); > scsi_remove_host(target->scsi_host); > srp_disconnect_target(target); > ib_destroy_cm_id(target->cm_id); > From rdreier at cisco.com Thu Jan 3 19:14:15 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 19:14:15 -0800 Subject: [ofa-general] Re: [PATCH] libmlx4: typo in mlx4_poll_one() In-Reply-To: <200712241354.01590.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Mon, 24 Dec 2007 13:54:01 +0200") References: <200712241354.01590.jackm@dev.mellanox.co.il> Message-ID: thanks, applied. From dillowda at ornl.gov Thu Jan 3 19:35:41 2008 From: dillowda at ornl.gov (Dave Dillow) Date: Thu, 03 Jan 2008 22:35:41 -0500 Subject: [ofa-general] [2.6.24-rc minor bugfix] IB/srp: release transport before removing host In-Reply-To: <20080104115427F.fujita.tomonori@lab.ntt.co.jp> References: <1199393485.7561.51.camel@lap75545.ornl.gov> <20080104094722F.fujita.tomonori@lab.ntt.co.jp> <1199414359.3636.13.camel@obelisk.thedillows.org> <20080104115427F.fujita.tomonori@lab.ntt.co.jp> Message-ID: <1199417741.3636.18.camel@obelisk.thedillows.org> The documented call sequence for removing a host is to call the transport xxx_remove_host() prior to scsi_remove_host(). The SRP transport used to crash when that order was followed, but as it is now fixed, use the documented order. Signed-off-by: David Dillow Acked-by: FUJITA Tomonori --- On Fri, 2008-01-04 at 11:54 +0900, FUJITA Tomonori wrote: > Linus has already merged your previous patch: [snip] > So please resend a patch to move srp_remove_host before > scsi_remove_host instead of adding srp_remove_host. [snip] > > Not sure if your Acked-by was for this one as well, so I left it off. > > Acked-by: FUJITA Tomonori diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 77e8b90..bdb6f85 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2053,8 +2053,8 @@ static void srp_remove_one(struct ib_device *device) list_for_each_entry_safe(target, tmp_target, &host->target_list, list) { - scsi_remove_host(target->scsi_host); srp_remove_host(target->scsi_host); + scsi_remove_host(target->scsi_host); srp_disconnect_target(target); ib_destroy_cm_id(target->cm_id); srp_free_target_ib(target); From rdreier at cisco.com Thu Jan 3 19:39:46 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 19:39:46 -0800 Subject: [ofa-general] Re: [2.6 patch] mthca: the scheduled MSI support removal In-Reply-To: <20080101134710.GH2360@does.not.exist> (Adrian Bunk's message of "Tue, 1 Jan 2008 15:47:10 +0200") References: <20080101134710.GH2360@does.not.exist> Message-ID: thanks for keeping on top of the schedule, applied for 2.6.25. From rdreier at cisco.com Thu Jan 3 19:51:28 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 19:51:28 -0800 Subject: [ofa-general] [PATCH] IB/mlx4: Micro-optimize mlx4_ib_poll_one() Message-ID: Rather than byte-swapping cqe->g_mlpath_rqpn each time we extract a field from it, byte-swap it once into a temporary variable. This results in smaller, better code -- eg, on 32-bit x86: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5) function old new delta mlx4_ib_poll_cq 1188 1183 -5 Signed-off-by: Roland Dreier --- I've queued this for 2.6.25... drivers/infiniband/hw/mlx4/cq.c | 9 +++++---- 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 8bf44da..24d9475 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -313,6 +313,7 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, struct mlx4_ib_srq *srq; int is_send; int is_error; + u32 g_mlpath_rqpn; u16 wqe_ctr; cqe = next_cqe_sw(cq); @@ -426,10 +427,10 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, wc->slid = be16_to_cpu(cqe->rlid); wc->sl = cqe->sl >> 4; - wc->src_qp = be32_to_cpu(cqe->g_mlpath_rqpn) & 0xffffff; - wc->dlid_path_bits = (be32_to_cpu(cqe->g_mlpath_rqpn) >> 24) & 0x7f; - wc->wc_flags |= be32_to_cpu(cqe->g_mlpath_rqpn) & 0x80000000 ? - IB_WC_GRH : 0; + g_mlpath_rqpn = be32_to_cpu(cqe->g_mlpath_rqpn); + wc->src_qp = g_mlpath_rqpn & 0xffffff; + wc->dlid_path_bits = (g_mlpath_rqpn >> 24) & 0x7f; + wc->wc_flags |= g_mlpath_rqpn & 0x80000000 ? IB_WC_GRH : 0; wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) >> 16; } -- 1.5.3.5 From rajouri.jammu at gmail.com Thu Jan 3 20:20:29 2008 From: rajouri.jammu at gmail.com (Rajouri Jammu) Date: Thu, 3 Jan 2008 20:20:29 -0800 Subject: [ofa-general] wc.opcode valid when completion error Message-ID: <3307cdf90801032020k3b9cc6b6u516dad4bf1d60c2d@mail.gmail.com> I have a basic question: Is the Work Completion Opcode valid when a completion error is returned by ib_poll_cq() ? Thanks much. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwsamcompm at samcomp.hu Wed Jan 2 20:49:32 2008 From: dwsamcompm at samcomp.hu (Mark Franklin) Date: Fri, 3 Jan 2008 10:19:32 +0530 Subject: [ofa-general] Purchase your medications at better prices. Message-ID: <01c84df2$21875200$d1cca77a@dwsamcompm> This weekly bulletin is dedicated to the astonishing results received by Independent Health Organization which was authorized to research the quality of medications supplied by online drugstores. Only one �CanadianPharmacy� drugstore out of 30 online drugstores offers 100% generic meds. http://geocities.com/MauricioWeiss67/ Purchase meds with us and enjoy the life to the full. Mark Franklin From rdreier at cisco.com Thu Jan 3 20:56:22 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 20:56:22 -0800 Subject: [ofa-general] wc.opcode valid when completion error In-Reply-To: <3307cdf90801032020k3b9cc6b6u516dad4bf1d60c2d@mail.gmail.com> (Rajouri Jammu's message of "Thu, 3 Jan 2008 20:20:29 -0800") References: <3307cdf90801032020k3b9cc6b6u516dad4bf1d60c2d@mail.gmail.com> Message-ID: > Is the Work Completion Opcode valid when > a completion error is returned by ib_poll_cq() ? No -- see the description of the "Poll for completion" verb in chapter 11 of the IB spec. The only fields that are always valid are the work request ID and the completion status. - R. From rdreier at cisco.com Thu Jan 3 21:06:41 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 21:06:41 -0800 Subject: [ofa-general] [PATCH] IB/ehca: Forward event client-reregister-required to registered clients In-Reply-To: <200712201506.34253.hnguyen@linux.vnet.ibm.com> (Hoang-Nam Nguyen's message of "Thu, 20 Dec 2007 15:06:33 +0100") References: <200712201506.34253.hnguyen@linux.vnet.ibm.com> Message-ID: thanks, applied. From rdreier at cisco.com Thu Jan 3 21:10:57 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 03 Jan 2008 21:10:57 -0800 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: (Roland Dreier's message of "Tue, 04 Dec 2007 22:20:19 -0800") References: <475607AA.301@dls.net> Message-ID: > I added a little debugging patch to src/cq.c in libmthca, and I found > that when the failure happened, the CQE had a WQE address that was out > of sequence -- the RQ has size 0x200 with 0x20 byte WQEs, and the CQEs > had WQE address 0x100 then WQE address 0x0; or address 0x0 then 0x140; > or even 0x80 twice in a row. > > Mellanox: can you take this test case and see if it is indeed a > firmware issue? I could believe that there is a bug in libmthca's > mthca_tavor_post_recv() function too... Hi Tziporet -- any update about this issue (bad WQE address in CQE on non-mem-free HCAs)? Thanks, Roland From ysuwegrrd at blueexpress.com Wed Jan 2 22:43:06 2008 From: ysuwegrrd at blueexpress.com (Delmer Rosario) Date: Fri, 3 Jan 2008 08:43:06 +0200 Subject: [ofa-general] Save on quality software! Message-ID: <01c84de4$a8cdc900$5d11fe58@ysuwegrrd> Take the opportunity to buy cheap applications, which are supposed to cost 5 to 20 times more! Only original and fully functional versions available to download immediately after you purchase. Also programs for Macintosh! Software in many languages! Accept this brilliant offer and take the advantage of our free installation consultations. Money back guarantee is available. http://geocities.com/DaneFernandez37/ You'll definitely find software you need. From vlad at lists.openfabrics.org Fri Jan 4 03:05:10 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 4 Jan 2008 03:05:10 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080104-0200 daily build status Message-ID: <20080104110510.E0FB5E6020A@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16 Passed on ppc64 with linux-2.6.15 Passed on powerpc with linux-2.6.13 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.19 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.13 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.14 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.15 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.21.1 Passed on ppc64 with linux-2.6.13 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-8.el5 Failed: From dgraux at steinmetz-forum.com Thu Jan 3 04:19:18 2008 From: dgraux at steinmetz-forum.com (Rodrick Rushing) Date: Fri, 3 Jan 2008 13:19:18 +0100 Subject: [ofa-general] Spiral Feather Message-ID: <01c84e0b$3eb2a140$190f7ed9@dgraux> Hey You I tried to attach a picture to this email Im not sure if I did it correctly If you dont see it please Email me at Kris at glorywaychurchx.info and I will retry right away Maybe we can Chat sometime :) From dgraux at steinmetz-forum.com Thu Jan 3 04:19:18 2008 From: dgraux at steinmetz-forum.com (Rodrick Rushing) Date: Fri, 3 Jan 2008 13:19:18 +0100 Subject: [ofa-general] Spiral Feather Message-ID: <01c84e0b$3eb2a140$190f7ed9@dgraux> Hey You I tried to attach a picture to this email Im not sure if I did it correctly If you dont see it please Email me at Kris at glorywaychurchx.info and I will retry right away Maybe we can Chat sometime :) From mattbel01 at myexcel.com Fri Jan 4 05:29:16 2008 From: mattbel01 at myexcel.com (mattbel01 at myexcel.com) Date: Fri, 4 Jan 2008 14:29:16 +0100 Subject: [ofa-general] Happy 2008 To You! Message-ID: <477E34AC.4080505@myexcel.com> Dance to the New 2008 Year tune http://happycards2008.com/ From Arkady.Kanevsky at netapp.com Fri Jan 4 06:09:17 2008 From: Arkady.Kanevsky at netapp.com (Kanevsky, Arkady) Date: Fri, 4 Jan 2008 09:09:17 -0500 Subject: [ofa-general] Re: setting iWarp IRD and ORD In-Reply-To: <477D9CDE.5010704@opengridcomputing.com> References: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com><000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com><477D802D.2050205@opengridcomputing.com><000201c84e6b$c8d77610$ff0da8c0@amr.corp.intel.com><477D8340.4060606@opengridcomputing.com><000301c84e6d$c0904020$ff0da8c0@amr.corp.intel.com> <477D9CDE.5010704@opengridcomputing.com> Message-ID: And what happens when RNIC on two sides have a different upper limits? Specifically, if requestor asks for ORD which is bigger than responder can handle? Is it user responsibility to pass ORD request out of bound to responder and if responder can not satisfy it then reject the request? Thanks, Arkady Kanevsky email: arkady at netapp.com Network Appliance Inc. phone: 781-768-5395 1601 Trapelo Rd. - Suite 16. Fax: 781-895-1195 Waltham, MA 02451 central phone: 781-768-5300 ________________________________ From: Steve Wise [mailto:swise at opengridcomputing.com] Sent: Thursday, January 03, 2008 9:42 PM To: Sean Hefty Cc: OpenFabrics General Subject: [ofa-general] Re: setting iWarp IRD and ORD Sean Hefty wrote: What exactly are you fixing? It may help to look at the patch for the librdmacm that I just posted. The problem that was seen was that uDAPL set IRD/ORD to 16 in the connection request, but the passive side could only support 4. The rdma_cm was supposed to use the lower value (provided by the user when calling rdma_accept) when transitioning the QP, but instead used the value from the request. Since iWarp connect request doesn't carry the IRD/ORD and always uses the value provided by the user through rdma_connect or rdma_accept, it doesn't sound like it would hit this problem. I just don't quite follow why iwcm_init_qp_rts_attr() doesn't set the QP attribute mask for IRD/ORD, or how the QP is programmed with these values. Perhaps this is a bug. But the chelsio driver just saves off the ord/ird from the connection parameters and then programs the qp with these values when the qp is associated with the connection (just after the connection goes into rdma mode)... - Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwntnnetm at ntnnet.com Thu Jan 3 07:59:38 2008 From: dwntnnetm at ntnnet.com (Maurice Yates) Date: Fri, 3 Jan 2008 20:59:38 +0500 Subject: [ofa-general] Diese Marke der Software ist würdig zu kaufen Message-ID: <01c84e4b$8d6db4b0$f454dcd4@dwntnnetm> Die echte und vollige Produkte der Software fur wenig Geld? Das ist wirklich. Sie momentan zu bekommen? Ja ist die Antwort. Einfach bezahlen und auslasten. Au?erdem sind die Programmen auf allen europaischen Sprachen uberlassen und fur Windows und Macintosh vorherbestimmt. Jetzt wird jedes Programm leicht aufgestellt. Dabei hilft die professionelle Konsultation des Anwenderdienstes. Wir garantieren schnelle Antworte und die Moglichkeit der Ruckzahlung. Die vollkommen funktionierende Software sind immer zu kaufen http://geocities.com/bright.kenneth/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Fri Jan 4 08:16:32 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 04 Jan 2008 10:16:32 -0600 Subject: [ofa-general] Re: setting iWarp IRD and ORD In-Reply-To: References: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com><000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com><477D802D.2050205@opengridcomputing.com><000201c84e6b$c8d77610$ff0da8c0@amr.corp.intel.com><477D8340.4060606@opengridcomputing.com><000301c84e6d$c0904020$ff0da8c0@amr.corp.intel.com> <477D9CDE.5010704@opengridcomputing.com> Message-ID: <477E5BE0.1020106@opengridcomputing.com> Kanevsky, Arkady wrote: > And what happens when RNIC on two sides have a different upper limits? > Specifically, if requestor asks for ORD which is bigger than responder > can handle? > Is it user responsibility to pass ORD request out of bound to responder and > if responder can not satisfy it then reject the request? > Thanks, > Yes. > > Arkady Kanevsky email: arkady at netapp.com > > > Network Appliance Inc. phone: 781-768-5395 > > 1601 Trapelo Rd. - Suite 16. Fax: 781-895-1195 > > Waltham, MA 02451 central phone: 781-768-5300 > > > > ------------------------------------------------------------------------ > *From:* Steve Wise [mailto:swise at opengridcomputing.com] > *Sent:* Thursday, January 03, 2008 9:42 PM > *To:* Sean Hefty > *Cc:* OpenFabrics General > *Subject:* [ofa-general] Re: setting iWarp IRD and ORD > > > > Sean Hefty wrote: >>> What exactly are you fixing? >>> >> >> It may help to look at the patch for the librdmacm that I just posted. The >> problem that was seen was that uDAPL set IRD/ORD to 16 in the connection >> request, but the passive side could only support 4. The rdma_cm was supposed to >> use the lower value (provided by the user when calling rdma_accept) when >> transitioning the QP, but instead used the value from the request. >> >> Since iWarp connect request doesn't carry the IRD/ORD and always uses the value >> provided by the user through rdma_connect or rdma_accept, it doesn't sound like >> it would hit this problem. I just don't quite follow why >> iwcm_init_qp_rts_attr() doesn't set the QP attribute mask for IRD/ORD, or how >> the QP is programmed with these values. >> >> > Perhaps this is a bug. But the chelsio driver just saves off the > ord/ird from the connection parameters and then programs the qp with > these values when the qp is associated with the connection (just > after the connection goes into rdma mode)... > >> - Sean >> > > From dwsbtvadvertisingm at sbtvadvertising.com Thu Jan 3 08:39:29 2008 From: dwsbtvadvertisingm at sbtvadvertising.com (Emory Pugh) Date: Fri, 3 Jan 2008 18:39:29 +0200 Subject: [ofa-general] Medications that you need. Message-ID: <01c84e37$f9224680$1f22ea58@dwsbtvadvertisingm> Buy Must Have medications at Canada based pharmacy. No prescription at all! Same quality! Save your money, buy pills immediately! http://geocities.com/SonnyWhitehead/ We provide confidential and secure purchase! From jlaird at petercilella.com Thu Jan 3 08:54:14 2008 From: jlaird at petercilella.com (Hallie Arnold) Date: Fri, 3 Jan 2008 19:54:14 +0300 Subject: [ofa-general] Mit diesem Software werden Sie und Ihr PC glü cklich Message-ID: <127022752.84172764042616@petercilella.com> Konnen die Produkte der Software gleichzeitig billig aber original und vollig sein? Ja, und Sie bekommen momentan die Programmen auf allen europaischen Sprachen uberlassen, die fur Windows und Macintosh vorherbestimmt sind. Einfach bezahlen und auslasten. Haben Sie Haben Sie Schwierigkeiten bei der Aufstellung des Programms? Sie bekommen die Hilfe der professionellen Konsultation des Anwenderdienstes. Haben Sie Fragen? Wir antworten schnell. Die Ruckzahlung ist moglich. Kaufen die vollkommen funktionierende Software http://geocities.com/hamilton_burton/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From caitlin.bestler at neterion.com Fri Jan 4 09:40:11 2008 From: caitlin.bestler at neterion.com (Caitlin Bestler) Date: Fri, 4 Jan 2008 09:40:11 -0800 Subject: [ofa-general] Re: setting iWarp IRD and ORD In-Reply-To: <477E5BE0.1020106@opengridcomputing.com> References: <000001c84e63$f71c1240$ff0da8c0@amr.corp.intel.com> <000101c84e69$9c4e2a00$ff0da8c0@amr.corp.intel.com> <477D802D.2050205@opengridcomputing.com> <000201c84e6b$c8d77610$ff0da8c0@amr.corp.intel.com> <477D8340.4060606@opengridcomputing.com> <000301c84e6d$c0904020$ff0da8c0@amr.corp.intel.com> <477D9CDE.5010704@opengridcomputing.com> <477E5BE0.1020106@opengridcomputing.com> Message-ID: <469958e00801040940v69a1cbd7s22677f2b51aca460@mail.gmail.com> On Jan 4, 2008 8:16 AM, Steve Wise wrote: > Kanevsky, Arkady wrote: > > And what happens when RNIC on two sides have a different upper limits? > > Specifically, if requestor asks for ORD which is bigger than responder > > can handle? > > Is it user responsibility to pass ORD request out of bound to responder and > > if responder can not satisfy it then reject the request? > > Thanks, > > > > Yes. > The key phrase here is that a transport neutral uDAPL application currently must communicate the *same* IRD/ORD information both between the ULP peers and from each ULP peer to the transport provider. There are two benefits to this approach: It minimizes the work of the transport provider. It optimizes connection setup for those applications that do not need to dynamically negotiate the IRD/ORD. This is especially true if the out-of-band communication can in fact be a protocol specification rather than a dynamic exchange. There is of course one major drawback to the approach -- it is a complex requirement that is difficult for application designers to remember. So this is ultimately a question for application developers. Which is better, flexibility or uniformity? If there is a concensus on uniformity, then I believe OFA *could* standardize an IRD/ORD convention within the private data and that such a convention would become an industry-wide standard. IT-API already proposed such an encoding. DAPL never really opposed that encoding, but had previously decided to avoid creating new wire protocols to ensure interoperability with non-DAPL applications. Realistically, OFA applications have virtually no need to interoperate with non-OFA applications. So a decision could be made to standardize IRD/ORD setup such that it could be done automatically by the iWARP CM. But such a decision limits flexibility and steals a few bytes from the available Private Data. So giving the application developers a chance to express any reservations should be the next step. That would include anyone who has a scenario where they need to interoperate with an iWARP application NOT using OFA. From dwniiagentm at niiagent.com Thu Jan 3 10:08:06 2008 From: dwniiagentm at niiagent.com (Francis Clinton) Date: Fri, 3 Jan 2008 10:08:06 -0800 Subject: [ofa-general] Die neue Software zu altem Preis: es lohnt sich Message-ID: <01c84df0$88a40700$7aaba37a@dwniiagentm> Die echte und vollige Produkte der Software fur wenig Geld? Das ist wirklich. Sie momentan zu bekommen? Ja ist die Antwort. Einfach bezahlen und auslasten. Au?erdem sind die Programmen auf allen europaischen Sprachen uberlassen und fur Windows und Macintosh vorherbestimmt. Wie das Programm aufzustellen? Dabei hilft die professionelle Konsultation des Anwenderdienstes. Garantierte schnelle Antwort, die Ruckzahlung ist moglich. Sie kaufen nur die vollkommen funktionierende Software http://geocities.com/carson.trent/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwgwinnetthumanem at gwinnetthumane.com Thu Jan 3 10:25:45 2008 From: dwgwinnetthumanem at gwinnetthumane.com (Fern Akers) Date: Fri, 3 Jan 2008 19:25:45 +0100 Subject: [ofa-general] Die Software ohne Probleme mit Aufstellung und hohen Preisen Message-ID: <01c84e3e$6fc24280$fbe13a54@dwgwinnetthumanem> Suchen Sie nach der Software? Mochten sie momentan bekommen? Das ist das! Nur bezahlen und auslasten. Die Programmen sind auf allen europaischen Sprachen uberlassen und fur Windows und Macintosh vorherbestimmtWollen Sie das Programm aufstellen? Benutzen Sie die Hilfe der professionellen Konsultation des Anwenderdienstes. Wir antworten auf Ihre Fragen schnell und garantieren die Moglichkeit der Ruckzahlung. Sie kaufen, die Software funktionieren, ausgezeichnet http://geocities.com/yesenia.dotson/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.hefty at intel.com Fri Jan 4 10:47:12 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Fri, 4 Jan 2008 10:47:12 -0800 Subject: [ofa-general] [PATCH] [2.6.25] rdma/cm: override default responder_resources with user value Message-ID: <000101c84f02$37ba1510$a937170a@amr.corp.intel.com> By default, the responder_resources parameter is set to that received in a connection request. The passive side may override this value when accepting the connection. Use the value provided by the passive side when transitioning the QP to RTR state, rather than the value given in the connect request. Without this change, the RTR transition may fail if the passive side supports fewer responder_resources than that in the request. For code consistency and to protect against QP destruction, restructure overriding initiator_depth to match how responder_resources is set. Signed-off-by: Sean Hefty --- This change is based off of problems seen with uDAPL. Although the fix for uDAPL is entirely in userspace (where the modify QP is done), the kernel rdma_cm has a similar problem. The problem only occurs if the passive side CAs supports a smaller number of outstanding RDMA operations than that requested by the active side. This should be okay to queue for 2.6.25. I don't see anything upstream that is affected by this change. drivers/infiniband/core/cma.c | 39 ++++++++++++++++++--------------------- 1 files changed, 18 insertions(+), 21 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 656d6df..b3917fe 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -488,7 +488,8 @@ void rdma_destroy_qp(struct rdma_cm_id *id) } EXPORT_SYMBOL(rdma_destroy_qp); -static int cma_modify_qp_rtr(struct rdma_id_private *id_priv) +static int cma_modify_qp_rtr(struct rdma_id_private *id_priv, + struct rdma_conn_param *conn_param) { struct ib_qp_attr qp_attr; int qp_attr_mask, ret; @@ -514,13 +515,16 @@ static int cma_modify_qp_rtr(struct rdma_id_private *id_priv) if (ret) goto out; + if (conn_param) + qp_attr.max_dest_rd_atomic = conn_param->responder_resources; ret = ib_modify_qp(id_priv->id.qp, &qp_attr, qp_attr_mask); out: mutex_unlock(&id_priv->qp_mutex); return ret; } -static int cma_modify_qp_rts(struct rdma_id_private *id_priv) +static int cma_modify_qp_rts(struct rdma_id_private *id_priv, + struct rdma_conn_param *conn_param) { struct ib_qp_attr qp_attr; int qp_attr_mask, ret; @@ -536,6 +540,8 @@ static int cma_modify_qp_rts(struct rdma_id_private *id_priv) if (ret) goto out; + if (conn_param) + qp_attr.max_rd_atomic = conn_param->initiator_depth; ret = ib_modify_qp(id_priv->id.qp, &qp_attr, qp_attr_mask); out: mutex_unlock(&id_priv->qp_mutex); @@ -866,11 +872,11 @@ static int cma_rep_recv(struct rdma_id_private *id_priv) { int ret; - ret = cma_modify_qp_rtr(id_priv); + ret = cma_modify_qp_rtr(id_priv, NULL); if (ret) goto reject; - ret = cma_modify_qp_rts(id_priv); + ret = cma_modify_qp_rts(id_priv, NULL); if (ret) goto reject; @@ -2274,7 +2280,7 @@ static int cma_connect_iw(struct rdma_id_private *id_priv, sin = (struct sockaddr_in*) &id_priv->id.route.addr.dst_addr; cm_id->remote_addr = *sin; - ret = cma_modify_qp_rtr(id_priv); + ret = cma_modify_qp_rtr(id_priv, conn_param); if (ret) goto out; @@ -2340,22 +2346,13 @@ static int cma_accept_ib(struct rdma_id_private *id_priv, struct ib_qp_attr qp_attr; int qp_attr_mask, ret; - if (id_priv->id.qp) { - ret = cma_modify_qp_rtr(id_priv); - if (ret) - goto out; - - qp_attr.qp_state = IB_QPS_RTS; - ret = ib_cm_init_qp_attr(id_priv->cm_id.ib, &qp_attr, - &qp_attr_mask); - if (ret) - goto out; + ret = cma_modify_qp_rtr(id_priv, conn_param); + if (ret) + goto out; - qp_attr.max_rd_atomic = conn_param->initiator_depth; - ret = ib_modify_qp(id_priv->id.qp, &qp_attr, qp_attr_mask); - if (ret) - goto out; - } + ret = cma_modify_qp_rts(id_priv, conn_param); + if (ret) + goto out; memset(&rep, 0, sizeof rep); rep.qp_num = id_priv->qp_num; @@ -2380,7 +2377,7 @@ static int cma_accept_iw(struct rdma_id_private *id_priv, struct iw_cm_conn_param iw_param; int ret; - ret = cma_modify_qp_rtr(id_priv); + ret = cma_modify_qp_rtr(id_priv, conn_param); if (ret) return ret; From a_ameen29 at alexhernandezlaw.com Thu Jan 3 11:30:57 2008 From: a_ameen29 at alexhernandezlaw.com (Dale Kline) Date: Fri, 3 Jan 2008 20:30:57 +0100 Subject: [ofa-general] Die Software ohne Probleme mit Aufstellung und hohen Preisen Message-ID: <01c84e47$8b7e1680$2745ba4f@a_ameen29> Wie kann man die Software momentan und fur wenig Geld bekommen? Einfach bezahlen und auslasten. Gleich haben Sie die auf allen europaischen Sprachen uberlassenen Programmen, die fur Windows und Macintosh vorherbestimmt sind. Die Produkte der Software sind original und vollig.Haben Sie Schwierigkeiten bei der Aufstellung? Die professionelle Konsultation des Anwenderdienstes hilft Ihnen. Die Antwort wird schnell sein. Die Ruckzahlung ist moglich. So konnen Sie die vollkommen funktionierende Software leicht kaufen http://geocities.com/martyevans41/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwsacotecm at sacotec.fi Thu Jan 3 11:41:56 2008 From: dwsacotecm at sacotec.fi (Darren Neff) Date: Fri, 3 Jan 2008 21:41:56 +0200 Subject: [ofa-general] Order cheap medications in Canada and save money. Message-ID: <01c84e51$760dea00$8c78a54e@dwsacotecm> ŤCanadianPharmacyť provides extremely cheap meds, 100% generic. Customer service staff will help with initial order and provide information about meds, dosages, side effects, etc. All orders are delivered to door in discreet packaging. http://geocities.com/JeremyWarren98/ The aim of this message is to help you to achieve better health. Darren Neff From YJia at tmriusa.com Fri Jan 4 12:43:57 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Fri, 4 Jan 2008 14:43:57 -0600 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: Message-ID: > The mmiowb() is definitely necessary, because without it then commands > were getting messed up on large Altix systems. I'm using Duo-core Xeon and I just grep the source of "mmiowb()" in kernel 2.6.23 include/asm-x86_64 /io.h and found that this function does nothing on x86_64 platform, is it true? Thanks! Yicheng Roland Dreier 01/02/2008 02:52 PM To Yicheng Jia cc general at lists.openfabrics.org, Jack Morgenstein Subject Re: [ofa-general] synchronize commands issued to MTHCA > Could you tell me what's the difference between "wmb()" and "mmiowb()". I > notice that ofa-1.3 has added "mmiowb()" at the end of mthca_cmd_post, > since "wmb()" is already called at the end of cmd_post, is "mmiowb()" > really necessary? wmb() orders writes from the same CPU -- it prevents highly out-of-order architectures from making writes visible in an order different from program order. mmiowb() orders MMIO writes between different CPUs, and prevents systems (such as SGI Altix) where the CPU fabric may reorder writes before they reach the IO bus. The mmiowb() is definitely necessary, because without it then commands were getting messed up on large Altix systems. - R. _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Fri Jan 4 12:47:19 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 04 Jan 2008 12:47:19 -0800 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: (Yicheng Jia's message of "Fri, 4 Jan 2008 14:43:57 -0600") References: Message-ID: > I'm using Duo-core Xeon and I just grep the source of "mmiowb()" in kernel > 2.6.23 include/asm-x86_64 /io.h and found that this function does nothing > on x86_64 platform, is it true? Yes -- this is why I kept referring to large SGI Altix systems. - R. From akepner at sgi.com Fri Jan 4 12:49:48 2008 From: akepner at sgi.com (akepner at sgi.com) Date: Fri, 4 Jan 2008 12:49:48 -0800 Subject: [ofa-general] synchronize commands issued to MTHCA In-Reply-To: References: Message-ID: <20080104204948.GT23661@sgi.com> On Fri, Jan 04, 2008 at 02:43:57PM -0600, Yicheng Jia wrote: > > The mmiowb() is definitely necessary, because without it then commands > > were getting messed up on large Altix systems. > > I'm using Duo-core Xeon and I just grep the source of "mmiowb()" in kernel > 2.6.23 include/asm-x86_64 /io.h and found that this function does nothing > on x86_64 platform, is it true? > Yes. It's a no-op for most architectures. -- Arthur From rdreier at cisco.com Fri Jan 4 12:56:25 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 04 Jan 2008 12:56:25 -0800 Subject: [ofa-general] Re: [PATCH 5/5] IB/ipath - Changes for fields moving from devdata to portdata. In-Reply-To: <20071222222459.17599.72860.stgit@eng-46.internal.keyresearch.com> (Arthur Jones's message of "Sat, 22 Dec 2007 14:25:00 -0800") References: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com> <20071222222459.17599.72860.stgit@eng-46.internal.keyresearch.com> Message-ID: thanks, applied 1-5. BTW, it's very helpful that you're feeding me patches in smaller batches. The only further request I would make is that (if it fits with your workflow) when sending a batch of patches, you update your git tree so that it is based on top of either my tree or Linus's tree. As things are now, your tree has a bunch of extra commits that are not in other trees. - R. From jim at mellanox.com Fri Jan 4 12:59:38 2008 From: jim at mellanox.com (Jim Mott) Date: Fri, 4 Jan 2008 12:59:38 -0800 Subject: [ofa-general] [PATCH 1/1] SDP - Bug294: SDP connect() only allows AF_INET (2), not AF_INET_SDP (27) Message-ID: Signed-off-by: Jim Mott --- Index: ofed_1_3/drivers/infiniband/ulp/sdp/sdp_main.c =================================================================== --- ofed_1_3.orig/drivers/infiniband/ulp/sdp/sdp_main.c 2008-01-04 12:21:51.000000000 -0800 +++ ofed_1_3/drivers/infiniband/ulp/sdp/sdp_main.c 2008-01-04 12:25:06.000000000 -0800 @@ -593,7 +593,7 @@ static int sdp_connect(struct sock *sk, if (addr_len < sizeof(struct sockaddr_in)) return -EINVAL; - if (uaddr->sa_family != AF_INET) + if (uaddr->sa_family != AF_INET && uaddr->sa_family != AF_INET_SDP) return -EAFNOSUPPORT; if (!ssk->id) { From jim at mellanox.com Fri Jan 4 12:59:46 2008 From: jim at mellanox.com (Jim Mott) Date: Fri, 4 Jan 2008 12:59:46 -0800 Subject: [ofa-general] [PATCH 1/1] SDP - Bug829: poll() always returns POLLOUT on non-blocking socket Message-ID: Signed-off-by: Jim Mott --- Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_main.c =================================================================== --- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp_main.c 2008-01-04 14:11:55.000000000 -0600 +++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_main.c 2008-01-04 14:12:02.000000000 -0600 @@ -2013,17 +2013,27 @@ static inline unsigned int sdp_listen_po static unsigned int sdp_poll(struct file *file, struct socket *socket, struct poll_table_struct *wait) { - int mask; + unsigned int mask; + struct sock *sk = socket->sk; + struct sdp_sock *ssk = sdp_sk(sk); + sdp_dbg_data(socket->sk, "%s\n", __func__); mask = datagram_poll(file, socket, wait); + + /* + * Adjust for memory in later kernels + */ + if (!sk_stream_memory_free(sk) || !slots_free(ssk)) + mask &= ~(POLLOUT | POLLWRNORM | POLLWRBAND); + /* TODO: Slightly ugly: it would be nicer if there was function * like datagram_poll that didn't include poll_wait, * then we could reverse the order. */ - if (socket->sk->sk_state == TCP_LISTEN) - return sdp_listen_poll(socket->sk); + if (sk->sk_state == TCP_LISTEN) + return sdp_listen_poll(sk); - if (sdp_sk(socket->sk)->urg_data & TCP_URG_VALID) + if (ssk->urg_data & TCP_URG_VALID) mask |= POLLPRI; return mask; } From jim at mellanox.com Fri Jan 4 12:59:53 2008 From: jim at mellanox.com (Jim Mott) Date: Fri, 4 Jan 2008 12:59:53 -0800 Subject: [ofa-general] [PATCH 1/1] SDP - Bug837: executing netperf with TCP_CORK enabled never ends Message-ID: Signed-off-by: Jim Mott --- Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_main.c =================================================================== --- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp_main.c 2008-01-04 14:12:02.000000000 -0600 +++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_main.c 2008-01-04 14:12:11.000000000 -0600 @@ -880,6 +880,15 @@ static void sdp_shutdown(struct sock *sk else return; + /* + * Just turn off CORK here. + * We could check for socket shutting down in main data path, + * but this costs no extra cycles there. + */ + ssk->nonagle &= ~TCP_NAGLE_CORK; + if (ssk->nonagle & TCP_NAGLE_OFF) + ssk->nonagle |= TCP_NAGLE_PUSH; + sdp_post_sends(ssk, 0); } From rdreier at cisco.com Fri Jan 4 14:05:29 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 04 Jan 2008 14:05:29 -0800 Subject: [ofa-general] [RFC PATCH] IPoIB: improve IPv4/IPv6 to IB mcast mapping functions In-Reply-To: <20071210203841.GJ30090@obsidianresearch.com> (Rolf Manderscheid's message of "Mon, 10 Dec 2007 13:38:41 -0700") References: <20071210203544.GI30090@obsidianresearch.com> <20071210203841.GJ30090@obsidianresearch.com> Message-ID: Any objection to merging the following for 2.6.25? [Rolf -- I think it makes more sense to delete the overwriting of the P_Key in ipoib_multicast.c in this patch rather than later in the series; do you agree?] Thanks, Roland From: Rolf Manderscheid An IPoIB subnet on an IB fabric that spans multiple IB subnets can't use link-local scope in multicast GIDs. The existing routines that map IP/IPv6 multicast addresses into IB link-level addresses hard-code the scope to link-local, and they also leave the partition key field uninitialised. This patch adds a parameter (the link-level broadcast address) to the mapping routines, allowing them to initialise both the scope and the P_Key appropriately, and fixes up the call sites. The next step will be to add a way to configure the scope for an IPoIB interface. Signed-off-by: Rolf Manderscheid Signed-off-by: Roland Dreier --- drivers/infiniband/core/cma.c | 4 +--- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 4 ---- include/net/if_inet6.h | 11 +++++++---- include/net/ip.h | 10 ++++++---- net/ipv4/arp.c | 2 +- net/ipv6/ndisc.c | 2 +- 6 files changed, 16 insertions(+), 17 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 312ec74..982836e 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -2610,11 +2610,9 @@ static void cma_set_mgid(struct rdma_id_private *id_priv, /* IPv6 address is an SA assigned MGID. */ memcpy(mgid, &sin6->sin6_addr, sizeof *mgid); } else { - ip_ib_mc_map(sin->sin_addr.s_addr, mc_map); + ip_ib_mc_map(sin->sin_addr.s_addr, dev_addr->broadcast, mc_map); if (id_priv->id.ps == RDMA_PS_UDP) mc_map[7] = 0x01; /* Use RDMA CM signature */ - mc_map[8] = ib_addr_get_pkey(dev_addr) >> 8; - mc_map[9] = (unsigned char) ib_addr_get_pkey(dev_addr); *mgid = *(union ib_gid *) (mc_map + 4); } } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 858ada1..2628339 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -788,10 +788,6 @@ void ipoib_mcast_restart_task(struct work_struct *work) memcpy(mgid.raw, mclist->dmi_addr + 4, sizeof mgid); - /* Add in the P_Key */ - mgid.raw[4] = (priv->pkey >> 8) & 0xff; - mgid.raw[5] = priv->pkey & 0xff; - mcast = __ipoib_mcast_find(dev, &mgid); if (!mcast || test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { struct ipoib_mcast *nmcast; diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 448eccb..b24508a 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -269,18 +269,21 @@ static inline void ipv6_arcnet_mc_map(const struct in6_addr *addr, char *buf) buf[0] = 0x00; } -static inline void ipv6_ib_mc_map(struct in6_addr *addr, char *buf) +static inline void ipv6_ib_mc_map(const struct in6_addr *addr, + const unsigned char *broadcast, char *buf) { + unsigned char scope = broadcast[5] & 0xF; + buf[0] = 0; /* Reserved */ buf[1] = 0xff; /* Multicast QPN */ buf[2] = 0xff; buf[3] = 0xff; buf[4] = 0xff; - buf[5] = 0x12; /* link local scope */ + buf[5] = 0x10 | scope; /* scope from broadcast address */ buf[6] = 0x60; /* IPv6 signature */ buf[7] = 0x1b; - buf[8] = 0; /* P_Key */ - buf[9] = 0; + buf[8] = broadcast[8]; /* P_Key */ + buf[9] = broadcast[9]; memcpy(buf + 10, addr->s6_addr + 6, 10); } #endif diff --git a/include/net/ip.h b/include/net/ip.h index 840dd91..50c8889 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -266,20 +266,22 @@ static inline void ip_eth_mc_map(__be32 naddr, char *buf) * Leave P_Key as 0 to be filled in by driver. */ -static inline void ip_ib_mc_map(__be32 naddr, char *buf) +static inline void ip_ib_mc_map(__be32 naddr, const unsigned char *broadcast, char *buf) { __u32 addr; + unsigned char scope = broadcast[5] & 0xF; + buf[0] = 0; /* Reserved */ buf[1] = 0xff; /* Multicast QPN */ buf[2] = 0xff; buf[3] = 0xff; addr = ntohl(naddr); buf[4] = 0xff; - buf[5] = 0x12; /* link local scope */ + buf[5] = 0x10 | scope; /* scope from broadcast address */ buf[6] = 0x40; /* IPv4 signature */ buf[7] = 0x1b; - buf[8] = 0; /* P_Key */ - buf[9] = 0; + buf[8] = broadcast[8]; /* P_Key */ + buf[9] = broadcast[9]; buf[10] = 0; buf[11] = 0; buf[12] = 0; diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c index 08174a2..54a76b8 100644 --- a/net/ipv4/arp.c +++ b/net/ipv4/arp.c @@ -211,7 +211,7 @@ int arp_mc_map(__be32 addr, u8 *haddr, struct net_device *dev, int dir) ip_tr_mc_map(addr, haddr); return 0; case ARPHRD_INFINIBAND: - ip_ib_mc_map(addr, haddr); + ip_ib_mc_map(addr, dev->broadcast, haddr); return 0; default: if (dir) { diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 777ed73..85947ea 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -337,7 +337,7 @@ int ndisc_mc_map(struct in6_addr *addr, char *buf, struct net_device *dev, int d ipv6_arcnet_mc_map(addr, buf); return 0; case ARPHRD_INFINIBAND: - ipv6_ib_mc_map(addr, buf); + ipv6_ib_mc_map(addr, dev->broadcast, buf); return 0; default: if (dir) { -- 1.5.4.rc2 From rdreier at cisco.com Fri Jan 4 14:09:19 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 04 Jan 2008 14:09:19 -0800 Subject: [ofa-general] Re: [PATCH 1/3] IB/srp: respect target credit limit In-Reply-To: <1198102123.5649.32.camel@lap75545.ornl.gov> (David Dillow's message of "Wed, 19 Dec 2007 17:08:43 -0500") References: <1198102123.5649.32.camel@lap75545.ornl.gov> Message-ID: looks great, applied. From 9z6f at yahoo.co.uk Thu Jan 3 14:47:11 2008 From: 9z6f at yahoo.co.uk (Scott Wall) Date: Fri, 3 Jan 2008 21:47:11 -0100 Subject: [ofa-general] Let's chat Message-ID: <01c84e52$31cf1980$f2801d4e@9z6f> Hello! I am tired today. I am nice girl that would like to chat with you. Email me at Hannah at UsOldGlory.info only, because I am using my friend's email to write this. Would you mind if I share some of my pictures with you? From rdreier at cisco.com Fri Jan 4 14:16:11 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 04 Jan 2008 14:16:11 -0800 Subject: [ofa-general] [PATCH 3/3] IB/srp: use scatter gather chaining In-Reply-To: <1198102155.5649.37.camel@lap75545.ornl.gov> (David Dillow's message of "Wed, 19 Dec 2007 17:09:15 -0500") References: <1198102155.5649.37.camel@lap75545.ornl.gov> Message-ID: > I looked through the DMA paths on the hardware drivers to ensure they > could take advantage of the SG chaining, and it seems that every one > except iPath uses the system's DMA routines, which have been converted > to handle chaining. iPath looks like it should be OK, but I have no way > to test it. I'd like to test this on ipath just to be sure. Can you send me a simple way to generate big IO requests? Thanks, Roland From dillowda at ornl.gov Fri Jan 4 14:28:00 2008 From: dillowda at ornl.gov (Dillow, David A.) Date: Fri, 04 Jan 2008 17:28:00 -0500 Subject: [ofa-general] [PATCH 3/3] IB/srp: use scatter gather chaining References: <1198102155.5649.37.camel@lap75545.ornl.gov> Message-ID: <537C6C0940C6C143AA46A88946B854170AD87CFA@ORNLEXCHANGE.ornl.gov> >From Roland Drier: > I looked through the DMA paths on the hardware drivers to ensure they > could take advantage of the SG chaining, and it seems that every one > except iPath uses the system's DMA routines, which have been converted > to handle chaining. iPath looks like it should be OK, but I have no way > to test it. set srp_sg_tablesize=256 Add SRP target with max_sects=8192 or more echo 4096 > /sys/block/sda/queue/max_sectors_kb echo noop > /sys/block/sda/queue/scheduler dd if=/dev/zero of=/dev/sda oflag=direct bs=4096k count=10 dd if=/dev/sda iflag=direct of=/dev/sda bs=4096k count=10 You may not get 4MB I/Os because of memory fragmentation, but you should see 1MB or better. You can use a real file as a data source/sink to verify against corruption. Dave From panda at cse.ohio-state.edu Fri Jan 4 14:31:53 2008 From: panda at cse.ohio-state.edu (Dhabaleswar Panda) Date: Fri, 4 Jan 2008 17:31:53 -0500 (EST) Subject: [ofa-general] Re: [mvapich-discuss] Troubles building/installing OFEM 1.2 on Fedora Core 4 64-bit In-Reply-To: <009c01c84f14$b00a89c0$101f9d40$@held@staarinc.com> Message-ID: Ben - Sorry to know that you are experiencing problems in building/installing OFED 1.2 on Fedora Core 4 64 bit system. FYI, the latest released version of OFED 1.2 is OFED 1.2.5.4. Regarding your rpm build errors, I am forwarding your note to `ewg' and `general' lists of Open Fabrics. More experienced users on these two lists can give you prompt feedbacks and guidance on the basic OFED installation issues. Thanks, DK On Fri, 4 Jan 2008, Ben Held wrote: > We are seeing a failure during the install process (out of rpmbuild) on a > Fedora Core 4 64-bit system. The tail of the log is here: > > > > Hunk #1 succeeded at 456 (offset 156 lines). > > Hunk #2 succeeded at 569 (offset 75 lines). > > Hunk #3 succeeded at 672 (offset 157 lines). > > Hunk #4 succeeded at 1444 (offset 281 lines). > > Hunk #5 succeeded at 1340 (offset 157 lines). > > Hunk #6 succeeded at 1791 with fuzz 1 (offset 498 lines). > > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/kernel_patches/backport/2.6.11_FC4/use > r_mad_3935_to_2_6_11_FC4.patch > > patching file drivers/infiniband/core/user_mad.c > > patch: **** malformed patch at line 12: @@ -827,13 +952,13 @@ static int > ib_umad_init_port(struct ib_d > > > > Failed to apply patch: > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/kernel_patches/backport/2.6.11_FC4/use > r_mad_3935_to_2_6_11_FC4.patch > > error: Bad exit status from /var/tmp/rpm-tmp.88475 (%install) > > > > > > RPM build errors: > > user vlad does not exist - using root > > group vlad does not exist - using root > > user vlad does not exist - using root > > group vlad does not exist - using root > > Bad exit status from /var/tmp/rpm-tmp.88475 (%install) > > ERROR: Failed executing "rpmbuild --rebuild --define '_topdir > /var/tmp/OFEDRPM' --define '_prefix /usr' --define 'build_root > /var/tmp/OFED' --defi > > ne 'configure_options --with-cxgb3-mod --with-ipoib-mod --with-mthca-mod > --with-sdp-mod --with-srp-mod --with-core-mod --with-user_mad-mod --with- > > user_access-mod --with-addr_trans-mod --with-rds-mod ' --define 'KVERSION > 2.6.11-1.1369_FC4smp' --define 'KSRC /lib/modules/2.6.11-1.1369_FC4smp/b > > uild' --define 'build_kernel_ib 1' --define 'build_kernel_ib_devel 1' > --define 'NETWORK_CONF_DIR /etc/sysconfig/network-scripts' --define 'modprob > > e_update 1' --define 'include_ipoib_conf 1' > /usr/etc/OFED-1.2-rc5/SRPMS/ofa_kernel-1.2-rc5.src.rpm" > > > > > > Any ideas? > > > > Regards, > > > > Ben Held > Simulation Technology & Applied Research, Inc. > 11520 N. Port Washington Rd., Suite 201 > Mequon, WI 53092 > P: 1.262.240.0291 x101 > F: 1.262.240.0294 > E: ben.held at staarinc.com > http://www.staarinc.com > > > > > > From rvm at obsidianresearch.com Fri Jan 4 15:59:53 2008 From: rvm at obsidianresearch.com (Rolf Manderscheid) Date: Fri, 04 Jan 2008 16:59:53 -0700 Subject: [ofa-general] Re: [RFC PATCH] IPoIB: improve IPv4/IPv6 to IB mcast mapping functions In-Reply-To: References: <20071210203544.GI30090@obsidianresearch.com> <20071210203841.GJ30090@obsidianresearch.com> Message-ID: <477EC879.7010501@obsidianresearch.com> Roland Dreier wrote: > Any objection to merging the following for 2.6.25? > > [Rolf -- I think it makes more sense to delete the overwriting of the > P_Key in ipoib_multicast.c in this patch rather than later in the > series; do you agree?] > Yes, the first patch is the logical place for it. The only reason I put it in the second patch was to keep the first patch as small and simple as possible. Rolf From sheriffcies at unleashedmedia.com Fri Jan 4 16:40:59 2008 From: sheriffcies at unleashedmedia.com (Lenora Brooks) Date: Sat, 05 Jan 2008 00:40:59 -0000 Subject: [ofa-general] Adobe Kreative Suite 3 for MAC\XP\Vista 269, Retails @ 1799 (u save 1529) Message-ID: <000401c84f33$269fea80$0100007f@kmkjhy> steinberg wavelab 5.01a - 49 acronis true image enterprise server 9.1.3666 - 79 discreet combustion 4.0 for windows - 69 stuffit deluxe 11 for mac - 29 roxio toast titanium 8 - 39 creative suite premium 2 - 149 avid xpress pro 5.7 - 119 nero 7 premium - 39 type xprimeeasy. com in Internet browser borland developer studio 2006 - 149 acronis true image workstation 9.1.3887 - 29 crystal reports professional edition 11 - 69 adobe indesign cs2 - 59 adobe encore dvd 2 - 49 alias maya 7.0 unlimited - 109 autodesk autocad electrical 2006 - 99 realize voice 3.51 - 29 ms windows 2003 enterprise server - 69 From kliteyn at mellanox.co.il Fri Jan 4 17:22:21 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 5 Jan 2008 03:22:21 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-05:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-04 OpenSM git rev = Thu_Jan_3_04:20:57_2008 [d1470d92223f94fb3f60d5a6f549ed91e1d9d627] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=399 Fail=1 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 9 LidMgr IS3-128.topo Failures: 1 LidMgr IS3-128.topo From dwspiaggeoasim at spiaggeoasi.it Thu Jan 3 17:43:28 2008 From: dwspiaggeoasim at spiaggeoasi.it (Stefan Akers) Date: Sat, 4 Jan 2008 02:43:28 +0100 Subject: [ofa-general] Want to be a hero in bed? Message-ID: <01c84e7b$95ba0800$25f1f759@dwspiaggeoasim> Are U Tired with erectile dysfunction? Enhance your sexual life now! Want to be ready for sex in few minutes? Reproductive and ED problems solution http://geocities.com/JeffereyCross93/ We are verified by VISA. Confidential purchase. From arthur.jones at qlogic.com Fri Jan 4 18:09:29 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Fri, 4 Jan 2008 18:09:29 -0800 Subject: [ofa-general] Re: [PATCH 5/5] IB/ipath - Changes for fields moving from devdata to portdata. In-Reply-To: References: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com> <20071222222459.17599.72860.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080105020929.GA10519@eng-46.pathscale.com> hi roland, ... On Fri, Jan 04, 2008 at 12:56:25PM -0800, Roland Dreier wrote: > [...] The only further request I would make is that (if it fits > with your workflow) when sending a batch of patches, you update your > git tree so that it is based on top of either my tree or Linus's tree. no problem, i'll base our external git repo on your tree from now on... arthur From a-allen at agia.com Fri Jan 4 01:41:36 2008 From: a-allen at agia.com (Keisha Ray) Date: Sat, 4 Jan 2008 11:41:36 +0200 Subject: [ofa-general] Let's chat Message-ID: <01c84ec6$c2dfd800$6c90ec58@a-allen> Hello! I am tired tonight. I am nice girl that would like to chat with you. Email me at Christina at UsOldGlory.info only, because I am using my friend's email to write this. Wanna see some pictures of me? From dwsentfactorm at sentfactor.com Fri Jan 4 01:52:04 2008 From: dwsentfactorm at sentfactor.com (Sergio Workman) Date: Sat, 4 Jan 2008 01:52:04 -0800 Subject: [ofa-general] Get your free 2400$ welcome bonus and win much more! Message-ID: <01c84e74$679fadf0$4598687c@dwsentfactorm> Play the most popular casino games at home! Black Jack, Slots, Roulette, Poker, Craps! Just download easy to use free software, register free account and play your favorite game. Receive free $2400 bonus to start play with! Among our advantages are: fast payouts, high degree of security, all around the clock customer support. These are few reasons why Golden Gate casino is so popular http://geocities.com/RalphTucker/ Enjoy pure pleasure of gambling from home without stress! From epaxhwk at bording.com Fri Jan 4 02:59:54 2008 From: epaxhwk at bording.com (Maureen Britton) Date: Sat, 4 Jan 2008 06:59:54 -0400 Subject: [ofa-general] Re: Maureen Message-ID: <01c84e9f$687f5100$ed8748c8@epaxhwk> We have Many types of med s We stock Hard to Find ones http://daisymadinarg.googlepages.com From rrfp at bootsquad.com Fri Jan 4 02:59:54 2008 From: rrfp at bootsquad.com (Ella Parker) Date: Sat, 4 Jan 2008 06:59:54 -0400 Subject: [ofa-general] Re: Ella Message-ID: <01c84e9f$687f5100$ed8748c8@rrfp> We have Many types of med s We stock Hard to Find ones http://bhcppafkorc.googlepages.com From vlad at lists.openfabrics.org Sat Jan 5 03:08:36 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 5 Jan 2008 03:08:36 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080105-0200 daily build status Message-ID: <20080105110837.0003DE601CB@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.16 Passed on ia64 with linux-2.6.21.1 Passed on powerpc with linux-2.6.12 Passed on ia64 with linux-2.6.19 Passed on ppc64 with linux-2.6.15 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.14 Passed on ppc64 with linux-2.6.19 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.17 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.16 Passed on ppc64 with linux-2.6.12 Passed on ia64 with linux-2.6.12 Passed on ppc64 with linux-2.6.17 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.21.1 Passed on ppc64 with linux-2.6.18 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.13 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.18-53.el5 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ppc64 with linux-2.6.18-8.el5 Failed: From kliteyn at mellanox.co.il Sat Jan 5 17:10:37 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 6 Jan 2008 03:10:37 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-06:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-05 OpenSM git rev = Thu_Jan_3_04:20:57_2008 [d1470d92223f94fb3f60d5a6f549ed91e1d9d627] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From jackm at dev.mellanox.co.il Sat Jan 5 23:05:34 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 6 Jan 2008 09:05:34 +0200 Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of =?windows-1255?q?any=09one_user?= process In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C903039083@mtlexch01.mtl.com> Message-ID: <200801060905.35505.jackm@dev.mellanox.co.il> On Thursday 03 January 2008 17:49, Tang, Changqing wrote: > Another issue I have after thinking about the interface more. > > Rank A is the sender, rank B and C are two ranks on a remote node. At first, B creates the > receiving QP and make connection to A and register the QP number for receiving. And A gets > the receiving QP nubmer from B.  After some communication between A and B, B decides to close > the connection, and unregister the QP number. Then A and C want to talk, so A tell C the > receiving QP number, C tries to register the QP number. > > I wonder at the time when C tries to register the QP number, the receiving QP has been > destroyed by the kernel, since when B unregister the QP number, the reference count becomes > zero, and kernel will cleanup it. > > Am I right ? Yes. However, C will get an error when it tries to register with that QP, and can return the error to A. - Jack From vlad at mellanox.co.il Sun Jan 6 00:15:08 2008 From: vlad at mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 6 Jan 2008 10:15:08 +0200 Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to selectsp4 patches for SLES9 kernel with minor versions equalor greater than 305 In-Reply-To: <477BD9DB.7050201@lnxi.com> Message-ID: <6C2C79E72C305246B504CBA17B5500C90168D709@mtlexch01.mtl.com> Applied to the ofed_1_2_c branch. Regards, Vladimir > -----Original Message----- > From: David B. Anderson [mailto:danderson at lnxi.com] > Sent: Wednesday, January 02, 2008 8:37 PM > To: Vladimir Sokolovsky > Cc: moshek at voltaire.com; general at lists.openfabrics.org > Subject: Re: [ofa-general] [PATCH] LNXI Fixed ofed_scripts > configure to selectsp4 patches for SLES9 kernel with minor > versions equalor greater than 305 > > > Hi Vladimir, > > I sent the patches to the list with git-send-email but I just checked > the mail logs and I'm > getting a timeout from lists.openfabrics.org for those > messages. So here they are directly :). > > Sorry for the delay. > > David > > Vladimir Sokolovsky wrote: > > git://git.openfabrics.org/ofed_1_2/linux-2.6.git > > - its my git tree, > > > > David can't commit his patches to this tree (he does not have > > permissions)... > > So, probably he have a clone of my tree somewhere. > > > > Regards, > > Vladimir > > > > Moshe Kazir wrote: > >> He wrote -> > >> > >>> commit 3db835ee0edb792b120ba10c8066e3d4409de2d7 > >>> > >>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git > >> > >> Moshe ____________________________________________________________ > >> Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) > >> > >> Voltaire - The Grid Backbone > >> > >> www.voltaire.com > >> > >> > >> -----Original Message----- > >> From: general-bounces at lists.openfabrics.org > >> [mailto:general-bounces at lists.openfabrics.org] On Behalf > Of Vladimir > >> Sokolovsky > >> Sent: Sunday, December 30, 2007 9:37 AM > >> To: David B. Anderson > >> Cc: general at lists.openfabrics.org > >> Subject: Re: [ofa-general] [PATCH] LNXI Fixed ofed_scripts > configure > >> to selectsp4 patches for SLES9 kernel with minor versions equalor > >> greater than 305 > >> > >> Hi David, > >> Where can I get your patches? > >> > >> Regards, > >> Vladimir > >> > >> David B. Anderson wrote: > >>> Hi Vladimir, > >>> > >>> The four patches named below are what I'm using to get the OFED > >>> 1.2.5 kernel to build for SLES9 SP4. > >>> > >>> commit 3db835ee0edb792b120ba10c8066e3d4409de2d7 > >>> > >>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git > >>> > >>> > >>> The patches are: > >>> > >>> [PATCH 1/4] LNXI changed ofed_scripts configure to select sp4 > >>> patches > >>> > >>> [PATCH 2/4] LNXI created backport patch addr_8802_to_2_6_5-7_308 > >>> > >>> [PATCH 3/4] LNXI fixed backport/2.6.5_sles9_sp4/rds_to_2_6_9.patch > >>> > >>> [PATCH 4/4] LNXI fixed > backport/2.6.5_sles9_sp4/cxg3_to_2_6_20.patch > >>> > >>> > >>> I've tested these on my cluster. > >>> > >>> Note: I changed your patch to the ofed_scripts/configure > script, so > >>> that even if the SLES9 > >>> > >>> kernel is greater than 309 it will not revert to using > SP3 patches. > >>> > >>> > >>> David > >>> > >>> > >>> Vladimir Sokolovsky wrote: > >>>> David B. Anderson wrote: > >>>>> I've all of these patches plus the following patch > >>>>> > >>>>> > kernel_patches/backport/2.6.5_sles9_sp4/cxgb3_remove_eeh.patch > >>>>> > >>>>> My current git repo is > >>>>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git > >>>>> commit: 6974c285e6fb06264f570f9cf919865bab66c9e6 > >>>>> > >>>>> My patch that I posted before fixes the kernel > configure script so > >>>>> that it applies 2.6.5_sles9_sp4 patches for the SP4 > release kernel of > >>>>> 2.6.5-7.308 and above. The configure patch from > >>>>> FED_1.2.5_sles9_sp4_configure.diff has 2.6.5-7.305* as > the only valid > >>>>> SP4 kernel which is incorrect. I get the same compiler error as > >> before. > >>>>> > >>>>> > >>>>> Moshe Kazir wrote: > >>>>>> See patches in the attached message. > >>>>>> > >>>>>> It was applied by Vlad. > >>>>>> > >>>>>> Moshe > >>>>>> > >>>>>> ____________________________________________________________ > >>>>>> Moshe Katzir | +972-9971-8639 (o) | > +972-52-860-6042 (m) > >>>>>> > >>>>>> Voltaire - The Grid Backbone > >>>>>> > >>>>>> www.voltaire.com > >>>>>> > >>>>>> > >>>>>> > >>>>>> -----Original Message----- > >>>>>> From: general-bounces at lists.openfabrics.org > >>>>>> [mailto:general-bounces at lists.openfabrics.org] On > Behalf Of David > >> B. > >>>>>> Anderson > >>>>>> Sent: Saturday, December 15, 2007 3:31 AM > >>>>>> To: general at lists.openfabrics.org; vlad at mellanox.co.il > >>>>>> Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts > configure > >>>>>> to > >> > >>>>>> select sp4 patches for SLES9 kernel with minor > versions equal or > >>>>>> greater than 305 > >>>>>> > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I've created the following patch for OFED 1.2.5.4 to have the > >>>>>> kernel for > >>>>>> > >>>>>> SLES9 SP4 recognized (2.6.5-7.308). > >>>>>> > >>>>>> Even with the patch I then had two back port patches not apply > >>>>>> cleanly (cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand > >>>>>> patched them but now I'm getting the following compiler errors: > >>>>>> > >>>>>> In file included from > >>>>>> /usr/src/linux-2.6.5-7.308/include/linux/module.h:10, > >>>>>> from > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addo > >>>>>> ns > >>>>>> /back > >>>>>> > >>>>>> port/2.6.5_sles9_sp4/include/linux/module.h:4, > >>>>>> from > >>>>>> /usr/src/linux-2.6.5-7.308/include/linux/device.h:21, > >>>>>> from > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addo > >>>>>> ns > >>>>>> /back > >>>>>> > >>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:4, > >>>>>> from > >>>>>> /usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38, > >>>>>> from > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addo > >>>>>> ns > >>>>>> /back > >>>>>> > >>>>>> port/2.6.5_sles9_sp4/include/linux/netdevice.h:4, > >>>>>> from > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/inf > >>>>>> in > >>>>>> iband > >>>>>> > >>>>>> /core/addr.c:32: > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addo > >>>>>> ns > >>>>>> /back > >>>>>> > >>>>>> port/2.6.5_sles9_sp4/include/linux/sched.h:8: warning: static > >>>>>> declaration for `wait_for_completion_timeout' follows > non-static > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin > >>>>>> iband > >>>>>> > >>>>>> /core/addr.c:67: warning: initialization from incompatible > >>>>>> pointer type > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin > >>>>>> iband > >>>>>> > >>>>>> /core/addr.c: In function `addr_resolve_remote': > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/inf > >>>>>> in > >>>>>> iband > >>>>>> > >>>>>> /core/addr.c:192: error: structure has no member named `idev' > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/inf > >>>>>> in > >>>>>> iband > >>>>>> > >>>>>> /core/addr.c:193: error: structure has no member named `idev' > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/inf > >>>>>> in > >>>>>> iband > >>>>>> > >>>>>> /core/addr.c:197: error: structure has no member named `idev' > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/inf > >>>>>> in > >>>>>> iband > >>>>>> > >>>>>> /core/addr.c: At top level: > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addo > >>>>>> ns > >>>>>> /back > >>>>>> > >>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:48: warning: > >>>>>> `class_create' defined but not used > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons > >>>>>> /back > >>>>>> > >>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:82: warning: > >>>>>> `class_destroy' defined but not used > >>>>>> > /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons > >>>>>> /back > >>>>>> > >>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:108: warning: > >>>>>> `class_device_create' defined but not used > >>>>>> make[6]: *** > >>>>>> > [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi > >>>>>> niban > >>>>>> > >>>>>> d/core/addr.o] Error 1 > >>>>>> make[5]: *** > >>>>>> > [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/in > >>>>>> fi > >>>>>> niban > >>>>>> > >>>>>> d/core] Error 2 > >>>>>> make[4]: *** > >>>>>> > [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/in > >>>>>> fi > >>>>>> niban > >>>>>> > >>>>>> d] Error 2 > >>>>>> make[3]: *** > >>>>>> [_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] > >>>>>> Error 2 > >>>>>> make[2]: *** [modules] Error 2 > >>>>>> make[1]: *** [modules] Error 2 > >>>>>> make[1]: Leaving directory > >>>>>> `/usr/src/linux-2.6.5-7.308-obj/x86_64/default' > >>>>>> make: *** [kernel] Error 2 > >>>>>> > >>>>>> Does anyone have OFED 1.2.5.4 building for SLES 9 SP4? > >>>>>> > >>>>>> Thanks > >>>>>> > >>>>>> > >>>>>> > ----------------------------------------------------------------- > >>>>>> -- > >>>>>> ----- > >>>>>> > >>>>>> > >>>>>> Subject: > >>>>>> [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4 > >>>>>> From: > >>>>>> "Moshe Kazir" > >>>>>> Date: > >>>>>> Sun, 25 Nov 2007 09:59:26 +0200 > >>>>>> To: > >>>>>> "Vladimir Sokolovsky" , > >>>>>> > >>>>>> > >>>>>> To: > >>>>>> "Vladimir Sokolovsky" , > >>>>>> > >>>>>> > >>>>>> > >>>>>> The attached files do the work. > >>>>>> > >>>>>> OFED_1.2.5_sles9_sp4_configure.diff include the changes in the > >>>>>> configure file. > >>>>>> OFED_1.2.5_sles9_sp4_backport.diff include the canges > requiered in > >> > >>>>>> the kernel_patche and kernel_addons directories. > >>>>>> > >>>>>> Moshe > >>>>>> ____________________________________________________________ > >>>>>> Moshe Katzir | +972-9971-8639 (o) | > +972-52-860-6042 (m) > >>>>>> > >>>>>> Voltaire - The Grid Backbone > >>>>>> > >>>>>> www.voltaire.com > >>>>>> > >>>> Hi David, > >>>> Please try the latest OFED-1.2.5.4-20071219-0824.tgz > build on your > >>>> SLES9SP4. > >>>> > >>>> > http://www.openfabrics.org/builds/connectx/OFED-1.2.5.4-200712 19-08 >>>> 24 >>>> .tgz >>>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>> >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > -- David B. Anderson Linux Networx Sr. Software Engineer Email: danderson at lnxi.com Phone: (801) 649-1311 From vlad at dev.mellanox.co.il Sun Jan 6 00:58:58 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 06 Jan 2008 10:58:58 +0200 Subject: [ofa-general] Re: [GIT PULL ofed-1.2.5 / ofed-1.3] - libcxgb3-1.1.2 release In-Reply-To: <477D72FC.2000800@opengridcomputing.com> References: <477D72FC.2000800@opengridcomputing.com> Message-ID: <47809852.8010404@dev.mellanox.co.il> Steve Wise wrote: > Vlad, > > Please pull version 1.1.2 of libcxgb3 for ofed-1.2.5 and ofed-1.3. > This release fixes a segfault that can happen when running rdma apps > over chelsio's device on 32b platforms and distros (bug 680). > > Pull from: > > git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 > > and > > git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3 > > > Thanks, > > Steve. > Done, Regards, Vladimir From vlad at lists.openfabrics.org Sun Jan 6 02:51:07 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 6 Jan 2008 02:51:07 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080106-0200 daily build status Message-ID: <20080106105107.A9CCAE28A54@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16 Passed on ppc64 with linux-2.6.14 Passed on ppc64 with linux-2.6.17 Passed on ia64 with linux-2.6.21.1 Failed: Build failed on x86_64 with linux-2.6.17 Log: compilation terminated. /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_x86_64_check/net/rds/cong.o: No space left on device {standard input}: Assembler messages: {standard input}:29875: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_x86_64_check/net/rds/cong.o: Invalid argument make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_x86_64_check/net/rds/cong.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.12 Log: -Iinclude \ \ -D__KERNEL__ \ -include include/linux/autoconf.h \ -incmake[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_x86_64_check/drivers/net/cxgb3/mc5.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_x86_64_check/drivers/net/cxgb3] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.12' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.13 Build failed on powerpc with linux-2.6.14 Log: \ -Iarch/ppc -Iarch/ppc/include -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -O2 -fomit-frame-pointer -g -Iarch/ppc -msoft-float -pipe -ffixed-r2 -mmultiple -mstring -Wa,-maltivec -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -DKBUILD_BASENAME=ib_rdma -DKBUILD_MODNAME=rds -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/.tmp_ib_rdma.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/ib_rdma.c /home/vlad/cross/powerpc/bin/powerpc-linux-ld -m elf32ppc -r -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/rds.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/af_rds.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/bind.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/cong.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/connection.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/info.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/message.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/recv.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/send.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/stats.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2 .6.14_powerpc_check/net/rds/sysctl.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/threads.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/transport.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/loop.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/page.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/rdma.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/tcp.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/tcp_connect.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/tcp_listen.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/tcp_send.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/tcp_stats.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerp c_check/net/rds/tcp_recv.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/ib.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/ib_cm.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/ib_recv.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/ib_ring.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/ib_send.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/ib_stats.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/ib_sysctl.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/ib_rdma.o /home/vlad/cross/powerpc/bin/powerpc-linux-ld: final link failed: No space left on device make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds/rds.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_powerpc_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/powerpc/linux-2.6.14' make: *** [kernel] Error 2 Log: ---------------------------------------------------------------------------------- from /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/cxgb3_offload.c:34: include/asm/mmu_context.h:67: warning: type qualifiers ignored on function return type /home/vlad/cross/ia64/bin/ia64-linux-ld -r -T /home/vlad/kernel.org/ia64/linux-2.6.13/arch/ia64/module.lds -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/cxgb3.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/cxgb3_main.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/ael1002.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/vsc8211.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/t3_hw.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/mc5.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/xgmac.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/sge.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/l2t.o /home/vlad/tmp/ofa_1_3_kern el-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/cxgb3_offload.o /home/vlad/cross/ia64/bin/ia64-linux-ld: final link failed: No space left on device make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3/cxgb3.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check/drivers/net/cxgb3] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.13' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-8.el5 Log: {standard input}: Assembler messages: {standard input}:31966: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_x86_64_check/drivers/net/mlx4/.tmp_mr.o: Invalid argument /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_x86_64_check/drivers/net/mlx4/mr.c:636: fatal error: closing dependency file /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_x86_64_check/drivers/net/mlx4/.mr.o.d: No space left on device compilation terminated. make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_x86_64_check/drivers/net/mlx4/mr.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_x86_64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.21-0.8-smp Log: -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/ulp/srpt \ -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/net/cxgb3 \ -Iinclude \ \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration -fno-strict-aliasing -fno-common -ffreestanding -Os -fomit-frame-pointer -g -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mthca_allocator)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mthca)" -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/hw/mthca/.tmp_mthca_allocator.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband/hw/mthca/mthca_allocator.c /home/vlad/cross/x86make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.21-0.8-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.23 Log: -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.23_ia64_check/drivers/infiniband/debug \ -I/usr/local/include/scst \ -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.23_ia64_check/drivers/infiniband/ulp/srpt \ -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.23_ia64_check/drivers/net/cxgb3 \ -Iinclude \ \ -DHAVE_WORKING_TEXT_ALIGN -DHAVE_MODEL_SMALL_ATTRIBUTE -DHAVE_SERIALIZE_DIRECTIVE -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -pipe -ffixed-r13 -mfixed-range=f12-f15,f32-f127 -falign-functions=32 -frename-registers -fno-optimize-sibling-calls -fomit-frmake[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.23_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.23' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.18 Log: -Iinclude \ \ -DHAVE_WORKING_TEXT_ALIGN -DHAVE_MODEL_SMALL_ATTRIBUTE -DHAVE_SERIALIZE_DIRECTIVE -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -pipe -ffixed-r13 -mfixed-range=f12-f15,f32-f127 -falign-functions=32 -frename-registers -fno-optimize-sibling-calls -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ib_mthca.mod)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mthca)" -DMODULE -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ia64_check/drivers/infiniband/hw/mthca/ib_mthca.mod.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ia64_check/drivers/infiniband/hw/mthca/ib_mthca.mod.c /home/vlad/cross/ia64/bin/ia64-linux-ld -r -T /home/vlad/kernel.org/ia64/linux-2.6.18/arch/ia64/module.lds -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ia64_check/drivers/infiniband/hw/mthca/ib_mthca.ko /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ia64_check/drivers/infiniband/hw/mthca/ib_mthca.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ia64_check/drivers/infiniband/hw/mthca/ib_mthca.mod.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ia64_check/drivers/infiniband/hw/mthca/ib_mthca.ko: final close failed: No space left on device make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ia64_check/drivers/infiniband/hw/mthca/ib_mthca.ko] Error 1 make[1]: *** [modules] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16 Log: -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ia64_check/drivers/net/cxgb3 \ -Iinclude \ \ -DHAVE_WORKING_TEXT_ALIGN -DHAVE_MODEL_SMALL_ATTRIBUTE -DHAVE_SERIALIZE_DIRECTIVE -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -Os -fomit-frame-pointer -g -pipe -ffixed-r13 -mfixed-range=f12-f15,f32-f127 -falign-functions=32 -frename-registers -fno-optimize-sibling-calls -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ib_stats)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ia64_check/net/rds/.tmp_ib_stats.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ia64_check/net/rds/ib_stats.c /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ia64_check/net/rds/.tmp_ib_stats.o: No space left on device {standard input}: Assembler messages: {standard input}:26079: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_limake[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.15 Log: -Iinclude \ \ -DHAVE_WORKING_TEXT_ALIGN -DHAVE_MODEL_SMALL_ATTRIBUTE -DHAVE_SERIALIZE_DIRECTIVE -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -Os -fomit-frame-pointer -g -pipe -ffixed-r13 -mfixed-range=f12-f15,f32-f127 -falign-functions=32 -frename-registers -fno-optimize-sibling-calls -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -DKBUILD_BASENAME=cxgb3_offload -DKBUILD_MODNAME=cxgb3 -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ia64_check/drivers/net/cxgb3/.tmp_cxgb3_offload.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ia64_check/drivers/net/cxgb3/cxgb3_offload.c /home/vlad/cross/ia64/bin/ia64-linux-ld: final link failed: No space left on device make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ia64_check/drivers/net/cxgb3/cxgb3_offload.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ia64_check/drivers/net/cxgb3] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.15' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.17 Log: compilation terminated. /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_ia64_check/net/rds/.tmp_tcp_send.o: No space left on device {standard input}: Assembler messages: {standard input}:30857: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_ia64_check/net/rds/.tmp_tcp_send.o: Invalid argument make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_ia64_check/net/rds/tcp_send.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_ia64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.17_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.17' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.12 Log: from /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ia64_check/drivers/net/mlx4/cq.c:41: include/asm/mmu_context.h:67: warning: type qualifiers ignored on function return type /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ia64_check/kernel_addons/backport/2.6.12/include/linux/device.h:44: warning: 'class_create' defined but not used /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ia64_check/kernel_addons/backport/2.6.12/include/linux/device.h:78: warning: 'class_destroy' defined but not used /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ia64_check/kernel_addons/backport/2.6.12/include/linux/device.h:104: warning: 'class_device_create' defined but not used /home/vlad/tmp/ofa_1_3_kernel-2008010make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ia64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.12' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.16.43-0.3-smp Log: -Iinclude \ \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration -fno-strict-aliasing -fno-common -ffreestanding -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(iser_memory)" -D"KBUILD_MODNAME=KBUILD_STR(ib_iser)" -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/.tmp_iser_memory.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_memory.c /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/iser_memory.c:488: fatal error: closing dependency file /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband/ulp/iser/.iser_memory.o.d: No space left on device compilation terminated. /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.43-0.3-smp_x86_make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.43-0.3-smp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.43-0.3-smp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.16.43-0.3-smp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on powerpc with linux-2.6.15 Log: {standard input}: Assembler messages: {standard input}:26660: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_powerpc_check/drivers/net/cxgb3/.tmp_vsc8211.o: Invalid argument /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_powerpc_check/drivers/net/cxgb3/vsc8211.c:228: fatal error: closing dependency file /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_powerpc_check/drivers/net/cxgb3/.vsc8211.o.d: No space left on device compilation terminated. make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_powerpc_check/drivers/net/cxgb3/vsc8211.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_powerpc_check/drivers/net/cxgb3] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_powerpc_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/powerpc/linux-2.6.15' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-42.ELsmp Log: /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/ipoib/.tmp_ipoib_fs.o: No space left on device {standard input}: Assembler messages: {standard input}:29245: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/ipoib/.tmp_ipoib_fs.o: Invalid argument make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_fs.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband/ulp/ipoib] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-42.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-42.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-42.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on powerpc with linux-2.6.13 Log: compilation terminated. /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_powerpc_check/net/rds/.tmp_bind.o: No space left on device {standard input}: Assembler messages: {standard input}:26070: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_powerpc_check/net/rds/.tmp_bind.o: Invalid argument make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_powerpc_check/net/rds/bind.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_powerpc_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_powerpc_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/powerpc/linux-2.6.13' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.14 Log: compilation terminated. /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_ia64_check/net/rds/.tmp_tcp_listen.o: No space left on device {standard input}: Assembler messages: {standard input}:30915: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_ia64_check/net/rds/.tmp_tcp_listen.o: Invalid argument make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_ia64_check/net/rds/tcp_listen.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_ia64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.14' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on powerpc with linux-2.6.12 Log: -Iinclude \ \ -Iarch/ppc -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -O2 -fomit-frame-pointer -g -Iarch/ppc -msoft-float -pipe -ffixed-r2 -mmultiple -mstring -Wa,-maltivec -Wdeclaration-after-statement -Wno-pointer-sign -DKBUILD_BASENAME=ib_umad -DKBUILD_MODNAME=ib_umad -DMODULE -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_powerpc_check/drivers/infiniband/core/ib_umad.mod.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_powerpc_check/drivers/infiniband/core/ib_umad.mod.c /home/vlad/cross/powerpc/bin/powerpc-linux-ld -m elf32ppc -r -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_powerpc_check/drivers/infiniband/core/ib_umad.ko /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_powerpc_check/drivers/infiniband/core/ib_umad.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_powerpc_check/drivers/infiniband/core/ib_umad.mod.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_powerpc_check/drivers/infiniband/core/ib_umad.ko: final close failed: No space left on device make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_powerpc_check/drivers/infiniband/core/ib_umad.ko] Error 1 make[1]: *** [modules] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/powerpc/linux-2.6.12' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.16.21-0.8-default Log: -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/srpt \ -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/net/cxgb3 \ -Iinclude \ \ -DHAVE_WORKING_TEXT_ALIGN -DHAVE_MODEL_SMALL_ATTRIBUTE -DHAVE_SERIALIZE_DIRECTIVE -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration -fno-strict-aliasing -fno-common -ffreestanding -Os -fomit-frame-pointer -g -pipe -ffixed-r13 -mfixed-range=f12-f15,f32-f127 -falign-functions=32 -frename-registers -fno-optimize-sibling-calls -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=Kmake[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/ulp/ipoib] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18-8.el5 Log: /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/nes/.tmp_nes.o: No space left on device {standard input}: Assembler messages: {standard input}:49995: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/nes/.tmp_nes.o: Invalid argument make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/nes/nes.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband/hw/nes] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_ppc64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-8.el5_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18-8.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.21.1 Log: /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes/nes_cm.c:3087: warning: assignment from incompatible pointer type /home/vlad/cross/x86_64/bin/x86_64-linux-ld -m elf_x86_64 -r -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes/iw_nes.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes/nes.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes/nes_hw.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes/nes_nic.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes/nes_utils.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes/nes_verbs.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes/nes_cm.o /home/vlad/cross/x86_64/bin/x86_64-linux-ld: final link failed: No space left on device make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes/iw_nes.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/hw/nes] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.21.1_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.21.1' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.19 Log: -I/usr/local/include/scst \ -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ppc64_check/drivers/infiniband/ulp/srpt \ -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ppc64_check/drivers/net/cxgb3 \ -Iinclude \ \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(info)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ppc64_check/net/rds/.tmp_info.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ppc64_check/net/rds/info.c /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.18 Log: \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(xrcd)" -D"KBUILD_MODNAME=KBUILD_STR(mlx4_core)" -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/.tmp_xrcd.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/xrcd.c /home/vlad/cross/ppc64/bin/ppc64-linux-ld -m elf64ppc -r -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/mlx4_core.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/alloc.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/catas.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/cmd.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/cq.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/eq.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/fw.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/icm.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/intf.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/main .o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/mcg.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/mr.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/pd.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/profile.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/qp.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/reset.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/srq.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/xrcd.o /home/vlad/cross/ppc64/bin/ppc64-linux-ld: final link failed: No space left on device make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4/mlx4_core.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.18' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.19 Log: -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/drivers/infiniband/ulp/srpt \ -I/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/drivers/net/cxgb3 \ -Iinclude \ \ -DHAVE_WORKING_TEXT_ALIGN -DHAVE_MODEL_SMALL_ATTRIBUTE -DHAVE_SERIALIZE_DIRECTIVE -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -pipe -ffixed-r13 -mfixed-range=f12-f15,f32-f127 -falign-functions=32 -frename-registers -fno-optimize-sibling-calls -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ib_rdma)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/.tmp_ib_rdma.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/ib_rdma.c /home/vlad/cross/ia64/bin/ia64-linux-ld -r -T /home/vlad/kernel.org/ia64/linux-2.6.19/arch/ia64/module.lds -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/rds.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/af_rds.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/bind.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/cong.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/connection.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/info.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/message.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/recv.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/send.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/stats.o /home/vlad/tmp/ofa_1_3_kernel-20080 106-0200_linux-2.6.19_ia64_check/net/rds/sysctl.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/threads.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/transport.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/loop.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/page.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/rdma.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/tcp.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/tcp_connect.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/tcp_listen.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/tcp_send.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/tcp_stats.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/tcp_ recv.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/ib.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/ib_cm.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/ib_recv.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/ib_ring.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/ib_send.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/ib_stats.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/ib_sysctl.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_ia64_check/net/rds/ib_rdma.o /home/vlad/cross/ia64/bin/ia64-linux-ld: final link failed: No space left on device make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ia64 with linux-2.6.22 Log: /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_ia64_check/drivers/infiniband/hw/mlx4/.tmp_mad.o: No space left on device {standard input}: Assembler messages: {standard input}:30989: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_ia64_check/drivers/infiniband/hw/mlx4/.tmp_mad.o: Invalid argument make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_ia64_check/drivers/infiniband/hw/mlx4/mad.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_ia64_check/drivers/infiniband/hw/mlx4] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.22' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-53.el5 Log: compilation terminated. /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-53.el5_x86_64_check/drivers/net/mlx4/.tmp_fw.o: No space left on device {standard input}: Assembler messages: {standard input}:34000: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-53.el5_x86_64_check/drivers/net/mlx4/.tmp_fw.o: Invalid argument make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-53.el5_x86_64_check/drivers/net/mlx4/fw.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-53.el5_x86_64_check/drivers/net/mlx4] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-53.el5_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-53.el5' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.13 Log: from /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_x86_64_check/drivers/net/mlx4/mlx4_core.mod.c:1: include/asm/apic.h: In function 'apic_write_atomic': include/asm/apic.h:47: warning: value computed is not used /home/vlad/cross/x86_64/bin/x86_64-linux-ld -m elf_x86_64 -r -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_x86_64_check/drivers/net/mlx4/mlx4_core.ko /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_x86_64_check/drivers/net/mlx4/mlx4_core.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_x86_64_check/drivers/net/mlx4/mlx4_core.mod.o /home/vlad/cross/x86_64/bin/x86_64-linux-ld: final link failed: No space left on device make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_x86_64_check/drivers/net/mlx4/mlx4_core.ko] Error 1 make[1]: *** [modules] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.13' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.15 Log: Build failed on x86_64 with linux-2.6.15 Log: -Iinclude \ \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -Os -fomit-frame-pointer -g -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -Wdeclaration-after-statement -Wno-pointer-sign -DKBUILD_BASENAME=rds -DKBUILD_MODNAME=rds -DMODULE -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ppc64_check/net/rds/rds.mod.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ppc64_check/net/rds/rds.mod.c /home/vlad/cross/ppc64/bin/ppc64-linux-ld -m elf64ppc -r -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ppc64_check/net/rds/rds.ko /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ppc64_check/net/rds/rds.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ppc64_check/net/rds/rds.mod.o /home/vlad/cross/ppc64/bin/ppc64-linux-ld: final link failed: No space left on device make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_ppc64_check/net/rds/rds.ko] Error 1 make[1]: *** [modules] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.15' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_x86_64_check/drivers/infiniband/hw/mthca/mthca_cq.o: No space left on device {standard input}: Assembler messages: {standard input}:38307: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_x86_64_check/drivers/infiniband/hw/mthca/mthca_cq.o: Invalid argument make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_x86_64_check/drivers/infiniband/hw/mthca/mthca_cq.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_x86_64_check/drivers/infiniband/hw/mthca] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.15_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.15' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.18-1.2798.fc6 Log: -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ipoib_fs)" -D"KBUILD_MODNAME=KBUILD_STR(ib_ipoib)" -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/.tmp_ipoib_fs.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_fs.c /home/vlad/cross/x86_64/bin/x86_64-linux-ld -m elf_x86_64 -r -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ib_ipoib.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_main.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_ib.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_multicast.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_verbs.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_vlan.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_etool.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1. 2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_cm.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_fs.o /home/vlad/cross/x86_64/bin/x86_64-linux-ld: final link failed: No space left on device make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib/ib_ipoib.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband/ulp/ipoib] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.18-1.2798.fc6_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-1.2798.fc6' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/mlx4/.tmp_wc.o: No space left on device {standard input}: Assembler messages: {standard input}:18473: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/mlx4/.tmp_wc.o: Invalid argument make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/mlx4/wc.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/mlx4] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.13 Log: compilation terminated. /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ppc64_check/drivers/net/cxgb3/.tmp_vsc8211.o: No space left on device {standard input}: Assembler messages: {standard input}:29210: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ppc64_check/drivers/net/cxgb3/.tmp_vsc8211.o: Invalid argument make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ppc64_check/drivers/net/cxgb3/vsc8211.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ppc64_check/drivers/net/cxgb3] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.13_ppc64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.13' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.16 Log: -Iinclude \ \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -Os -fomit-frame-pointer -g -msoft-float -pipe -mminimal-toc -mtraceback=none -mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring -Wa,-maltivec -Wdeclaration-after-statement -Wno-pointer-sign -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ib_cm)" -D"KBUILD_MODNAME=KBUILD_STR(ib_cm)" -DMODULE -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ppc64_check/drivers/infiniband/core/ib_cm.mod.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ppc64_check/drivers/infiniband/core/ib_cm.mod.c /home/vlad/cross/ppc64/bin/ppc64-linux-ld -m elf64ppc -r -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ppc64_check/drivers/infiniband/core/ib_cm.ko /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ppc64_check/drivers/infiniband/core/ib_cm.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ppc64_check/drivers/infiniband/core/ib_cm.mod.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ppc64_check/drivers/infiniband/core/ib_cm.ko: final close failed: No space left on device make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.16_ppc64_check/drivers/infiniband/core/ib_cm.ko] Error 1 make[1]: *** [modules] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.16' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.19 Log: -Iinclude \ \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -fasynchronous-unwind-tables -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ib_core.mod)" -D"KBUILD_MODNAME=KBUILD_STR(ib_core)" -DMODULE -c -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_x86_64_check/drivers/infiniband/core/ib_core.mod.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_x86_64_check/drivers/infiniband/core/ib_core.mod.c /home/vlad/cross/x86_64/bin/x86_64-linux-ld -m elf_x86_64 -r -o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_x86_64_check/drivers/infiniband/core/ib_core.ko /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_x86_64_check/drivers/infiniband/core/ib_core.o /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_x86_64_check/drivers/infiniband/core/ib_core.mod.o /home/vlad/cross/x86_64/bin/x86_64-linux-ld: final link failed: No space left on device make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.19_x86_64_check/drivers/infiniband/core/ib_core.ko] Error 1 make[1]: *** [modules] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.19' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.22.5-31-default Log: compilation terminated. /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22.5-31-default_x86_64_check/drivers/net/cxgb3/.tmp_xgmac.o: No space left on device {standard input}: Assembler messages: {standard input}:42098: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22.5-31-default_x86_64_check/drivers/net/cxgb3/.tmp_xgmac.o: Invalid argument make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22.5-31-default_x86_64_check/drivers/net/cxgb3/xgmac.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22.5-31-default_x86_64_check/drivers/net/cxgb3] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22.5-31-default_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.22.5-31-default' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on ppc64 with linux-2.6.12 Log: /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ppc64_check/drivers/infiniband/core/ib_addr.mod.c:1: fatal error: closing dependency file /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ppc64_check/drivers/infiniband/core/.ib_addr.mod.o.d: No space left on device compilation terminated. /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ppc64_check/drivers/infiniband/core/ib_addr.mod.o: No space left on device {standard input}: Assembler messages: {standard input}:50: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ppc64_check/drivers/infiniband/core/ib_addr.mod.o: Invalid argument make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.12_ppc64_check/drivers/infiniband/core/ib_addr.mod.o] Error 1 make[1]: *** [modules] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ppc64/linux-2.6.12' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.14 Log: /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_x86_64_check/drivers/infiniband/hw/mthca/mthca_qp.c: At top level: /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_x86_64_check/drivers/infiniband/hw/mthca/mthca_qp.c:2342: fatal error: closing dependency file /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_x86_64_check/drivers/infiniband/hw/mthca/.mthca_qp.o.d: No space left on device compilation terminated. make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_x86_64_check/drivers/infiniband/hw/mthca/mthca_qp.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_x86_64_check/drivers/infiniband/hw/mthca] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.14_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.14' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.22 Log: {standard input}: Assembler messages: {standard input}:30256: FATAL: Can't write /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_x86_64_check/net/rds/tcp_listen.o: Invalid argument /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_x86_64_check/net/rds/tcp_listen.c:204: fatal error: closing dependency file /home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_x86_64_check/net/rds/.tcp_listen.o.d: No space left on device compilation terminated. make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_x86_64_check/net/rds/tcp_listen.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_x86_64_check/net/rds] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080106-0200_linux-2.6.22_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.22' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From tziporet at dev.mellanox.co.il Sun Jan 6 03:29:20 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Sun, 06 Jan 2008 13:29:20 +0200 Subject: [ofa-general] Re: [ewg] OFED 1.3 timeline In-Reply-To: <477C3E22.7090006@sgi.com> References: <4693BF47.8070700@mellanox.co.il> <477C3E22.7090006@sgi.com> Message-ID: <4780BB90.2090803@mellanox.co.il> Edward Mascarenhas wrote: > Hi, What is the expected timeline for the remaining RCs and GA of OFED 1.3? > > Thanks, > Edward > > > > We will close it in the meeting tomorrow, but I think RC2 will be delayed for next week since the enhanced XRC API must be closed before this In general this data can always be found at our Wiki: https://wiki.openfabrics.org/tiki-index.php?page=OFED+1.3+release+plan+and+features Tziporet From info at irish.com Sat Jan 5 06:31:24 2008 From: info at irish.com (The Irish National Lottery Board) Date: Sat, 5 Jan 2008 09:31:24 -0500 (EST) Subject: [ofa-general] Congratulations!!! Your E-mail Just Won 1,000, 000.00 Euro (Call Us At +44-70457-46575) Message-ID: <22797.41.204.224.13.1199543484.squirrel@mail.wadsnet.com> We are pleased to announce to you the draw of the IRISH LOTTERY PROGRAMME online Sweepstakes International Program held on 2nd of January, 2008. Your e-mail address attached to ticket number:56475600545 189 with Serial number 5369/05 drew the lucky number:1-2-12-28-39-41-15, which subsequently won the LOTTO RESULTS prize. You have therefore been approved to claim a total sum of 1,000,000.00 Euro. (One Million Euro) in cash. In order to file out your claims, you are advised to contact our claim Agent: Name: Mr. Alton Williams Tell:+44-70457-46575 +44-70457-46645 Email: info at altonwilliams.free2all.co.uk Please the below form for the processing of your claims: 1.FULL NAMES: 2.ADDRESS: 3.SEX: 4.AGE: 5.OCCUPATION: 6.MODE OF REMITTANCE: COURIER/BANK 7.TELEPHONE NUMBER: 8. COUNTRY: For security reasons, you are advised to keep your winning information confidential. Sincerely, Marry Kelly(Mrs) Note: This e-mail message passed through the Wadsnet outbound Mail Vault anti-virus/anti-SPAM scanners. From tziporet at dev.mellanox.co.il Sun Jan 6 03:34:07 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Sun, 06 Jan 2008 13:34:07 +0200 Subject: [ofa-general] Re: [ewg] Re: [mvapich-discuss] Troubles building/installing OFEM 1.2 on Fedora Core 4 64-bit In-Reply-To: References: Message-ID: <4780BCAF.2020806@mellanox.co.il> Dhabaleswar Panda wrote: > Ben - Sorry to know that you are experiencing problems in > building/installing OFED 1.2 on Fedora Core 4 64 bit system. > > FYI, the latest released version of OFED 1.2 is OFED 1.2.5.4. > > Regarding your rpm build errors, I am forwarding your note to `ewg' and > `general' lists of Open Fabrics. More experienced users on these two > lists can give you prompt feedbacks and guidance on the basic OFED > installation issues. > > We do not support Fedora Core 4 with OFED 1.2 and 1.2.5 I suggest you move to Fedora Core 6 Tziporet From tziporet at dev.mellanox.co.il Sun Jan 6 03:58:03 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Sun, 06 Jan 2008 13:58:03 +0200 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: References: <475607AA.301@dls.net> Message-ID: <4780C24B.8080505@mellanox.co.il> Roland Dreier wrote: > > I added a little debugging patch to src/cq.c in libmthca, and I found > > that when the failure happened, the CQE had a WQE address that was out > > of sequence -- the RQ has size 0x200 with 0x20 byte WQEs, and the CQEs > > had WQE address 0x100 then WQE address 0x0; or address 0x0 then 0x140; > > or even 0x80 twice in a row. > > > > Mellanox: can you take this test case and see if it is indeed a > > firmware issue? I could believe that there is a bug in libmthca's > > mthca_tavor_post_recv() function too... > > Hi Tziporet -- any update about this issue (bad WQE address in CQE on > non-mem-free HCAs)? > > > We will work on this soon; sorry we had other urgent issues before :-( Tziporet From jhg at billybar.com Sat Jan 5 03:59:56 2008 From: jhg at billybar.com (Myles Espinoza) Date: , 5 Jan 2008 13:59:56 +0200 Subject: [ofa-general] No doctor appointment is necessary Message-ID: <127696425.05301808352650@billybar.com> If you take special Christmas offer from CanadianPharmacy, you'll save up to 20% on you products. Only now. Don't waste time, this offer is valid till the end of the season only. CanadianPharmacy offers high quality Canadian products meeting all Pharmaceutical Standards. Wide selection of products which are cheaper than American ones are available to order online. Easy, secure and confidential ordering process.If your order is $300+, you will receive 12 bonus pills.Get the quality products you deserve. http://geocities.com/murphy.terrell/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Sun Jan 6 07:43:28 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Jan 2008 15:43:28 +0000 Subject: [ofa-general] [PATCH RFC] opensm/ib_types.h: remove ifdef WIN conditions Message-ID: <20080106154328.GA26304@sashak.voltaire.com> It was stated couple of times that in windows another instance of ib_types.h file is used. If so we don't need to keep those 'ifdef WIN' conditions here. Also this removes empty __ptr64 macro. Signed-off-by: Sasha Khapyorsky --- opensm/include/iba/ib_types.h | 36 ++++++++++-------------------------- 1 files changed, 10 insertions(+), 26 deletions(-) diff --git a/opensm/include/iba/ib_types.h b/opensm/include/iba/ib_types.h index 672184b..a438d8a 100644 --- a/opensm/include/iba/ib_types.h +++ b/opensm/include/iba/ib_types.h @@ -49,20 +49,9 @@ #endif /* __cplusplus */ BEGIN_C_DECLS -#if defined( WIN32 ) || defined( _WIN64 ) -#if defined( EXPORT_AL_SYMBOLS ) -#define OSM_EXPORT __declspec(dllexport) -#else -#define OSM_EXPORT __declspec(dllimport) -#endif -#define OSM_API __stdcall -#define OSM_CDECL __cdecl -#else #define OSM_EXPORT extern #define OSM_API #define OSM_CDECL -#define __ptr64 -#endif /****h* IBA Base/Constants * NAME * Constants @@ -8241,22 +8230,21 @@ typedef struct _ib_ioc_info { /* * The following definitions are shared between the Access Layer and VPD */ -typedef struct _ib_ca *__ptr64 ib_ca_handle_t; -typedef struct _ib_pd *__ptr64 ib_pd_handle_t; -typedef struct _ib_rdd *__ptr64 ib_rdd_handle_t; -typedef struct _ib_mr *__ptr64 ib_mr_handle_t; -typedef struct _ib_mw *__ptr64 ib_mw_handle_t; -typedef struct _ib_qp *__ptr64 ib_qp_handle_t; -typedef struct _ib_eec *__ptr64 ib_eec_handle_t; -typedef struct _ib_cq *__ptr64 ib_cq_handle_t; -typedef struct _ib_av *__ptr64 ib_av_handle_t; -typedef struct _ib_mcast *__ptr64 ib_mcast_handle_t; +typedef struct _ib_ca * ib_ca_handle_t; +typedef struct _ib_pd * ib_pd_handle_t; +typedef struct _ib_rdd * ib_rdd_handle_t; +typedef struct _ib_mr * ib_mr_handle_t; +typedef struct _ib_mw * ib_mw_handle_t; +typedef struct _ib_qp * ib_qp_handle_t; +typedef struct _ib_eec * ib_eec_handle_t; +typedef struct _ib_cq * ib_cq_handle_t; +typedef struct _ib_av * ib_av_handle_t; +typedef struct _ib_mcast * ib_mcast_handle_t; /* Currently for windows branch, use the extended version of ib special verbs struct in order to be compliant with Infinicon ib_types; later we'll change it to support OpenSM ib_types.h */ -#ifndef WIN32 /****d* Access Layer/ib_api_status_t * NAME * ib_api_status_t @@ -10710,8 +10698,4 @@ typedef struct _ib_ci_op { *****/ END_C_DECLS -#endif /* ndef WIN32 */ -#if defined( __WIN__ ) -#include -#endif #endif /* __IB_TYPES_H__ */ -- 1.5.3.4.206.g58ba4 From sashak at voltaire.com Sun Jan 6 07:49:19 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Jan 2008 15:49:19 +0000 Subject: [ofa-general] [PATCH] opensm: unify SM managers initializers Message-ID: <20080106154919.GB26304@sashak.voltaire.com> This unifies SM managers initializers, now instead of bunch of parameters only reference to SM object is passed to the initializers. Signed-off-by: Sasha Khapyorsky --- This patch is for master only yet. opensm/include/opensm/osm_drop_mgr.h | 31 ++++------- opensm/include/opensm/osm_lid_mgr.h | 27 +++------- opensm/include/opensm/osm_link_mgr.h | 23 +++----- opensm/include/opensm/osm_mcast_mgr.h | 23 +++----- opensm/include/opensm/osm_sm_state_mgr.h | 22 ++++---- opensm/include/opensm/osm_state_mgr.h | 74 ++++---------------------- opensm/include/opensm/osm_sweep_fail_ctrl.h | 24 +-------- opensm/include/opensm/osm_ucast_mgr.h | 23 +++----- opensm/opensm/osm_drop_mgr.c | 17 +++--- opensm/opensm/osm_lid_mgr.c | 28 ++++------- opensm/opensm/osm_link_mgr.c | 21 +++----- opensm/opensm/osm_mcast_mgr.c | 21 +++----- opensm/opensm/osm_sm.c | 40 +++------------ opensm/opensm/osm_sm_state_mgr.c | 21 +++---- opensm/opensm/osm_state_mgr.c | 57 ++++++-------------- opensm/opensm/osm_sweep_fail_ctrl.c | 18 +++---- opensm/opensm/osm_ucast_mgr.c | 21 +++----- 17 files changed, 150 insertions(+), 341 deletions(-) diff --git a/opensm/include/opensm/osm_drop_mgr.h b/opensm/include/opensm/osm_drop_mgr.h index 0f929e0..c9d881c 100644 --- a/opensm/include/opensm/osm_drop_mgr.h +++ b/opensm/include/opensm/osm_drop_mgr.h @@ -81,6 +81,7 @@ BEGIN_C_DECLS * Steve King, Intel * *********/ +struct osm_sm; /****s* OpenSM: Drop Manager/osm_drop_mgr_t * NAME * osm_drop_mgr_t @@ -94,6 +95,7 @@ BEGIN_C_DECLS * SYNOPSIS */ typedef struct _osm_drop_mgr { + struct osm_sm *sm; osm_subn_t *p_subn; osm_log_t *p_log; osm_req_t *p_req; @@ -102,6 +104,9 @@ typedef struct _osm_drop_mgr { } osm_drop_mgr_t; /* * FIELDS +* sm +* Pointer to the SM object. +* * p_subn * Pointer to the Subnet object for this subnet. * @@ -188,38 +193,24 @@ void osm_drop_mgr_destroy(IN osm_drop_mgr_t * const p_mgr); * * SYNOPSIS */ -ib_api_status_t osm_drop_mgr_init(IN osm_drop_mgr_t * const p_mgr, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN osm_req_t * const p_req, - IN cl_plock_t * const p_lock); +ib_api_status_t +osm_drop_mgr_init(IN osm_drop_mgr_t * const p_mgr, struct osm_sm * sm); /* * PARAMETERS * p_mgr * [in] Pointer to an osm_drop_mgr_t object to initialize. * -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. +* sm +* [in] Pointer to the SM object. * * RETURN VALUES -* IB_SUCCESS if the Drop Manager object was initialized -* successfully. +* IB_SUCCESS if the Drop Manager object was initialized successfully. * * NOTES * Allows calling other Drop Manager methods. * * SEE ALSO -* Drop Manager object, osm_drop_mgr_construct, -* osm_drop_mgr_destroy +* Drop Manager object, osm_drop_mgr_construct, osm_drop_mgr_destroy *********/ /****f* OpenSM: Drop Manager/osm_drop_mgr_process diff --git a/opensm/include/opensm/osm_lid_mgr.h b/opensm/include/opensm/osm_lid_mgr.h index 9a7d0e3..d76cf56 100644 --- a/opensm/include/opensm/osm_lid_mgr.h +++ b/opensm/include/opensm/osm_lid_mgr.h @@ -83,6 +83,7 @@ BEGIN_C_DECLS * Steve King, Intel * *********/ +struct osm_sm; /****s* OpenSM: LID Manager/osm_lid_mgr_t * NAME * osm_lid_mgr_t @@ -96,6 +97,7 @@ BEGIN_C_DECLS * SYNOPSIS */ typedef struct _osm_lid_mgr { + struct osm_sm *sm; osm_subn_t *p_subn; osm_db_t *p_db; osm_req_t *p_req; @@ -108,6 +110,9 @@ typedef struct _osm_lid_mgr { } osm_lid_mgr_t; /* * FIELDS +* sm +* Pointer to the SM object. +* * p_subn * Pointer to the Subnet object for this subnet. * @@ -214,30 +219,14 @@ void osm_lid_mgr_destroy(IN osm_lid_mgr_t * const p_mgr); * SYNOPSIS */ ib_api_status_t -osm_lid_mgr_init(IN osm_lid_mgr_t * const p_mgr, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_db_t * const p_db, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); +osm_lid_mgr_init(IN osm_lid_mgr_t * const p_mgr, IN struct osm_sm * sm); /* * PARAMETERS * p_mgr * [in] Pointer to an osm_lid_mgr_t object to initialize. * -* p_req -* [in] Pointer to the attribute Requester object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_db -* [in] Pointer to the database object. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. +* sm +* [in] Pointer to the SM object for this subnet. * * RETURN VALUES * CL_SUCCESS if the LID Manager object was initialized diff --git a/opensm/include/opensm/osm_link_mgr.h b/opensm/include/opensm/osm_link_mgr.h index 11a7352..c9cd796 100644 --- a/opensm/include/opensm/osm_link_mgr.h +++ b/opensm/include/opensm/osm_link_mgr.h @@ -81,6 +81,7 @@ BEGIN_C_DECLS * Steve King, Intel * *********/ +struct osm_sm; /****s* OpenSM: Link Manager/osm_link_mgr_t * NAME * osm_link_mgr_t @@ -94,6 +95,7 @@ BEGIN_C_DECLS * SYNOPSIS */ typedef struct _osm_link_mgr { + struct osm_sm *sm; osm_subn_t *p_subn; osm_req_t *p_req; osm_log_t *p_log; @@ -101,6 +103,9 @@ typedef struct _osm_link_mgr { } osm_link_mgr_t; /* * FIELDS +* sm +* Pointer to the SM object. +* * p_subn * Pointer to the Subnet object for this subnet. * @@ -188,26 +193,14 @@ void osm_link_mgr_destroy(IN osm_link_mgr_t * const p_mgr); * SYNOPSIS */ ib_api_status_t -osm_link_mgr_init(IN osm_link_mgr_t * const p_mgr, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); +osm_link_mgr_init(IN osm_link_mgr_t * const p_mgr, IN struct osm_sm * sm); /* * PARAMETERS * p_mgr * [in] Pointer to an osm_link_mgr_t object to initialize. * -* p_req -* [in] Pointer to the attribute Requester object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. +* sm +* [in] Pointer to the SM object. * * RETURN VALUES * IB_SUCCESS if the Link Manager object was initialized diff --git a/opensm/include/opensm/osm_mcast_mgr.h b/opensm/include/opensm/osm_mcast_mgr.h index 47b67ed..08e4b7a 100644 --- a/opensm/include/opensm/osm_mcast_mgr.h +++ b/opensm/include/opensm/osm_mcast_mgr.h @@ -83,6 +83,7 @@ BEGIN_C_DECLS * Steve King, Intel * *********/ +struct osm_sm; /****s* OpenSM: Multicast Manager/osm_mcast_mgr_t * NAME * osm_mcast_mgr_t @@ -96,6 +97,7 @@ BEGIN_C_DECLS * SYNOPSIS */ typedef struct _osm_mcast_mgr { + struct osm_sm *sm; osm_subn_t *p_subn; osm_req_t *p_req; osm_log_t *p_log; @@ -103,6 +105,9 @@ typedef struct _osm_mcast_mgr { } osm_mcast_mgr_t; /* * FIELDS +* sm +* Pointer to the SM object. +* * p_subn * Pointer to the Subnet object for this subnet. * @@ -190,26 +195,14 @@ void osm_mcast_mgr_destroy(IN osm_mcast_mgr_t * const p_mgr); * SYNOPSIS */ ib_api_status_t -osm_mcast_mgr_init(IN osm_mcast_mgr_t * const p_mgr, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); +osm_mcast_mgr_init(IN osm_mcast_mgr_t * const p_mgr, struct osm_sm * sm); /* * PARAMETERS * p_mgr * [in] Pointer to an osm_mcast_mgr_t object to initialize. * -* p_req -* [in] Pointer to the attribute Requester object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. +* sm +* [in] Pointer to the SM object. * * RETURN VALUES * IB_SUCCESS if the Multicast Manager object was initialized diff --git a/opensm/include/opensm/osm_sm_state_mgr.h b/opensm/include/opensm/osm_sm_state_mgr.h index b8ee174..db05e2f 100644 --- a/opensm/include/opensm/osm_sm_state_mgr.h +++ b/opensm/include/opensm/osm_sm_state_mgr.h @@ -87,6 +87,7 @@ BEGIN_C_DECLS * Yael Kalka, Mellanox * *********/ +struct osm_sm; /****s* OpenSM: SM State Manager/osm_sm_state_mgr_t * NAME * osm_sm_state_mgr_t @@ -100,6 +101,7 @@ BEGIN_C_DECLS * SYNOPSIS */ typedef struct _osm_sm_state_mgr { + struct osm_sm *sm; cl_spinlock_t state_lock; cl_timer_t polling_timer; uint32_t retry_number; @@ -112,6 +114,9 @@ typedef struct _osm_sm_state_mgr { /* * FIELDS +* sm +* Pointer to the SM object. +* * state_lock * Spinlock guarding the state and processes. * @@ -164,8 +169,8 @@ void osm_sm_state_mgr_construct(IN osm_sm_state_mgr_t * const p_sm_mgr); * NOTES * Allows osm_sm_state_mgr_destroy * -* Calling osm_sm_state_mgr_construct is a prerequisite to calling any other -* method except osm_sm_state_mgr_init. +* Calling osm_sm_state_mgr_construct is a prerequisite to calling any +* other method except osm_sm_state_mgr_init. * * SEE ALSO * SM State Manager object, osm_sm_state_mgr_init, @@ -215,21 +220,14 @@ void osm_sm_state_mgr_destroy(IN osm_sm_state_mgr_t * const p_sm_mgr); */ ib_api_status_t osm_sm_state_mgr_init(IN osm_sm_state_mgr_t * const p_sm_mgr, - IN osm_subn_t * const p_subn, - IN osm_req_t * const p_req, IN osm_log_t * const p_log); + struct osm_sm * sm); /* * PARAMETERS * p_sm_mgr * [in] Pointer to an osm_sm_state_mgr_t object to initialize. * -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_log -* [in] Pointer to the log object. +* sm +* [in] Pointer to the SM object. * * RETURN VALUES * IB_SUCCESS if the SM State Manager object was initialized diff --git a/opensm/include/opensm/osm_state_mgr.h b/opensm/include/opensm/osm_state_mgr.h index f51593a..968f233 100644 --- a/opensm/include/opensm/osm_state_mgr.h +++ b/opensm/include/opensm/osm_state_mgr.h @@ -85,6 +85,7 @@ BEGIN_C_DECLS * Steve King, Intel * *********/ +struct osm_sm; /****s* OpenSM: State Manager/osm_state_mgr_t * NAME * osm_state_mgr_t @@ -98,6 +99,7 @@ BEGIN_C_DECLS * SYNOPSIS */ typedef struct _osm_state_mgr { + struct osm_sm *sm; osm_subn_t *p_subn; osm_log_t *p_log; osm_lid_mgr_t *p_lid_mgr; @@ -115,6 +117,9 @@ typedef struct _osm_state_mgr { } osm_state_mgr_t; /* * FIELDS +* sm +* Pointer to the SM object. +* * p_subn * Pointer to the Subnet object for this subnet. * @@ -139,11 +144,11 @@ typedef struct _osm_state_mgr { * p_req * Pointer to the Requester object sending SMPs. * -* p_stats -* Pointer to the OpenSM statistics block. +* p_stats +* Pointer to the OpenSM statistics block. * -* p_sm_state_mgr -* Pointer to the SM state mgr object. +* p_sm_state_mgr +* Pointer to the SM state mgr object. * * p_mad_ctrl * Pointer to the SM's MAD Controller object. @@ -157,15 +162,6 @@ typedef struct _osm_state_mgr { * state * State of the SM. * -* state_step_mode -* Controls the mode of progressing to next stage: -* OSM_STATE_STEP_CONTINUOUS - normal automatic progress mode -* OSM_STATE_STEP_TAKE_ONE - do one step and stop -* OSM_STATE_STEP_BREAK - stop before taking next step -* -* next_stage_signal -* Stores the signal to be provided when running the next stage. -* * SEE ALSO * State Manager object *********/ @@ -241,60 +237,14 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const p_mgr); * SYNOPSIS */ ib_api_status_t -osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr, - IN osm_subn_t * const p_subn, - IN osm_lid_mgr_t * const p_lid_mgr, - IN osm_ucast_mgr_t * const p_ucast_mgr, - IN osm_mcast_mgr_t * const p_mcast_mgr, - IN osm_link_mgr_t * const p_link_mgr, - IN osm_drop_mgr_t * const p_drop_mgr, - IN osm_req_t * const p_req, - IN osm_stats_t * const p_stats, - IN struct _osm_sm_state_mgr *const p_sm_state_mgr, - IN const osm_sm_mad_ctrl_t * const p_mad_ctrl, - IN cl_plock_t * const p_lock, - IN cl_event_t * const p_subnet_up_event, - IN osm_log_t * const p_log); +osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr, struct osm_sm * sm); /* * PARAMETERS * p_mgr * [in] Pointer to an osm_state_mgr_t object to initialize. * -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_lid_mgr -* [in] Pointer to the LID Manager object. -* -* p_ucast_mgr -* [in] Pointer to the Unicast Manager object. -* -* p_mcast_mgr -* [in] Pointer to the Multicast Manager object. -* -* p_link_mgr -* [in] Pointer to the Link Manager object. -* -* p_drop_mgr -* [in] Pointer to the Drop Manager object. -* -* p_req -* [in] Pointer to the Request Controller object. -* -* p_stats -* [in] Pointer to the OpenSM statistics block. -* -* p_sm_state_mgr -* [in] Pointer to the SM state mgr object. -* -* p_mad_ctrl -* [in] Pointer to the SM's mad controller. -* -* p_subnet_up_event -* [in] Pointer to the event to set if/when the subnet comes up. -* -* p_log -* [in] Pointer to the log object. +* sm +* [in] Pointer to the SM object. * * RETURN VALUES * IB_SUCCESS if the State Manager object was initialized diff --git a/opensm/include/opensm/osm_sweep_fail_ctrl.h b/opensm/include/opensm/osm_sweep_fail_ctrl.h index 2fca6eb..28ae7a6 100644 --- a/opensm/include/opensm/osm_sweep_fail_ctrl.h +++ b/opensm/include/opensm/osm_sweep_fail_ctrl.h @@ -96,22 +96,13 @@ struct osm_sm; */ typedef struct _osm_sweep_fail_ctrl { struct osm_sm *sm; - osm_log_t *p_log; - cl_dispatcher_t *p_disp; cl_disp_reg_handle_t h_disp; - } osm_sweep_fail_ctrl_t; /* * FIELDS * sm * Pointer to the sm object. * -* p_log -* Pointer to the log object. -* -* p_disp -* Pointer to the Dispatcher. -* * h_disp * Handle returned from dispatcher registration. * @@ -193,25 +184,14 @@ void osm_sweep_fail_ctrl_destroy(IN osm_sweep_fail_ctrl_t * const p_ctrl); */ ib_api_status_t osm_sweep_fail_ctrl_init(IN osm_sweep_fail_ctrl_t * const p_ctrl, - IN osm_log_t * const p_log, - IN struct osm_sm * const sm, - IN cl_dispatcher_t * const p_disp); + IN struct osm_sm * sm); /* * PARAMETERS * p_ctrl * [in] Pointer to an osm_sweep_fail_ctrl_t object to initialize. * -* p_rcv -* [in] Pointer to an osm_sweep_fail_t object. -* -* p_log -* [in] Pointer to the log object. -* * sm -* [in] Pointer to the sm object. -* -* p_disp -* [in] Pointer to the OpenSM central Dispatcher. +* [in] Pointer to the SM object. * * RETURN VALUES * CL_SUCCESS if the Sweep Fail Controller object was initialized diff --git a/opensm/include/opensm/osm_ucast_mgr.h b/opensm/include/opensm/osm_ucast_mgr.h index 88d8cca..1868eae 100644 --- a/opensm/include/opensm/osm_ucast_mgr.h +++ b/opensm/include/opensm/osm_ucast_mgr.h @@ -83,6 +83,7 @@ BEGIN_C_DECLS * Steve King, Intel * *********/ +struct osm_sm; /****s* OpenSM: Unicast Manager/osm_ucast_mgr_t * NAME * osm_ucast_mgr_t @@ -96,6 +97,7 @@ BEGIN_C_DECLS * SYNOPSIS */ typedef struct _osm_ucast_mgr { + struct osm_sm *sm; osm_subn_t *p_subn; osm_req_t *p_req; osm_log_t *p_log; @@ -107,6 +109,9 @@ typedef struct _osm_ucast_mgr { } osm_ucast_mgr_t; /* * FIELDS +* sm +* Pointer to the SM object. +* * p_subn * Pointer to the Subnet object for this subnet. * @@ -210,26 +215,14 @@ void osm_ucast_mgr_destroy(IN osm_ucast_mgr_t * const p_mgr); * SYNOPSIS */ ib_api_status_t -osm_ucast_mgr_init(IN osm_ucast_mgr_t * const p_mgr, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock); +osm_ucast_mgr_init(IN osm_ucast_mgr_t * const p_mgr, IN struct osm_sm * sm); /* * PARAMETERS * p_mgr * [in] Pointer to an osm_ucast_mgr_t object to initialize. * -* p_req -* [in] Pointer to the attribute Requester object. -* -* p_subn -* [in] Pointer to the Subnet object for this subnet. -* -* p_log -* [in] Pointer to the log object. -* -* p_lock -* [in] Pointer to the OpenSM serializing lock. +* sm +* [in] Pointer to the SM object. * * RETURN VALUES * IB_SUCCESS if the Unicast Manager object was initialized diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c index 7ace399..202b33c 100644 --- a/opensm/opensm/osm_drop_mgr.c +++ b/opensm/opensm/osm_drop_mgr.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include #include @@ -88,21 +89,19 @@ void osm_drop_mgr_destroy(IN osm_drop_mgr_t * const p_mgr) /********************************************************************** **********************************************************************/ ib_api_status_t -osm_drop_mgr_init(IN osm_drop_mgr_t * const p_mgr, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, - IN osm_req_t * const p_req, IN cl_plock_t * const p_lock) +osm_drop_mgr_init(IN osm_drop_mgr_t * const p_mgr, IN osm_sm_t * sm) { ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_drop_mgr_init); + OSM_LOG_ENTER(sm->p_log, osm_drop_mgr_init); osm_drop_mgr_construct(p_mgr); - p_mgr->p_log = p_log; - p_mgr->p_subn = p_subn; - p_mgr->p_lock = p_lock; - p_mgr->p_req = p_req; + p_mgr->sm = sm; + p_mgr->p_log = sm->p_log; + p_mgr->p_subn = sm->p_subn; + p_mgr->p_lock = sm->p_lock; + p_mgr->p_req = &sm->req; OSM_LOG_EXIT(p_mgr->p_log); return (status); diff --git a/opensm/opensm/osm_lid_mgr.c b/opensm/opensm/osm_lid_mgr.c index 30e5713..3194f42 100644 --- a/opensm/opensm/osm_lid_mgr.c +++ b/opensm/opensm/osm_lid_mgr.c @@ -91,6 +91,7 @@ #include #include #include +#include #include #include #include @@ -98,7 +99,6 @@ #include #include #include -#include /********************************************************************** lid range item of qlist @@ -241,28 +241,20 @@ static void __osm_lid_mgr_validate_db(IN osm_lid_mgr_t * p_mgr) /********************************************************************** **********************************************************************/ ib_api_status_t -osm_lid_mgr_init(IN osm_lid_mgr_t * const p_mgr, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_db_t * const p_db, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) +osm_lid_mgr_init(IN osm_lid_mgr_t * const p_mgr, IN osm_sm_t *sm) { ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_lid_mgr_init); - - CL_ASSERT(p_req); - CL_ASSERT(p_subn); - CL_ASSERT(p_lock); - CL_ASSERT(p_db); + OSM_LOG_ENTER(sm->p_log, osm_lid_mgr_init); osm_lid_mgr_construct(p_mgr); - p_mgr->p_log = p_log; - p_mgr->p_subn = p_subn; - p_mgr->p_db = p_db; - p_mgr->p_lock = p_lock; - p_mgr->p_req = p_req; + p_mgr->sm = sm; + p_mgr->p_log = sm->p_log; + p_mgr->p_subn = sm->p_subn; + p_mgr->p_db = sm->p_db; + p_mgr->p_lock = sm->p_lock; + p_mgr->p_req = &sm->req; /* we initialize and restore the db domain of guid to lid map */ p_mgr->p_g2l = osm_db_domain_init(p_mgr->p_db, "/guid2lid"); @@ -280,7 +272,7 @@ osm_lid_mgr_init(IN osm_lid_mgr_t * const p_mgr, /* we use the stored guid to lid table if not forced to reassign */ if (!p_mgr->p_subn->opt.reassign_lids) { if (osm_db_restore(p_mgr->p_g2l)) { - if (p_subn->opt.exit_on_fatal) { + if (p_mgr->p_subn->opt.exit_on_fatal) { osm_log(p_mgr->p_log, OSM_LOG_SYS, "FATAL: Error restoring Guid-to-Lid persistent database\n"); status = IB_ERROR; diff --git a/opensm/opensm/osm_link_mgr.c b/opensm/opensm/osm_link_mgr.c index b96b741..d5e0956 100644 --- a/opensm/opensm/osm_link_mgr.c +++ b/opensm/opensm/osm_link_mgr.c @@ -52,6 +52,7 @@ #include #include #include +#include #include #include #include @@ -76,25 +77,19 @@ void osm_link_mgr_destroy(IN osm_link_mgr_t * const p_mgr) /********************************************************************** **********************************************************************/ ib_api_status_t -osm_link_mgr_init(IN osm_link_mgr_t * const p_mgr, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) +osm_link_mgr_init(IN osm_link_mgr_t * const p_mgr, IN osm_sm_t * sm) { ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_link_mgr_init); - - CL_ASSERT(p_req); - CL_ASSERT(p_subn); - CL_ASSERT(p_lock); + OSM_LOG_ENTER(sm->p_log, osm_link_mgr_init); osm_link_mgr_construct(p_mgr); - p_mgr->p_log = p_log; - p_mgr->p_subn = p_subn; - p_mgr->p_lock = p_lock; - p_mgr->p_req = p_req; + p_mgr->sm = sm; + p_mgr->p_log = sm->p_log; + p_mgr->p_subn = sm->p_subn; + p_mgr->p_lock = sm->p_lock; + p_mgr->p_req = &sm->req; OSM_LOG_EXIT(p_mgr->p_log); return (status); diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index f51a45a..3bdbd31 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -54,6 +54,7 @@ #include #include #include +#include #include #include #include @@ -374,25 +375,19 @@ void osm_mcast_mgr_destroy(IN osm_mcast_mgr_t * const p_mgr) /********************************************************************** **********************************************************************/ ib_api_status_t -osm_mcast_mgr_init(IN osm_mcast_mgr_t * const p_mgr, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) +osm_mcast_mgr_init(IN osm_mcast_mgr_t * const p_mgr, IN osm_sm_t * sm) { ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_mcast_mgr_init); - - CL_ASSERT(p_req); - CL_ASSERT(p_subn); - CL_ASSERT(p_lock); + OSM_LOG_ENTER(sm->p_log, osm_mcast_mgr_init); osm_mcast_mgr_construct(p_mgr); - p_mgr->p_log = p_log; - p_mgr->p_subn = p_subn; - p_mgr->p_lock = p_lock; - p_mgr->p_req = p_req; + p_mgr->sm = sm; + p_mgr->p_log = sm->p_log; + p_mgr->p_subn = sm->p_subn; + p_mgr->p_lock = sm->p_lock; + p_mgr->p_req = &sm->req; OSM_LOG_EXIT(p_mgr->p_log); return (status); diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index b60a615..af8c569 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -315,59 +315,35 @@ osm_sm_init(IN osm_sm_t * const p_sm, if (status != IB_SUCCESS) goto Exit; - status = osm_lid_mgr_init(&p_sm->lid_mgr, - &p_sm->req, - p_sm->p_subn, - p_sm->p_db, p_sm->p_log, p_sm->p_lock); + status = osm_lid_mgr_init(&p_sm->lid_mgr, p_sm); if (status != IB_SUCCESS) goto Exit; - status = osm_ucast_mgr_init(&p_sm->ucast_mgr, - &p_sm->req, - p_sm->p_subn, p_sm->p_log, p_sm->p_lock); + status = osm_ucast_mgr_init(&p_sm->ucast_mgr, p_sm); if (status != IB_SUCCESS) goto Exit; - status = osm_link_mgr_init(&p_sm->link_mgr, - &p_sm->req, - p_sm->p_subn, p_sm->p_log, p_sm->p_lock); + status = osm_link_mgr_init(&p_sm->link_mgr, p_sm); if (status != IB_SUCCESS) goto Exit; - status = osm_state_mgr_init(&p_sm->state_mgr, - p_sm->p_subn, - &p_sm->lid_mgr, - &p_sm->ucast_mgr, - &p_sm->mcast_mgr, - &p_sm->link_mgr, - &p_sm->drop_mgr, - &p_sm->req, - p_stats, - &p_sm->sm_state_mgr, - &p_sm->mad_ctrl, - p_sm->p_lock, - &p_sm->subnet_up_event, p_sm->p_log); + status = osm_state_mgr_init(&p_sm->state_mgr, p_sm); if (status != IB_SUCCESS) goto Exit; - status = osm_drop_mgr_init(&p_sm->drop_mgr, - p_sm->p_subn, - p_sm->p_log, &p_sm->req, p_sm->p_lock); + status = osm_drop_mgr_init(&p_sm->drop_mgr, p_sm); if (status != IB_SUCCESS) goto Exit; - status = osm_sweep_fail_ctrl_init(&p_sm->sweep_fail_ctrl, - p_log, p_sm, p_disp); + status = osm_sweep_fail_ctrl_init(&p_sm->sweep_fail_ctrl, p_sm); if (status != IB_SUCCESS) goto Exit; - status = osm_sm_state_mgr_init(&p_sm->sm_state_mgr, - p_sm->p_subn, &p_sm->req, p_sm->p_log); + status = osm_sm_state_mgr_init(&p_sm->sm_state_mgr, p_sm); if (status != IB_SUCCESS) goto Exit; - status = osm_mcast_mgr_init(&p_sm->mcast_mgr, - &p_sm->req, p_subn, p_log, p_lock); + status = osm_mcast_mgr_init(&p_sm->mcast_mgr, p_sm); if (status != IB_SUCCESS) goto Exit; diff --git a/opensm/opensm/osm_sm_state_mgr.c b/opensm/opensm/osm_sm_state_mgr.c index c42611c..52aa199 100644 --- a/opensm/opensm/osm_sm_state_mgr.c +++ b/opensm/opensm/osm_sm_state_mgr.c @@ -49,10 +49,11 @@ #endif /* HAVE_CONFIG_H */ #include +#include #include #include #include -#include +#include #include #include #include @@ -392,24 +393,20 @@ void osm_sm_state_mgr_destroy(IN osm_sm_state_mgr_t * const p_sm_mgr) /********************************************************************** **********************************************************************/ ib_api_status_t -osm_sm_state_mgr_init(IN osm_sm_state_mgr_t * const p_sm_mgr, - IN osm_subn_t * const p_subn, - IN osm_req_t * const p_req, IN osm_log_t * const p_log) +osm_sm_state_mgr_init(IN osm_sm_state_mgr_t * const p_sm_mgr, IN osm_sm_t * sm) { cl_status_t status; - OSM_LOG_ENTER(p_log, osm_sm_state_mgr_init); - - CL_ASSERT(p_subn); - CL_ASSERT(p_req); + OSM_LOG_ENTER(sm->p_log, osm_sm_state_mgr_init); osm_sm_state_mgr_construct(p_sm_mgr); - p_sm_mgr->p_log = p_log; - p_sm_mgr->p_req = p_req; - p_sm_mgr->p_subn = p_subn; + p_sm_mgr->sm = sm; + p_sm_mgr->p_log = sm->p_log; + p_sm_mgr->p_req = &sm->req; + p_sm_mgr->p_subn = sm->p_subn; - if (p_subn->opt.sm_inactive) { + if (p_sm_mgr->p_subn->opt.sm_inactive) { /* init the state of the SM to not active */ p_sm_mgr->p_subn->sm_state = IB_SMINFO_STATE_NOTACTIVE; __osm_sm_state_mgr_notactive_msg(p_sm_mgr); diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 4b7235f..5c196e3 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -55,6 +55,7 @@ #include #include #include +#include #include #include #include @@ -93,51 +94,27 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const p_mgr) /********************************************************************** **********************************************************************/ ib_api_status_t -osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr, - IN osm_subn_t * const p_subn, - IN osm_lid_mgr_t * const p_lid_mgr, - IN osm_ucast_mgr_t * const p_ucast_mgr, - IN osm_mcast_mgr_t * const p_mcast_mgr, - IN osm_link_mgr_t * const p_link_mgr, - IN osm_drop_mgr_t * const p_drop_mgr, - IN osm_req_t * const p_req, - IN osm_stats_t * const p_stats, - IN osm_sm_state_mgr_t * const p_sm_state_mgr, - IN const osm_sm_mad_ctrl_t * const p_mad_ctrl, - IN cl_plock_t * const p_lock, - IN cl_event_t * const p_subnet_up_event, - IN osm_log_t * const p_log) +osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr, IN osm_sm_t * sm) { - OSM_LOG_ENTER(p_log, osm_state_mgr_init); - - CL_ASSERT(p_subn); - CL_ASSERT(p_lid_mgr); - CL_ASSERT(p_ucast_mgr); - CL_ASSERT(p_mcast_mgr); - CL_ASSERT(p_link_mgr); - CL_ASSERT(p_drop_mgr); - CL_ASSERT(p_req); - CL_ASSERT(p_stats); - CL_ASSERT(p_sm_state_mgr); - CL_ASSERT(p_mad_ctrl); - CL_ASSERT(p_lock); + OSM_LOG_ENTER(sm->p_log, osm_state_mgr_init); osm_state_mgr_construct(p_mgr); - p_mgr->p_log = p_log; - p_mgr->p_subn = p_subn; - p_mgr->p_lid_mgr = p_lid_mgr; - p_mgr->p_ucast_mgr = p_ucast_mgr; - p_mgr->p_mcast_mgr = p_mcast_mgr; - p_mgr->p_link_mgr = p_link_mgr; - p_mgr->p_drop_mgr = p_drop_mgr; - p_mgr->p_mad_ctrl = p_mad_ctrl; - p_mgr->p_req = p_req; - p_mgr->p_stats = p_stats; - p_mgr->p_sm_state_mgr = p_sm_state_mgr; + p_mgr->sm = sm; + p_mgr->p_log = sm->p_log; + p_mgr->p_subn = sm->p_subn; + p_mgr->p_lid_mgr = &sm->lid_mgr; + p_mgr->p_ucast_mgr = &sm->ucast_mgr; + p_mgr->p_mcast_mgr = &sm->mcast_mgr; + p_mgr->p_link_mgr = &sm->link_mgr; + p_mgr->p_drop_mgr = &sm->drop_mgr; + p_mgr->p_mad_ctrl = &sm->mad_ctrl; + p_mgr->p_req = &sm->req; + p_mgr->p_stats = &sm->p_subn->p_osm->stats; + p_mgr->p_sm_state_mgr = &sm->sm_state_mgr; p_mgr->state = OSM_SM_STATE_IDLE; - p_mgr->p_lock = p_lock; - p_mgr->p_subnet_up_event = p_subnet_up_event; + p_mgr->p_lock = sm->p_lock; + p_mgr->p_subnet_up_event = &sm->subnet_up_event; OSM_LOG_EXIT(p_mgr->p_log); return IB_SUCCESS; diff --git a/opensm/opensm/osm_sweep_fail_ctrl.c b/opensm/opensm/osm_sweep_fail_ctrl.c index b46573d..92b3165 100644 --- a/opensm/opensm/osm_sweep_fail_ctrl.c +++ b/opensm/opensm/osm_sweep_fail_ctrl.c @@ -59,7 +59,7 @@ static void __osm_sweep_fail_ctrl_disp_callback(IN void *context, { osm_sweep_fail_ctrl_t *const p_ctrl = (osm_sweep_fail_ctrl_t *) context; - OSM_LOG_ENTER(p_ctrl->p_log, __osm_sweep_fail_ctrl_disp_callback); + OSM_LOG_ENTER(p_ctrl->sm->p_log, __osm_sweep_fail_ctrl_disp_callback); UNUSED_PARAM(p_data); /* @@ -67,7 +67,7 @@ static void __osm_sweep_fail_ctrl_disp_callback(IN void *context, */ osm_sm_signal(p_ctrl->sm, OSM_SIGNAL_LIGHT_SWEEP_FAIL); - OSM_LOG_EXIT(p_ctrl->p_log); + OSM_LOG_EXIT(p_ctrl->sm->p_log); } /********************************************************************** @@ -90,26 +90,22 @@ void osm_sweep_fail_ctrl_destroy(IN osm_sweep_fail_ctrl_t * const p_ctrl) **********************************************************************/ ib_api_status_t osm_sweep_fail_ctrl_init(IN osm_sweep_fail_ctrl_t * const p_ctrl, - IN osm_log_t * const p_log, - IN osm_sm_t * const sm, - IN cl_dispatcher_t * const p_disp) + IN osm_sm_t * const sm) { ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_sweep_fail_ctrl_init); + OSM_LOG_ENTER(sm->p_log, osm_sweep_fail_ctrl_init); osm_sweep_fail_ctrl_construct(p_ctrl); - p_ctrl->p_log = p_log; - p_ctrl->p_disp = p_disp; p_ctrl->sm = sm; - p_ctrl->h_disp = cl_disp_register(p_disp, + p_ctrl->h_disp = cl_disp_register(sm->p_disp, OSM_MSG_LIGHT_SWEEP_FAIL, __osm_sweep_fail_ctrl_disp_callback, p_ctrl); if (p_ctrl->h_disp == CL_DISP_INVALID_HANDLE) { - osm_log(p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_sweep_fail_ctrl_init: ERR 3501: " "Dispatcher registration failed\n"); status = IB_INSUFFICIENT_RESOURCES; @@ -117,6 +113,6 @@ osm_sweep_fail_ctrl_init(IN osm_sweep_fail_ctrl_t * const p_ctrl, } Exit: - OSM_LOG_EXIT(p_log); + OSM_LOG_EXIT(sm->p_log); return (status); } diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c index 1841219..d7c045e 100644 --- a/opensm/opensm/osm_ucast_mgr.c +++ b/opensm/opensm/osm_ucast_mgr.c @@ -55,6 +55,7 @@ #include #include #include +#include #include #include #include @@ -86,25 +87,19 @@ void osm_ucast_mgr_destroy(IN osm_ucast_mgr_t * const p_mgr) /********************************************************************** **********************************************************************/ ib_api_status_t -osm_ucast_mgr_init(IN osm_ucast_mgr_t * const p_mgr, - IN osm_req_t * const p_req, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN cl_plock_t * const p_lock) +osm_ucast_mgr_init(IN osm_ucast_mgr_t * const p_mgr, IN osm_sm_t * sm) { ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_log, osm_ucast_mgr_init); - - CL_ASSERT(p_req); - CL_ASSERT(p_subn); - CL_ASSERT(p_lock); + OSM_LOG_ENTER(sm->p_log, osm_ucast_mgr_init); osm_ucast_mgr_construct(p_mgr); - p_mgr->p_log = p_log; - p_mgr->p_subn = p_subn; - p_mgr->p_lock = p_lock; - p_mgr->p_req = p_req; + p_mgr->sm = sm; + p_mgr->p_log = sm->p_log; + p_mgr->p_subn = sm->p_subn; + p_mgr->p_lock = sm->p_lock; + p_mgr->p_req = &sm->req; p_mgr->lft_buf = malloc(IB_LID_UCAST_END_HO + 1); if (!p_mgr->lft_buf) -- 1.5.3.4.206.g58ba4 From sashak at voltaire.com Sun Jan 6 07:52:22 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Jan 2008 15:52:22 +0000 Subject: [ofa-general] [PATCH] opensm: cleanup dummy SM req and resp objects In-Reply-To: <20080106154919.GB26304@sashak.voltaire.com> References: <20080106154919.GB26304@sashak.voltaire.com> Message-ID: <20080106155222.GC26304@sashak.voltaire.com> Cleanup dummy SM req and resp objects, eliminate data duplications. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_drop_mgr.h | 5 - opensm/include/opensm/osm_lid_mgr.h | 5 - opensm/include/opensm/osm_link_mgr.h | 5 - opensm/include/opensm/osm_mcast_mgr.h | 5 - opensm/include/opensm/osm_perfmgr.h | 1 - opensm/include/opensm/osm_pkey.h | 2 +- opensm/include/opensm/osm_req.h | 107 ---------------------- opensm/include/opensm/osm_resp.h | 37 -------- opensm/include/opensm/osm_sm.h | 143 +++++++++++++++++++++++++++-- opensm/include/opensm/osm_sm_state_mgr.h | 4 - opensm/include/opensm/osm_state_mgr.h | 4 - opensm/include/opensm/osm_ucast_mgr.h | 5 - opensm/opensm/osm_drop_mgr.c | 1 - opensm/opensm/osm_lid_mgr.c | 3 +- opensm/opensm/osm_link_mgr.c | 3 +- opensm/opensm/osm_mcast_mgr.c | 5 +- opensm/opensm/osm_node_info_rcv.c | 18 +--- opensm/opensm/osm_perfmgr.c | 8 +- opensm/opensm/osm_pkey_mgr.c | 38 ++++----- opensm/opensm/osm_port_info_rcv.c | 14 +-- opensm/opensm/osm_qos.c | 32 ++++---- opensm/opensm/osm_req.c | 86 +++++------------- opensm/opensm/osm_resp.c | 64 +++----------- opensm/opensm/osm_sm.c | 14 --- opensm/opensm/osm_sm_state_mgr.c | 3 +- opensm/opensm/osm_sminfo_rcv.c | 8 +- opensm/opensm/osm_state_mgr.c | 17 ++--- opensm/opensm/osm_sw_info_rcv.c | 4 +- opensm/opensm/osm_trap_rcv.c | 5 +- opensm/opensm/osm_ucast_mgr.c | 8 +-- opensm/opensm/osm_vl_arb_rcv.c | 1 - 31 files changed, 234 insertions(+), 421 deletions(-) diff --git a/opensm/include/opensm/osm_drop_mgr.h b/opensm/include/opensm/osm_drop_mgr.h index c9d881c..758fe60 100644 --- a/opensm/include/opensm/osm_drop_mgr.h +++ b/opensm/include/opensm/osm_drop_mgr.h @@ -52,7 +52,6 @@ #include #include #include -#include #include #ifdef __cplusplus @@ -98,7 +97,6 @@ typedef struct _osm_drop_mgr { struct osm_sm *sm; osm_subn_t *p_subn; osm_log_t *p_log; - osm_req_t *p_req; cl_plock_t *p_lock; } osm_drop_mgr_t; @@ -113,9 +111,6 @@ typedef struct _osm_drop_mgr { * p_log * Pointer to the log object. * -* p_req -* Pointer to the Request object. -* * p_lock * Pointer to the serializing lock. * diff --git a/opensm/include/opensm/osm_lid_mgr.h b/opensm/include/opensm/osm_lid_mgr.h index d76cf56..15b230f 100644 --- a/opensm/include/opensm/osm_lid_mgr.h +++ b/opensm/include/opensm/osm_lid_mgr.h @@ -51,7 +51,6 @@ #include #include #include -#include #include #include #include @@ -100,7 +99,6 @@ typedef struct _osm_lid_mgr { struct osm_sm *sm; osm_subn_t *p_subn; osm_db_t *p_db; - osm_req_t *p_req; osm_log_t *p_log; cl_plock_t *p_lock; boolean_t send_set_reqs; @@ -119,9 +117,6 @@ typedef struct _osm_lid_mgr { * p_db * Pointer to the database (persistency) object * -* p_req -* Pointer to the Requester object sending SMPs. -* * p_log * Pointer to the log object. * diff --git a/opensm/include/opensm/osm_link_mgr.h b/opensm/include/opensm/osm_link_mgr.h index c9cd796..3c799a6 100644 --- a/opensm/include/opensm/osm_link_mgr.h +++ b/opensm/include/opensm/osm_link_mgr.h @@ -51,7 +51,6 @@ #include #include #include -#include #include #include @@ -97,7 +96,6 @@ struct osm_sm; typedef struct _osm_link_mgr { struct osm_sm *sm; osm_subn_t *p_subn; - osm_req_t *p_req; osm_log_t *p_log; cl_plock_t *p_lock; } osm_link_mgr_t; @@ -109,9 +107,6 @@ typedef struct _osm_link_mgr { * p_subn * Pointer to the Subnet object for this subnet. * -* p_req -* Pointer to the Requester object sending SMPs. -* * p_log * Pointer to the log object. * diff --git a/opensm/include/opensm/osm_mcast_mgr.h b/opensm/include/opensm/osm_mcast_mgr.h index 08e4b7a..5f89c0b 100644 --- a/opensm/include/opensm/osm_mcast_mgr.h +++ b/opensm/include/opensm/osm_mcast_mgr.h @@ -51,7 +51,6 @@ #include #include #include -#include #include #include #include @@ -99,7 +98,6 @@ struct osm_sm; typedef struct _osm_mcast_mgr { struct osm_sm *sm; osm_subn_t *p_subn; - osm_req_t *p_req; osm_log_t *p_log; cl_plock_t *p_lock; } osm_mcast_mgr_t; @@ -111,9 +109,6 @@ typedef struct _osm_mcast_mgr { * p_subn * Pointer to the Subnet object for this subnet. * -* p_req -* Pointer to the Requester object sending SMPs. -* * p_log * Pointer to the log object. * diff --git a/opensm/include/opensm/osm_perfmgr.h b/opensm/include/opensm/osm_perfmgr.h index 4bd05f5..9152f5f 100644 --- a/opensm/include/opensm/osm_perfmgr.h +++ b/opensm/include/opensm/osm_perfmgr.h @@ -46,7 +46,6 @@ #include #include #include -#include #include #include #include diff --git a/opensm/include/opensm/osm_pkey.h b/opensm/include/opensm/osm_pkey.h index 0dce001..c1cdcc6 100644 --- a/opensm/include/opensm/osm_pkey.h +++ b/opensm/include/opensm/osm_pkey.h @@ -38,10 +38,10 @@ #include #include +#include #include #include #include -#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { diff --git a/opensm/include/opensm/osm_req.h b/opensm/include/opensm/osm_req.h index 1d6b26e..6a32f70 100644 --- a/opensm/include/opensm/osm_req.h +++ b/opensm/include/opensm/osm_req.h @@ -233,112 +233,5 @@ osm_req_init(IN osm_req_t * const p_req, * osm_req_destroy *********/ -/****f* OpenSM: Generic Requester/osm_req_get -* NAME -* osm_req_get -* -* DESCRIPTION -* Starts the process to transmit a directed route request for -* the attribute. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_req_get(IN const osm_req_t * const p_req, - IN const osm_dr_path_t * const p_path, - IN const uint16_t attr_id, - IN const uint32_t attr_mod, - IN const cl_disp_msgid_t err_msg, - IN const osm_madw_context_t * const p_context); -/* -* PARAMETERS -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_path -* [in] Pointer to the directed route path to the node -* from which to retrieve the attribute. -* -* attr_id -* [in] Attribute ID to request. -* -* attr_mod -* [in] Attribute modifier for this request. -* -* err_msg -* [in] Message id with which to post this MAD if an error occurs. -* -* p_context -* [in] Mad wrapper context structure to be copied into the wrapper -* context, and thus visible to the recipient of the response. -* -* RETURN VALUES -* IB_SUCCESS if the request was successful. -* -* NOTES -* This function asynchronously requests the specified attribute. -* The response from the node will be routed through the Dispatcher -* to the appropriate receive controller object. -* -* SEE ALSO -* Generic Requester -*********/ -/****f* OpenSM: Generic Requester/osm_req_set -* NAME -* osm_req_set -* -* DESCRIPTION -* Starts the process to transmit a directed route Set() request. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_req_set(IN const osm_req_t * const p_req, - IN const osm_dr_path_t * const p_path, - IN const uint8_t * const p_payload, - IN const size_t payload_size, - IN const uint16_t attr_id, - IN const uint32_t attr_mod, - IN const cl_disp_msgid_t err_msg, - IN const osm_madw_context_t * const p_context); -/* -* PARAMETERS -* p_req -* [in] Pointer to an osm_req_t object. -* -* p_path -* [in] Pointer to the directed route path of the recipient. -* -* p_payload -* [in] Pointer to the SMP payload to send. -* -* payload_size -* [in] The size of the payload to be copied to the SMP data field. -* -* attr_id -* [in] Attribute ID to request. -* -* attr_mod -* [in] Attribute modifier for this request. -* -* err_msg -* [in] Message id with which to post this MAD if an error occurs. -* -* p_context -* [in] Mad wrapper context structure to be copied into the wrapper -* context, and thus visible to the recipient of the response. -* -* RETURN VALUES -* IB_SUCCESS if the request was successful. -* -* NOTES -* This function asynchronously requests the specified attribute. -* The response from the node will be routed through the Dispatcher -* to the appropriate receive controller object. -* -* SEE ALSO -* Generic Requester -*********/ - END_C_DECLS #endif /* _OSM_REQ_H_ */ diff --git a/opensm/include/opensm/osm_resp.h b/opensm/include/opensm/osm_resp.h index 4299f98..115d227 100644 --- a/opensm/include/opensm/osm_resp.h +++ b/opensm/include/opensm/osm_resp.h @@ -225,42 +225,5 @@ osm_resp_init(IN osm_resp_t * const p_resp, * osm_resp_destroy *********/ -/****f* OpenSM: Generic Responder/osm_resp_send -* NAME -* osm_resp_send -* -* DESCRIPTION -* Starts the process to transmit a directed route response. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_resp_send(IN const osm_resp_t * const p_resp, - IN const osm_madw_t * const p_req_madw, - IN const ib_net16_t status, IN const uint8_t * const p_payload); -/* -* PARAMETERS -* p_resp -* [in] Pointer to an osm_resp_t object. -* -* p_madw -* [in] Pointer to the MAD Wrapper object for the requesting MAD -* to which this response is generated. -* -* status -* [in] Status for this response. -* -* p_payload -* [in] Pointer to the payload of the response MAD. -* -* RETURN VALUES -* IB_SUCCESS if the response was successful. -* -* NOTES -* -* SEE ALSO -* Generic Responder -*********/ - END_C_DECLS #endif /* _OSM_RESP_H_ */ diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h index f68d59e..e0b3d01 100644 --- a/opensm/include/opensm/osm_sm.h +++ b/opensm/include/opensm/osm_sm.h @@ -59,8 +59,6 @@ #include #include #include -#include -#include #include #include #include @@ -133,8 +131,6 @@ typedef struct osm_sm { atomic32_t sm_trans_id; cl_spinlock_t mgrp_lock; cl_qlist_t mgrp_list; - osm_req_t req; - osm_resp_t resp; osm_sm_mad_ctrl_t mad_ctrl; osm_lid_mgr_t lid_mgr; osm_ucast_mgr_t ucast_mgr; @@ -176,12 +172,6 @@ typedef struct osm_sm { * p_vl15 * Pointer to the VL15 interface. * -* req -* Generic MAD attribute requester. -* -* resp -* MAD attribute responder. -* * mad_ctrl * MAD Controller. * @@ -413,6 +403,139 @@ osm_sm_bind(IN osm_sm_t * const p_sm, IN const ib_net64_t port_guid); * SEE ALSO *********/ +/****f* OpenSM: SM/osm_req_get +* NAME +* osm_req_get +* +* DESCRIPTION +* Starts the process to transmit a directed route request for +* the attribute. +* +* SYNOPSIS +*/ +ib_api_status_t +osm_req_get(IN osm_sm_t * sm, + IN const osm_dr_path_t * const p_path, + IN const uint16_t attr_id, + IN const uint32_t attr_mod, + IN const cl_disp_msgid_t err_msg, + IN const osm_madw_context_t * const p_context); +/* +* PARAMETERS +* sm +* [in] Pointer to an osm_sm_t object. +* +* p_path +* [in] Pointer to the directed route path to the node +* from which to retrieve the attribute. +* +* attr_id +* [in] Attribute ID to request. +* +* attr_mod +* [in] Attribute modifier for this request. +* +* err_msg +* [in] Message id with which to post this MAD if an error occurs. +* +* p_context +* [in] Mad wrapper context structure to be copied into the wrapper +* context, and thus visible to the recipient of the response. +* +* RETURN VALUES +* IB_SUCCESS if the request was successful. +* +* NOTES +* This function asynchronously requests the specified attribute. +* The response from the node will be routed through the Dispatcher +* to the appropriate receive controller object. +*********/ +/****f* OpenSM: SM/osm_req_set +* NAME +* osm_req_set +* +* DESCRIPTION +* Starts the process to transmit a directed route Set() request. +* +* SYNOPSIS +*/ +ib_api_status_t +osm_req_set(IN osm_sm_t * sm, + IN const osm_dr_path_t * const p_path, + IN const uint8_t * const p_payload, + IN const size_t payload_size, + IN const uint16_t attr_id, + IN const uint32_t attr_mod, + IN const cl_disp_msgid_t err_msg, + IN const osm_madw_context_t * const p_context); +/* +* PARAMETERS +* sm +* [in] Pointer to an osm_sm_t object. +* +* p_path +* [in] Pointer to the directed route path of the recipient. +* +* p_payload +* [in] Pointer to the SMP payload to send. +* +* payload_size +* [in] The size of the payload to be copied to the SMP data field. +* +* attr_id +* [in] Attribute ID to request. +* +* attr_mod +* [in] Attribute modifier for this request. +* +* err_msg +* [in] Message id with which to post this MAD if an error occurs. +* +* p_context +* [in] Mad wrapper context structure to be copied into the wrapper +* context, and thus visible to the recipient of the response. +* +* RETURN VALUES +* IB_SUCCESS if the request was successful. +* +* NOTES +* This function asynchronously requests the specified attribute. +* The response from the node will be routed through the Dispatcher +* to the appropriate receive controller object. +*********/ +/****f* OpenSM: SM/osm_resp_send +* NAME +* osm_resp_send +* +* DESCRIPTION +* Starts the process to transmit a directed route response. +* +* SYNOPSIS +*/ +ib_api_status_t +osm_resp_send(IN osm_sm_t * sm, + IN const osm_madw_t * const p_req_madw, + IN const ib_net16_t status, IN const uint8_t * const p_payload); +/* +* PARAMETERS +* p_resp +* [in] Pointer to an osm_resp_t object. +* +* p_madw +* [in] Pointer to the MAD Wrapper object for the requesting MAD +* to which this response is generated. +* +* status +* [in] Status for this response. +* +* p_payload +* [in] Pointer to the payload of the response MAD. +* +* RETURN VALUES +* IB_SUCCESS if the response was successful. +* +*********/ + /****f* OpenSM: SM/osm_sm_mcgrp_join * NAME * osm_sm_mcgrp_join diff --git a/opensm/include/opensm/osm_sm_state_mgr.h b/opensm/include/opensm/osm_sm_state_mgr.h index db05e2f..3007554 100644 --- a/opensm/include/opensm/osm_sm_state_mgr.h +++ b/opensm/include/opensm/osm_sm_state_mgr.h @@ -107,7 +107,6 @@ typedef struct _osm_sm_state_mgr { uint32_t retry_number; ib_net64_t master_guid; osm_subn_t *p_subn; - osm_req_t *p_req; osm_log_t *p_log; osm_remote_sm_t *p_polling_sm; } osm_sm_state_mgr_t; @@ -133,9 +132,6 @@ typedef struct _osm_sm_state_mgr { * p_subn * Pointer to the Subnet object for this subnet. * -* p_req -* Pointer to the generic attribute request object. -* * p_log * Pointer to the log object. * diff --git a/opensm/include/opensm/osm_state_mgr.h b/opensm/include/opensm/osm_state_mgr.h index 968f233..f3886ec 100644 --- a/opensm/include/opensm/osm_state_mgr.h +++ b/opensm/include/opensm/osm_state_mgr.h @@ -107,7 +107,6 @@ typedef struct _osm_state_mgr { osm_mcast_mgr_t *p_mcast_mgr; osm_link_mgr_t *p_link_mgr; osm_drop_mgr_t *p_drop_mgr; - osm_req_t *p_req; osm_stats_t *p_stats; struct _osm_sm_state_mgr *p_sm_state_mgr; const osm_sm_mad_ctrl_t *p_mad_ctrl; @@ -141,9 +140,6 @@ typedef struct _osm_state_mgr { * p_drop_mgr * Pointer to the Drop Manager object. * -* p_req -* Pointer to the Requester object sending SMPs. -* * p_stats * Pointer to the OpenSM statistics block. * diff --git a/opensm/include/opensm/osm_ucast_mgr.h b/opensm/include/opensm/osm_ucast_mgr.h index 1868eae..2acab49 100644 --- a/opensm/include/opensm/osm_ucast_mgr.h +++ b/opensm/include/opensm/osm_ucast_mgr.h @@ -51,7 +51,6 @@ #include #include #include -#include #include #include #include @@ -99,7 +98,6 @@ struct osm_sm; typedef struct _osm_ucast_mgr { struct osm_sm *sm; osm_subn_t *p_subn; - osm_req_t *p_req; osm_log_t *p_log; cl_plock_t *p_lock; boolean_t is_dor; @@ -115,9 +113,6 @@ typedef struct _osm_ucast_mgr { * p_subn * Pointer to the Subnet object for this subnet. * -* p_req -* Pointer to the Requester object sending SMPs. -* * p_log * Pointer to the log object. * diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c index 202b33c..39ceaa1 100644 --- a/opensm/opensm/osm_drop_mgr.c +++ b/opensm/opensm/osm_drop_mgr.c @@ -101,7 +101,6 @@ osm_drop_mgr_init(IN osm_drop_mgr_t * const p_mgr, IN osm_sm_t * sm) p_mgr->p_log = sm->p_log; p_mgr->p_subn = sm->p_subn; p_mgr->p_lock = sm->p_lock; - p_mgr->p_req = &sm->req; OSM_LOG_EXIT(p_mgr->p_log); return (status); diff --git a/opensm/opensm/osm_lid_mgr.c b/opensm/opensm/osm_lid_mgr.c index 3194f42..f248676 100644 --- a/opensm/opensm/osm_lid_mgr.c +++ b/opensm/opensm/osm_lid_mgr.c @@ -254,7 +254,6 @@ osm_lid_mgr_init(IN osm_lid_mgr_t * const p_mgr, IN osm_sm_t *sm) p_mgr->p_subn = sm->p_subn; p_mgr->p_db = sm->p_db; p_mgr->p_lock = sm->p_lock; - p_mgr->p_req = &sm->req; /* we initialize and restore the db domain of guid to lid map */ p_mgr->p_g2l = osm_db_domain_init(p_mgr->p_db, "/guid2lid"); @@ -1149,7 +1148,7 @@ __osm_lid_mgr_set_physp_pi(IN osm_lid_mgr_t * const p_mgr, if (send_set) { p_mgr->send_set_reqs = TRUE; - status = osm_req_set(p_mgr->p_req, + status = osm_req_set(p_mgr->sm, osm_physp_get_dr_path_ptr(p_physp), payload, sizeof(payload), diff --git a/opensm/opensm/osm_link_mgr.c b/opensm/opensm/osm_link_mgr.c index d5e0956..3d38362 100644 --- a/opensm/opensm/osm_link_mgr.c +++ b/opensm/opensm/osm_link_mgr.c @@ -89,7 +89,6 @@ osm_link_mgr_init(IN osm_link_mgr_t * const p_mgr, IN osm_sm_t * sm) p_mgr->p_log = sm->p_log; p_mgr->p_subn = sm->p_subn; p_mgr->p_lock = sm->p_lock; - p_mgr->p_req = &sm->req; OSM_LOG_EXIT(p_mgr->p_log); return (status); @@ -371,7 +370,7 @@ __osm_link_mgr_set_physp_pi(IN osm_link_mgr_t * const p_mgr, send_set = TRUE; if (send_set) - status = osm_req_set(p_mgr->p_req, + status = osm_req_set(p_mgr->sm, osm_physp_get_dr_path_ptr(p_physp), payload, sizeof(payload), diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index 3bdbd31..be220c5 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -387,7 +387,6 @@ osm_mcast_mgr_init(IN osm_mcast_mgr_t * const p_mgr, IN osm_sm_t * sm) p_mgr->p_log = sm->p_log; p_mgr->p_subn = sm->p_subn; p_mgr->p_lock = sm->p_lock; - p_mgr->p_req = &sm->req; OSM_LOG_EXIT(p_mgr->p_log); return (status); @@ -447,9 +446,7 @@ __osm_mcast_mgr_set_tbl(IN osm_mcast_mgr_t * const p_mgr, block_id_ho = block_num + (position << 28); - status = osm_req_set(p_mgr->p_req, - p_path, - (void *)block, + status = osm_req_set(p_mgr->sm, p_path, (void *)block, sizeof(block), IB_MAD_ATTR_MCAST_FWD_TBL, cl_hton32(block_id_ho), diff --git a/opensm/opensm/osm_node_info_rcv.c b/opensm/opensm/osm_node_info_rcv.c index b84788a..50287dc 100644 --- a/opensm/opensm/osm_node_info_rcv.c +++ b/opensm/opensm/osm_node_info_rcv.c @@ -55,7 +55,6 @@ #include #include #include -#include #include #include #include @@ -116,9 +115,7 @@ static void requery_dup_node_info(IN osm_sm_t * sm, context.ni_context.dup_port_num = p_physp->port_num; context.ni_context.dup_count = count; - status = osm_req_get(&sm->req, - &path, - IB_MAD_ATTR_NODE_INFO, + status = osm_req_get(sm, &path, IB_MAD_ATTR_NODE_INFO, 0, CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) @@ -313,8 +310,7 @@ __osm_ni_rcv_process_new_node(IN osm_sm_t * sm, context.pi_context.light_sweep = FALSE; context.pi_context.active_transition = FALSE; - status = osm_req_get(&sm->req, - osm_physp_get_dr_path_ptr(p_physp), + status = osm_req_get(sm, osm_physp_get_dr_path_ptr(p_physp), IB_MAD_ATTR_PORT_INFO, cl_hton32(port_num), CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) @@ -367,8 +363,7 @@ __osm_ni_rcv_get_node_desc(IN osm_sm_t * sm, context.nd_context.node_guid = osm_node_get_node_guid(p_node); - status = osm_req_get(&sm->req, - osm_physp_get_dr_path_ptr(p_physp), + status = osm_req_get(sm, osm_physp_get_dr_path_ptr(p_physp), IB_MAD_ATTR_NODE_DESC, 0, CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) @@ -509,8 +504,7 @@ __osm_ni_rcv_process_existing_ca_or_router(IN osm_sm_t * sm, context.pi_context.update_master_sm_base_lid = FALSE; context.pi_context.light_sweep = FALSE; - status = osm_req_get(&sm->req, - osm_physp_get_dr_path_ptr(p_physp), + status = osm_req_get(sm, osm_physp_get_dr_path_ptr(p_physp), IB_MAD_ATTR_PORT_INFO, cl_hton32(port_num), CL_DISP_MSGID_NONE, &context); @@ -552,9 +546,7 @@ __osm_ni_rcv_process_switch(IN osm_sm_t * sm, context.si_context.light_sweep = FALSE; /* Request a SwitchInfo attribute */ - status = osm_req_get(&sm->req, - &dr_path, - IB_MAD_ATTR_SWITCH_INFO, + status = osm_req_get(sm, &dr_path, IB_MAD_ATTR_SWITCH_INFO, 0, CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) /* continue despite error */ diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index a7c0abc..76ef080 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -612,8 +612,7 @@ static int sweep_hop_1(osm_sm_t * sm) path_array[1] = port_num; osm_dr_path_init(&hop_1_path, h_bind, 1, path_array); - status = osm_req_get(&sm->req, - &hop_1_path, + status = osm_req_get(sm, &hop_1_path, IB_MAD_ATTR_NODE_INFO, 0, CL_DISP_MSGID_NONE, &context); @@ -647,7 +646,7 @@ static int sweep_hop_1(osm_sm_t * sm) path_array[1] = port_num; osm_dr_path_init(&hop_1_path, h_bind, 1, path_array); - status = osm_req_get(&sm->req, &hop_1_path, + status = osm_req_get(sm, &hop_1_path, IB_MAD_ATTR_NODE_INFO, 0, CL_DISP_MSGID_NONE, &context); @@ -708,8 +707,7 @@ static int sweep_hop_0(osm_sm_t * const sm) } osm_dr_path_init(&dr_path, h_bind, 0, path_array); - status = osm_req_get(&sm->req, - &dr_path, IB_MAD_ATTR_NODE_INFO, 0, + status = osm_req_get(sm, &dr_path, IB_MAD_ATTR_NODE_INFO, 0, CL_DISP_MSGID_NONE, NULL); if (status != IB_SUCCESS) diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c index 58eed04..e098d9b 100644 --- a/opensm/opensm/osm_pkey_mgr.c +++ b/opensm/opensm/osm_pkey_mgr.c @@ -87,7 +87,7 @@ pkey_mgr_get_physp_max_blocks(IN const osm_subn_t * p_subn, */ static void pkey_mgr_process_physical_port(IN osm_log_t * p_log, - IN const osm_req_t * p_req, + IN osm_sm_t * sm, IN const ib_net16_t pkey, IN osm_physp_t * p_physp) { @@ -149,8 +149,7 @@ pkey_mgr_process_physical_port(IN osm_log_t * p_log, /********************************************************************** **********************************************************************/ static void -pkey_mgr_process_partition_table(osm_log_t * p_log, - const osm_req_t * p_req, +pkey_mgr_process_partition_table(osm_log_t * p_log, osm_sm_t * sm, const osm_prtn_t * p_prtn, const boolean_t full) { @@ -169,7 +168,7 @@ pkey_mgr_process_partition_table(osm_log_t * p_log, i_next = cl_map_next(i); p_physp = cl_map_obj(i); if (p_physp && osm_physp_is_valid(p_physp)) - pkey_mgr_process_physical_port(p_log, p_req, pkey, + pkey_mgr_process_physical_port(p_log, sm, pkey, p_physp); } } @@ -177,7 +176,7 @@ pkey_mgr_process_partition_table(osm_log_t * p_log, /********************************************************************** **********************************************************************/ static ib_api_status_t -pkey_mgr_update_pkey_entry(IN const osm_req_t * p_req, +pkey_mgr_update_pkey_entry(IN osm_sm_t * sm, IN const osm_physp_t * p_physp, IN const ib_pkey_table_t * block, IN const uint16_t block_index) @@ -192,7 +191,7 @@ pkey_mgr_update_pkey_entry(IN const osm_req_t * p_req, attr_mod = block_index; if (osm_node_get_type(p_node) == IB_NODE_TYPE_SWITCH) attr_mod |= osm_physp_get_port_num(p_physp) << 16; - return osm_req_set(p_req, osm_physp_get_dr_path_ptr(p_physp), + return osm_req_set(sm, osm_physp_get_dr_path_ptr(p_physp), (uint8_t *) block, sizeof(*block), IB_MAD_ATTR_P_KEY_TABLE, cl_hton32(attr_mod), CL_DISP_MSGID_NONE, &context); @@ -201,8 +200,7 @@ pkey_mgr_update_pkey_entry(IN const osm_req_t * p_req, /********************************************************************** **********************************************************************/ static boolean_t -pkey_mgr_enforce_partition(IN osm_log_t * p_log, - IN const osm_req_t * p_req, +pkey_mgr_enforce_partition(IN osm_log_t * p_log, osm_sm_t * sm, IN osm_physp_t * p_physp, IN const boolean_t enforce) { osm_madw_context_t context; @@ -242,7 +240,7 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log, context.pi_context.light_sweep = FALSE; context.pi_context.active_transition = FALSE; - status = osm_req_set(p_req, osm_physp_get_dr_path_ptr(p_physp), + status = osm_req_set(sm, osm_physp_get_dr_path_ptr(p_physp), payload, sizeof(payload), IB_MAD_ATTR_PORT_INFO, cl_hton32(osm_physp_get_port_num(p_physp)), @@ -270,8 +268,7 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log, /********************************************************************** **********************************************************************/ -static boolean_t pkey_mgr_update_port(osm_log_t * p_log, - osm_req_t * p_req, +static boolean_t pkey_mgr_update_port(osm_log_t * p_log, osm_sm_t * sm, const osm_port_t * const p_port) { osm_physp_t *p_physp; @@ -300,7 +297,7 @@ static boolean_t pkey_mgr_update_port(osm_log_t * p_log, p_pkey_tbl = osm_physp_get_mod_pkey_tbl(p_physp); num_of_blocks = osm_pkey_tbl_get_num_blocks(p_pkey_tbl); max_num_of_blocks = - pkey_mgr_get_physp_max_blocks(p_req->p_subn, p_physp); + pkey_mgr_get_physp_max_blocks(sm->p_subn, p_physp); if (p_pkey_tbl->max_blocks > max_num_of_blocks) { osm_log(p_log, OSM_LOG_INFO, "pkey_mgr_update_port: " @@ -379,7 +376,7 @@ static boolean_t pkey_mgr_update_port(osm_log_t * p_log, continue; status = - pkey_mgr_update_pkey_entry(p_req, p_physp, new_block, + pkey_mgr_update_pkey_entry(sm, p_physp, new_block, block_index); if (status == IB_SUCCESS) { osm_log(p_log, OSM_LOG_DEBUG, @@ -407,8 +404,7 @@ static boolean_t pkey_mgr_update_port(osm_log_t * p_log, /********************************************************************** **********************************************************************/ static boolean_t -pkey_mgr_update_peer_port(osm_log_t * p_log, - const osm_req_t * p_req, +pkey_mgr_update_peer_port(osm_log_t * p_log, osm_sm_t * sm, const osm_subn_t * p_subn, const osm_port_t * const p_port, boolean_t enforce) { @@ -452,7 +448,7 @@ pkey_mgr_update_peer_port(osm_log_t * p_log, enforce = FALSE; } - if (pkey_mgr_enforce_partition(p_log, p_req, peer, enforce)) + if (pkey_mgr_enforce_partition(p_log, sm, peer, enforce)) port_info_set = TRUE; if (enforce == FALSE) @@ -470,7 +466,7 @@ pkey_mgr_update_peer_port(osm_log_t * p_log, if (!peer_block || memcmp(peer_block, block, sizeof(*peer_block))) { status = - pkey_mgr_update_pkey_entry(p_req, peer, block, + pkey_mgr_update_pkey_entry(sm, peer, block, block_index); if (status == IB_SUCCESS) ret_val = TRUE; @@ -529,9 +525,9 @@ osm_signal_t osm_pkey_mgr_process(IN osm_opensm_t * p_osm) while (p_next != cl_qmap_end(p_tbl)) { p_prtn = (osm_prtn_t *) p_next; p_next = cl_qmap_next(p_next); - pkey_mgr_process_partition_table(&p_osm->log, &p_osm->sm.req, + pkey_mgr_process_partition_table(&p_osm->log, &p_osm->sm, p_prtn, FALSE); - pkey_mgr_process_partition_table(&p_osm->log, &p_osm->sm.req, + pkey_mgr_process_partition_table(&p_osm->log, &p_osm->sm, p_prtn, TRUE); } @@ -541,10 +537,10 @@ osm_signal_t osm_pkey_mgr_process(IN osm_opensm_t * p_osm) while (p_next != cl_qmap_end(p_tbl)) { p_port = (osm_port_t *) p_next; p_next = cl_qmap_next(p_next); - if (pkey_mgr_update_port(&p_osm->log, &p_osm->sm.req, p_port)) + if (pkey_mgr_update_port(&p_osm->log, &p_osm->sm, p_port)) signal = OSM_SIGNAL_DONE_PENDING; if ((osm_node_get_type(p_port->p_node) != IB_NODE_TYPE_SWITCH) - && pkey_mgr_update_peer_port(&p_osm->log, &p_osm->sm.req, + && pkey_mgr_update_peer_port(&p_osm->log, &p_osm->sm, &p_osm->subn, p_port, !p_osm->subn.opt. no_partition_enforcement)) diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c index 3775665..8cc33c5 100644 --- a/opensm/opensm/osm_port_info_rcv.c +++ b/opensm/opensm/osm_port_info_rcv.c @@ -55,7 +55,6 @@ #include #include #include -#include #include #include #include @@ -191,7 +190,7 @@ __osm_pi_rcv_process_endport(IN osm_sm_t * sm, memset(&context, 0, sizeof(context)); context.smi_context.set_method = FALSE; context.smi_context.port_guid = port_guid; - status = osm_req_get(&sm->req, + status = osm_req_get(sm, osm_physp_get_dr_path_ptr (p_physp), IB_MAD_ATTR_SM_INFO, 0, @@ -295,7 +294,7 @@ __osm_pi_rcv_process_switch_port(IN osm_sm_t * sm, context.ni_context.port_num = osm_physp_get_port_num(p_physp); - status = osm_req_get(&sm->req, + status = osm_req_get(sm, &path, IB_MAD_ATTR_NODE_INFO, 0, @@ -373,8 +372,7 @@ __osm_pi_rcv_process_ca_or_router_port(IN osm_sm_t * sm, /********************************************************************** **********************************************************************/ static void get_pkey_table(IN osm_log_t * p_log, - IN osm_req_t * p_req, - IN osm_subn_t * const p_subn, + IN osm_sm_t * sm, IN osm_node_t * const p_node, IN osm_physp_t * const p_physp) { @@ -426,9 +424,7 @@ static void get_pkey_table(IN osm_log_t * p_log, attr_mod_ho = block_num; else attr_mod_ho = block_num | (port_num << 16); - status = osm_req_get(p_req, - &path, - IB_MAD_ATTR_P_KEY_TABLE, + status = osm_req_get(sm, &path, IB_MAD_ATTR_P_KEY_TABLE, cl_hton32(attr_mod_ho), CL_DISP_MSGID_NONE, &context); @@ -454,7 +450,7 @@ __osm_pi_rcv_get_pkey_slvl_vla_tables(IN osm_sm_t * sm, { OSM_LOG_ENTER(sm->p_log, __osm_pi_rcv_get_pkey_slvl_vla_tables); - get_pkey_table(sm->p_log, &sm->req, sm->p_subn, p_node, p_physp); + get_pkey_table(sm->p_log, sm, p_node, p_physp); OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c index 1c1e1f1..c437028 100644 --- a/opensm/opensm/osm_qos.c +++ b/opensm/opensm/osm_qos.c @@ -68,7 +68,7 @@ static void qos_build_config(struct qos_config *cfg, /* * QoS primitives */ -static ib_api_status_t vlarb_update_table_block(osm_req_t * p_req, +static ib_api_status_t vlarb_update_table_block(osm_sm_t * sm, osm_physp_t * p, uint8_t port_num, unsigned force_update, @@ -100,13 +100,13 @@ static ib_api_status_t vlarb_update_table_block(osm_req_t * p_req, context.vla_context.set_method = TRUE; attr_mod = ((block_num + 1) << 16) | port_num; - return osm_req_set(p_req, osm_physp_get_dr_path_ptr(p), + return osm_req_set(sm, osm_physp_get_dr_path_ptr(p), (uint8_t *) & block, sizeof(block), IB_MAD_ATTR_VL_ARBITRATION, cl_hton32(attr_mod), CL_DISP_MSGID_NONE, &context); } -static ib_api_status_t vlarb_update(osm_req_t * p_req, +static ib_api_status_t vlarb_update(osm_sm_t * sm, osm_physp_t * p, uint8_t port_num, unsigned force_update, const struct qos_config *qcfg) @@ -118,7 +118,7 @@ static ib_api_status_t vlarb_update(osm_req_t * p_req, if (p_pi->vl_arb_low_cap > 0) { len = p_pi->vl_arb_low_cap < IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK ? p_pi->vl_arb_low_cap : IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK; - if ((status = vlarb_update_table_block(p_req, p, port_num, + if ((status = vlarb_update_table_block(sm, p, port_num, force_update, &qcfg->vlarb_low[0], len, 0)) != IB_SUCCESS) @@ -126,7 +126,7 @@ static ib_api_status_t vlarb_update(osm_req_t * p_req, } if (p_pi->vl_arb_low_cap > IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK) { len = p_pi->vl_arb_low_cap % IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK; - if ((status = vlarb_update_table_block(p_req, p, port_num, + if ((status = vlarb_update_table_block(sm, p, port_num, force_update, &qcfg->vlarb_low[1], len, 1)) != IB_SUCCESS) @@ -135,7 +135,7 @@ static ib_api_status_t vlarb_update(osm_req_t * p_req, if (p_pi->vl_arb_high_cap > 0) { len = p_pi->vl_arb_high_cap < IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK ? p_pi->vl_arb_high_cap : IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK; - if ((status = vlarb_update_table_block(p_req, p, port_num, + if ((status = vlarb_update_table_block(sm, p, port_num, force_update, &qcfg->vlarb_high[0], len, 2)) != IB_SUCCESS) @@ -143,7 +143,7 @@ static ib_api_status_t vlarb_update(osm_req_t * p_req, } if (p_pi->vl_arb_high_cap > IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK) { len = p_pi->vl_arb_high_cap % IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK; - if ((status = vlarb_update_table_block(p_req, p, port_num, + if ((status = vlarb_update_table_block(sm, p, port_num, force_update, &qcfg->vlarb_high[1], len, 3)) != IB_SUCCESS) @@ -153,7 +153,7 @@ static ib_api_status_t vlarb_update(osm_req_t * p_req, return status; } -static ib_api_status_t sl2vl_update_table(osm_req_t * p_req, +static ib_api_status_t sl2vl_update_table(osm_sm_t * sm, osm_physp_t * p, uint8_t in_port, uint8_t out_port, unsigned force_update, @@ -187,13 +187,13 @@ static ib_api_status_t sl2vl_update_table(osm_req_t * p_req, context.slvl_context.port_guid = osm_physp_get_port_guid(p); context.slvl_context.set_method = TRUE; attr_mod = in_port << 8 | out_port; - return osm_req_set(p_req, osm_physp_get_dr_path_ptr(p), + return osm_req_set(sm, osm_physp_get_dr_path_ptr(p), (uint8_t *) & tbl, sizeof(tbl), IB_MAD_ATTR_SLVL_TABLE, cl_hton32(attr_mod), CL_DISP_MSGID_NONE, &context); } -static ib_api_status_t sl2vl_update(osm_req_t * p_req, osm_port_t * p_port, +static ib_api_status_t sl2vl_update(osm_sm_t * sm, osm_port_t * p_port, osm_physp_t * p, uint8_t port_num, unsigned force_update, const struct qos_config *qcfg) @@ -220,7 +220,7 @@ static ib_api_status_t sl2vl_update(osm_req_t * p_req, osm_port_t * p_port, for (i = 0; i < num_ports; i++) { status = - sl2vl_update_table(p_req, p, i, port_num, + sl2vl_update_table(sm, p, i, port_num, force_update, &qcfg->sl2vl); if (status != IB_SUCCESS) return status; @@ -229,7 +229,7 @@ static ib_api_status_t sl2vl_update(osm_req_t * p_req, osm_port_t * p_port, return IB_SUCCESS; } -static ib_api_status_t qos_physp_setup(osm_log_t * p_log, osm_req_t * p_req, +static ib_api_status_t qos_physp_setup(osm_log_t * p_log, osm_sm_t * sm, osm_port_t * p_port, osm_physp_t * p, uint8_t port_num, unsigned force_update, @@ -243,7 +243,7 @@ static ib_api_status_t qos_physp_setup(osm_log_t * p_log, osm_req_t * p_req, p->vl_high_limit = qcfg->vl_high_limit; /* setup VLArbitration */ - status = vlarb_update(p_req, p, port_num, force_update, qcfg); + status = vlarb_update(sm, p, port_num, force_update, qcfg); if (status != IB_SUCCESS) { osm_log(p_log, OSM_LOG_ERROR, "qos_physp_setup: ERR 6202 : " @@ -254,7 +254,7 @@ static ib_api_status_t qos_physp_setup(osm_log_t * p_log, osm_req_t * p_req, } /* setup SL2VL tables */ - status = sl2vl_update(p_req, p_port, p, port_num, force_update, qcfg); + status = sl2vl_update(sm, p_port, p, port_num, force_update, qcfg); if (status != IB_SUCCESS) { osm_log(p_log, OSM_LOG_ERROR, "qos_physp_setup: ERR 6203 : " @@ -316,7 +316,7 @@ osm_signal_t osm_qos_setup(osm_opensm_t * p_osm) force_update = p_physp->need_update || p_osm->subn.need_update; status = - qos_physp_setup(&p_osm->log, &p_osm->sm.req, + qos_physp_setup(&p_osm->log, &p_osm->sm, p_port, p_physp, i, force_update, &swe_config); } @@ -336,7 +336,7 @@ osm_signal_t osm_qos_setup(osm_opensm_t * p_osm) continue; force_update = p_physp->need_update || p_osm->subn.need_update; - status = qos_physp_setup(&p_osm->log, &p_osm->sm.req, + status = qos_physp_setup(&p_osm->log, &p_osm->sm, p_port, p_physp, 0, force_update, cfg); } diff --git a/opensm/opensm/osm_req.c b/opensm/opensm/osm_req.c index ed1d19c..0524ce2 100644 --- a/opensm/opensm/osm_req.c +++ b/opensm/opensm/osm_req.c @@ -52,7 +52,6 @@ #include #include #include -#include #include #include #include @@ -63,51 +62,10 @@ #include /********************************************************************** - **********************************************************************/ -void osm_req_construct(IN osm_req_t * const p_req) -{ - CL_ASSERT(p_req); - - memset(p_req, 0, sizeof(*p_req)); -} - -/********************************************************************** - **********************************************************************/ -void osm_req_destroy(IN osm_req_t * const p_req) -{ - CL_ASSERT(p_req); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_req_init(IN osm_req_t * const p_req, - IN osm_mad_pool_t * const p_pool, - IN osm_vl15_t * const p_vl15, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN atomic32_t * const p_sm_trans_id) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_req_init); - - osm_req_construct(p_req); - p_req->p_log = p_log; - - p_req->p_pool = p_pool; - p_req->p_vl15 = p_vl15; - p_req->p_subn = p_subn; - p_req->p_sm_trans_id = p_sm_trans_id; - - OSM_LOG_EXIT(p_log); - return (status); -} - -/********************************************************************** The plock MAY or MAY NOT be held before calling this function. **********************************************************************/ ib_api_status_t -osm_req_get(IN const osm_req_t * const p_req, +osm_req_get(IN osm_sm_t * sm, IN const osm_dr_path_t * const p_path, IN const uint16_t attr_id, IN const uint32_t attr_mod, @@ -118,9 +76,9 @@ osm_req_get(IN const osm_req_t * const p_req, ib_api_status_t status = IB_SUCCESS; ib_net64_t tid; - CL_ASSERT(p_req); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_req->p_log, osm_req_get); + OSM_LOG_ENTER(sm->p_log, osm_req_get); CL_ASSERT(p_path); CL_ASSERT(attr_id); @@ -131,20 +89,20 @@ osm_req_get(IN const osm_req_t * const p_req, /* p_context may be NULL. */ - p_madw = osm_mad_pool_get(p_req->p_pool, + p_madw = osm_mad_pool_get(sm->p_mad_pool, p_path->h_bind, MAD_BLOCK_SIZE, NULL); if (p_madw == NULL) { - osm_log(p_req->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_req_get: ERR 1101: " "Unable to acquire MAD\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - tid = cl_hton64((uint64_t) cl_atomic_inc(p_req->p_sm_trans_id)); + tid = cl_hton64((uint64_t) cl_atomic_inc(&sm->sm_trans_id)); - if (osm_log_is_active(p_req->p_log, OSM_LOG_DEBUG)) { - osm_log(p_req->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "osm_req_get: " "Getting %s (0x%X), modifier 0x%X, TID 0x%" PRIx64 "\n", ib_get_sm_attr_str(attr_id), @@ -158,7 +116,7 @@ osm_req_get(IN const osm_req_t * const p_req, attr_id, attr_mod, p_path->hop_count, - p_req->p_subn->opt.m_key, + sm->p_subn->opt.m_key, p_path->path, IB_LID_PERMISSIVE, IB_LID_PERMISSIVE); p_madw->mad_addr.dest_lid = IB_LID_PERMISSIVE; @@ -175,10 +133,10 @@ osm_req_get(IN const osm_req_t * const p_req, if (p_context) p_madw->context = *p_context; - osm_vl15_post(p_req->p_vl15, p_madw); + osm_vl15_post(sm->p_vl15, p_madw); Exit: - OSM_LOG_EXIT(p_req->p_log); + OSM_LOG_EXIT(sm->p_log); return (status); } @@ -186,7 +144,7 @@ osm_req_get(IN const osm_req_t * const p_req, The plock MAY or MAY NOT be held before calling this function. **********************************************************************/ ib_api_status_t -osm_req_set(IN const osm_req_t * const p_req, +osm_req_set(IN osm_sm_t * sm, IN const osm_dr_path_t * const p_path, IN const uint8_t * const p_payload, IN const size_t payload_size, @@ -199,9 +157,9 @@ osm_req_set(IN const osm_req_t * const p_req, ib_api_status_t status = IB_SUCCESS; ib_net64_t tid; - CL_ASSERT(p_req); + CL_ASSERT(sm); - OSM_LOG_ENTER(p_req->p_log, osm_req_set); + OSM_LOG_ENTER(sm->p_log, osm_req_set); CL_ASSERT(p_path); CL_ASSERT(attr_id); @@ -213,20 +171,20 @@ osm_req_set(IN const osm_req_t * const p_req, /* p_context may be NULL. */ - p_madw = osm_mad_pool_get(p_req->p_pool, + p_madw = osm_mad_pool_get(sm->p_mad_pool, p_path->h_bind, MAD_BLOCK_SIZE, NULL); if (p_madw == NULL) { - osm_log(p_req->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_req_set: ERR 1102: " "Unable to acquire MAD\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; } - tid = cl_hton64((uint64_t) cl_atomic_inc(p_req->p_sm_trans_id)); + tid = cl_hton64((uint64_t) cl_atomic_inc(&sm->sm_trans_id)); - if (osm_log_is_active(p_req->p_log, OSM_LOG_DEBUG)) { - osm_log(p_req->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "osm_req_set: " "Setting %s (0x%X), modifier 0x%X, TID 0x%" PRIx64 "\n", ib_get_sm_attr_str(attr_id), @@ -240,7 +198,7 @@ osm_req_set(IN const osm_req_t * const p_req, attr_id, attr_mod, p_path->hop_count, - p_req->p_subn->opt.m_key, + sm->p_subn->opt.m_key, p_path->path, IB_LID_PERMISSIVE, IB_LID_PERMISSIVE); p_madw->mad_addr.dest_lid = IB_LID_PERMISSIVE; @@ -259,9 +217,9 @@ osm_req_set(IN const osm_req_t * const p_req, memcpy(osm_madw_get_smp_ptr(p_madw)->data, p_payload, payload_size); - osm_vl15_post(p_req->p_vl15, p_madw); + osm_vl15_post(sm->p_vl15, p_madw); Exit: - OSM_LOG_EXIT(p_req->p_log); + OSM_LOG_EXIT(sm->p_log); return (status); } diff --git a/opensm/opensm/osm_resp.c b/opensm/opensm/osm_resp.c index e5beb45..285559a 100644 --- a/opensm/opensm/osm_resp.c +++ b/opensm/opensm/osm_resp.c @@ -52,7 +52,6 @@ #include #include #include -#include #include #include #include @@ -64,51 +63,14 @@ /********************************************************************** **********************************************************************/ -void osm_resp_construct(IN osm_resp_t * const p_resp) -{ - memset(p_resp, 0, sizeof(*p_resp)); -} - -/********************************************************************** - **********************************************************************/ -void osm_resp_destroy(IN osm_resp_t * const p_resp) -{ - CL_ASSERT(p_resp); -} - -/********************************************************************** - **********************************************************************/ -ib_api_status_t -osm_resp_init(IN osm_resp_t * const p_resp, - IN osm_mad_pool_t * const p_pool, - IN osm_vl15_t * const p_vl15, - IN osm_subn_t * const p_subn, IN osm_log_t * const p_log) -{ - ib_api_status_t status = IB_SUCCESS; - - OSM_LOG_ENTER(p_log, osm_resp_init); - - osm_resp_construct(p_resp); - - p_resp->p_log = p_log; - p_resp->p_pool = p_pool; - p_resp->p_vl15 = p_vl15; - p_resp->p_subn = p_subn; - - OSM_LOG_EXIT(p_log); - return (status); -} - -/********************************************************************** - **********************************************************************/ static void -osm_resp_make_resp_smp(IN const osm_resp_t * const p_resp, +osm_resp_make_resp_smp(IN osm_sm_t * sm, IN const ib_smp_t * const p_src_smp, IN const ib_net16_t status, IN const uint8_t * const p_payload, OUT ib_smp_t * const p_dest_smp) { - OSM_LOG_ENTER(p_resp->p_log, osm_resp_make_resp_smp); + OSM_LOG_ENTER(sm->p_log, osm_resp_make_resp_smp); CL_ASSERT(p_dest_smp); CL_ASSERT(p_src_smp); @@ -123,7 +85,7 @@ osm_resp_make_resp_smp(IN const osm_resp_t * const p_resp, p_dest_smp->method = IB_MAD_METHOD_TRAP_REPRESS; p_dest_smp->status = 0; } else { - osm_log(p_resp->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_resp_make_resp_smp: ERR 1302: " "src smp method unsupported 0x%X\n", p_src_smp->method); goto Exit; @@ -137,13 +99,13 @@ osm_resp_make_resp_smp(IN const osm_resp_t * const p_resp, memcpy(&p_dest_smp->data, p_payload, IB_SMP_DATA_SIZE); Exit: - OSM_LOG_EXIT(p_resp->p_log); + OSM_LOG_EXIT(sm->p_log); } /********************************************************************** **********************************************************************/ ib_api_status_t -osm_resp_send(IN const osm_resp_t * const p_resp, +osm_resp_send(IN osm_sm_t * sm, IN const osm_madw_t * const p_req_madw, IN const ib_net16_t mad_status, IN const uint8_t * const p_payload) @@ -153,7 +115,7 @@ osm_resp_send(IN const osm_resp_t * const p_resp, osm_madw_t *p_madw; ib_api_status_t status = IB_SUCCESS; - OSM_LOG_ENTER(p_resp->p_log, osm_resp_send); + OSM_LOG_ENTER(sm->p_log, osm_resp_send); CL_ASSERT(p_req_madw); CL_ASSERT(p_payload); @@ -162,12 +124,12 @@ osm_resp_send(IN const osm_resp_t * const p_resp, if (osm_exit_flag) goto Exit; - p_madw = osm_mad_pool_get(p_resp->p_pool, + p_madw = osm_mad_pool_get(sm->p_mad_pool, osm_madw_get_bind_handle(p_req_madw), MAD_BLOCK_SIZE, NULL); if (p_madw == NULL) { - osm_log(p_resp->p_log, OSM_LOG_ERROR, + osm_log(sm->p_log, OSM_LOG_ERROR, "osm_resp_send: ERR 1301: " "Unable to acquire MAD\n"); status = IB_INSUFFICIENT_RESOURCES; goto Exit; @@ -179,7 +141,7 @@ osm_resp_send(IN const osm_resp_t * const p_resp, */ p_smp = osm_madw_get_smp_ptr(p_madw); p_req_smp = osm_madw_get_smp_ptr(p_req_madw); - osm_resp_make_resp_smp(p_resp, p_req_smp, mad_status, p_payload, p_smp); + osm_resp_make_resp_smp(sm, p_req_smp, mad_status, p_payload, p_smp); p_madw->mad_addr.dest_lid = p_req_madw->mad_addr.addr_type.smi.source_lid; p_madw->mad_addr.addr_type.smi.source_lid = @@ -188,8 +150,8 @@ osm_resp_send(IN const osm_resp_t * const p_resp, p_madw->resp_expected = FALSE; p_madw->fail_msg = CL_DISP_MSGID_NONE; - if (osm_log_is_active(p_resp->p_log, OSM_LOG_DEBUG)) { - osm_log(p_resp->p_log, OSM_LOG_DEBUG, + if (osm_log_is_active(sm->p_log, OSM_LOG_DEBUG)) { + osm_log(sm->p_log, OSM_LOG_DEBUG, "osm_resp_send: " "Responding to %s (0x%X)" "\n\t\t\t\tattribute modifier 0x%X, TID 0x%" PRIx64 @@ -198,9 +160,9 @@ osm_resp_send(IN const osm_resp_t * const p_resp, cl_ntoh64(p_smp->trans_id)); } - osm_vl15_post(p_resp->p_vl15, p_madw); + osm_vl15_post(sm->p_vl15, p_madw); Exit: - OSM_LOG_EXIT(p_resp->p_log); + OSM_LOG_EXIT(sm->p_log); return (status); } diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index af8c569..f2d259d 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -158,8 +158,6 @@ void osm_sm_construct(IN osm_sm_t * const p_sm) cl_event_wheel_construct(&p_sm->trap_aging_tracker); cl_thread_construct(&p_sm->sweeper); cl_spinlock_construct(&p_sm->mgrp_lock); - osm_req_construct(&p_sm->req); - osm_resp_construct(&p_sm->resp); osm_sm_mad_ctrl_construct(&p_sm->mad_ctrl); osm_lid_mgr_construct(&p_sm->lid_mgr); osm_ucast_mgr_construct(&p_sm->ucast_mgr); @@ -224,8 +222,6 @@ void osm_sm_shutdown(IN osm_sm_t * const p_sm) void osm_sm_destroy(IN osm_sm_t * const p_sm) { OSM_LOG_ENTER(p_sm->p_log, osm_sm_destroy); - osm_req_destroy(&p_sm->req); - osm_resp_destroy(&p_sm->resp); osm_lid_mgr_destroy(&p_sm->lid_mgr); osm_ucast_mgr_destroy(&p_sm->ucast_mgr); osm_link_mgr_destroy(&p_sm->link_mgr); @@ -301,16 +297,6 @@ osm_sm_init(IN osm_sm_t * const p_sm, if (status != IB_SUCCESS) goto Exit; - status = osm_req_init(&p_sm->req, - p_mad_pool, - p_vl15, p_subn, p_log, &p_sm->sm_trans_id); - if (status != IB_SUCCESS) - goto Exit; - - status = osm_resp_init(&p_sm->resp, p_mad_pool, p_vl15, p_subn, p_log); - if (status != IB_SUCCESS) - goto Exit; - status = cl_event_wheel_init(&p_sm->trap_aging_tracker); if (status != IB_SUCCESS) goto Exit; diff --git a/opensm/opensm/osm_sm_state_mgr.c b/opensm/opensm/osm_sm_state_mgr.c index 52aa199..8cd3276 100644 --- a/opensm/opensm/osm_sm_state_mgr.c +++ b/opensm/opensm/osm_sm_state_mgr.c @@ -238,7 +238,7 @@ __osm_sm_state_mgr_send_master_sm_info_req(IN osm_sm_state_mgr_t * p_sm_mgr) context.smi_context.port_guid = p_port->guid; context.smi_context.set_method = FALSE; - status = osm_req_get(p_sm_mgr->p_req, + status = osm_req_get(p_sm_mgr->sm, osm_physp_get_dr_path_ptr(p_port->p_physp), IB_MAD_ATTR_SM_INFO, 0, CL_DISP_MSGID_NONE, &context); @@ -403,7 +403,6 @@ osm_sm_state_mgr_init(IN osm_sm_state_mgr_t * const p_sm_mgr, IN osm_sm_t * sm) p_sm_mgr->sm = sm; p_sm_mgr->p_log = sm->p_log; - p_sm_mgr->p_req = &sm->req; p_sm_mgr->p_subn = sm->p_subn; if (p_sm_mgr->p_subn->opt.sm_inactive) { diff --git a/opensm/opensm/osm_sminfo_rcv.c b/opensm/opensm/osm_sminfo_rcv.c index b150edd..63cc393 100644 --- a/opensm/opensm/osm_sminfo_rcv.c +++ b/opensm/opensm/osm_sminfo_rcv.c @@ -128,7 +128,7 @@ __osm_sminfo_rcv_process_get_request(IN osm_sm_t * sm, p_smi->sm_key = 0; } - status = osm_resp_send(&sm->resp, p_madw, 0, payload); + status = osm_resp_send(sm, p_madw, 0, payload); if (status != IB_SUCCESS) { osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_get_request: ERR 2F02: " @@ -241,7 +241,7 @@ __osm_sminfo_rcv_process_set_request(IN osm_sm_t * sm, osm_get_sm_mgr_state_str(ib_sminfo_get_state (sm_smi))); /* send a response with error code */ - status = osm_resp_send(&sm->resp, p_madw, 7, payload); + status = osm_resp_send(sm, p_madw, 7, payload); if (status != IB_SUCCESS) osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F05: " @@ -291,7 +291,7 @@ __osm_sminfo_rcv_process_set_request(IN osm_sm_t * sm, osm_get_sm_mgr_state_str(ib_sminfo_get_state (sm_smi))); /* send a response with error code */ - status = osm_resp_send(&sm->resp, p_madw, 7, payload); + status = osm_resp_send(sm, p_madw, 7, payload); if (status != IB_SUCCESS) osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F08: " @@ -302,7 +302,7 @@ __osm_sminfo_rcv_process_set_request(IN osm_sm_t * sm, } /* the SubnSet(SMInfo) command is ok. Send a response. */ - status = osm_resp_send(&sm->resp, p_madw, 0, payload); + status = osm_resp_send(sm, p_madw, 0, payload); if (status != IB_SUCCESS) osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_sminfo_rcv_process_set_request: ERR 2F09: " diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 5c196e3..e4130cc 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -109,7 +109,6 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr, IN osm_sm_t * sm) p_mgr->p_link_mgr = &sm->link_mgr; p_mgr->p_drop_mgr = &sm->drop_mgr; p_mgr->p_mad_ctrl = &sm->mad_ctrl; - p_mgr->p_req = &sm->req; p_mgr->p_stats = &sm->p_subn->p_osm->stats; p_mgr->p_sm_state_mgr = &sm->sm_state_mgr; p_mgr->state = OSM_SM_STATE_IDLE; @@ -486,9 +485,7 @@ static void __osm_state_mgr_get_sw_info(IN cl_map_item_t * const p_object, mad_context.si_context.set_method = FALSE; mad_context.si_context.light_sweep = TRUE; - status = osm_req_get(p_mgr->p_req, - p_dr_path, - IB_MAD_ATTR_SWITCH_INFO, 0, + status = osm_req_get(p_mgr->sm, p_dr_path, IB_MAD_ATTR_SWITCH_INFO, 0, OSM_MSG_LIGHT_SWEEP_FAIL, &mad_context); if (status != IB_SUCCESS) { @@ -532,8 +529,7 @@ __osm_state_mgr_get_remote_port_info(IN osm_state_mgr_t * const p_mgr, /* note that with some negative logic - if the query failed it means that * there is no point in going to heavy sweep */ - status = osm_req_get(p_mgr->p_req, - &rem_node_dr_path, + status = osm_req_get(p_mgr->sm, &rem_node_dr_path, IB_MAD_ATTR_PORT_INFO, 0, CL_DISP_MSGID_NONE, &mad_context); @@ -595,7 +591,7 @@ static ib_api_status_t __osm_state_mgr_sweep_hop_0(IN osm_state_mgr_t * CL_PLOCK_RELEASE(p_mgr->p_lock); osm_dr_path_init(&dr_path, h_bind, 0, path_array); - status = osm_req_get(p_mgr->p_req, + status = osm_req_get(p_mgr->sm, &dr_path, IB_MAD_ATTR_NODE_INFO, 0, CL_DISP_MSGID_NONE, NULL); @@ -813,8 +809,7 @@ static ib_api_status_t __osm_state_mgr_sweep_hop_1(IN osm_state_mgr_t * path_array[1] = port_num; osm_dr_path_init(&hop_1_path, h_bind, 1, path_array); - status = osm_req_get(p_mgr->p_req, - &hop_1_path, + status = osm_req_get(p_mgr->sm, &hop_1_path, IB_MAD_ATTR_NODE_INFO, 0, CL_DISP_MSGID_NONE, &context); @@ -849,7 +844,7 @@ static ib_api_status_t __osm_state_mgr_sweep_hop_1(IN osm_state_mgr_t * osm_dr_path_init(&hop_1_path, h_bind, 1, path_array); status = - osm_req_get(p_mgr->p_req, &hop_1_path, + osm_req_get(p_mgr->sm, &hop_1_path, IB_MAD_ATTR_NODE_INFO, 0, CL_DISP_MSGID_NONE, &context); @@ -1110,7 +1105,7 @@ __osm_state_mgr_send_handover(IN osm_state_mgr_t * const p_mgr, p_smi->sm_key = 0; } - status = osm_req_set(p_mgr->p_req, + status = osm_req_set(p_mgr->sm, osm_physp_get_dr_path_ptr(p_port->p_physp), payload, sizeof(payload), IB_MAD_ATTR_SM_INFO, IB_SMINFO_ATTR_MOD_HANDOVER, diff --git a/opensm/opensm/osm_sw_info_rcv.c b/opensm/opensm/osm_sw_info_rcv.c index d9bd21b..962f6c7 100644 --- a/opensm/opensm/osm_sw_info_rcv.c +++ b/opensm/opensm/osm_sw_info_rcv.c @@ -111,9 +111,7 @@ __osm_si_rcv_get_port_info(IN osm_sm_t * sm, p_smp->hop_count, p_smp->initial_path); for (port_num = 0; port_num < num_ports; port_num++) { - status = osm_req_get(&sm->req, - &dr_path, - IB_MAD_ATTR_PORT_INFO, + status = osm_req_get(sm, &dr_path, IB_MAD_ATTR_PORT_INFO, cl_hton32(port_num), CL_DISP_MSGID_NONE, &context); if (status != IB_SUCCESS) { diff --git a/opensm/opensm/osm_trap_rcv.c b/opensm/opensm/osm_trap_rcv.c index 196bca2..b7a8c40 100644 --- a/opensm/opensm/osm_trap_rcv.c +++ b/opensm/opensm/osm_trap_rcv.c @@ -393,7 +393,7 @@ __osm_trap_rcv_process_request(IN osm_sm_t * sm, "__osm_trap_rcv_process_request: ERR 3809: " "Failed to find source physical port for trap\n"); - status = osm_resp_send(&sm->resp, &tmp_madw, 0, payload); + status = osm_resp_send(sm, &tmp_madw, 0, payload); if (status != IB_SUCCESS) { osm_log(sm->p_log, OSM_LOG_ERROR, "__osm_trap_rcv_process_request: ERR 3802: " @@ -523,8 +523,7 @@ __osm_trap_rcv_process_request(IN osm_sm_t * sm, active_transition = FALSE; status = - osm_req_set(&sm->p_subn-> - p_osm->sm.req, + osm_req_set(sm, osm_physp_get_dr_path_ptr (p_physp), payload, diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c index d7c045e..88e29e9 100644 --- a/opensm/opensm/osm_ucast_mgr.c +++ b/opensm/opensm/osm_ucast_mgr.c @@ -99,7 +99,6 @@ osm_ucast_mgr_init(IN osm_ucast_mgr_t * const p_mgr, IN osm_sm_t * sm) p_mgr->p_log = sm->p_log; p_mgr->p_subn = sm->p_subn; p_mgr->p_lock = sm->p_lock; - p_mgr->p_req = &sm->req; p_mgr->lft_buf = malloc(IB_LID_UCAST_END_HO + 1); if (!p_mgr->lft_buf) @@ -431,9 +430,7 @@ osm_ucast_mgr_set_fwd_table(IN osm_ucast_mgr_t * const p_mgr, context.si_context.node_guid = osm_node_get_node_guid(p_node); context.si_context.set_method = TRUE; - status = osm_req_set(p_mgr->p_req, - p_path, - (uint8_t *) & si, + status = osm_req_set(p_mgr->sm, p_path, (uint8_t *) & si, sizeof(si), IB_MAD_ATTR_SWITCH_INFO, 0, CL_DISP_MSGID_NONE, &context); @@ -469,8 +466,7 @@ osm_ucast_mgr_set_fwd_table(IN osm_ucast_mgr_t * const p_mgr, "Writing FT block %u\n", block_id_ho); } - status = osm_req_set(p_mgr->p_req, - p_path, + status = osm_req_set(p_mgr->sm, p_path, p_mgr->lft_buf + block_id_ho * 64, sizeof(block), IB_MAD_ATTR_LIN_FWD_TBL, diff --git a/opensm/opensm/osm_vl_arb_rcv.c b/opensm/opensm/osm_vl_arb_rcv.c index 23b081a..a88bf70 100644 --- a/opensm/opensm/osm_vl_arb_rcv.c +++ b/opensm/opensm/osm_vl_arb_rcv.c @@ -53,7 +53,6 @@ #include #include #include -#include #include #include #include -- 1.5.3.4.206.g58ba4 From sashak at voltaire.com Sun Jan 6 07:53:32 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Jan 2008 15:53:32 +0000 Subject: [ofa-general] [PATCH] opensm: remove unused header files In-Reply-To: <20080106155222.GC26304@sashak.voltaire.com> References: <20080106154919.GB26304@sashak.voltaire.com> <20080106155222.GC26304@sashak.voltaire.com> Message-ID: <20080106155332.GD26304@sashak.voltaire.com> Remove unused header files osm_req.h and osm_resp.h. Signed-off-by: Sasha Khapyorsky --- opensm/include/Makefile.am | 2 - opensm/include/opensm/osm_req.h | 237 -------------------------------------- opensm/include/opensm/osm_resp.h | 229 ------------------------------------ 3 files changed, 0 insertions(+), 468 deletions(-) delete mode 100644 opensm/include/opensm/osm_req.h delete mode 100644 opensm/include/opensm/osm_resp.h diff --git a/opensm/include/Makefile.am b/opensm/include/Makefile.am index 117087f..45b02cd 100644 --- a/opensm/include/Makefile.am +++ b/opensm/include/Makefile.am @@ -12,12 +12,10 @@ EXTRA_DIST = \ $(srcdir)/opensm/osm_madw.h \ $(srcdir)/opensm/osm_subnet.h \ $(srcdir)/opensm/osm_sweep_fail_ctrl.h \ - $(srcdir)/opensm/osm_resp.h \ $(srcdir)/opensm/osm_partition.h \ $(srcdir)/opensm/osm_helper.h \ $(srcdir)/opensm/osm_node.h \ $(srcdir)/opensm/osm_console.h \ - $(srcdir)/opensm/osm_req.h \ $(srcdir)/opensm/osm_mcm_info.h \ $(srcdir)/opensm/osm_inform.h \ $(srcdir)/opensm/osm_path.h \ diff --git a/opensm/include/opensm/osm_req.h b/opensm/include/opensm/osm_req.h deleted file mode 100644 index 6a32f70..0000000 --- a/opensm/include/opensm/osm_req.h +++ /dev/null @@ -1,237 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_req_t. - * This object represents an object that generically requests - * attributes from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_REQ_H_ -#define _OSM_REQ_H_ - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Generic Requester -* NAME -* Generic Requester -* -* DESCRIPTION -* The Generic Requester object encapsulates the information -* needed to request an attribute from a node. -* -* The Generic Requester object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: Generic Requester/osm_req_t -* NAME -* osm_req_t -* -* DESCRIPTION -* Generic Requester structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_req { - osm_mad_pool_t *p_pool; - osm_vl15_t *p_vl15; - osm_log_t *p_log; - osm_subn_t *p_subn; - atomic32_t *p_sm_trans_id; -} osm_req_t; -/* -* FIELDS -* p_pool -* Pointer to the MAD pool. -* -* p_vl15 -* Pointer to the VL15 interface. -* -* p_log -* Pointer to the log object. -* -* p_subn -* Pointer to the subnet object. -* -* p_sm_trans_id -* Pointer to transaction ID. -* -* SEE ALSO -* Generic Requester object -*********/ - -/****f* OpenSM: Generic Requester/osm_req_construct -* NAME -* osm_req_construct -* -* DESCRIPTION -* This function constructs a Generic Requester object. -* -* SYNOPSIS -*/ -void osm_req_construct(IN osm_req_t * const p_req); -/* -* PARAMETERS -* p_req -* [in] Pointer to a Generic Requester object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_req_init, and osm_req_destroy. -* -* Calling osm_req_construct is a prerequisite to calling any other -* method except osm_req_init. -* -* SEE ALSO -* Generic Requester object, osm_req_init, -* osm_req_destroy -*********/ - -/****f* OpenSM: Generic Requester/osm_req_destroy -* NAME -* osm_req_destroy -* -* DESCRIPTION -* The osm_req_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_req_destroy(IN osm_req_t * const p_req); -/* -* PARAMETERS -* p_req -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Generic Requester object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_req_construct or osm_req_init. -* -* SEE ALSO -* Generic Requester object, osm_req_construct, -* osm_req_init -*********/ - -/****f* OpenSM: Generic Requester/osm_req_init -* NAME -* osm_req_init -* -* DESCRIPTION -* The osm_req_init function initializes a -* Generic Requester object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_req_init(IN osm_req_t * const p_req, - IN osm_mad_pool_t * const p_pool, - IN osm_vl15_t * const p_vl15, - IN osm_subn_t * const p_subn, - IN osm_log_t * const p_log, IN atomic32_t * const p_sm_trans_id); -/* -* PARAMETERS -* p_req -* [in] Pointer to an osm_req_t object to initialize. -* -* p_mad_pool -* [in] Pointer to the MAD pool. -* -* p_vl15 -* [in] Pointer to the VL15 interface. -* -* p_subn -* [in] Pointer to the subnet object. -* -* p_log -* [in] Pointer to the log object. -* -* p_sm_trans_id -* [in] Pointer to the atomic SM transaction ID. -* -* RETURN VALUES -* IB_SUCCESS if the Generic Requester object was initialized -* successfully. -* -* NOTES -* Allows calling other Generic Requester methods. -* -* SEE ALSO -* Generic Requester object, osm_req_construct, -* osm_req_destroy -*********/ - -END_C_DECLS -#endif /* _OSM_REQ_H_ */ diff --git a/opensm/include/opensm/osm_resp.h b/opensm/include/opensm/osm_resp.h deleted file mode 100644 index 115d227..0000000 --- a/opensm/include/opensm/osm_resp.h +++ /dev/null @@ -1,229 +0,0 @@ -/* - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. - * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. - * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - */ - -/* - * Abstract: - * Declaration of osm_resp_t. - * This object represents an object that generically requests - * attributes from a node. - * This object is part of the OpenSM family of objects. - * - * Environment: - * Linux User Mode - * - * $Revision: 1.4 $ - */ - -#ifndef _OSM_RESP_H_ -#define _OSM_RESP_H_ - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef __cplusplus -# define BEGIN_C_DECLS extern "C" { -# define END_C_DECLS } -#else /* !__cplusplus */ -# define BEGIN_C_DECLS -# define END_C_DECLS -#endif /* __cplusplus */ - -BEGIN_C_DECLS -/****h* OpenSM/Generic Responder -* NAME -* Generic Responder -* -* DESCRIPTION -* The Generic Responder object encapsulates the information -* needed to respond to an attribute from a node. -* -* The Generic Responder object is thread safe. -* -* This object should be treated as opaque and should be -* manipulated only through the provided functions. -* -* AUTHOR -* Steve King, Intel -* -*********/ -/****s* OpenSM: Generic Responder/osm_resp_t -* NAME -* osm_resp_t -* -* DESCRIPTION -* Generic Responder structure. -* -* This object should be treated as opaque and should -* be manipulated only through the provided functions. -* -* SYNOPSIS -*/ -typedef struct _osm_resp { - osm_mad_pool_t *p_pool; - osm_vl15_t *p_vl15; - osm_log_t *p_log; - osm_subn_t *p_subn; -} osm_resp_t; -/* -* FIELDS -* p_pool -* Pointer to the MAD pool. -* -* p_vl15 -* Pointer to the VL15 interface. -* -* p_log -* Pointer to the log object. -* -* p_subn -* Pointer to the subnet object. -* -* SEE ALSO -* Generic Responder object -*********/ - -/****f* OpenSM: Generic Responder/osm_resp_construct -* NAME -* osm_resp_construct -* -* DESCRIPTION -* This function constructs a Generic Responder object. -* -* SYNOPSIS -*/ -void osm_resp_construct(IN osm_resp_t * const p_resp); -/* -* PARAMETERS -* p_resp -* [in] Pointer to a Generic Responder object to construct. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Allows calling osm_resp_init, osm_resp_destroy -* -* Calling osm_resp_construct is a prerequisite to calling any other -* method except osm_resp_init. -* -* SEE ALSO -* Generic Responder object, osm_resp_init, -* osm_resp_destroy -*********/ - -/****f* OpenSM: Generic Responder/osm_resp_destroy -* NAME -* osm_resp_destroy -* -* DESCRIPTION -* The osm_resp_destroy function destroys the object, releasing -* all resources. -* -* SYNOPSIS -*/ -void osm_resp_destroy(IN osm_resp_t * const p_resp); -/* -* PARAMETERS -* p_resp -* [in] Pointer to the object to destroy. -* -* RETURN VALUE -* This function does not return a value. -* -* NOTES -* Performs any necessary cleanup of the specified -* Generic Responder object. -* Further operations should not be attempted on the destroyed object. -* This function should only be called after a call to -* osm_resp_construct or osm_resp_init. -* -* SEE ALSO -* Generic Responder object, osm_resp_construct, -* osm_resp_init -*********/ - -/****f* OpenSM: Generic Responder/osm_resp_init -* NAME -* osm_resp_init -* -* DESCRIPTION -* The osm_resp_init function initializes a -* Generic Responder object for use. -* -* SYNOPSIS -*/ -ib_api_status_t -osm_resp_init(IN osm_resp_t * const p_resp, - IN osm_mad_pool_t * const p_pool, - IN osm_vl15_t * const p_vl15, - IN osm_subn_t * const p_subn, IN osm_log_t * const p_log); -/* -* PARAMETERS -* p_resp -* [in] Pointer to an osm_resp_t object to initialize. -* -* p_mad_pool -* [in] Pointer to the MAD pool. -* -* p_vl15 -* [in] Pointer to the VL15 interface. -* -* p_subn -* [in] Pointer to the subnet object. -* -* p_log -* [in] Pointer to the log object. -* -* RETURN VALUES -* IB_SUCCESS if the Generic Responder object was initialized -* successfully. -* -* NOTES -* Allows calling other Generic Responder methods. -* -* SEE ALSO -* Generic Responder object, osm_resp_construct, -* osm_resp_destroy -*********/ - -END_C_DECLS -#endif /* _OSM_RESP_H_ */ -- 1.5.3.4.206.g58ba4 From lennyb at voltaire.com Sun Jan 6 08:23:54 2008 From: lennyb at voltaire.com (Lenny Verkhovsky) Date: Sun, 6 Jan 2008 18:23:54 +0200 Subject: [ofa-general] AF_INET_SDP value Message-ID: <39C75744D164D948A170E9792AF8E7CA4296C4@exil.voltaire.com> Hi, Is AF_INET_SDP equals 27 is standartized for all architectures and kernels ? Best Regards, Lenny. -------------- next part -------------- An HTML attachment was scrubbed... URL: From msmxixy at bloodandbones.com Sat Jan 5 08:25:56 2008 From: msmxixy at bloodandbones.com (Janell Mcghee) Date: , 5 Jan 2008 17:25:56 +0100 Subject: [ofa-general] Purchase software at surprisingly low prices! Message-ID: <01c84fc0$0808d1d0$cbdef94d@msmxixy> Don't waste time waiting for delivery of your software on a CD. Download and install it immediately. Choose the program you need from more than 270 programs in many languages. Professional customer service will help in case some problem with installation occurs. All updates are available to download free of charge. Money back guarantee! http://geocities.com/LamontRoy09/ Incredible selection of programs and applications! From dwsharbym at sharby.com Sat Jan 5 09:40:09 2008 From: dwsharbym at sharby.com (Cruz Shaw) Date: Mon, 6 Jan 2008 01:40:09 +0800 Subject: [ofa-general] Want to be a hero in bed? Message-ID: <01c85005$122c2280$08aed8dd@dwsharbym> Are U Tired with erectile dysfunction? Enhance your sexual life now! Want to be ready for sex in few minutes? Reproductive and ED problems solution http://geocities.com/DamionNorris42/ We are verified by VISA. Confidential purchase. From jim at mellanox.com Sun Jan 6 10:21:46 2008 From: jim at mellanox.com (Jim Mott) Date: Sun, 6 Jan 2008 10:21:46 -0800 Subject: [ofa-general] AF_INET_SDP value In-Reply-To: <39C75744D164D948A170E9792AF8E7CA4296C4@exil.voltaire.com> References: <39C75744D164D948A170E9792AF8E7CA4296C4@exil.voltaire.com> Message-ID: I do not believe so. There are some politics involved. This value is shipped as part of the user space libsdp code. Perhaps someone that knows more history on this can comment? From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Lenny Verkhovsky Sent: Sunday, January 06, 2008 10:24 AM To: general at lists.openfabrics.org Subject: [ofa-general] AF_INET_SDP value Hi, Is AF_INET_SDP equals 27 is standartized for all architectures and kernels ? Best Regards, Lenny. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Sun Jan 6 11:09:12 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Jan 2008 19:09:12 +0000 Subject: [ofa-general] ***SPAM*** [PATCH] management/*/Makefile.am: fix ChangeLog generation rule Message-ID: <20080106190912.GE26304@sashak.voltaire.com> Then individual package is detached from the main source tree, running 'make dist' in this detached sub-tree will fail, it is because ../gen_chlog.sh script will not be found. Fix it, make ChangeLog generation on dist-hook optional. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/Makefile.am | 5 +++-- libibcommon/Makefile.am | 5 +++-- libibmad/Makefile.am | 5 +++-- libibumad/Makefile.am | 5 +++-- opensm/Makefile.am | 5 +++-- 5 files changed, 15 insertions(+), 10 deletions(-) diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index edff06c..e1b4d58 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -95,8 +95,9 @@ EXTRA_DIST = scripts include infiniband-diags.spec.in infiniband-diags.spec \ $(man_MANS) autogen.sh dist-hook: - test -x ../$(top_srcdir)/gen_chlog.sh \ - && ../$(top_srcdir)/gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog + if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ + $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ + fi # install this to a default location. install-data-hook: diff --git a/libibcommon/Makefile.am b/libibcommon/Makefile.am index af60035..75889f4 100644 --- a/libibcommon/Makefile.am +++ b/libibcommon/Makefile.am @@ -27,5 +27,6 @@ EXTRA_DIST = $(srcdir)/include/infiniband/common.h \ $(srcdir)/src/libibcommon.map libibcommon.ver autogen.sh dist-hook: - test -x ../$(top_srcdir)/gen_chlog.sh \ - && ../$(top_srcdir)/gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog + if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ + $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ + fi diff --git a/libibmad/Makefile.am b/libibmad/Makefile.am index a350d50..beae1a4 100644 --- a/libibmad/Makefile.am +++ b/libibmad/Makefile.am @@ -29,5 +29,6 @@ EXTRA_DIST = $(srcdir)/include/infiniband/mad.h libibmad.spec.in libibmad.spec \ $(srcdir)/src/libibmad.map libibmad.ver autogen.sh dist-hook: - test -x ../$(top_srcdir)/gen_chlog.sh \ - && ../$(top_srcdir)/gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog + if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ + $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ + fi diff --git a/libibumad/Makefile.am b/libibumad/Makefile.am index 7674654..49c8b11 100644 --- a/libibumad/Makefile.am +++ b/libibumad/Makefile.am @@ -41,8 +41,9 @@ EXTRA_DIST = $(srcdir)/include/infiniband/umad.h \ $(man_MANS) autogen.sh dist-hook: - test -x ../$(top_srcdir)/gen_chlog.sh \ - && ../$(top_srcdir)/gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog + if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ + $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ + fi install-data-hook: cd $(DESTDIR)$(mandir)/man3 && \ diff --git a/opensm/Makefile.am b/opensm/Makefile.am index 70de2d7..0c817ae 100644 --- a/opensm/Makefile.am +++ b/opensm/Makefile.am @@ -28,5 +28,6 @@ various_scripts = $(wildcard scripts/*) EXTRA_DIST = autogen.sh opensm.spec $(various_scripts) $(man_MANS) dist-hook: $(EXTRA_DIST) - test -x ../$(top_srcdir)/gen_chlog.sh \ - && ../$(top_srcdir)/gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog + if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ + $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ + fi -- 1.5.4.rc2.38.gd6da3 From qqwnkoojmnn at boynecity.com Sat Jan 5 13:56:31 2008 From: qqwnkoojmnn at boynecity.com (Amber Gentry) Date: Mon, 6 Jan 2008 05:56:31 +0800 Subject: [ofa-general] Get the cheapest software offer! Message-ID: <01c85028$e2ccddd0$4ff6a0da@qqwnkoojmnn> Get original and perfectly functioning software at low prices. All software can be downloaded immediately after purchase. Impressive selection of programs even for Macintosh! Programs in many languages are available. Accept this brilliant offer and take the advantage of our free installation consultations. Money back guarantee is available. http://geocities.com/JamiWest15/ Check our site for discounts! From sashak at voltaire.com Sun Jan 6 14:58:13 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Jan 2008 22:58:13 +0000 Subject: [ofa-general] [PATCH] management/gen_ver.sh script Message-ID: <20080106225813.GF26304@sashak.voltaire.com> This generates a version string which includes recent version as specified in correspondent sub project's configure.in file, plus git revision abbreviation in the case if sub-project HEAD is different from recent tag, plus "-dirty" suffix if local uncommitted changes are in the sub project tree. For example: $ ./gen_ver.sh opensm 3.1.8-5a03b64-dirty Signed-off-by: Sasha Khapyorsky --- gen_ver.sh | 38 ++++++++++++++++++++++++++++++++++++++ 1 files changed, 38 insertions(+), 0 deletions(-) create mode 100755 gen_ver.sh diff --git a/gen_ver.sh b/gen_ver.sh new file mode 100755 index 0000000..3524182 --- /dev/null +++ b/gen_ver.sh @@ -0,0 +1,38 @@ +#!/bin/sh +# +# This generates a version string which includes recent version as +# specified in correspondent sub project's configure.in file, plus +# git revision abbreviation in the case if sub-project HEAD is different +# from recent tag, plus "-dirty" suffix if local uncommitted changes are +# in the sub project tree. +# + +usage() +{ + echo "Usage: $0 " + exit 2 +} + +test -z "$1" && usage + +package=$1 + +cd `dirname $0` + +conf_file=$package/configure.in +version=`cat $conf_file | sed -ne '/AC_INIT.*'$package'.*/s/^AC_INIT.*'$package', \(.*\),.*$/\1/p'` + +git diff --quiet $package-$version..HEAD -- $package > /dev/null 2>&1 +if [ $? -eq 1 ] ; then + abbr=`git rev-parse --short --verify HEAD 2>/dev/null` + if [ ! -z "$abbr" ] ; then + version="${version}-${abbr}" + fi +fi + +git diff-index --quiet HEAD -- $package > /dev/null 2>&1 +if [ $? -eq 1 ] ; then + version="${version}-dirty" +fi + +echo $version -- 1.5.4.rc2.38.gd6da3 From sashak at voltaire.com Sun Jan 6 15:01:39 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Jan 2008 23:01:39 +0000 Subject: [ofa-general] [PATCH] opensm: update OpenSM version according to the tree state In-Reply-To: <20080106225813.GF26304@sashak.voltaire.com> References: <20080106225813.GF26304@sashak.voltaire.com> Message-ID: <20080106230139.GG26304@sashak.voltaire.com> There is automatic OpenSM version (in generated osm_version.h file) updater. It takes results from gen_ver.sh script, generated version has form of: * 3.1.8 , when tree state is equivalent to the opensm-3.1.8 release tag * 3.1.8-4449c46 , same + git commit abbreviation, it is when the tree is different from the release tag * 3.1.8-4449c46-dirty, same as above, but the tree also has uncommitted changes Signed-off-by: Sasha Khapyorsky --- opensm/opensm/Makefile.am | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am index 9511a80..a5d9515 100644 --- a/opensm/opensm/Makefile.am +++ b/opensm/opensm/Makefile.am @@ -102,6 +102,18 @@ opensminclude_HEADERS = $(srcdir)/../include/opensm/osm_base.h \ $(srcdir)/../include/opensm/osm_helper.h \ $(srcdir)/../include/opensm/osm_event_plugin.h +BUILT_SOURCES = osm_version +osm_version: + if [ -x $(top_srcdir)/../gen_ver.sh ] ; then \ + ver_file=$(srcdir)/../include/opensm/osm_version.h ; \ + osm_ver=`cat $$ver_file | sed -ne '/#define OSM_VERSION /s/^.*\"OpenSM \(.*\)\"$$/\1/p'` ; \ + ver=`$(top_srcdir)/../gen_ver.sh $$PACKAGE` ; \ + if [ $$ver != $$osm_ver ] ; then \ + cat $$ver_file | sed -e '/#define OSM_VERSION /s/\"OpenSM \(.*\)\"/\"OpenSM '$$ver'\"/' > tmp_new_version ; \ + cat tmp_new_version > $$ver_file && rm -f tmp_new_version ; \ + fi ; \ + fi + # headers are distributed as part of the include dir EXTRA_DIST = $(srcdir)/libopensm.map $(srcdir)/libopensm.ver \ $(srcdir)/ChangeLog \ -- 1.5.4.rc2.38.gd6da3 From sashak at voltaire.com Sun Jan 6 15:12:21 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Jan 2008 23:12:21 +0000 Subject: [ofa-general] [PATCH] opensm/Makefile.am: update configure.in version on 'make dist' In-Reply-To: <20080106225813.GF26304@sashak.voltaire.com> References: <20080106225813.GF26304@sashak.voltaire.com> Message-ID: <20080106231221.GH26304@sashak.voltaire.com> This updates the package version stored in configure.in in newly created by 'make dist' package tarball. The version is generated by gen_ver.sh in accordance with the tree state. Signed-off-by: Sasha Khapyorsky --- opensm/Makefile.am | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/opensm/Makefile.am b/opensm/Makefile.am index 0c817ae..4474493 100644 --- a/opensm/Makefile.am +++ b/opensm/Makefile.am @@ -31,3 +31,7 @@ dist-hook: $(EXTRA_DIST) if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ fi + if [ -x $(top_srcdir)/../gen_ver.sh ] ; then \ + ver=`$(top_srcdir)/../gen_ver.sh $(PACKAGE)` ; \ + sed -e '/AC_INIT/s/$(PACKAGE), .*,/$(PACKAGE), '$$ver',/' $(top_srcdir)/configure.in > $(distdir)/configure.in ; \ + fi -- 1.5.4.rc2.38.gd6da3 From sashak at voltaire.com Sun Jan 6 15:16:45 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 6 Jan 2008 23:16:45 +0000 Subject: [ofa-general] [PATCH] management/*/Makefile.am: update configure.in version on 'make dist' In-Reply-To: <20080106225813.GF26304@sashak.voltaire.com> References: <20080106225813.GF26304@sashak.voltaire.com> Message-ID: <20080106231645.GI26304@sashak.voltaire.com> This is similar gen_ver.sh usage in opensm/Makefile.am. This updates the package version stored in configure.in in newly created by 'make dist' package tarball. The version is generated by gen_ver.sh in accordance with the tree state. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/Makefile.am | 4 ++++ libibcommon/Makefile.am | 4 ++++ libibmad/Makefile.am | 4 ++++ libibumad/Makefile.am | 4 ++++ 4 files changed, 16 insertions(+), 0 deletions(-) diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index e1b4d58..8e1e587 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -98,6 +98,10 @@ dist-hook: if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ fi + if [ -x $(top_srcdir)/../gen_ver.sh ] ; then \ + ver=`$(top_srcdir)/../gen_ver.sh $(PACKAGE)` ; \ + sed -e '/AC_INIT/s/$(PACKAGE), .*,/$(PACKAGE), '$$ver',/' $(top_srcdir)/configure.in > $(distdir)/configure.in ; \ + fi # install this to a default location. install-data-hook: diff --git a/libibcommon/Makefile.am b/libibcommon/Makefile.am index 75889f4..dd8e264 100644 --- a/libibcommon/Makefile.am +++ b/libibcommon/Makefile.am @@ -30,3 +30,7 @@ dist-hook: if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ fi + if [ -x $(top_srcdir)/../gen_ver.sh ] ; then \ + ver=`$(top_srcdir)/../gen_ver.sh $(PACKAGE)` ; \ + sed -e '/AC_INIT/s/$(PACKAGE), .*,/$(PACKAGE), '$$ver',/' $(top_srcdir)/configure.in > $(distdir)/configure.in ; \ + fi diff --git a/libibmad/Makefile.am b/libibmad/Makefile.am index beae1a4..9def2fe 100644 --- a/libibmad/Makefile.am +++ b/libibmad/Makefile.am @@ -32,3 +32,7 @@ dist-hook: if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ fi + if [ -x $(top_srcdir)/../gen_ver.sh ] ; then \ + ver=`$(top_srcdir)/../gen_ver.sh $(PACKAGE)` ; \ + sed -e '/AC_INIT/s/$(PACKAGE), .*,/$(PACKAGE), '$$ver',/' $(top_srcdir)/configure.in > $(distdir)/configure.in ; \ + fi diff --git a/libibumad/Makefile.am b/libibumad/Makefile.am index 49c8b11..5b8a69a 100644 --- a/libibumad/Makefile.am +++ b/libibumad/Makefile.am @@ -44,6 +44,10 @@ dist-hook: if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \ $(top_srcdir)/../gen_chlog.sh $(PACKAGE) > $(distdir)/ChangeLog ; \ fi + if [ -x $(top_srcdir)/../gen_ver.sh ] ; then \ + ver=`$(top_srcdir)/../gen_ver.sh $(PACKAGE)` ; \ + sed -e '/AC_INIT/s/$(PACKAGE), .*,/$(PACKAGE), '$$ver',/' $(top_srcdir)/configure.in > $(distdir)/configure.in ; \ + fi install-data-hook: cd $(DESTDIR)$(mandir)/man3 && \ -- 1.5.4.rc2.38.gd6da3 From kliteyn at mellanox.co.il Sun Jan 6 17:06:16 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 7 Jan 2008 03:06:16 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-07:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-06 OpenSM git rev = Sat_Dec_1_00:14:46_2007 [9caf8d66a9434fd5300631e6909a4d727cf9abf6] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From sashak at voltaire.com Sun Jan 6 17:57:37 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Jan 2008 01:57:37 +0000 Subject: [ofa-general] [PATCH] infiniband-diags: use common build version Message-ID: <20080107015737.GJ26304@sashak.voltaire.com> Use common version (as defined in configure.in) instead of per tool __BUILD_VERSION_TAG__. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/configure.in | 1 + infiniband-diags/include/ibdiag_common.h | 8 +---- infiniband-diags/include/ibdiag_version.h.in | 39 ++++++++++++++++++++++++++ 3 files changed, 42 insertions(+), 6 deletions(-) create mode 100644 infiniband-diags/include/ibdiag_version.h.in diff --git a/infiniband-diags/configure.in b/infiniband-diags/configure.in index 1d5810a..1baa6cb 100644 --- a/infiniband-diags/configure.in +++ b/infiniband-diags/configure.in @@ -136,6 +136,7 @@ AC_SUBST(IBSCRIPTPATH) AC_CONFIG_FILES([\ Makefile \ infiniband-diags.spec \ + include/ibdiag_version.h \ scripts/ibcheckerrors \ scripts/ibcheckerrs \ scripts/ibchecknet \ diff --git a/infiniband-diags/include/ibdiag_common.h b/infiniband-diags/include/ibdiag_common.h index 029d80e..e8b1fab 100644 --- a/infiniband-diags/include/ibdiag_common.h +++ b/infiniband-diags/include/ibdiag_common.h @@ -52,15 +52,11 @@ extern int ibdebug; void iberror(const char *fn, char *msg, ...); -#ifdef __BUILD_VERSION_TAG__ - -#define stringify(s) to_string(s) -#define to_string(s) #s +#include static inline const char* get_build_version(void) { - return "BUILD VERSION: " stringify(__BUILD_VERSION_TAG__) " Build date: " __DATE__ " " __TIME__ ; + return "BUILD VERSION: " IBDIAG_VERSION " Build date: " __DATE__ " " __TIME__ ; } -#endif #endif /* _IBDIAG_COMMON_H_ */ diff --git a/infiniband-diags/include/ibdiag_version.h.in b/infiniband-diags/include/ibdiag_version.h.in new file mode 100644 index 0000000..62430c5 --- /dev/null +++ b/infiniband-diags/include/ibdiag_version.h.in @@ -0,0 +1,39 @@ +/* + * Copyright (c) 2008 Voltaire Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +#ifndef _IBDIAG_VERSION_H_ +#define _IBDIAG_VERSION_H_ + +#define IBDIAG_VERSION "@VERSION@" + +#endif /* _IBDIAG_VERSION_H_ */ -- 1.5.4.rc2.38.gd6da3 From sashak at voltaire.com Sun Jan 6 18:02:49 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Jan 2008 02:02:49 +0000 Subject: [ofa-general] [PATCH] infiniband-diags: remove not needed anymore __BUILD_VERSION_TAG__ In-Reply-To: <20080107015737.GJ26304@sashak.voltaire.com> References: <20080107015737.GJ26304@sashak.voltaire.com> Message-ID: <20080107020249.GK26304@sashak.voltaire.com> Remove not needed anymore __BUILD_VERSION_TAG__ macro. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/src/ibaddr.c | 1 - infiniband-diags/src/ibnetdiscover.c | 1 - infiniband-diags/src/ibping.c | 1 - infiniband-diags/src/ibportstate.c | 1 - infiniband-diags/src/ibroute.c | 1 - infiniband-diags/src/ibstat.c | 1 - infiniband-diags/src/ibsysstat.c | 1 - infiniband-diags/src/ibtracert.c | 1 - infiniband-diags/src/perfquery.c | 1 - infiniband-diags/src/saquery.c | 2 -- infiniband-diags/src/sminfo.c | 1 - infiniband-diags/src/smpdump.c | 1 - infiniband-diags/src/smpquery.c | 1 - infiniband-diags/src/vendstat.c | 1 - 14 files changed, 0 insertions(+), 15 deletions(-) diff --git a/infiniband-diags/src/ibaddr.c b/infiniband-diags/src/ibaddr.c index c61b6b7..f71edf8 100644 --- a/infiniband-diags/src/ibaddr.c +++ b/infiniband-diags/src/ibaddr.c @@ -41,7 +41,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2 #include #include #include diff --git a/infiniband-diags/src/ibnetdiscover.c b/infiniband-diags/src/ibnetdiscover.c index 7701b02..b8d4e92 100644 --- a/infiniband-diags/src/ibnetdiscover.c +++ b/infiniband-diags/src/ibnetdiscover.c @@ -47,7 +47,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2.5 #include #include #include diff --git a/infiniband-diags/src/ibping.c b/infiniband-diags/src/ibping.c index ba32508..6c40e63 100644 --- a/infiniband-diags/src/ibping.c +++ b/infiniband-diags/src/ibping.c @@ -44,7 +44,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2 #include #include #include diff --git a/infiniband-diags/src/ibportstate.c b/infiniband-diags/src/ibportstate.c index 9ea7529..d21d8b4 100644 --- a/infiniband-diags/src/ibportstate.c +++ b/infiniband-diags/src/ibportstate.c @@ -43,7 +43,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2.2 #include #include #include diff --git a/infiniband-diags/src/ibroute.c b/infiniband-diags/src/ibroute.c index 664f7f5..3a6907b 100644 --- a/infiniband-diags/src/ibroute.c +++ b/infiniband-diags/src/ibroute.c @@ -46,7 +46,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2 #include #include #include diff --git a/infiniband-diags/src/ibstat.c b/infiniband-diags/src/ibstat.c index aa55d83..eda77b7 100644 --- a/infiniband-diags/src/ibstat.c +++ b/infiniband-diags/src/ibstat.c @@ -57,7 +57,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.1 #include #include #include diff --git a/infiniband-diags/src/ibsysstat.c b/infiniband-diags/src/ibsysstat.c index 2435c87..8e00baf 100644 --- a/infiniband-diags/src/ibsysstat.c +++ b/infiniband-diags/src/ibsysstat.c @@ -43,7 +43,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2 #include #include #include diff --git a/infiniband-diags/src/ibtracert.c b/infiniband-diags/src/ibtracert.c index 284ae2a..eb9329c 100644 --- a/infiniband-diags/src/ibtracert.c +++ b/infiniband-diags/src/ibtracert.c @@ -46,7 +46,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2.1 #include #include #include diff --git a/infiniband-diags/src/perfquery.c b/infiniband-diags/src/perfquery.c index 17aafb6..ce8f342 100644 --- a/infiniband-diags/src/perfquery.c +++ b/infiniband-diags/src/perfquery.c @@ -43,7 +43,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2.3 #include #include #include diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c index fad3d50..2017a86 100644 --- a/infiniband-diags/src/saquery.c +++ b/infiniband-diags/src/saquery.c @@ -48,8 +48,6 @@ #define _GNU_SOURCE #include -#define __BUILD_VERSION_TAG__ 1.2.4 - #include #include #include diff --git a/infiniband-diags/src/sminfo.c b/infiniband-diags/src/sminfo.c index 87f09ac..d2d192e 100644 --- a/infiniband-diags/src/sminfo.c +++ b/infiniband-diags/src/sminfo.c @@ -42,7 +42,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2.2 #include #include #include diff --git a/infiniband-diags/src/smpdump.c b/infiniband-diags/src/smpdump.c index c325771..2179042 100644 --- a/infiniband-diags/src/smpdump.c +++ b/infiniband-diags/src/smpdump.c @@ -56,7 +56,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.1 #include #include #include diff --git a/infiniband-diags/src/smpquery.c b/infiniband-diags/src/smpquery.c index 0a91de1..7535f37 100644 --- a/infiniband-diags/src/smpquery.c +++ b/infiniband-diags/src/smpquery.c @@ -47,7 +47,6 @@ #define __STDC_FORMAT_MACROS #include -#define __BUILD_VERSION_TAG__ 1.2.2 #include #include #include diff --git a/infiniband-diags/src/vendstat.c b/infiniband-diags/src/vendstat.c index fa0206c..22a1bed 100644 --- a/infiniband-diags/src/vendstat.c +++ b/infiniband-diags/src/vendstat.c @@ -42,7 +42,6 @@ #include #include -#define __BUILD_VERSION_TAG__ 1.2.1 #include #include #include -- 1.5.4.rc2.38.gd6da3 From pietje at wildwomenweekend.com Sat Jan 5 18:19:41 2008 From: pietje at wildwomenweekend.com (Carson Gutierrez) Date: Mon, 6 Jan 2008 05:19:41 +0300 Subject: [ofa-general] Man Lebt nur einmal - probiers aus ! Message-ID: <01c85023$bd4bd480$5ebcf74d@pietje> Verpassen Sie nichts am Lebem - Sie werden fuhlen was unsere Kunden bestatigen! Qualitat - 100% wirksam Viiiaaaggra 10 pills x 100 mg + Ciiiaaaaaallis 10 pills x 20 mg 48,06 Euro Ciiiaaaaaallis 30 x 20mg 73,19 Euro - 2,44 Euro pro Stuck - Sie sparen: 9,09 Euro Viiiaaaggra 30 x 50mg 42,12 Euro - 1,41 Euro pro Stuck - Sie sparen: 20,98 Euro Ciiiaaaaaallis 90 x 20mg 169,27 Euro - 1,88 Euro pro Stuck - Sie sparen: 77,62 Euro Viiiaaaggra 90 x 50mg 112,50 Euro - 1.25 Euro pro Stuck - Sie sparen: 82,84 Euro Ciiiaaaaaallis 120 x 20mg 213,15 Euro - 1,78 Euro pro Stuck - Sie sparen: 116,08 Euro Viiiaaaggra 120 x 50mg 135,96 Euro - 1.13 Euro pro Stuck - Sie sparen: 117,48 Euro Ciiiaaaaaallis 180 x 20mg 311,62 Euro - 1,73 Euro pro Stuck - Sie sparen: 181,12 Euro Viiiaaaggra 180 x 50mg 192,58 Euro - 1.07 Euro pro Stuck - Sie sparen: 187,41 Euro - Bequem und diskret online bestellen. - Diskrete Verpackung und Zahlung - Kein peinlicher Arztbesuch erforderlich - Kostenlose, arztliche Telefon-Beratung - Kein langes Warten - Auslieferung innerhalb von 2-3 Tagen - keine versteckte Kosten - Visa verifizierter Onlineshop Mit unseren Produkten vergessen Sie Ihre Enttauschungen, anhaltende Versagensangste und wiederholte peinliche Situationen Nur fur kurze Zeit - vier Pillen umsonst erhalten http://markseveral.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From sweitzen at cisco.com Sun Jan 6 22:46:58 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Sun, 6 Jan 2008 22:46:58 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> Message-ID: Jim, I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU (single core each CPU) Xeon system. I do not see any performance improvement (either throughput or CPU utilization) using netperf when I set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384. Can you elaborate on your HCA type, and performance improvement you see? Here's an example netperf command line when using a Cheetah DDR HCA and 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware too): [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 -v2 -4 -H 192.168.1.201 -l 30 -t TCP_STREAM -c -C -- -m 65536 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.201 (192.168.1.201) port 0 AF_INET : histogram : demo Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 65536 30.01 7267.70 55.06 61.27 1.241 1.381 Alignment Offset Bytes Bytes Sends Bytes Recvs Local Remote Local Remote Xfered Per Per Send Recv Send Recv Send (avg) Recv (avg) 8 8 0 0 2.726e+10 65536.00 415942 48106.01 566648 Maximum Segment Size (bytes) -1 Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Wednesday, December 12, 2007 6:29 PM > To: Scott Weitzenkamp (sweitzen) > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > I am traveling for the next 2 weeks and not able to test > anymore. That > said, I believe all outstanding problems are fixed and it is safe to > re-enable by default. My testing shows the crossover size > where bzcopy > is always a win at about 16K. The patch goes in sdp_main.c and looks > something like: > -static int sdp_zcopy_thresh = 0; > +static int sdp_zcopy_thresh = 16384; > > -----Original Message----- > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > Sent: Wednesday, December 12, 2007 5:26 PM > To: Jim Mott; ewg at lists.openfabrics.org > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > Jim, when do you plan to enably bzcopy by default? > > Scott Weitzenkamp > SQA and Release Manager > Server Virtualization Business Unit > Cisco Systems > > > > > > -----Original Message----- > > From: general-bounces at lists.openfabrics.org > > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Jim Mott > > Sent: Friday, November 30, 2007 12:04 PM > > To: ewg at lists.openfabrics.org > > Cc: general at lists.openfabrics.org > > Subject: [ofa-general] RE: [ewg] Not seeing any SDP > > performance changes inOFED 1.3 beta, and I get Oops when > > enabling sdp_zcopy_thresh > > > > Hi, > > This kernel Oops is new and I will look at it. Dotan and > > the Mellanox regression tests have been keeping me busy > > recently. There > > was a problem like this, but only in multi-threaded apps > > using a single socket or when doing cleanup after ^C. > > > > I will re-enable default bzcopy behavior once all the > > important Mellanox regression tests are passing. Until then, > > setting the > > sdp_zcopy_threah variable by hand (8192 and up should give > > better performance) and running simple tests like netperf should be > > working fine. You should not be seeing any problem here. [I > > have only tested locally with x86_64 rhat4u4, rhat5, 2.6.23.8, and > > 2.6.24-rc2. Mellanox regression tests everything and they > > have not submitted this Oops yet.] > > > > I have opened bugs in the openfabrics bugzilla for > > everything I am currently working on. It is down right now > > or I would add > > pointers. > > > > > > Here is my work list; additions or priority changes welcome: > > > > SDP OPEN ISSUES LIST (Priority order) > > ===================================== > > 1) DONE: BUG: Unload of mlx4 and ib_sdp fails while SDP active > > 11/6 [PATCH 1/1 V2] SDP - Fix reference count bug ... > > > > 2) DONE: BUG: Many data corruption failures > > 11/11 [PATCH 1/1] SDP - Fix bug where zcopy bcopy returns ... > > > > 3) DONE: Bug 793 - kernel BUG at net/core/skbuff.c:95! > > 11/26 [PATCH 1/1] SDP - bug793; skbuff changes ... > > > > 4) TODO: BUG: kernel oops in SDP regression > > Replicated problem by hitting ^C during a transfer. I have > > created a patch that fixes the problem, but it needs more work > > to move into production. There are some side effects I do not > > yet understand. > > This is the one I am working on now. I hope to drop it soon. > > There is a bug open tracking it. > > > > 5) TODO: BUG: libsdp returns good RC when it should fail > > > > 6) TODO: BUG: aio_test fails in SDP regression > > > > 7) TODO: Bug 779 - Lock ordering problem during accept on 1.2.5 > > After building a 2.6.23.8 kernel with lock checking enabled, I > > can not reproduce this problem. Looks like I'll need more input > > from the reporter. (Bug updated to say this). I will continue to > > code review though. > > > > 8) DONE: Bug 294 - connect does not allow AF_INET_SDP > > [fix in bugzilla dropped] > > > > 9) DONE: Backport work needed to support 2.6.24 > > > > 10) TODO: Package user space libsdp for Redhat > > This is supposed to be easy to do, but it will take me some time > > to figure out the detail. > > > > 11) DONE: BUG: Memory leak > > 11/20 [PATCH 1/1 v2] SDP - Fix a memory leak in bzcopy > > -----Original Message----- > > From: ewg-bounces at lists.openfabrics.org > > [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of Scott > > Weitzenkamp (sweitzen) > > Sent: Friday, November 30, 2007 12:37 PM > > To: Jim Mott; Scott Weitzenkamp (sweitzen); > ewg at lists.openfabrics.org > > Cc: general at lists.openfabrics.org > > Subject: [ewg] Not seeing any SDP performance changes in OFED > > 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > > > Jim, > > > > Using netperf with TCP_STREAM and TCP_RR, I'm not seeing any > > changes in > > SDP throughput or CPU utilization comparing OFED 1.3 beta and OFED > > 1.2.5. Looks like I need to set a non-zero value in > > /sys/module/ib_sdp/sdp_zcopy_thresh? Do you plan to enable this by > > default soon? > > > > I tried "echo 4096 > /sys/module/ib_sdp/sdp_zcopy_thresh" > on RHEL4 and > > then tried netperf, and got an Oops. > > > > Unable to handle kernel NULL pointer deref > > erence at 0000000000000000 RIP: > > {put_page+0} > > PML4 1a3047067 PGD 1a7a6d067 PMD 0 > > Oops: 0000 [1] SMP > > CPU 0 > > Modules linked in: parport_pc lp parport autofs4 > > i2c_dev i2c_co > > re nfs lockd nfs_acl sunrpc rdma_ucm(U) rds(U) ib_sdp(U) rdma_cm(U) > > iw_cm(U) ib_ > > addr(U) mlx4_ib(U) mlx4_core(U) ds yenta_socket pcmcia_core > dm_mirror > > dm_multipa > > th dm_mod joydev button battery ac uhci_hcd ehci_hcd shpchp > > ib_mthca(U) > > ib_ipoib > > (U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U) > > ib_core(U) md5 > > ipv6 e1000 floppy ata_piix libata sg ext3 jbd mptscsih > mptsas mptspi > > mptscsi mp > > tbase sd_mod scsi_mod > > Pid: 6802, comm: netperf241 Not tainted > > 2.6.9-55.ELlargesmp > > RIP: 0010:[] > > {put_page+0} > > RSP: 0018:00000101a7bcbbc0 EFLAGS: 00010203 > > RAX: 0000000000000000 RBX: 0000000000000001 RCX: > > 00000000000002 > > 02 > > RDX: 00000101b0b43e80 RSI: 0000000000000202 RDI: > > 00000000000000 > > 00 > > RBP: 00000101b85761c0 R08: 0000000000000000 R09: > > 00000000000000 > > 00 > > R10: 0000000000000246 R11: ffffffffa02e0e36 R12: > > 00000101a4b330 > > 80 > > R13: 00000101a7bcbd58 R14: 0000000000000000 R15: > > 00000000000100 > > 00 > > FS: 0000002a95696940(0000) > GS:ffffffff80500380(0000) > > knlGS:000 > > 0000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 0000000000000000 CR3: 0000000000101000 CR4: > > 00000000000006 > > e0 > > Process netperf241 (pid: 6802, threadinfo > > 00000101a7bca000, tas > > k 00000101a70df030) > > Stack: ffffffffa02e110a 0000000000000100 > > 0000000000000000 00000 > > 00000529780 > > 0001000000000246 0000000000000246 > > 000000008013feac 00000 > > 800ffffffe0 > > 0000000000000000 00000101a7bcbe88 > > Call > > Trace:{:ib_sdp:sdp_sendmsg+724} > > > f801478b2>{queue_delayed_work+101} > > {:ib_addr:queue_req+122} > > > 7ecb>{sock_sendmsg+271} > > {do_no_page+916} > > {au > > toremove_wake_function+0} > > {sockfd_lookup+16} > > { > > sys_sendto+195} > > {do_page_fault+577} > > > > {dnotify_parent+34} > > {vfs_read+248} > > {syst > > em_call+126} > > > > > > Code: 8b 07 48 89 fa f6 c4 80 74 3b 48 8b > 57 10 8b 02 > > 48 89 d1 > > f6 > > RIP {put_page+0} RSP > > <00000101a7bcbbc0> > > CR2: 0000000000000000 > > <0>Kernel panic - not syncing: Oops > > > > Scott Weitzenkamp > > SQA and Release Manager > > Server Virtualization Business Unit > > Cisco Systems > > _______________________________________________ > > ewg mailing list > > ewg at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > From dotanb at dev.mellanox.co.il Sun Jan 6 08:01:25 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Sun, 6 Jan 2008 18:01:25 +0200 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion Message-ID: <200801061801.25386.dotanb@dev.mellanox.co.il> Fix the value of the pkey_index in the completion to get a valid value for GSI QPs. Signed-off-by: Dotan Barak --- diff --git a/src/cq.c b/src/cq.c index 06ae9e2..33823c8 100644 --- a/src/cq.c +++ b/src/cq.c @@ -319,7 +319,7 @@ static int mlx4_poll_one(struct mlx4_cq *cq, wc->src_qp = g_mlpath_rqpn & 0xffffff; wc->dlid_path_bits = (g_mlpath_rqpn >> 24) & 0x7f; wc->wc_flags |= g_mlpath_rqpn & 0x80000000 ? IBV_WC_GRH : 0; - wc->pkey_index = ntohl(cqe->immed_rss_invalid) >> 16; + wc->pkey_index = (uint16_t)(ntohl(cqe->immed_rss_invalid) & 0x7f); } return CQ_OK; From dotanb at dev.mellanox.co.il Sun Jan 6 23:01:25 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Mon, 7 Jan 2008 09:01:25 +0200 Subject: [ofa-general] [PATCH] mlx4: Fix the value of the pkey_index in the completion Message-ID: <200801070901.26213.dotanb@dev.mellanox.co.il> Fix the value of the pkey_index in the completion to get a valid value for GSI QPs. Without this fix, incoming GSI packets on port 2 gets invalid pkey index in the completion, which prevent from the mad layer to send back a response. Roland: can you please try to push this patch to kernel 2.6.24? Signed-off-by: Dotan Barak --- diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 8bf44da..608de9f 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -430,7 +430,7 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, wc->dlid_path_bits = (be32_to_cpu(cqe->g_mlpath_rqpn) >> 24) & 0x7f; wc->wc_flags |= be32_to_cpu(cqe->g_mlpath_rqpn) & 0x80000000 ? IB_WC_GRH : 0; - wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) >> 16; + wc->pkey_index = (u16)(be32_to_cpu(cqe->immed_rss_invalid) & 0x7f); } return 0; From dotanb at dev.mellanox.co.il Sun Jan 6 23:10:29 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Mon, 7 Jan 2008 09:10:29 +0200 Subject: [ofa-general] [PATCH] mlx4: Check the return values of the functions ib_get_cached_* Message-ID: <200801070910.29604.dotanb@dev.mellanox.co.il> Added a check to the return values of ib_get_cached_gid and ib_get_cached_pkey before using the values that they set (becasue those functions may fail) Signed-off-by: Dotan Barak --- diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 8cba9c5..6d0123d 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -1065,6 +1065,7 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_send_wr *wr, int header_size; int spc; int i; + int ret; send_size = 0; for (i = 0; i < wr->num_sge; ++i) @@ -1082,8 +1083,10 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_send_wr *wr, sqp->ud_header.grh.flow_label = ah->av.sl_tclass_flowlabel & cpu_to_be32(0xfffff); sqp->ud_header.grh.hop_limit = ah->av.hop_limit; - ib_get_cached_gid(ib_dev, be32_to_cpu(ah->av.port_pd) >> 24, + ret = ib_get_cached_gid(ib_dev, be32_to_cpu(ah->av.port_pd) >> 24, ah->av.gid_index, &sqp->ud_header.grh.source_gid); + if (ret) + return ret; memcpy(sqp->ud_header.grh.destination_gid.raw, ah->av.dgid, 16); } @@ -1114,9 +1117,11 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_send_wr *wr, sqp->ud_header.lrh.source_lid = IB_LID_PERMISSIVE; sqp->ud_header.bth.solicited_event = !!(wr->send_flags & IB_SEND_SOLICITED); if (!sqp->qp.ibqp.qp_num) - ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, &pkey); + ret = ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, &pkey); else - ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->wr.ud.pkey_index, &pkey); + ret = ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->wr.ud.pkey_index, &pkey); + if (ret) + return ret; sqp->ud_header.bth.pkey = cpu_to_be16(pkey); sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->wr.ud.remote_qpn); sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1)); From dotanb at dev.mellanox.co.il Sun Jan 6 23:16:04 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Mon, 7 Jan 2008 09:16:04 +0200 Subject: [ofa-general] [PATCH] mthca: Check the return values of the functions ib_get_cached_* Message-ID: <200801070916.04353.dotanb@dev.mellanox.co.il> Added a check to the return values of ib_get_cached_gid and ib_get_cached_pkey before using the values that they set (becasue those functions may fail). Signed-off-by: Dotan Barak --- diff --git a/drivers/infiniband/hw/mthca/mthca_av.c b/drivers/infiniband/hw/mthca/mthca_av.c index 4b111a8..cb72041 100644 --- a/drivers/infiniband/hw/mthca/mthca_av.c +++ b/drivers/infiniband/hw/mthca/mthca_av.c @@ -268,6 +268,8 @@ int mthca_ah_grh_present(struct mthca_ah *ah) int mthca_read_ah(struct mthca_dev *dev, struct mthca_ah *ah, struct ib_ud_header *header) { + int ret; + if (ah->type == MTHCA_AH_ON_HCA) return -EINVAL; @@ -280,10 +282,12 @@ int mthca_read_ah(struct mthca_dev *dev, struct mthca_ah *ah, header->grh.flow_label = ah->av->sl_tclass_flowlabel & cpu_to_be32(0xfffff); header->grh.hop_limit = ah->av->hop_limit; - ib_get_cached_gid(&dev->ib_dev, + ret = ib_get_cached_gid(&dev->ib_dev, be32_to_cpu(ah->av->port_pd) >> 24, ah->av->gid_index % dev->limits.gid_table_len, &header->grh.source_gid); + if (ret) + return ret; memcpy(header->grh.destination_gid.raw, ah->av->dgid, 16); } diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c index 0e5461c..22970ec 100644 --- a/drivers/infiniband/hw/mthca/mthca_qp.c +++ b/drivers/infiniband/hw/mthca/mthca_qp.c @@ -1536,11 +1536,13 @@ static int build_mlx_header(struct mthca_dev *dev, struct mthca_sqp *sqp, sqp->ud_header.lrh.source_lid = IB_LID_PERMISSIVE; sqp->ud_header.bth.solicited_event = !!(wr->send_flags & IB_SEND_SOLICITED); if (!sqp->qp.ibqp.qp_num) - ib_get_cached_pkey(&dev->ib_dev, sqp->qp.port, + err = ib_get_cached_pkey(&dev->ib_dev, sqp->qp.port, sqp->pkey_index, &pkey); else - ib_get_cached_pkey(&dev->ib_dev, sqp->qp.port, + err = ib_get_cached_pkey(&dev->ib_dev, sqp->qp.port, wr->wr.ud.pkey_index, &pkey); + if (err) + return err; sqp->ud_header.bth.pkey = cpu_to_be16(pkey); sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->wr.ud.remote_qpn); sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1)); From dotanb at dev.mellanox.co.il Sun Jan 6 23:24:25 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Mon, 7 Jan 2008 09:24:25 +0200 Subject: [ofa-general] [PATCH] srp: Check the return values of the functions ib_get_cached_gid Message-ID: <200801070924.25906.dotanb@dev.mellanox.co.il> Added a check to the return value of ib_get_cached_gid before using the value that it set (becasue this function may fail). Signed-off-by: Dotan Barak --- diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 950228f..fcefa50 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1812,7 +1812,9 @@ static ssize_t srp_create_target(struct class_device *class_dev, if (ret) goto err; - ib_get_cached_gid(host->dev->dev, host->port, 0, &target->path.sgid); + ret = ib_get_cached_gid(host->dev->dev, host->port, 0, &target->path.sgid); + if (ret) + goto err; printk(KERN_DEBUG PFX "new target: id_ext %016llx ioc_guid %016llx pkey %04x " "service_id %016llx dgid %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", From kliteyn at mellanox.co.il Sun Jan 6 23:35:20 2008 From: kliteyn at mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 07 Jan 2008 09:35:20 +0200 Subject: [ofa-general] [PATCH] management/gen_ver.sh script In-Reply-To: <20080106225813.GF26304@sashak.voltaire.com> References: <20080106225813.GF26304@sashak.voltaire.com> Message-ID: <4781D638.2000508@mellanox.co.il> Sasha Khapyorsky wrote: > This generates a version string which includes recent version as > specified in correspondent sub project's configure.in file, plus > git revision abbreviation in the case if sub-project HEAD is different > from recent tag, plus "-dirty" suffix if local uncommitted changes are > in the sub project tree. For example: > > $ ./gen_ver.sh opensm > 3.1.8-5a03b64-dirty > Great, thanks. -- Yevgeny > Signed-off-by: Sasha Khapyorsky > --- > gen_ver.sh | 38 ++++++++++++++++++++++++++++++++++++++ > 1 files changed, 38 insertions(+), 0 deletions(-) > create mode 100755 gen_ver.sh > > diff --git a/gen_ver.sh b/gen_ver.sh > new file mode 100755 > index 0000000..3524182 > --- /dev/null > +++ b/gen_ver.sh > @@ -0,0 +1,38 @@ > +#!/bin/sh > +# > +# This generates a version string which includes recent version as > +# specified in correspondent sub project's configure.in file, plus > +# git revision abbreviation in the case if sub-project HEAD is different > +# from recent tag, plus "-dirty" suffix if local uncommitted changes are > +# in the sub project tree. > +# > + > +usage() > +{ > + echo "Usage: $0 " > + exit 2 > +} > + > +test -z "$1" && usage > + > +package=$1 > + > +cd `dirname $0` > + > +conf_file=$package/configure.in > +version=`cat $conf_file | sed -ne '/AC_INIT.*'$package'.*/s/^AC_INIT.*'$package', \(.*\),.*$/\1/p'` > + > +git diff --quiet $package-$version..HEAD -- $package > /dev/null 2>&1 > +if [ $? -eq 1 ] ; then > + abbr=`git rev-parse --short --verify HEAD 2>/dev/null` > + if [ ! -z "$abbr" ] ; then > + version="${version}-${abbr}" > + fi > +fi > + > +git diff-index --quiet HEAD -- $package > /dev/null 2>&1 > +if [ $? -eq 1 ] ; then > + version="${version}-dirty" > +fi > + > +echo $version > From sashak at voltaire.com Mon Jan 7 00:11:30 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Jan 2008 08:11:30 +0000 Subject: [ofa-general] [PATCH] management/gen_ver.sh script In-Reply-To: <4781D638.2000508@mellanox.co.il> References: <20080106225813.GF26304@sashak.voltaire.com> <4781D638.2000508@mellanox.co.il> Message-ID: <20080107081130.GM26304@sashak.voltaire.com> On 09:35 Mon 07 Jan , Yevgeny Kliteynik wrote: > > Sasha Khapyorsky wrote: > > This generates a version string which includes recent version as > > specified in correspondent sub project's configure.in file, plus > > git revision abbreviation in the case if sub-project HEAD is different > > from recent tag, plus "-dirty" suffix if local uncommitted changes are > > in the sub project tree. For example: > > > > $ ./gen_ver.sh opensm > > 3.1.8-5a03b64-dirty > > > > Great, thanks. BTW, I'm thinking yet, do we need it for OFED-1.3? Any thoughts? Sasha From kliteyn at mellanox.co.il Mon Jan 7 00:05:03 2008 From: kliteyn at mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 07 Jan 2008 10:05:03 +0200 Subject: [ofa-general] [PATCH] management/gen_ver.sh script In-Reply-To: <20080107081130.GM26304@sashak.voltaire.com> References: <20080106225813.GF26304@sashak.voltaire.com> <4781D638.2000508@mellanox.co.il> <20080107081130.GM26304@sashak.voltaire.com> Message-ID: <4781DD2F.1040304@mellanox.co.il> Sasha Khapyorsky wrote: > On 09:35 Mon 07 Jan , Yevgeny Kliteynik wrote: > >> Sasha Khapyorsky wrote: >> >>> This generates a version string which includes recent version as >>> specified in correspondent sub project's configure.in file, plus >>> git revision abbreviation in the case if sub-project HEAD is different >>> from recent tag, plus "-dirty" suffix if local uncommitted changes are >>> in the sub project tree. For example: >>> >>> $ ./gen_ver.sh opensm >>> 3.1.8-5a03b64-dirty >>> >>> >> Great, thanks. >> > > BTW, I'm thinking yet, do we need it for OFED-1.3? Any thoughts? > Why not? It would be easier to understand what exactly is the user running when we'll get into the usual OFED 1.3.x.y... versioning mess. -- Yevgeny > Sasha > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > From sashak at voltaire.com Mon Jan 7 00:21:16 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Jan 2008 08:21:16 +0000 Subject: [ofa-general] [PATCH] management/gen_ver.sh script In-Reply-To: <4781DD2F.1040304@mellanox.co.il> References: <20080106225813.GF26304@sashak.voltaire.com> <4781D638.2000508@mellanox.co.il> <20080107081130.GM26304@sashak.voltaire.com> <4781DD2F.1040304@mellanox.co.il> Message-ID: <20080107082116.GN26304@sashak.voltaire.com> On 10:05 Mon 07 Jan , Yevgeny Kliteynik wrote: > > Sasha Khapyorsky wrote: > > On 09:35 Mon 07 Jan , Yevgeny Kliteynik wrote: > > > >> Sasha Khapyorsky wrote: > >> > >>> This generates a version string which includes recent version as > >>> specified in correspondent sub project's configure.in file, plus > >>> git revision abbreviation in the case if sub-project HEAD is different > >>> from recent tag, plus "-dirty" suffix if local uncommitted changes are > >>> in the sub project tree. For example: > >>> > >>> $ ./gen_ver.sh opensm > >>> 3.1.8-5a03b64-dirty > >>> > >> Great, thanks. > >> > > > > BTW, I'm thinking yet, do we need it for OFED-1.3? Any thoughts? > > > > Why not? It would be easier to understand what exactly is the user running > when we'll get into the usual OFED 1.3.x.y... versioning mess. Ok. I will apply to OFED too. Sasha From kliteyn at dev.mellanox.co.il Mon Jan 7 00:12:52 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 07 Jan 2008 10:12:52 +0200 Subject: [ofa-general] [PATCH] management/gen_ver.sh script In-Reply-To: <4781DD2F.1040304@mellanox.co.il> References: <20080106225813.GF26304@sashak.voltaire.com> <4781D638.2000508@mellanox.co.il> <20080107081130.GM26304@sashak.voltaire.com> <4781DD2F.1040304@mellanox.co.il> Message-ID: <4781DF04.8070202@dev.mellanox.co.il> Yevgeny Kliteynik wrote: > > Sasha Khapyorsky wrote: >> On 09:35 Mon 07 Jan , Yevgeny Kliteynik wrote: >> >>> Sasha Khapyorsky wrote: >>> >>>> This generates a version string which includes recent version as >>>> specified in correspondent sub project's configure.in file, plus >>>> git revision abbreviation in the case if sub-project HEAD is different >>>> from recent tag, plus "-dirty" suffix if local uncommitted changes are >>>> in the sub project tree. For example: >>>> >>>> $ ./gen_ver.sh opensm >>>> 3.1.8-5a03b64-dirty >>>> >>> Great, thanks. >>> >> >> BTW, I'm thinking yet, do we need it for OFED-1.3? Any thoughts? >> > > Why not? It would be easier to understand what exactly is the user running > when we'll get into the usual OFED 1.3.x.y... versioning mess. On second though, for all the OFED 1.3.x.y the tag would be 1.3 anyway... But still, don't see why not include it in 1.3 -- Yevgeny > >> Sasha >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> >> > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From sashak at voltaire.com Mon Jan 7 00:36:22 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Jan 2008 08:36:22 +0000 Subject: [ofa-general] [PATCH] management/gen_ver.sh script In-Reply-To: <4781DF04.8070202@dev.mellanox.co.il> References: <20080106225813.GF26304@sashak.voltaire.com> <4781D638.2000508@mellanox.co.il> <20080107081130.GM26304@sashak.voltaire.com> <4781DD2F.1040304@mellanox.co.il> <4781DF04.8070202@dev.mellanox.co.il> Message-ID: <20080107083622.GO26304@sashak.voltaire.com> On 10:12 Mon 07 Jan , Yevgeny Kliteynik wrote: > Yevgeny Kliteynik wrote: > > Sasha Khapyorsky wrote: > >> On 09:35 Mon 07 Jan , Yevgeny Kliteynik wrote: > >> > >>> Sasha Khapyorsky wrote: > >>> > >>>> This generates a version string which includes recent version as > >>>> specified in correspondent sub project's configure.in file, plus > >>>> git revision abbreviation in the case if sub-project HEAD is different > >>>> from recent tag, plus "-dirty" suffix if local uncommitted changes are > >>>> in the sub project tree. For example: > >>>> > >>>> $ ./gen_ver.sh opensm > >>>> 3.1.8-5a03b64-dirty > >>>> > >>> Great, thanks. > >>> > >> > >> BTW, I'm thinking yet, do we need it for OFED-1.3? Any thoughts? > >> > > Why not? It would be easier to understand what exactly is the user running > > when we'll get into the usual OFED 1.3.x.y... versioning mess. > > On second though, for all the OFED 1.3.x.y the tag would be 1.3 anyway... git tag? Yes, the release tag is untouched, but you have exact commit abbreviation as part of version string, so 'git-show 5a03b64' will let you this. Sasha > But still, don't see why not include it in 1.3 > > -- Yevgeny > > >> Sasha > >> _______________________________________________ > >> general mailing list > >> general at lists.openfabrics.org > >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > >> > >> To unsubscribe, please visit > >> http://openib.org/mailman/listinfo/openib-general > >> > >> > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > From ogerlitz at voltaire.com Mon Jan 7 01:31:59 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Mon, 07 Jan 2008 11:31:59 +0200 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <200801061801.25386.dotanb@dev.mellanox.co.il> References: <200801061801.25386.dotanb@dev.mellanox.co.il> Message-ID: <4781F18F.1070506@voltaire.com> Dotan Barak wrote: > Fix the value of the pkey_index in the completion to get a valid value for GSI QPs. Is libmthca fine in that respect? Or. From dotanb at dev.mellanox.co.il Mon Jan 7 02:13:21 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Mon, 07 Jan 2008 12:13:21 +0200 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <4781F18F.1070506@voltaire.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> Message-ID: <4781FB41.6040204@dev.mellanox.co.il> Or Gerlitz wrote: > Dotan Barak wrote: >> Fix the value of the pkey_index in the completion to get a valid >> value for GSI QPs. > > Is libmthca fine in that respect? As much as i know, everything is fine with mthca/libmthca. We saw several problems only in ConnectX (because of the new low level driver). Right now, we are doing some more checks to check the mlx4_0 low level driver as well as the IB core. After that we'll check the mthca low level driver too. Dotan From vlad at lists.openfabrics.org Mon Jan 7 03:11:17 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 7 Jan 2008 03:11:17 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080107-0200 daily build status Message-ID: <20080107111117.A7D46E60042@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.22 Passed on ia64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18 Passed on ia64 with linux-2.6.22 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.18 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.13 Passed on powerpc with linux-2.6.14 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.12 Passed on ia64 with linux-2.6.17 Passed on x86_64 with linux-2.6.19 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.14 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on powerpc with linux-2.6.15 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.14 Passed on powerpc with linux-2.6.12 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.15 Passed on ppc64 with linux-2.6.13 Passed on ia64 with linux-2.6.23 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18-53.el5 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.21.1 Failed: From kliteyn at dev.mellanox.co.il Mon Jan 7 04:07:39 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 07 Jan 2008 14:07:39 +0200 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <4781FB41.6040204@dev.mellanox.co.il> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> Message-ID: <4782160B.1080709@dev.mellanox.co.il> Dotan Barak wrote: > Or Gerlitz wrote: >> Dotan Barak wrote: >>> Fix the value of the pkey_index in the completion to get a valid >>> value for GSI QPs. >> >> Is libmthca fine in that respect? > As much as i know, everything is fine with mthca/libmthca. > > We saw several problems only in ConnectX (because of the new low level > driver). > > Right now, we are doing some more checks to check the mlx4_0 low level > driver as well as the IB core. > After that we'll check the mthca low level driver too. Currently OpenSM doesn't support any non-default pkey (or any pkey at index other than 0) in sa queries. When a request is received, opensm doesn't extract the right pkey from the mad header - it replaces it with a default pkey, and when a response is sent, OpenSM always uses pkey at index 0. -- Yevgeny > Dotan > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From dotanb at dev.mellanox.co.il Mon Jan 7 04:14:30 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Mon, 07 Jan 2008 14:14:30 +0200 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <4782160B.1080709@dev.mellanox.co.il> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> Message-ID: <478217A6.80307@dev.mellanox.co.il> Yevgeny Kliteynik wrote: > Dotan Barak wrote: >> Or Gerlitz wrote: >>> Dotan Barak wrote: >>>> Fix the value of the pkey_index in the completion to get a valid >>>> value for GSI QPs. >>> >>> Is libmthca fine in that respect? >> As much as i know, everything is fine with mthca/libmthca. >> >> We saw several problems only in ConnectX (because of the new low >> level driver). >> >> Right now, we are doing some more checks to check the mlx4_0 low >> level driver as well as the IB core. >> After that we'll check the mthca low level driver too. > > Currently OpenSM doesn't support any non-default pkey > (or any pkey at index other than 0) in sa queries. > When a request is received, opensm doesn't extract the > right pkey from the mad header - it replaces it with a > default pkey, and when a response is sent, OpenSM always > uses pkey at index 0. > > -- Yevgeny FYI: after several testings it seems that mthca low level driver don't have this problem. Dotan From dotanb at dev.mellanox.co.il Mon Jan 7 04:49:55 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Mon, 07 Jan 2008 14:49:55 +0200 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <478217A6.80307@dev.mellanox.co.il> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> Message-ID: <47821FF3.7020705@dev.mellanox.co.il> Dotan Barak wrote: > Yevgeny Kliteynik wrote: >> Dotan Barak wrote: >>> Or Gerlitz wrote: >>>> Dotan Barak wrote: >>>>> Fix the value of the pkey_index in the completion to get a valid >>>>> value for GSI QPs. >>>> >>>> Is libmthca fine in that respect? >>> As much as i know, everything is fine with mthca/libmthca. >>> >>> We saw several problems only in ConnectX (because of the new low >>> level driver). >>> >>> Right now, we are doing some more checks to check the mlx4_0 low >>> level driver as well as the IB core. >>> After that we'll check the mthca low level driver too. >> >> Currently OpenSM doesn't support any non-default pkey >> (or any pkey at index other than 0) in sa queries. >> When a request is received, opensm doesn't extract the >> right pkey from the mad header - it replaces it with a >> default pkey, and when a response is sent, OpenSM always >> uses pkey at index 0. >> >> -- Yevgeny > FYI: after several testings it seems that mthca low level driver don't > have this problem. > > Dotan > Just to make sure that everything is clear: I checked that the mthca low level driver can extract the right pkey_index in the completion of GSI QP. The problem that Yevgeny mentioned exists in the openSM and i opened a bug on this issue. Dotan From dwslapshoem at slapshoe.com Sun Jan 6 04:48:47 2008 From: dwslapshoem at slapshoe.com (Irene Fisher) Date: Mon, 6 Jan 2008 16:48:47 +0400 Subject: [ofa-general] Receive a real time experience of gambling without visiting a real casino! Message-ID: <01c85084$016ec180$9cf96e53@dwslapshoem> Welcome to Golden Gate Casino that offers you a unique possibility to win real money online. Download for free totally realistic and secure software which brings game excitement right into your home and receive 2400$ welcome bonus! We guarantee absolute privacy of player information. Friendly 24/7 customer support, quick payouts, only fair gaming! http://geocities.com/ClarenceBryant78/ Start downloading free software now! From sashak at voltaire.com Sun Jan 6 18:23:21 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 7 Jan 2008 02:23:21 +0000 Subject: [ofa-general] ***SPAM*** [PATCH] infiniband-diags: update version according to the tree state In-Reply-To: <20080107015737.GJ26304@sashak.voltaire.com> References: <20080107015737.GJ26304@sashak.voltaire.com> Message-ID: <20080107022321.GL26304@sashak.voltaire.com> Update the version according to the tree state. Similar to OpenSM version update. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/Makefile.am | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/infiniband-diags/Makefile.am b/infiniband-diags/Makefile.am index 8e1e587..ca66e2d 100644 --- a/infiniband-diags/Makefile.am +++ b/infiniband-diags/Makefile.am @@ -91,6 +91,18 @@ man_MANS = man/ibaddr.8 man/ibcheckerrors.8 man/ibcheckerrs.8 \ man/ibdatacounts.8 man/ibdatacounters.8 \ man/ibrouters.8 man/ibprintrt.8 man/ibidsverify.8 +BUILT_SOURCES = ibdiag_version +ibdiag_version: + if [ -x $(top_srcdir)/../gen_ver.sh ] ; then \ + ver_file=$(srcdir)/include/ibdiag_version.h ; \ + ibdiag_ver=`cat $$ver_file | sed -ne '/#define IBDIAG_VERSION /s/^.*\"\(.*\)\"$$/\1/p'` ; \ + ver=`$(top_srcdir)/../gen_ver.sh $(PACKAGE)` ; \ + if [ $$ver != $$ibdiag_ver ] ; then \ + cat $$ver_file | sed -e '/#define IBDIAG_VERSION /s/\".*\"/\"'$$ver'\"/' > tmp_new_version ; \ + cat tmp_new_version > $$ver_file && rm -f tmp_new_version ; \ + fi ; \ + fi + EXTRA_DIST = scripts include infiniband-diags.spec.in infiniband-diags.spec \ $(man_MANS) autogen.sh -- 1.5.4.rc2.38.gd6da3 From a-aarog at aerosonde.com Sun Jan 6 05:41:25 2008 From: a-aarog at aerosonde.com (Goldie Ogden) Date: Mon, 6 Jan 2008 14:41:25 +0100 Subject: [ofa-general] Let's chat Message-ID: <01c85072$36b6c2a0$af2d8659@a-aarog> Hello! I am tired today. I am nice girl that would like to chat with you. Email me at Elisabeth at HonorDays.info only, because I am using my friend's email to write this. I want to show you some pictures. From lowcostpumpseals.com at klimawechsel.com Mon Jan 7 05:36:28 2008 From: lowcostpumpseals.com at klimawechsel.com (Moises Wright) Date: Mon, 07 Jan 2008 15:36:28 +0200 Subject: [ofa-general] Adobe Font Folio 11 MAC/XP/Vista for 189, Retails @ 2599 (You save 2409) Message-ID: <000201c85132$1bff7400$0100007f@nccpgxa> Type 'xhighereasy. com' in Internet browser (please delete space and quotes) adobe fireworks cs3 - 59 cyberlink powerdvd ultra deluxe 7 - 29 alias maya 7.0 unlimited - 109 adobe fireworks cs3 - 59 masterwriter 1.0 - 49 crystal xcelsius professional v4.5 - 59 virtual pc 7.0 for mac - 49 coreldraw graphics suite x3 - 59 sonic scenarist 3.0 - 49 coreldraw graphics suite 12 - 49 microsoft sql server developer edition 2005 - 69 steinberg nuendo 3.1 - 99 You can save 75-90% here! From hrosenstock at xsigo.com Mon Jan 7 07:02:08 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 07 Jan 2008 07:02:08 -0800 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <47821FF3.7020705@dev.mellanox.co.il> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> Message-ID: <1199718128.20870.100.camel@hrosenstock-ws.xsigo.com> On Mon, 2008-01-07 at 14:49 +0200, Dotan Barak wrote: > Dotan Barak wrote: > > Yevgeny Kliteynik wrote: > >> Dotan Barak wrote: > >>> Or Gerlitz wrote: > >>>> Dotan Barak wrote: > >>>>> Fix the value of the pkey_index in the completion to get a valid > >>>>> value for GSI QPs. > >>>> > >>>> Is libmthca fine in that respect? > >>> As much as i know, everything is fine with mthca/libmthca. > >>> > >>> We saw several problems only in ConnectX (because of the new low > >>> level driver). > >>> > >>> Right now, we are doing some more checks to check the mlx4_0 low > >>> level driver as well as the IB core. > >>> After that we'll check the mthca low level driver too. > >> > >> Currently OpenSM doesn't support any non-default pkey > >> (or any pkey at index other than 0) in sa queries. > >> When a request is received, opensm doesn't extract the > >> right pkey from the mad header - it replaces it with a > >> default pkey, and when a response is sent, OpenSM always > >> uses pkey at index 0. > >> > >> -- Yevgeny > > FYI: after several testings it seems that mthca low level driver don't > > have this problem. > > > > Dotan > > > > Just to make sure that everything is clear: I checked that the mthca low > level driver can extract > the right pkey_index in the completion of GSI QP. > > The problem that Yevgeny mentioned exists in the openSM and i opened a > bug on this issue. What's the bug number for this ? -- Hal > > Dotan From hrosenstock at xsigo.com Mon Jan 7 07:03:07 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 07 Jan 2008 07:03:07 -0800 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <47821FF3.7020705@dev.mellanox.co.il> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> Message-ID: <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> On Mon, 2008-01-07 at 14:49 +0200, Dotan Barak wrote: > Dotan Barak wrote: > > Yevgeny Kliteynik wrote: > >> Dotan Barak wrote: > >>> Or Gerlitz wrote: > >>>> Dotan Barak wrote: > >>>>> Fix the value of the pkey_index in the completion to get a valid > >>>>> value for GSI QPs. > >>>> > >>>> Is libmthca fine in that respect? > >>> As much as i know, everything is fine with mthca/libmthca. > >>> > >>> We saw several problems only in ConnectX (because of the new low > >>> level driver). > >>> > >>> Right now, we are doing some more checks to check the mlx4_0 low > >>> level driver as well as the IB core. > >>> After that we'll check the mthca low level driver too. > >> > >> Currently OpenSM doesn't support any non-default pkey > >> (or any pkey at index other than 0) in sa queries. > >> When a request is received, opensm doesn't extract the > >> right pkey from the mad header - it replaces it with a > >> default pkey, and when a response is sent, OpenSM always > >> uses pkey at index 0. > >> > >> -- Yevgeny > > FYI: after several testings it seems that mthca low level driver don't > > have this problem. > > > > Dotan > > > > Just to make sure that everything is clear: I checked that the mthca low > level driver can extract > the right pkey_index in the completion of GSI QP. > > The problem that Yevgeny mentioned exists in the openSM and i opened a > bug on this issue. Some comments on the issues raised above: 1. There has already been discussion on this list of other OpenFabrics components assuming the default PKey at index 0 (and yes, this is not mandated by IBA). 2. As to the impact of using a non default PKey (in the BTH) for querying SA, is this really used in any implementations ? It makes deployment of SM difficult (needing much more configuration). That's not to say this isn't a bug but more speaks to the severity of it. IMO it should be documented as a current limitation. [Also, note that user MAD API only supports pkey index at recent kernel and library versions.] -- Hal > Dotan From hrosenstock at xsigo.com Mon Jan 7 07:03:16 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 07 Jan 2008 07:03:16 -0800 Subject: [ofa-general] Re: [PATCH RFC] opensm/ib_types.h: remove ifdef WIN conditions In-Reply-To: <20080106154328.GA26304@sashak.voltaire.com> References: <20080106154328.GA26304@sashak.voltaire.com> Message-ID: <1199718196.20870.104.camel@hrosenstock-ws.xsigo.com> Sasha, On Sun, 2008-01-06 at 15:43 +0000, Sasha Khapyorsky wrote: > It was stated couple of times that in windows another instance of > ib_types.h file is used. If so we don't need to keep those 'ifdef WIN' > conditions here. Also this removes empty __ptr64 macro. Shouldn't this also be sent to ofw for comments ? Also, ib_cm_types.h looks like it should be changed as well in terms of this. Since master and ofed_1_3 are no longer identical, please indicate for which branch(es) patches are intended. -- Hal > Signed-off-by: Sasha Khapyorsky > --- > opensm/include/iba/ib_types.h | 36 ++++++++++-------------------------- > 1 files changed, 10 insertions(+), 26 deletions(-) > > diff --git a/opensm/include/iba/ib_types.h b/opensm/include/iba/ib_types.h > index 672184b..a438d8a 100644 > --- a/opensm/include/iba/ib_types.h > +++ b/opensm/include/iba/ib_types.h > @@ -49,20 +49,9 @@ > #endif /* __cplusplus */ > > BEGIN_C_DECLS > -#if defined( WIN32 ) || defined( _WIN64 ) > -#if defined( EXPORT_AL_SYMBOLS ) > -#define OSM_EXPORT __declspec(dllexport) > -#else > -#define OSM_EXPORT __declspec(dllimport) > -#endif > -#define OSM_API __stdcall > -#define OSM_CDECL __cdecl > -#else > #define OSM_EXPORT extern > #define OSM_API > #define OSM_CDECL > -#define __ptr64 > -#endif > /****h* IBA Base/Constants > * NAME > * Constants > @@ -8241,22 +8230,21 @@ typedef struct _ib_ioc_info { > /* > * The following definitions are shared between the Access Layer and VPD > */ > -typedef struct _ib_ca *__ptr64 ib_ca_handle_t; > -typedef struct _ib_pd *__ptr64 ib_pd_handle_t; > -typedef struct _ib_rdd *__ptr64 ib_rdd_handle_t; > -typedef struct _ib_mr *__ptr64 ib_mr_handle_t; > -typedef struct _ib_mw *__ptr64 ib_mw_handle_t; > -typedef struct _ib_qp *__ptr64 ib_qp_handle_t; > -typedef struct _ib_eec *__ptr64 ib_eec_handle_t; > -typedef struct _ib_cq *__ptr64 ib_cq_handle_t; > -typedef struct _ib_av *__ptr64 ib_av_handle_t; > -typedef struct _ib_mcast *__ptr64 ib_mcast_handle_t; > +typedef struct _ib_ca * ib_ca_handle_t; > +typedef struct _ib_pd * ib_pd_handle_t; > +typedef struct _ib_rdd * ib_rdd_handle_t; > +typedef struct _ib_mr * ib_mr_handle_t; > +typedef struct _ib_mw * ib_mw_handle_t; > +typedef struct _ib_qp * ib_qp_handle_t; > +typedef struct _ib_eec * ib_eec_handle_t; > +typedef struct _ib_cq * ib_cq_handle_t; > +typedef struct _ib_av * ib_av_handle_t; > +typedef struct _ib_mcast * ib_mcast_handle_t; > > /* Currently for windows branch, use the extended version of ib special verbs struct > in order to be compliant with Infinicon ib_types; later we'll change it to support > OpenSM ib_types.h */ > > -#ifndef WIN32 > /****d* Access Layer/ib_api_status_t > * NAME > * ib_api_status_t > @@ -10710,8 +10698,4 @@ typedef struct _ib_ci_op { > *****/ > > END_C_DECLS > -#endif /* ndef WIN32 */ > -#if defined( __WIN__ ) > -#include > -#endif > #endif /* __IB_TYPES_H__ */ From hrosenstock at xsigo.com Mon Jan 7 07:03:30 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 07 Jan 2008 07:03:30 -0800 Subject: [ofa-general] [PATCH 1/3] libvendor: osm_vendor_get_all_port_attr() rework In-Reply-To: <11951179291903-git-send-email-sashak@voltaire.com> References: <11951179291471-git-send-email-sashak@voltaire.com> <11951179291903-git-send-email-sashak@voltaire.com> Message-ID: <1199718210.20870.105.camel@hrosenstock-ws.xsigo.com> Sasha, On Thu, 2007-11-15 at 11:12 +0200, Sasha Khapyorsky wrote: > It fixes couple of issues with this function: > > - return only valid guids, don't return duplicated entries What entries were duplicated ? I think there may be a subtle "API" change in that the ib_port_attr_t array filled in no longer has (or properly calculates the "best" port). Not sure if it is this change or some other change which causes this. -- Hal > as well as valid number of ports > - return valid sm_lid (as on ports) > - potential local buffers overflow > - minor leaks (not released ca) > > Finally it is much simplified now. > > Signed-off-by: Sasha Khapyorsky > --- > opensm/libvendor/osm_vendor_ibumad.c | 100 ++++++++-------------------------- > 1 files changed, 24 insertions(+), 76 deletions(-) > > diff --git a/opensm/libvendor/osm_vendor_ibumad.c b/opensm/libvendor/osm_vendor_ibumad.c > index 1d5f359..37007cd 100644 > --- a/opensm/libvendor/osm_vendor_ibumad.c > +++ b/opensm/libvendor/osm_vendor_ibumad.c > @@ -543,18 +543,10 @@ osm_vendor_get_all_port_attr(IN osm_vendor_t * const p_vend, > IN ib_port_attr_t * const p_attr_array, > IN uint32_t * const p_num_ports) > { > - ib_net64_t portguids[*p_num_ports]; > - ib_net64_t *p_guid = portguids, *e = portguids + *p_num_ports; > umad_ca_t ca; > - int lids[*p_num_ports]; > - int linkstates[*p_num_ports]; > - int portnums[*p_num_ports]; > - int *p_lid = lids; > - int *p_linkstates = linkstates; > - int *p_portnum = portnums; > - umad_port_t def_port = { "" }; > + ib_port_attr_t *attr = p_attr_array; > + unsigned done = 0; > int r, i, j; > - int sm_lid = 0; > > OSM_LOG_ENTER(p_vend->p_log, osm_vendor_get_all_port_attr); > > @@ -568,81 +560,37 @@ osm_vendor_get_all_port_attr(IN osm_vendor_t * const p_vend, > goto Exit; > } > > - for (i = 0; p_guid < e && i < p_vend->ca_count; i++) { > + if (!p_attr_array) { > + r = IB_INSUFFICIENT_MEMORY; > + *p_num_ports = 0; > + goto Exit; > + } > + > + for (i = 0; i < p_vend->ca_count && !done; i++) { > /* > * For each CA, retrieve the port guids > */ > - if ((r = umad_get_ca_portguids(p_vend->ca_names[i], > - p_guid, e - p_guid)) < 0) { > - osm_log(p_vend->p_log, OSM_LOG_ERROR, > - "osm_vendor_get_all_port_attr: ERR 5419: " > - "Unable to get CA %s port guids (%s)\n", > - p_vend->ca_names[i], strerror(r)); > - goto Exit; > - } > - > - p_guid += r; > - > - if ((r = umad_get_ca(p_vend->ca_names[i], &ca)) == 0) { > + if (umad_get_ca(p_vend->ca_names[i], &ca) == 0) { > for (j = 0; j <= ca.numports; j++) { > - if (ca.ports[j]) { > - *p_lid = ca.ports[j]->base_lid; > - *p_linkstates = ca.ports[j]->state; > - *p_portnum = ca.ports[j]->portnum; > - free(ca.ports[j]); > + if (!ca.ports[j]) > + continue; > + attr->port_guid = ca.ports[j]->port_guid; > + attr->lid = ca.ports[j]->base_lid; > + attr->port_num = ca.ports[j]->portnum; > + attr->sm_lid = ca.ports[j]->sm_lid; > + attr->link_state = ca.ports[j]->state; > + attr++; > + if (attr - p_attr_array > *p_num_ports) { > + done = 1; > + break; > } > - p_lid++; > - p_linkstates++; > - p_portnum++; > } > + umad_release_ca(&ca); > } > } > > - *p_num_ports = p_guid - portguids; > - > - /* > - * If no port 0 - we are on other than switch. > - * Get a default 'best' port from the library. > - */ > - if (*p_num_ports && !portguids[0]) { > - umad_get_port(0, 0, &def_port); > - > - portguids[0] = def_port.port_guid; > - lids[0] = def_port.base_lid; > - linkstates[0] = def_port.state; > - portnums[0] = def_port.portnum; > - sm_lid = def_port.sm_lid; > - > - osm_log(p_vend->p_log, OSM_LOG_DEBUG, > - "osm_vendor_get_all_port_attr: " > - "assign CA %s port %d guid (0x%" PRIx64 > - ") as the default port\n", def_port.ca_name, > - def_port.portnum, cl_hton64(def_port.port_guid)); > - > - umad_release_port(&def_port); > - } > - > - j = 0; > - if (p_attr_array) { > - /* set the port guid, lid, and sm lid in the port attr struct */ > - for (i = 0; i < *p_num_ports; i++) { > - if (i > 0 && portguids[i] == 0) > - continue; > - p_attr_array[j].port_guid = portguids[i]; > - p_attr_array[j].lid = lids[i]; > - p_attr_array[j].port_num = portnums[i]; > - if (j == 0) > - p_attr_array[j].sm_lid = sm_lid; > - else > - p_attr_array[j].sm_lid = > - p_vend->umad_port.sm_lid; > - p_attr_array[j].link_state = linkstates[i]; > - j++; > - } > - r = 0; > - *p_num_ports = j; > - } else > - r = IB_INSUFFICIENT_MEMORY; > + *p_num_ports = attr - p_attr_array; > + r = 0; > > Exit: > OSM_LOG_EXIT(p_vend->p_log); From kliteyn at dev.mellanox.co.il Mon Jan 7 07:13:56 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 07 Jan 2008 17:13:56 +0200 Subject: [ofa-general] [PATCH] opensm/osm_qos_policy.c: trivial fix in passing wrong pointer Message-ID: <478241B4.1050209@dev.mellanox.co.il> st_lookup() returned node in the p_node pointer and replaced the node that was intended to be inserted into the queue, which caused infinite loop. Besides, if st_lookup() does finds an element in the hash, we're only interested to know that it did - don't need the actual element. Please apply to ofed_1_3 and master. Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_qos_policy.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/opensm/opensm/osm_qos_policy.c b/opensm/opensm/osm_qos_policy.c index 6140de0..bde1e7e 100644 --- a/opensm/opensm/osm_qos_policy.c +++ b/opensm/opensm/osm_qos_policy.c @@ -77,8 +77,7 @@ __build_nodebyname_hash(osm_qos_policy_t * p_qos_policy) p_node != (osm_node_t *) cl_qmap_end(p_node_guid_tbl); p_node = (osm_node_t *) cl_qmap_next(&p_node->map_item)) { if (!st_lookup(p_qos_policy->p_node_hash, - (st_data_t)p_node->print_desc, - (st_data_t*)&p_node)) + (st_data_t)p_node->print_desc, NULL)) st_insert(p_qos_policy->p_node_hash, (st_data_t)p_node->print_desc, (st_data_t)p_node); -- 1.5.1.4 From krause at cup.hp.com Mon Jan 7 07:08:58 2008 From: krause at cup.hp.com (Michael Krause) Date: Mon, 07 Jan 2008 07:08:58 -0800 Subject: [ofa-general] AF_INET_SDP value In-Reply-To: References: <39C75744D164D948A170E9792AF8E7CA4296C4@exil.voltaire.com> Message-ID: <6.2.0.14.2.20080107070720.0233cdc0@esmail.cup.hp.com> An HTML attachment was scrubbed... URL: From perkinjo at cse.ohio-state.edu Mon Jan 7 07:23:43 2008 From: perkinjo at cse.ohio-state.edu (Jonathan L. Perkins) Date: Mon, 07 Jan 2008 10:23:43 -0500 Subject: [ofa-general] Updated MVAPICH2 1.0.1 SRPM Available Message-ID: <478243FF.2000209@cse.ohio-state.edu> I've uploaded a new SRPM for MVAPICH2 to the openfabrics server. This is located in ~perkinjo/ofed_1_3/ and is identified by the latest.txt file. This is a bug fix change that should solve the problem seen when building on the PPC64 platform. -- Jonathan Perkins http://www.cse.ohio-state.edu/~perkinjo From tziporet at mellanox.co.il Mon Jan 7 07:24:57 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Mon, 7 Jan 2008 17:24:57 +0200 Subject: [ofa-general] Agenda for OFED meeting today (Jan 7) Message-ID: <6C2C79E72C305246B504CBA17B5500C9030F03A2@mtlexch01.mtl.com> Hi All, After a nice holidays vacation we wish to proceed with OFED release :-) Meeting Agenda: 1. Release status - report from all 2. Tasks that should be completed for RC2: * XRC - enhanced API * IPoIB performance improvements for small messages - at least some of the changes * Open MPI 1.2.5-rc2 * Qlogic new driver - done * Any other? 3. Agree on new schedule for the release: * RC2: Jan 15, 2008 * RC3: Jan 29, 2008 * RC4: Feb 12, 2008 * Release: Feb 19, 2008 4. Review critical and major bugs: 760 major eli at mellanox.co.il UDP performance on Rx is lower than Tx 761 major eli at mellanox.co.il Poor and jittery UDP performance at small messages 820 major pasha at mellanox.co.il rpm 4.4.2.2, Binary file matches Binary file, 800 major perkinjo at cse.ohio-state.edu MVAPICH2 compile error on PPC64 736 major rolandd at cisco.com IBV_WC_RETRY_EXC_ERR errors with local rdma_reads 767 major swise at opengridcomputing.com Non backport Kernels that don't build in genalloc cause compile errors for cxgb3 Tziporet Koren Software Director Mellanox Technologies mailto: tziporet at mellanox.co.il Tel +972-4-9097200, ext 380 -------------- next part -------------- An HTML attachment was scrubbed... URL: From dotanb at dev.mellanox.co.il Mon Jan 7 07:37:14 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Mon, 07 Jan 2008 17:37:14 +0200 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <1199718128.20870.100.camel@hrosenstock-ws.xsigo.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718128.20870.100.camel@hrosenstock-ws.xsigo.com> Message-ID: <4782472A.8040305@dev.mellanox.co.il> >> Just to make sure that everything is clear: I checked that the mthca low >> level driver can extract >> the right pkey_index in the completion of GSI QP. >> >> The problem that Yevgeny mentioned exists in the openSM and i opened a >> bug on this issue. >> > > What's the bug number for this ? > 845 Dotan From hurdlesio34 at teamlevine.com Sun Jan 6 08:44:50 2008 From: hurdlesio34 at teamlevine.com (Daryl Aguirre) Date: Mon, 6 Jan 2008 18:44:50 +0200 Subject: [ofa-general] Re: Message-ID: <01c85094$37b43d00$20b8b54e@hurdlesio34> Now you can take your chances for a way better future! Start right ahead!WE now happy to introduce to you a tatally different option to acquire your qualification online!Any field your master in you will defenitely go better with a diploma record in your CV. Lot's of people worldwide appreciated this unique opportunity of getting bachelor�s, PH�s, and Master�s through the net. And plus you now able to reach your aim almoust instantly.Ladder you carrer! Call us 1 206 888-2083 for 24/7. Your diploma is just a few days away! From swindlersmdv350 at filetaxeshere.com Sun Jan 6 08:58:33 2008 From: swindlersmdv350 at filetaxeshere.com (Jamie Mcallister) Date: Mon, 6 Jan 2008 17:58:33 +0100 Subject: [ofa-general] Re: Message-ID: <01c8508d$c07baa80$644661be@swindlersmdv350> Now you can take your chances for a way better future! Start right ahead!WE now happy to introduce to you a tatally different option to acquire your qualification online!Any field your master in you will defenitely go better with a diploma record in your CV. Millions of people all over the world took advantage of getting bachelor’s, PH’s, and Master’s through the net. And plus you now able to reach your aim almoust instantly.The missing brick is right there! Call us 1 206 888-2083 around the clolck. You can proudly grasp your diploma within days! From a.capriotti at cineca.it Mon Jan 7 09:36:00 2008 From: a.capriotti at cineca.it (Andrea Capriotti) Date: Mon, 7 Jan 2008 18:36:00 +0100 (MET) Subject: [ofa-general] Issues with compilation of OFED 1.2.5.4 and RHEL 4 U6kernel 2.6.9-67.0.1.ELsmp In-Reply-To: <39C75744D164D948A170E9792AF8E7CAC5ACB2@exil.voltaire.com> References: <1199289816.10472.61.camel@debcap.cineca.it> <39C75744D164D948A170E9792AF8E7CAC5ACB2@exil.voltaire.com> Message-ID: <1199727505.4046.24.camel@debcap.cineca.it> Il giorno gio, 03/01/2008 alle 10.19 +0200, Moshe Kazir ha scritto: > That's what I send to Vlad, > > You have to use the patch files and the new install.pl It doesn't work. It exits with this error: + cd ofa_kernel-1.2.5.4 + cd /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.2.5.4 + mkdir -p /var/tmp/OFED//usr/src + cp -a /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.2.5.4 /var/tmp/OFED//usr/src + ./configure --prefix=/usr --kernel-version 2.6.9-67.0.1.ELsmp --kernel-sources /lib/modules/2.6.9-67.0.1.ELsmp/build --modules-dir /lib/modules/2.6.9-67.0. 1.ELsmp/updates --with-core-mod --with-user_mad-mod --with-user_access-mod --with-addr_trans-mod --with-mthca-mod --with-mlx4-mod --with-cxgb3-mod --with-nes -mod --with-ipath_inf-mod --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-srp-target-mod --with-rds-mod --with-qlgc_vnic-mod --with-iser-mod Wrong parameter --with-nes-mod > Or try using OFED-1.2.5 last build from > http://www.openfabrics.org/builds/connectx OFED-1.2.5.4-20080106-0845.tgz worked like a charm, thank you very much. Can I use it in a production environment? Does it include only bug fixes and the backport for RHEL4 U6 or something else? When will OFED 1.2.5.5 be released? Best Regards -- Andrea Capriotti System Management Group - Cineca - www.cineca.it a.capriotti at cineca.it - Tel +39 051 6171890 From weiny2 at llnl.gov Mon Jan 7 10:04:35 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 7 Jan 2008 10:04:35 -0800 Subject: [ofa-general] [PATCH] management/gen_ver.sh script In-Reply-To: <20080107082116.GN26304@sashak.voltaire.com> References: <20080106225813.GF26304@sashak.voltaire.com> <4781D638.2000508@mellanox.co.il> <20080107081130.GM26304@sashak.voltaire.com> <4781DD2F.1040304@mellanox.co.il> <20080107082116.GN26304@sashak.voltaire.com> Message-ID: <20080107100435.52a51266.weiny2@llnl.gov> On Mon, 7 Jan 2008 08:21:16 +0000 Sasha Khapyorsky wrote: > On 10:05 Mon 07 Jan , Yevgeny Kliteynik wrote: > > > > Sasha Khapyorsky wrote: > > > On 09:35 Mon 07 Jan , Yevgeny Kliteynik wrote: > > > > > >> Sasha Khapyorsky wrote: > > >> > > >>> This generates a version string which includes recent version as > > >>> specified in correspondent sub project's configure.in file, plus > > >>> git revision abbreviation in the case if sub-project HEAD is different > > >>> from recent tag, plus "-dirty" suffix if local uncommitted changes are > > >>> in the sub project tree. For example: > > >>> > > >>> $ ./gen_ver.sh opensm > > >>> 3.1.8-5a03b64-dirty > > >>> > > >> Great, thanks. > > >> > > > > > > BTW, I'm thinking yet, do we need it for OFED-1.3? Any thoughts? > > > > > > > Why not? It would be easier to understand what exactly is the user running > > when we'll get into the usual OFED 1.3.x.y... versioning mess. > > Ok. I will apply to OFED too. Yes, yes, and yes. I was going to ask if you would apply it to 1.3 branch as well. Ira > > Sasha > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From palibois.com at velloxcoatings.com Mon Jan 7 13:19:17 2008 From: palibois.com at velloxcoatings.com (Jayson Jones) Date: Mon, 07 Jan 2008 17:19:17 -0400 Subject: [ofa-general] Re: Check it Message-ID: <000201c85161$d8bcf980$0100007f@okwxaa> Type 'xhighereasy. com' in your |E (please delete space and quotes) sony sound forge 9.0 - 49 avid liquid pro 7 - 69 propellerhead reason 3 - 69 endnote x1 for mac - 59 creative suite standard - 99 ulead photoimpact 12 - 79 microsoft money home & business 7 - 39 mindjet mindmanager pro 7.0 - 39 avid newscutter xp 6.7.2 - 69 creative suite 3 design premium for win - 269 microsoft money home & business 7 - 39 intuit quickbooks premier edition 2007 - 79 You can save 73-90% here! From jim at mellanox.com Mon Jan 7 13:36:09 2008 From: jim at mellanox.com (Jim Mott) Date: Mon, 7 Jan 2008 13:36:09 -0800 Subject: [ofa-general] AF_INET_SDP value In-Reply-To: <6.2.0.14.2.20080107070720.0233cdc0@esmail.cup.hp.com> References: <39C75744D164D948A170E9792AF8E7CA4296C4@exil.voltaire.com> <6.2.0.14.2.20080107070720.0233cdc0@esmail.cup.hp.com> Message-ID: This is indeed how SDP works on Linux. The unmodified binary runs against the libsdp shared library and the right things happen. The AF issue comes in because of a requirement (request, desire, misunderstanding, creeping feature?) to be able to create SDP only applications that can bypass the library and run directly against SDP. These applications, for example, will fail if the target system is not running SDP where the library approach silently falls back to TCP. While I am not sure of who the non-libsdp consumer of this AF is, I am sure that there is a non-technical problem just defining a new address family. The end result is that AF_INET_SDP is not defined in any normal OS place. Maybe this is correct behavior. I could certainly argue both sides. Thanks, JIm Jim Mott Mellanox Technologies Ltd. mail: jim at mellanox.com Phone: 512-294-5481 From: Michael Krause [mailto:krause at cup.hp.com] Sent: Monday, January 07, 2008 9:09 AM To: Jim Mott; Lenny Verkhovsky; general at lists.openfabrics.org Subject: RE: [ofa-general] AF_INET_SDP value Technically, there was never a solid technical reason to require a new AF_INET_SDP value since SDP should be transparently interposed underneath a Sockets AF_INET application (the SDP port mapper protocol helps in this regard as well).  The intended reason for SDP in the first place is to enable Sockets-based applications to transparently, i.e. non-modified source and if using shared libraries, non-modified binaries, to take advantage of RDMA interconnects.   This is how it is implemented on Windows and other OS that support SDP or in Window's case, the prior incarnation called Winsocks Direct. While making a modification to the address family may seem trivial to most, the simple act of opening up the application source to any change is a major issue to many enterprise customers.   Given SDP adoption is nascent and there are competing approaches to protocol acceleration technology coming to market or being explored as well as a lot of unfortunate marketing FUD, the developers might want to think about what it would take to support SDP as originally intended by the IBTA and IETF. Mike At 10:21 AM 1/6/2008, Jim Mott wrote: Content-class: urn:content-classes:message Content-Type: multipart/alternative;          boundary="----_=_NextPart_001_01C85090.FC407BBA" I do not believe so.  There are some politics involved.  This value is shipped as part of the user space libsdp code.  Perhaps someone that knows  more history on this can comment?   From: general-bounces at lists.openfabrics.org [ mailto:general-bounces at lists.openfabrics.org] On Behalf Of Lenny Verkhovsky Sent: Sunday, January 06, 2008 10:24 AM To: general at lists.openfabrics.org Subject: [ofa-general] AF_INET_SDP value   Hi,   Is AF_INET_SDP equals 27 is standartized for all architectures and kernels ?   Best Regards, Lenny.   _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From rdreier at cisco.com Mon Jan 7 14:05:20 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 07 Jan 2008 14:05:20 -0800 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <200801061801.25386.dotanb@dev.mellanox.co.il> (Dotan Barak's message of "Sun, 6 Jan 2008 18:01:25 +0200") References: <200801061801.25386.dotanb@dev.mellanox.co.il> Message-ID: > - wc->pkey_index = ntohl(cqe->immed_rss_invalid) >> 16; > + wc->pkey_index = (uint16_t)(ntohl(cqe->immed_rss_invalid) & 0x7f); This is pretty silly. We don't allow userspace to create QP1 anyway, so is there any point setting the pkey_index field here at all? - R. From rdreier at cisco.com Mon Jan 7 14:11:28 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 07 Jan 2008 14:11:28 -0800 Subject: [ofa-general] Re: [PATCH] mlx4: Fix the value of the pkey_index in the completion In-Reply-To: <200801070901.26213.dotanb@dev.mellanox.co.il> (Dotan Barak's message of "Mon, 7 Jan 2008 09:01:25 +0200") References: <200801070901.26213.dotanb@dev.mellanox.co.il> Message-ID: > Without this fix, incoming GSI packets on port 2 gets invalid pkey index in the completion, > which prevent from the mad layer to send back a response. Does this explain the problems that someone saw when using all port 2 of a connectx setup? Why does it only affect port 2? > - wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) >> 16; > + wc->pkey_index = (u16)(be32_to_cpu(cqe->immed_rss_invalid) & 0x7f); It seems the (u16) cast here is doubly useless, since wc->pkey_index is already u16, and we're masking with 0x7f anyway. By the way, will it always work to mask with 0x7f? Or is it possible that the P_Key table might have more than 128 entries? - R. From lotto at phreego.com Mon Jan 7 14:39:53 2008 From: lotto at phreego.com (godsent) Date: Mon, 7 Jan 2008 17:39:53 -0500 Subject: [ofa-general] LOAN OFFER APPLY TODAY Message-ID: <81b144fc5d2a4018b0b2bd347a62ef30.lotto@phreego.com> Attn: I am a private Loan lender,I offer Loans to individuals,Firms and cooperate bodies at 3% interest rate,loan terms determinant. Loan for developing businesses a competitive edge. I offer the following kinds of loans * Personal Loans (Secure and Unsecured) * Business Loans (Secure and Unsecured) * Undergraduate Loans * Graduate Loans * MBA Education Loans * Medical Education Loans * Legal Education Loans * Study Abroad Loans * Consolidation Loan * Combination Loan * Collateral And Non- Collateral Loans and much more Loans For Your Business Startup: Loans for Everyone, I offer loans between an amount ranging from £1,000 to £500,000, 000. You can contact me today and be financially equiped. godsent_loanlender at yahoo.com.sg From lotto at phreego.com Mon Jan 7 14:39:56 2008 From: lotto at phreego.com (godsent) Date: Mon, 7 Jan 2008 17:39:56 -0500 Subject: [ofa-general] LOAN OFFER APPLY TODAY Message-ID: <8bb06ec837d8432bb593dd1e9da5576c.lotto@phreego.com> Attn: I am a private Loan lender,I offer Loans to individuals,Firms and cooperate bodies at 3% interest rate,loan terms determinant. Loan for developing businesses a competitive edge. I offer the following kinds of loans * Personal Loans (Secure and Unsecured) * Business Loans (Secure and Unsecured) * Undergraduate Loans * Graduate Loans * MBA Education Loans * Medical Education Loans * Legal Education Loans * Study Abroad Loans * Consolidation Loan * Combination Loan * Collateral And Non- Collateral Loans and much more Loans For Your Business Startup: Loans for Everyone, I offer loans between an amount ranging from £1,000 to £500,000, 000. You can contact me today and be financially equiped. godsent_loanlender at yahoo.com.sg From dillowda at ornl.gov Mon Jan 7 15:23:41 2008 From: dillowda at ornl.gov (David Dillow) Date: Mon, 07 Jan 2008 18:23:41 -0500 Subject: [ofa-general] [PATCH v2] IB/srp: add identifying information to log messages In-Reply-To: <20071222145612.GA10085@osc.edu> References: <1198269544.9979.26.camel@lap75545.ornl.gov> <20071222145612.GA10085@osc.edu> Message-ID: <1199748221.22987.6.camel@lap75545.ornl.gov> When you have multiple targets, it gets really confusing when you try to track down who did a reset when there is no identifying information in the log message, especially when the same extension ID is mapped through two different local IB ports. So, add an identifier that can be used to track back to which local IB port/remote target pair is the one having problems. Signed-off-by: David Dillow --- On Sat, 2007-12-22 at 09:56 -0500, Pete Wyckoff wrote: > Good idea to fix these. > > Could you use the standard dev_err(), dev_printk() and friends here > instead? dev = &target->scsi_host->shost_gendev. In fact, for > struct Scsi_host, you can do one better and use shost_printk(). I finally got back around to working on this; these apply to Linus's current tree. diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 77e8b90..154ebb0 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -272,7 +272,8 @@ static void srp_path_rec_completion(int status, target->status = status; if (status) - printk(KERN_ERR PFX "Got failed path rec status %d\n", status); + shost_printk(KERN_ERR, target->scsi_host, + PFX "Got failed path rec status %d\n", status); else target->path = *pathrec; complete(&target->done); @@ -303,7 +304,8 @@ static int srp_lookup_path(struct srp_target_port *target) wait_for_completion(&target->done); if (target->status < 0) - printk(KERN_WARNING PFX "Path record query failed\n"); + shost_printk(KERN_WARNING, target->scsi_host, + PFX "Path record query failed\n"); return target->status; } @@ -379,9 +381,10 @@ static int srp_send_req(struct srp_target_port *target) * the second 8 bytes to the local node GUID. */ if (srp_target_is_topspin(target)) { - printk(KERN_DEBUG PFX "Topspin/Cisco initiator port ID workaround " - "activated for target GUID %016llx\n", - (unsigned long long) be64_to_cpu(target->ioc_guid)); + shost_printk(KERN_DEBUG, target->scsi_host, + PFX "Topspin/Cisco initiator port ID workaround " + "activated for target GUID %016llx\n", + (unsigned long long) be64_to_cpu(target->ioc_guid)); memset(req->priv.initiator_port_id, 0, 8); memcpy(req->priv.initiator_port_id + 8, &target->srp_host->dev->dev->node_guid, 8); @@ -400,7 +403,8 @@ static void srp_disconnect_target(struct srp_target_port *target) init_completion(&target->done); if (ib_send_cm_dreq(target->cm_id, NULL, 0)) { - printk(KERN_DEBUG PFX "Sending CM DREQ failed\n"); + shost_printk(KERN_DEBUG, target->scsi_host, + PFX "Sending CM DREQ failed\n"); return; } wait_for_completion(&target->done); @@ -568,7 +572,8 @@ static int srp_reconnect_target(struct srp_target_port *target) return ret; err: - printk(KERN_ERR PFX "reconnect failed (%d), removing target port.\n", ret); + shost_printk(KERN_ERR, target->scsi_host, + PFX "reconnect failed (%d), removing target port.\n", ret); /* * We couldn't reconnect, so kill our target port off. @@ -683,8 +688,9 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target, if (scmnd->sc_data_direction != DMA_FROM_DEVICE && scmnd->sc_data_direction != DMA_TO_DEVICE) { - printk(KERN_WARNING PFX "Unhandled data direction %d\n", - scmnd->sc_data_direction); + shost_printk(KERN_WARNING, target->scsi_host, + PFX "Unhandled data direction %d\n", + scmnd->sc_data_direction); return -EINVAL; } @@ -786,8 +792,9 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp) } else { scmnd = req->scmnd; if (!scmnd) - printk(KERN_ERR "Null scmnd for RSP w/tag %016llx\n", - (unsigned long long) rsp->tag); + shost_printk(KERN_ERR, target->scsi_host, + "Null scmnd for RSP w/tag %016llx\n", + (unsigned long long) rsp->tag); scmnd->result = rsp->status; if (rsp->flags & SRP_RSP_FLAG_SNSVALID) { @@ -831,7 +838,8 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc) if (0) { int i; - printk(KERN_ERR PFX "recv completion, opcode 0x%02x\n", opcode); + shost_printk(KERN_ERR, target->scsi_host, + PFX "recv completion, opcode 0x%02x\n", opcode); for (i = 0; i < wc->byte_len; ++i) { if (i % 8 == 0) @@ -852,11 +860,13 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc) case SRP_T_LOGOUT: /* XXX Handle target logout */ - printk(KERN_WARNING PFX "Got target logout request\n"); + shost_printk(KERN_WARNING, target->scsi_host, + PFX "Got target logout request\n"); break; default: - printk(KERN_WARNING PFX "Unhandled SRP opcode 0x%02x\n", opcode); + shost_printk(KERN_WARNING, target->scsi_host, + PFX "Unhandled SRP opcode 0x%02x\n", opcode); break; } @@ -872,9 +882,10 @@ static void srp_completion(struct ib_cq *cq, void *target_ptr) ib_req_notify_cq(cq, IB_CQ_NEXT_COMP); while (ib_poll_cq(cq, 1, &wc) > 0) { if (wc.status) { - printk(KERN_ERR PFX "failed %s status %d\n", - wc.wr_id & SRP_OP_RECV ? "receive" : "send", - wc.status); + shost_printk(KERN_ERR, target->scsi_host, + PFX "failed %s status %d\n", + wc.wr_id & SRP_OP_RECV ? "receive" : "send", + wc.status); target->qp_in_error = 1; break; } @@ -1022,12 +1033,13 @@ static int srp_queuecommand(struct scsi_cmnd *scmnd, len = srp_map_data(scmnd, target, req); if (len < 0) { - printk(KERN_ERR PFX "Failed to map data\n"); + shost_printk(KERN_ERR, target->scsi_host, + PFX "Failed to map data\n"); goto err; } if (__srp_post_recv(target)) { - printk(KERN_ERR PFX "Recv failed\n"); + shost_printk(KERN_ERR, target->scsi_host, PFX "Recv failed\n"); goto err_unmap; } @@ -1035,7 +1047,7 @@ static int srp_queuecommand(struct scsi_cmnd *scmnd, DMA_TO_DEVICE); if (__srp_post_send(target, iu, len)) { - printk(KERN_ERR PFX "Send failed\n"); + shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n"); goto err_unmap; } @@ -1090,6 +1102,7 @@ static void srp_cm_rej_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event, struct srp_target_port *target) { + struct Scsi_Host *shost = target->scsi_host; struct ib_class_port_info *cpi; int opcode; @@ -1115,19 +1128,22 @@ static void srp_cm_rej_handler(struct ib_cm_id *cm_id, memcpy(target->path.dgid.raw, event->param.rej_rcvd.ari, 16); - printk(KERN_DEBUG PFX "Topspin/Cisco redirect to target port GID %016llx%016llx\n", - (unsigned long long) be64_to_cpu(target->path.dgid.global.subnet_prefix), - (unsigned long long) be64_to_cpu(target->path.dgid.global.interface_id)); + shost_printk(KERN_DEBUG, shost, + PFX "Topspin/Cisco redirect to target port GID %016llx%016llx\n", + (unsigned long long) be64_to_cpu(target->path.dgid.global.subnet_prefix), + (unsigned long long) be64_to_cpu(target->path.dgid.global.interface_id)); target->status = SRP_PORT_REDIRECT; } else { - printk(KERN_WARNING " REJ reason: IB_CM_REJ_PORT_REDIRECT\n"); + shost_printk(KERN_WARNING, shost, + " REJ reason: IB_CM_REJ_PORT_REDIRECT\n"); target->status = -ECONNRESET; } break; case IB_CM_REJ_DUPLICATE_LOCAL_COMM_ID: - printk(KERN_WARNING " REJ reason: IB_CM_REJ_DUPLICATE_LOCAL_COMM_ID\n"); + shost_printk(KERN_WARNING, shost, + " REJ reason: IB_CM_REJ_DUPLICATE_LOCAL_COMM_ID\n"); target->status = -ECONNRESET; break; @@ -1138,20 +1154,21 @@ static void srp_cm_rej_handler(struct ib_cm_id *cm_id, u32 reason = be32_to_cpu(rej->reason); if (reason == SRP_LOGIN_REJ_REQ_IT_IU_LENGTH_TOO_LARGE) - printk(KERN_WARNING PFX - "SRP_LOGIN_REJ: requested max_it_iu_len too large\n"); + shost_printk(KERN_WARNING, shost, + PFX "SRP_LOGIN_REJ: requested max_it_iu_len too large\n"); else - printk(KERN_WARNING PFX - "SRP LOGIN REJECTED, reason 0x%08x\n", reason); + shost_printk(KERN_WARNING, shost, + PFX "SRP LOGIN REJECTED, reason 0x%08x\n", reason); } else - printk(KERN_WARNING " REJ reason: IB_CM_REJ_CONSUMER_DEFINED," - " opcode 0x%02x\n", opcode); + shost_printk(KERN_WARNING, shost, + " REJ reason: IB_CM_REJ_CONSUMER_DEFINED," + " opcode 0x%02x\n", opcode); target->status = -ECONNRESET; break; default: - printk(KERN_WARNING " REJ reason 0x%x\n", - event->param.rej_rcvd.reason); + shost_printk(KERN_WARNING, shost, " REJ reason 0x%x\n", + event->param.rej_rcvd.reason); target->status = -ECONNRESET; } } @@ -1166,7 +1183,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) switch (event->event) { case IB_CM_REQ_ERROR: - printk(KERN_DEBUG PFX "Sending CM REQ failed\n"); + shost_printk(KERN_DEBUG, target->scsi_host, + PFX "Sending CM REQ failed\n"); comp = 1; target->status = -ECONNRESET; break; @@ -1184,7 +1202,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) target->scsi_host->can_queue = min(target->req_lim, target->scsi_host->can_queue); } else { - printk(KERN_WARNING PFX "Unhandled RSP opcode %#x\n", opcode); + shost_printk(KERN_WARNING, target->scsi_host, + PFX "Unhandled RSP opcode %#x\n", opcode); target->status = -ECONNRESET; break; } @@ -1230,20 +1249,23 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) break; case IB_CM_REJ_RECEIVED: - printk(KERN_DEBUG PFX "REJ received\n"); + shost_printk(KERN_DEBUG, target->scsi_host, PFX "REJ received\n"); comp = 1; srp_cm_rej_handler(cm_id, event, target); break; case IB_CM_DREQ_RECEIVED: - printk(KERN_WARNING PFX "DREQ received - connection closed\n"); + shost_printk(KERN_WARNING, target->scsi_host, + PFX "DREQ received - connection closed\n"); if (ib_send_cm_drep(cm_id, NULL, 0)) - printk(KERN_ERR PFX "Sending CM DREP failed\n"); + shost_printk(KERN_ERR, target->scsi_host, + PFX "Sending CM DREP failed\n"); break; case IB_CM_TIMEWAIT_EXIT: - printk(KERN_ERR PFX "connection closed\n"); + shost_printk(KERN_ERR, target->scsi_host, + PFX "connection closed\n"); comp = 1; target->status = 0; @@ -1255,7 +1277,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) break; default: - printk(KERN_WARNING PFX "Unhandled CM event %d\n", event->event); + shost_printk(KERN_WARNING, target->scsi_host, + PFX "Unhandled CM event %d\n", event->event); break; } @@ -1332,7 +1355,7 @@ static int srp_abort(struct scsi_cmnd *scmnd) struct srp_request *req; int ret = SUCCESS; - printk(KERN_ERR "SRP abort called\n"); + shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n"); if (target->qp_in_error) return FAILED; @@ -1362,7 +1385,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd) struct srp_target_port *target = host_to_target(scmnd->device->host); struct srp_request *req, *tmp; - printk(KERN_ERR "SRP reset_device called\n"); + shost_printk(KERN_ERR, target->scsi_host, "SRP reset_device called\n"); if (target->qp_in_error) return FAILED; @@ -1389,7 +1412,7 @@ static int srp_reset_host(struct scsi_cmnd *scmnd) struct srp_target_port *target = host_to_target(scmnd->device->host); int ret = FAILED; - printk(KERN_ERR PFX "SRP reset_host called\n"); + shost_printk(KERN_ERR, target->scsi_host, PFX "SRP reset_host called\n"); if (!srp_reconnect_target(target)) ret = SUCCESS; @@ -1814,8 +1837,9 @@ static ssize_t srp_create_target(struct class_device *class_dev, ib_get_cached_gid(host->dev->dev, host->port, 0, &target->path.sgid); - printk(KERN_DEBUG PFX "new target: id_ext %016llx ioc_guid %016llx pkey %04x " - "service_id %016llx dgid %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", + shost_printk(KERN_DEBUG, target->scsi_host, PFX + "new target: id_ext %016llx ioc_guid %016llx pkey %04x " + "service_id %016llx dgid %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", (unsigned long long) be64_to_cpu(target->id_ext), (unsigned long long) be64_to_cpu(target->ioc_guid), be16_to_cpu(target->path.pkey), @@ -1842,7 +1866,8 @@ static ssize_t srp_create_target(struct class_device *class_dev, target->qp_in_error = 0; ret = srp_connect_target(target); if (ret) { - printk(KERN_ERR PFX "Connection failed\n"); + shost_printk(KERN_ERR, target->scsi_host, + PFX "Connection failed\n"); goto err_cm_id; } From sean.hefty at intel.com Mon Jan 7 16:23:17 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Mon, 7 Jan 2008 16:23:17 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: References: <000101c84346$7766b2d0$9b37170a@amr.corp.intel.com> Message-ID: <000401c8518c$aabcf280$a937170a@amr.corp.intel.com> I turned on lock checking and got the following possible locking dependency. (Running on 2.6.24-rc3.) I have two connectX cards connected back to back and was trying to run opensm in QoS mode. Opensm failed to bring the subnet up, and hitting ctrl-C to kill opensm resulted in it hanging. Using kill -9 resulted in this being written to /var/log/messages. Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: ======================================================= Jan 7 12:23:35 mshefty-linux3 kernel: [ INFO: possible circular locking dependency detected ] Jan 7 12:23:35 mshefty-linux3 kernel: 2.6.24-rc3 #3 Jan 7 12:23:35 mshefty-linux3 kernel: ------------------------------------------------------- Jan 7 12:23:35 mshefty-linux3 kernel: opensm/7164 is trying to acquire lock: Jan 7 12:23:35 mshefty-linux3 kernel: (¥ ){--..}, at: [] flush_workqueue+0x0/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: but task is already holding lock: Jan 7 12:23:35 mshefty-linux3 kernel: (&port->mutex){----}, at: [] ib_umad_close+0x2d/0x100 [ib_umad] Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: which lock already depends on the new lock. Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: the existing dependency chain (in reverse order) is: Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: -> #2 (&port->mutex){----}: Jan 7 12:23:35 mshefty-linux3 kernel: [] __lock_acquire+0x7a7/0x10c0 Jan 7 12:23:35 mshefty-linux3 kernel: [] lock_acquire+0x53/0x70 Jan 7 12:23:35 mshefty-linux3 kernel: [] queue_packet+0x49/0xf0 [ib_umad] Jan 7 12:23:35 mshefty-linux3 kernel: [] kmem_cache_alloc+0xa5/0xe0 Jan 7 12:23:35 mshefty-linux3 kernel: [] down_read+0x32/0x40 Jan 7 12:23:35 mshefty-linux3 kernel: [] queue_packet+0x49/0xf0 [ib_umad] Jan 7 12:23:35 mshefty-linux3 kernel: [] recv_handler+0xdb/0x170 [ib_umad] Jan 7 12:23:35 mshefty-linux3 kernel: [] trace_hardirqs_on+0xbf/0x160 Jan 7 12:23:35 mshefty-linux3 kernel: [] ib_mad_completion_handler+0x2d5/0x6f0 [ib_mad] Jan 7 12:23:35 mshefty-linux3 kernel: [] __lock_acquire+0x8e7/0x10c0 Jan 7 12:23:35 mshefty-linux3 kernel: [] ib_mad_completion_handler+0x0/0x6f0 [ib_mad] Jan 7 12:23:35 mshefty-linux3 kernel: [] ib_mad_completion_handler+0x0/0x6f0 [ib_mad] Jan 7 12:23:35 mshefty-linux3 kernel: [] run_workqueue+0xed/0x220 Jan 7 12:23:35 mshefty-linux3 kernel: [] trace_hardirqs_on+0xbf/0x160 Jan 7 12:23:35 mshefty-linux3 kernel: [] worker_thread+0xd3/0x140 Jan 7 12:23:35 mshefty-linux3 kernel: [] autoremove_wake_function+0x0/0x30 Jan 7 12:23:35 mshefty-linux3 kernel: [] autoremove_wake_function+0x0/0x30 Jan 7 12:23:35 mshefty-linux3 kernel: [] worker_thread+0x0/0x140 Jan 7 12:23:35 mshefty-linux3 kernel: [] kthread+0x6c/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] child_rip+0xa/0x12 Jan 7 12:23:35 mshefty-linux3 kernel: [] restore_args+0x0/0x30 Jan 7 12:23:35 mshefty-linux3 kernel: [] kthreadd+0x8e/0x170 Jan 7 12:23:35 mshefty-linux3 kernel: [] kthread+0x0/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] child_rip+0x0/0x12 Jan 7 12:23:35 mshefty-linux3 kernel: [] 0xffffffffffffffff Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: -> #1 (&port_priv->work){--..}: Jan 7 12:23:35 mshefty-linux3 kernel: [] __lock_acquire+0x7a7/0x10c0 Jan 7 12:23:35 mshefty-linux3 kernel: [] ib_mad_completion_handler+0x0/0x6f0 [ib_mad] Jan 7 12:23:35 mshefty-linux3 kernel: [] lock_acquire+0x53/0x70 Jan 7 12:23:35 mshefty-linux3 kernel: [] run_workqueue+0xa2/0x220 Jan 7 12:23:35 mshefty-linux3 kernel: [] _spin_unlock_irq+0x24/0x30 Jan 7 12:23:35 mshefty-linux3 kernel: [] run_workqueue+0xe7/0x220 Jan 7 12:23:35 mshefty-linux3 kernel: [] trace_hardirqs_on+0xbf/0x160 Jan 7 12:23:35 mshefty-linux3 kernel: [] worker_thread+0xd3/0x140 Jan 7 12:23:35 mshefty-linux3 kernel: [] autoremove_wake_function+0x0/0x30 Jan 7 12:23:35 mshefty-linux3 kernel: [] autoremove_wake_function+0x0/0x30 Jan 7 12:23:35 mshefty-linux3 kernel: [] worker_thread+0x0/0x140 Jan 7 12:23:35 mshefty-linux3 kernel: [] kthread+0x6c/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] child_rip+0xa/0x12 Jan 7 12:23:35 mshefty-linux3 kernel: [] restore_args+0x0/0x30 Jan 7 12:23:35 mshefty-linux3 kernel: [] kthreadd+0x8e/0x170 Jan 7 12:23:35 mshefty-linux3 kernel: [] kthread+0x0/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] child_rip+0x0/0x12 Jan 7 12:23:35 mshefty-linux3 kernel: [] 0xffffffffffffffff Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: -> #0 (¥ ){--..}: Jan 7 12:23:35 mshefty-linux3 kernel: [] print_stack_trace+0x6a/0x80 Jan 7 12:23:35 mshefty-linux3 kernel: [] __lock_acquire+0x612/0x10c0 Jan 7 12:23:35 mshefty-linux3 kernel: [] __lock_acquire+0x8e7/0x10c0 Jan 7 12:23:35 mshefty-linux3 kernel: [] lock_acquire+0x53/0x70 Jan 7 12:23:35 mshefty-linux3 kernel: [] flush_workqueue+0x0/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] _spin_unlock_irqrestore+0x55/0x70 Jan 7 12:23:35 mshefty-linux3 kernel: [] flush_workqueue+0x43/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] ib_unregister_mad_agent+0x297/0x460 [ib_mad] Jan 7 12:23:35 mshefty-linux3 kernel: [] ib_umad_close+0xbe/0x100 [ib_umad] Jan 7 12:23:35 mshefty-linux3 kernel: [] __fput+0x1cb/0x200 Jan 7 12:23:35 mshefty-linux3 kernel: [] filp_close+0x4b/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] put_files_struct+0x70/0xc0 Jan 7 12:23:35 mshefty-linux3 kernel: [] do_exit+0x1d8/0x8d0 Jan 7 12:23:35 mshefty-linux3 kernel: [] __dequeue_signal+0x27/0x1e0 Jan 7 12:23:35 mshefty-linux3 kernel: [] do_group_exit+0x30/0x90 Jan 7 12:23:35 mshefty-linux3 kernel: [] get_signal_to_deliver+0x2fe/0x4f0 Jan 7 12:23:35 mshefty-linux3 kernel: [] do_notify_resume+0xc5/0x750 Jan 7 12:23:35 mshefty-linux3 kernel: [] trace_hardirqs_on_thunk+0x35/0x3a Jan 7 12:23:35 mshefty-linux3 kernel: [] trace_hardirqs_on+0xbf/0x160 Jan 7 12:23:35 mshefty-linux3 kernel: [] sysret_signal+0x21/0x31 Jan 7 12:23:35 mshefty-linux3 kernel: [] ptregscall_common+0x67/0xb0 Jan 7 12:23:35 mshefty-linux3 kernel: [] 0xffffffffffffffff Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: other info that might help us debug this: Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: 1 lock held by opensm/7164: Jan 7 12:23:35 mshefty-linux3 kernel: #0: (&port->mutex){----}, at: [] ib_umad_close+0x2d/0x100 [ib_umad] Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: stack backtrace: Jan 7 12:23:35 mshefty-linux3 kernel: Jan 7 12:23:35 mshefty-linux3 kernel: Call Trace: Jan 7 12:23:35 mshefty-linux3 kernel: [] print_circular_bug_tail+0x85/0x90 Jan 7 12:23:35 mshefty-linux3 kernel: [] print_stack_trace+0x6a/0x80 Jan 7 12:23:35 mshefty-linux3 kernel: [] __lock_acquire+0x612/0x10c0 Jan 7 12:23:35 mshefty-linux3 kernel: [] __lock_acquire+0x8e7/0x10c0 Jan 7 12:23:35 mshefty-linux3 kernel: [] lock_acquire+0x53/0x70 Jan 7 12:23:35 mshefty-linux3 kernel: [] flush_workqueue+0x0/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] _spin_unlock_irqrestore+0x55/0x70 Jan 7 12:23:35 mshefty-linux3 kernel: [] flush_workqueue+0x43/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] :ib_mad:ib_unregister_mad_agent+0x297/0x460 Jan 7 12:23:35 mshefty-linux3 kernel: [] :ib_umad:ib_umad_close+0xbe/0x100 Jan 7 12:23:35 mshefty-linux3 kernel: [] __fput+0x1cb/0x200 Jan 7 12:23:35 mshefty-linux3 kernel: [] filp_close+0x4b/0xa0 Jan 7 12:23:35 mshefty-linux3 kernel: [] put_files_struct+0x70/0xc0 Jan 7 12:23:35 mshefty-linux3 kernel: [] do_exit+0x1d8/0x8d0 Jan 7 12:23:35 mshefty-linux3 kernel: [] __dequeue_signal+0x27/0x1e0 Jan 7 12:23:35 mshefty-linux3 kernel: [] do_group_exit+0x30/0x90 Jan 7 12:23:35 mshefty-linux3 kernel: [] get_signal_to_deliver+0x2fe/0x4f0 Jan 7 12:23:35 mshefty-linux3 kernel: [] do_notify_resume+0xc5/0x750 Jan 7 12:23:35 mshefty-linux3 kernel: [] trace_hardirqs_on_thunk+0x35/0x3a Jan 7 12:23:35 mshefty-linux3 kernel: [] trace_hardirqs_on+0xbf/0x160 Jan 7 12:23:35 mshefty-linux3 kernel: [] sysret_signal+0x21/0x31 Jan 7 12:23:35 mshefty-linux3 kernel: [] ptregscall_common+0x67/0xb0 Jan 7 12:23:35 mshefty-linux3 kernel: From kliteyn at mellanox.co.il Mon Jan 7 17:52:04 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 8 Jan 2008 03:52:04 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-08:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-07 OpenSM git rev = Mon_Jan_7_04:52:47_2008 [7d7dd1e173aad973e5e8fecf3b5a9f67d02ce375] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=398 Fail=2 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 8 LidMgr IS3-128.topo Failures: 2 LidMgr IS3-128.topo From info at lottery.org Mon Jan 7 20:29:07 2008 From: info at lottery.org (LOTTERY BOARD) Date: Tue, 8 Jan 2008 07:29:07 +0300 (EAT) Subject: [ofa-general] Your E-mail Address Has Won{E-mail Alert} Message-ID: <53525.213.185.118.237.1199766547.squirrel@webmail.wananchi.com> UK ONLINE LOTTERY INTERNATIONAL 60 MERRIMAN ROAD BLACKHEATH LONDON SE3 8RZ ENGLAND REF NO: SL/74/368/05 BATCH NO:SL-121-LT-11-12-05 ONLINE NOTIFICATION We are pleased to inform you today 8th January 2008 of the result Draw winners of the UK ON-LINE LOTTERY PROMO PROGRAMME, held on the 1st of December, 2007.You have therefore been approved for a lump sum pay out of £750,000.00 (Seven hundred and fifty thousand Pounds Sterling) To file for your claim,please contact our fiduciary agent claims officer via email as soon as possible for the immediate release of your winnings: MR PINKETT GRIFFIN E-mail address: www.uk16 at yahoo.co.uk Tel:+44 70457 51333 1.Full Name: 2.Full Address: 3.Marital Status: 4.Occupation: 5.Age: 6.Sex: 7.Nationality: 8.Country Of Residence: 9.Telephone Number: Congratulations once more from all members and staff of this program. Sincerely, Mrs Stella Woods From dotanb at dev.mellanox.co.il Mon Jan 7 22:47:13 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Tue, 08 Jan 2008 08:47:13 +0200 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: References: <200801061801.25386.dotanb@dev.mellanox.co.il> Message-ID: <47831C71.9040008@dev.mellanox.co.il> Roland Dreier wrote: > > - wc->pkey_index = ntohl(cqe->immed_rss_invalid) >> 16; > > + wc->pkey_index = (uint16_t)(ntohl(cqe->immed_rss_invalid) & 0x7f); > > This is pretty silly. We don't allow userspace to create QP1 anyway, > so is there any point setting the pkey_index field here at all? > > You are absolutely right - QP1 is being used only in kernel level but i wanted to be consistent with the kernel level, so i fixed this line too. I checked this issue and every (user) low level driver library, until today, filled the value of the pkey_index and i didn't want that libmlx4 will be different ... Do you think that this code should be removed from all of the user level libraries? Dotan From dotanb at dev.mellanox.co.il Mon Jan 7 22:55:36 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Tue, 08 Jan 2008 08:55:36 +0200 Subject: [ofa-general] Re: [PATCH] mlx4: Fix the value of the pkey_index in the completion In-Reply-To: References: <200801070901.26213.dotanb@dev.mellanox.co.il> Message-ID: <47831E68.9050809@dev.mellanox.co.il> Roland Dreier wrote: > > Without this fix, incoming GSI packets on port 2 gets invalid pkey index in the completion, > > which prevent from the mad layer to send back a response. > > Does this explain the problems that someone saw when using all port 2 > of a connectx setup? > Yes. without this mask the value of the pkey_index was 0x80. In the function build_mlx_header there is a call to ib_get_cached_pkey which failed because the pkey index was out of range (i sent you a patch that check the return value). This failure caused to uninitialized pkey value to be sent in the MAD response .... > Why does it only affect port 2? > > > - wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) >> 16; > > + wc->pkey_index = (u16)(be32_to_cpu(cqe->immed_rss_invalid) & 0x7f); > > It seems the (u16) cast here is doubly useless, since wc->pkey_index > is already u16, and we're masking with 0x7f anyway. > If you want i can send a new patch without the casting ... > By the way, will it always work to mask with 0x7f? Or is it possible > that the P_Key table might have more than 128 entries? > Yes, 0x7f should be enough for the ConnectX HCA because it supports 128 pkey entries per port. (this line will be written in the next version of the ConnectX PRM) Dotan From vlad at lists.openfabrics.org Tue Jan 8 03:09:44 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 8 Jan 2008 03:09:44 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080108-0200 daily build status Message-ID: <20080108110944.729A1E60042@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.15 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.17 Passed on ia64 with linux-2.6.18 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.20 Passed on powerpc with linux-2.6.12 Passed on ia64 with linux-2.6.12 Passed on powerpc with linux-2.6.15 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.17 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.12 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.13 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.12 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.18-53.el5 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-8.el5 Failed: From iflametreem at flametree.com Mon Jan 7 03:50:30 2008 From: iflametreem at flametree.com (Lacy Jarrett) Date: Tue, 7 Jan 2008 13:50:30 +0200 Subject: [ofa-general] Fü r die qualitative Software wenig zu bezahlen: warum nicht? Message-ID: <01c85134$43efd700$1128e358@iflametreem> Man kann die Software momentan bekommen. Wie? Bezahlen und auslasten! Das sind die Programmen auf allen europaischen Sprachen, die fur Windows und Macintosh vorherbestimmt sind. Fur die echte und vollige Produkte der Software bezahlt man nur wenig Geld.Haben Sie Schwierigkeiten bei der Aufstellung? Die professionelle Konsultation des Anwenderdienstes hilft Ihnen. Die Antwort wird schnell sein. Die Ruckzahlung ist moglich. Sie kaufen nur die ausgezeichnet funktionierende Software http://geocities.com/roach.fredric/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From glebn at voltaire.com Tue Jan 8 03:59:11 2008 From: glebn at voltaire.com (Gleb Natapov) Date: Tue, 8 Jan 2008 13:59:11 +0200 Subject: [ofa-general] AF_INET_SDP value In-Reply-To: References: <39C75744D164D948A170E9792AF8E7CA4296C4@exil.voltaire.com> <6.2.0.14.2.20080107070720.0233cdc0@esmail.cup.hp.com> Message-ID: <20080108115910.GW22604@minantech.com> On Mon, Jan 07, 2008 at 01:36:09PM -0800, Jim Mott wrote: > This is indeed how SDP works on Linux. The unmodified binary runs against the libsdp shared library and the right things happen. libsdp opens AF_INET_SDP socket. If the value of AF_INET_SDP is not standard libsdp may work on some OSes and not on others. > > The AF issue comes in because of a requirement (request, desire, misunderstanding, creeping feature?) to be able to create SDP only applications that can bypass the library and run directly against SDP. These applications, for example, will fail if the target system is not running SDP where the library approach silently falls back to TCP. > > While I am not sure of who the non-libsdp consumer of this AF is, I am sure that there is a non-technical problem just defining a new address family. > > The end result is that AF_INET_SDP is not defined in any normal OS place. Maybe this is correct behavior. I could certainly argue both sides. > > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > From: Michael Krause [mailto:krause at cup.hp.com] > Sent: Monday, January 07, 2008 9:09 AM > To: Jim Mott; Lenny Verkhovsky; general at lists.openfabrics.org > Subject: RE: [ofa-general] AF_INET_SDP value > > > Technically, there was never a solid technical reason to require a new AF_INET_SDP value since SDP should be transparently interposed underneath a Sockets AF_INET application (the SDP port mapper protocol helps in this regard as well).  The intended reason for SDP in the first place is to enable Sockets-based applications to transparently, i.e. non-modified source and if using shared libraries, non-modified binaries, to take advantage of RDMA interconnects.   This is how it is implemented on Windows and other OS that support SDP or in Window's case, the prior incarnation called Winsocks Direct. > > While making a modification to the address family may seem trivial to most, the simple act of opening up the application source to any change is a major issue to many enterprise customers.   Given SDP adoption is nascent and there are competing approaches to protocol acceleration technology coming to market or being explored as well as a lot of unfortunate marketing FUD, the developers might want to think about what it would take to support SDP as originally intended by the IBTA and IETF. > > Mike > > > At 10:21 AM 1/6/2008, Jim Mott wrote: > > Content-class: urn:content-classes:message > Content-Type: multipart/alternative; >          boundary="----_=_NextPart_001_01C85090.FC407BBA" > > I do not believe so.  There are some politics involved.  This value is shipped as part of the user space libsdp code.  Perhaps someone that knows  more history on this can comment? >   > From: general-bounces at lists.openfabrics.org [ mailto:general-bounces at lists.openfabrics.org] On Behalf Of Lenny Verkhovsky > Sent: Sunday, January 06, 2008 10:24 AM > To: general at lists.openfabrics.org > Subject: [ofa-general] AF_INET_SDP value >   > Hi, >   > Is AF_INET_SDP equals 27 is standartized for all architectures and kernels ? >   > Best Regards, > Lenny. >   > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- Gleb. From iforevernowm at forevernow.com Mon Jan 7 04:46:20 2008 From: iforevernowm at forevernow.com (Kelly Wallace) Date: Tue, 7 Jan 2008 14:46:20 +0200 Subject: [ofa-general] Das Zeichen der neuen Software zu einem niedrigen Preis Message-ID: <01c8513c$10d3c220$fc38ef58@iforevernowm> Konnen die Produkte der Software gleichzeitig billig aber original und vollig sein? Ja, und Sie bekommen momentan die Programmen auf allen europaischen Sprachen uberlassen, die fur Windows und Macintosh vorherbestimmt sind. Einfach bezahlen und auslasten. Mit der Hilfe der professionellen Konsultation des Anwenderdienstes ist Die Aufstellung des Programms kein Problem fur Ihnen. Antwort ist garantiert. Die Ruckzahlung ist moglich. Wenn Sie diese Software kaufen, haben Sie mit der vollkommen funktionierende Software zu Tun http://geocities.com/galen.cannon/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at mellanox.com Tue Jan 8 04:53:40 2008 From: jim at mellanox.com (Jim Mott) Date: Tue, 8 Jan 2008 04:53:40 -0800 Subject: [ofa-general] AF_INET_SDP value In-Reply-To: <20080108115910.GW22604@minantech.com> References: <39C75744D164D948A170E9792AF8E7CA4296C4@exil.voltaire.com> <6.2.0.14.2.20080107070720.0233cdc0@esmail.cup.hp.com> <20080108115910.GW22604@minantech.com> Message-ID: The value of AF_INET_SDP is defined in the shipped file sdp_socket.h. It is used by the kernel module build, the libsdp build, and applications that decide to explicitly use SDP. Since we build the module and libsdp on the target system, things will work. The problem is that the value selected for AF_INET_SDP might someday conflict with a real standard address family definition. When this happens, something (SDP or a standard AF) will be broken. JIm -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Gleb Natapov Sent: Tuesday, January 08, 2008 5:59 AM To: Jim Mott Cc: Lenny Verkhovsky; general at lists.openfabrics.org Subject: Re: [ofa-general] AF_INET_SDP value On Mon, Jan 07, 2008 at 01:36:09PM -0800, Jim Mott wrote: > This is indeed how SDP works on Linux. The unmodified binary runs against the libsdp shared library and the right things happen. libsdp opens AF_INET_SDP socket. If the value of AF_INET_SDP is not standard libsdp may work on some OSes and not on others. > > The AF issue comes in because of a requirement (request, desire, misunderstanding, creeping feature?) to be able to create SDP only applications that can bypass the library and run directly against SDP. These applications, for example, will fail if the target system is not running SDP where the library approach silently falls back to TCP. > > While I am not sure of who the non-libsdp consumer of this AF is, I am sure that there is a non-technical problem just defining a new address family. > > The end result is that AF_INET_SDP is not defined in any normal OS place. Maybe this is correct behavior. I could certainly argue both sides. > > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > From: Michael Krause [mailto:krause at cup.hp.com] > Sent: Monday, January 07, 2008 9:09 AM > To: Jim Mott; Lenny Verkhovsky; general at lists.openfabrics.org > Subject: RE: [ofa-general] AF_INET_SDP value > > > Technically, there was never a solid technical reason to require a new AF_INET_SDP value since SDP should be transparently interposed underneath a Sockets AF_INET application (the SDP port mapper protocol helps in this regard as well).  The intended reason for SDP in the first place is to enable Sockets-based applications to transparently, i.e. non-modified source and if using shared libraries, non-modified binaries, to take advantage of RDMA interconnects.   This is how it is implemented on Windows and other OS that support SDP or in Window's case, the prior incarnation called Winsocks Direct. > > While making a modification to the address family may seem trivial to most, the simple act of opening up the application source to any change is a major issue to many enterprise customers.   Given SDP adoption is nascent and there are competing approaches to protocol acceleration technology coming to market or being explored as well as a lot of unfortunate marketing FUD, the developers might want to think about what it would take to support SDP as originally intended by the IBTA and IETF. > > Mike > > > At 10:21 AM 1/6/2008, Jim Mott wrote: > > Content-class: urn:content-classes:message > Content-Type: multipart/alternative; >          boundary="----_=_NextPart_001_01C85090.FC407BBA" > > I do not believe so.  There are some politics involved.  This value is shipped as part of the user space libsdp code.  Perhaps someone that knows  more history on this can comment? >   > From: general-bounces at lists.openfabrics.org [ mailto:general-bounces at lists.openfabrics.org] On Behalf Of Lenny Verkhovsky > Sent: Sunday, January 06, 2008 10:24 AM > To: general at lists.openfabrics.org > Subject: [ofa-general] AF_INET_SDP value >   > Hi, >   > Is AF_INET_SDP equals 27 is standartized for all architectures and kernels ? >   > Best Regards, > Lenny. >   > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- Gleb. _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From kliteyn at dev.mellanox.co.il Tue Jan 8 05:33:13 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 08 Jan 2008 15:33:13 +0200 Subject: [ofa-general] Re: CMA can't establish connection with QoS on In-Reply-To: <477D184D.8020300@ichips.intel.com> References: <47600070.8050008@dev.mellanox.co.il> <000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com> <47605620.3070105@dev.mellanox.co.il> <47608BE4.7020209@ichips.intel.com> <4761315A.1070306@dev.mellanox.co.il> <477CF8EF.5010307@dev.mellanox.co.il> <477D184D.8020300@ichips.intel.com> Message-ID: <47837B99.2050508@dev.mellanox.co.il> Sean, I updated the bug with the step-by-step instructions how to burn the FW and reproduce the error. I compiled this "how-to" today, so everything there is up to date. The instructions how to burn fw assume that you have mlxburn. If it's not the case, I can help you to save time by avoiding installation of it - just send me your ConnectX board ID and I'll prepare a QoS-enabled FW image. To get the board id run this command: "ibv_devinfo | grep board_id" -- Yevgeny Sean Hefty wrote: >> Did you get a chance to look at this issue? >> >> https://bugs.openfabrics.org/show_bug.cgi?id=821 > > I had to build a couple of systems up to test this, but I wasn't able to > reproduce the error before going on vacation. I kept getting a path > record query error reported by the rdma_cm, so something must be off > with my configuration. I'll continue to look at it. > > - Sean > From ogerlitz at voltaire.com Tue Jan 8 05:34:50 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Tue, 08 Jan 2008 15:34:50 +0200 Subject: [ofa-general] Re: librdmacm/man: fix-up man pages In-Reply-To: <475FD984.6080203@voltaire.com> References: <000101c81a64$3582de80$9c98070a@amr.corp.intel.com> <4726EEAC.3070105@voltaire.com> <472755C4.10600@ichips.intel.com> <47285F53.4060402@voltaire.com> <4728BF4A.1060301@ichips.intel.com> <15ddcffd0710311320v6b91b3cm3be0f7882e30ad2b@mail.gmail.com> <000001c81cb5$4ce12160$9c98070a@amr.corp.intel.com> <15ddcffd0711270435t12a18dc3waac2596b3884ac72@mail.gmail.com> <000001c8311a$176cdbe0$63248686@amr.corp.intel.com> <15ddcffd0711280307u7a89c6c2q2854b071f74d9123@mail.gmail.com> <000801c832b6$81feb850$f5d8180a@amr.corp.intel.com> <475FD984.6080203@voltaire.com> Message-ID: <47837BFA.7040402@voltaire.com> Or Gerlitz wrote: > Sean Hefty wrote: >> These have been updated and pushed upstream. Please let me know if >> you're aware of any other documentation changes. > OK, here's some feedback on the documentation (man pages) Hi Sean, I noted that some of the changes I suggested were not incorporated into the man pages as you released them with 1.0.5, so are they still in the queue or you have decided not to go for them? Or. > 1. for rdma_disconnect - mention that > - it applies only to connected service > - the QP is moved to the error state and following that all the posted > work requests will be flushed to the completion queue. > 2. for rdma_join_multicast > - mention that as with unicast, if source address is provided to > rdma_resolve_addr then the routing table need not be set to route this > group to an ipoib device From swelch at systemfabricworks.com Tue Jan 8 07:10:21 2008 From: swelch at systemfabricworks.com (Steve Welch) Date: Tue, 8 Jan 2008 09:10:21 -0600 Subject: [ofa-general] Verb Status Message-ID: <60F3A3B2-FBE7-41D6-89D8-0BF70591A59D@systemfabricworks.com> Hi Pramod, Things are continuing to advance and a new code drop should be in process. For the verbs: Item 2.2, 2.4, and 2.5 are code complete and under test. Item 2.6 is now 50% coded and Item 2.7 40% coded. For non-error paths I can create and ramp up a UD queue pair. I'm able to create a AH and post a UD send via OS Bypass and can poll for the completion of that send. Currently the hardware processing of the send completes in error and it looks to be a local QP problem most likely related to QP attributes; but I should be able to find that fairly easily. The good news is that the work request is successfully posted from user space, Tavor reads and processes the WQE, writes the error completion CQE to memory, and then this CQE is correctly read via the user space library. Talk to you later, Steve From bart.vanassche at gmail.com Tue Jan 8 07:21:24 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 8 Jan 2008 16:21:24 +0100 Subject: [ofa-general] OFED and Ubuntu Linux Message-ID: Hello, Is the OFED software stack supported on Ubuntu Linux ? While OFED 1.2.5.4 compiles fine on Ubuntu Linux 7.10, I got the following errors while installing the RPM's: root at vanasscb-linux:/home/vanasscb/software/OFED-1.2.5.4/RPMS# rpm --nodeps -U *.rpm /var/tmp/rpm-tmp.69098: line 7: /sbin/lspci: No such file or directory /var/tmp/rpm-tmp.69098: line 10: /sbin/lspci: No such file or directory /var/tmp/rpm-tmp.69098: line 13: /sbin/lspci: No such file or directory root at vanasscb-linux:/home/vanasscb/software/OFED-1.2.5.4/RPMS# type lspci lspci is /usr/bin/lspci Regards, Bart Van Assche. From sashak at voltaire.com Tue Jan 8 07:49:47 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 15:49:47 +0000 Subject: [ofa-general] Re: [PATCH RFC] opensm/ib_types.h: remove ifdef WIN conditions In-Reply-To: <1199718196.20870.104.camel@hrosenstock-ws.xsigo.com> References: <20080106154328.GA26304@sashak.voltaire.com> <1199718196.20870.104.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080108154947.GP26304@sashak.voltaire.com> Hi Hal, On 07:03 Mon 07 Jan , Hal Rosenstock wrote: > On Sun, 2008-01-06 at 15:43 +0000, Sasha Khapyorsky wrote: > > It was stated couple of times that in windows another instance of > > ib_types.h file is used. If so we don't need to keep those 'ifdef WIN' > > conditions here. Also this removes empty __ptr64 macro. > > Shouldn't this also be sent to ofw for comments ? I will forward this to ofw for sure, but don't expect a lot of interest - if it is not used there. > Also, ib_cm_types.h looks like it should be changed as well in terms of > this. This file seems to be unused at all, likely we could just remove it. > Since master and ofed_1_3 are no longer identical, please indicate for > which branch(es) patches are intended. As usual everything is for the master if otherwise is not indicated. Sasha From hrosenstock at xsigo.com Tue Jan 8 07:57:31 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Tue, 08 Jan 2008 07:57:31 -0800 Subject: [ofa-general] Re: [PATCH RFC] opensm/ib_types.h: remove ifdef WIN conditions In-Reply-To: <20080108154947.GP26304@sashak.voltaire.com> References: <20080106154328.GA26304@sashak.voltaire.com> <1199718196.20870.104.camel@hrosenstock-ws.xsigo.com> <20080108154947.GP26304@sashak.voltaire.com> Message-ID: <1199807851.25609.55.camel@hrosenstock-ws.xsigo.com> Sasha, On Tue, 2008-01-08 at 15:49 +0000, Sasha Khapyorsky wrote: > Hi Hal, > > On 07:03 Mon 07 Jan , Hal Rosenstock wrote: > > On Sun, 2008-01-06 at 15:43 +0000, Sasha Khapyorsky wrote: > > > It was stated couple of times that in windows another instance of > > > ib_types.h file is used. If so we don't need to keep those 'ifdef WIN' > > > conditions here. Also this removes empty __ptr64 macro. > > > > Shouldn't this also be sent to ofw for comments ? > > I will forward this to ofw for sure, but don't expect a lot of interest > - if it is not used there. I think it is a courtesy for ofw to do this in case they don't follow the general list as this is removing some of their definitions. > > Also, ib_cm_types.h looks like it should be changed as well in terms of > > this. > > This file seems to be unused at all, likely we could just remove it. This was done for an out of tree consumer a while ago (those definitions were originally in ib_types.h). There was a thread on general a while ago on this and I think a closed bug too. I can dig it out if you can't. -- Hal > > Since master and ofed_1_3 are no longer identical, please indicate for > > which branch(es) patches are intended. > > As usual everything is for the master if otherwise is not indicated. > Sasha From sashak at voltaire.com Tue Jan 8 08:20:36 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 16:20:36 +0000 Subject: [ofa-general] [PATCH 1/3] libvendor: osm_vendor_get_all_port_attr() rework In-Reply-To: <1199718210.20870.105.camel@hrosenstock-ws.xsigo.com> References: <11951179291471-git-send-email-sashak@voltaire.com> <11951179291903-git-send-email-sashak@voltaire.com> <1199718210.20870.105.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080108162036.GR26304@sashak.voltaire.com> On 07:03 Mon 07 Jan , Hal Rosenstock wrote: > Sasha, > > On Thu, 2007-11-15 at 11:12 +0200, Sasha Khapyorsky wrote: > > It fixes couple of issues with this function: > > > > - return only valid guids, don't return duplicated entries > > What entries were duplicated ? At index zero of the array. > I think there may be a subtle "API" change in that the ib_port_attr_t > array filled in no longer has (or properly calculates the "best" port). > Not sure if it is this change or some other change which causes this. I think it is a just a fix and API is untouched there. Sasha From hrosenstock at xsigo.com Tue Jan 8 08:22:05 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Tue, 08 Jan 2008 08:22:05 -0800 Subject: [ofa-general] [PATCH 1/3] libvendor: osm_vendor_get_all_port_attr() rework In-Reply-To: <20080108162036.GR26304@sashak.voltaire.com> References: <11951179291471-git-send-email-sashak@voltaire.com> <11951179291903-git-send-email-sashak@voltaire.com> <1199718210.20870.105.camel@hrosenstock-ws.xsigo.com> <20080108162036.GR26304@sashak.voltaire.com> Message-ID: <1199809325.25609.72.camel@hrosenstock-ws.xsigo.com> Sasha, On Tue, 2008-01-08 at 16:20 +0000, Sasha Khapyorsky wrote: > On 07:03 Mon 07 Jan , Hal Rosenstock wrote: > > Sasha, > > > > On Thu, 2007-11-15 at 11:12 +0200, Sasha Khapyorsky wrote: > > > It fixes couple of issues with this function: > > > > > > - return only valid guids, don't return duplicated entries > > > > What entries were duplicated ? > > At index zero of the array. That was the "best" port (and was intentionally duplicated). > > I think there may be a subtle "API" change in that the ib_port_attr_t > > array filled in no longer has (or properly calculates the "best" port). > > Not sure if it is this change or some other change which causes this. > > I think it is a just a fix and API is untouched there. Changing index 0 of the array is a subtle API change (and it affected consumers which use this). -- Hal > Sasha > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Tue Jan 8 08:34:49 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 16:34:49 +0000 Subject: [ofa-general] Re: [PATCH] opensm/osm_qos_policy.c: trivial fix in passing wrong pointer In-Reply-To: <478241B4.1050209@dev.mellanox.co.il> References: <478241B4.1050209@dev.mellanox.co.il> Message-ID: <20080108163448.GS26304@sashak.voltaire.com> On 17:13 Mon 07 Jan , Yevgeny Kliteynik wrote: > st_lookup() returned node in the p_node pointer and replaced > the node that was intended to be inserted into the queue, > which caused infinite loop. > Besides, if st_lookup() does finds an element in the hash, > we're only interested to know that it did - don't need the > actual element. > > Please apply to ofed_1_3 and master. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From tziporet at mellanox.co.il Tue Jan 8 08:24:20 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Tue, 8 Jan 2008 18:24:20 +0200 Subject: [ofa-general] OFED Jan-07, 2008 meeting summary on readiness toward RC2 Message-ID: <6C2C79E72C305246B504CBA17B5500C9030F0B46@mtlexch01.mtl.com> OFED Jan-07, 2008 meeting summary on readiness toward RC2 1. Release status: * In general there are no major issues - testing continue at all companies * There is a wide coverage of platform and OSes 2. Tasks that should be completed for RC2: * XRC - enhanced API - will be ready by next week * IPoIB performance improvements for small messages - at least some of the changes will be integrated * Open MPI 1.2.5-rc2 - will be ready by next week * Qlogic new driver - done 3. Agree on new schedule for the release: * RC2: Jan 15, 2008 * RC3: Jan 29, 2008 * RC4: Feb 12, 2008 * Release: Feb 19, 2008 If we will see that RC3 is stable enough we will try to pull-in And in any case we do not want to delay the release any more 4. Review critical and major bugs: 750 critical raisch at de.ibm.com Problem with modprobe ib_ehca with older kernel versions - probably fixed 760 major eli at mellanox.co.il UDP performance on Rx is lower than Tx - related to IPoIB above 761 major eli at mellanox.co.il Poor and jittery UDP performance at small messages - related to IPoIB above 820 major pasha at mellanox.co.il rpm 4.4.2.2, Binary file matches Binary file - patch was sent by OSU will be incorporated by Pasha 800 major perkinjo at cse.ohio-state.edu MVAPICH2 compile error on PPC64 - fixed 736 major rolandd at cisco.com IBV_WC_RETRY_EXC_ERR errors with local rdma_reads - Need Arlin to retest with new FW 767 major swise at opengridcomputing.com Non backport Kernels that don't build in genalloc compile errors for cxgb3 - not a major issue (will be in RN) *Important: All people requested to review the open bugs and update status* Tziporet From sashak at voltaire.com Tue Jan 8 08:38:41 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 16:38:41 +0000 Subject: [ofa-general] Re: [PATCH RFC] opensm/ib_types.h: remove ifdef WIN conditions In-Reply-To: <1199807851.25609.55.camel@hrosenstock-ws.xsigo.com> References: <20080106154328.GA26304@sashak.voltaire.com> <1199718196.20870.104.camel@hrosenstock-ws.xsigo.com> <20080108154947.GP26304@sashak.voltaire.com> <1199807851.25609.55.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080108163841.GT26304@sashak.voltaire.com> On 07:57 Tue 08 Jan , Hal Rosenstock wrote: > > > Also, ib_cm_types.h looks like it should be changed as well in terms of > > > this. > > > > This file seems to be unused at all, likely we could just remove it. > > This was done for an out of tree consumer a while ago (those definitions > were originally in ib_types.h). There was a thread on general a while > ago on this and I think a closed bug too. I can dig it out if you can't. I remember at least discussion in bag tracker. So what is you opinion? Are you against removing? Sasha From hrosenstock at xsigo.com Tue Jan 8 08:32:56 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Tue, 08 Jan 2008 08:32:56 -0800 Subject: [ofa-general] Re: [PATCH RFC] opensm/ib_types.h: remove ifdef WIN conditions In-Reply-To: <20080108163841.GT26304@sashak.voltaire.com> References: <20080106154328.GA26304@sashak.voltaire.com> <1199718196.20870.104.camel@hrosenstock-ws.xsigo.com> <20080108154947.GP26304@sashak.voltaire.com> <1199807851.25609.55.camel@hrosenstock-ws.xsigo.com> <20080108163841.GT26304@sashak.voltaire.com> Message-ID: <1199809976.25609.76.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-08 at 16:38 +0000, Sasha Khapyorsky wrote: > On 07:57 Tue 08 Jan , Hal Rosenstock wrote: > > > > Also, ib_cm_types.h looks like it should be changed as well in terms of > > > > this. > > > > > > This file seems to be unused at all, likely we could just remove it. > > > > This was done for an out of tree consumer a while ago (those definitions > > were originally in ib_types.h). There was a thread on general a while > > ago on this and I think a closed bug too. I can dig it out if you can't. > > I remember at least discussion in bag tracker. So what is you opinion? I would do the same thing here as being done with ib_types.h (and remove the WIN declarations) assuming no objections on this in general (e.g. on ib_types.h). > Are you against removing? Yes. -- Hal > Sasha From prescott at hpc.ufl.edu Tue Jan 8 08:32:48 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Tue, 08 Jan 2008 11:32:48 -0500 Subject: [ofa-general] SDP and iWARP Message-ID: <4783A5B0.6040603@hpc.ufl.edu> Hi; Before the holidays, I tried a netperf SDP_STREAM test between a pair of Chelsio S310-SR iWARP cards using OFED 1.2, which crashed the netperf client host. Using OFED-1.3-20080107-0942 on both hosts the crash no longer happens, which is great, but it isn't working quite yet: [root at tebow1 ~]# LD_PRELOAD=/usr/lib64/libsdp.so /opt/netperf/bin/netperf -H a.b.c.y -L a.b.c.x -t SDP_STREAM -c -C -l 1 SDP STREAM TEST from a.b.c.x (a.b.c.x) port 0 AF_INET to a.b.c.y (a.b.c.y) port 0 AF_INET recv_response: Connection reset by peer Increasing the libsdp log level, I can see that the there is an SDP listener on the server, but the client has this: Tue Jan 8 11:24:33 2008 netperf[5055] libsdp CONNECT: connecting SDP fd:6 Tue Jan 8 11:24:43 2008 netperf[5055] libsdp Error connect: failed for SDP fd:6 with error:Network is unreachable Without libsdp preloaded, the TCP_STREAM test works perfectly (sanity). Any ideas? Thanks in advance, Craig From sashak at voltaire.com Tue Jan 8 09:00:41 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 17:00:41 +0000 Subject: [ofa-general] [PATCH 1/3] libvendor: osm_vendor_get_all_port_attr() rework In-Reply-To: <1199809325.25609.72.camel@hrosenstock-ws.xsigo.com> References: <11951179291471-git-send-email-sashak@voltaire.com> <11951179291903-git-send-email-sashak@voltaire.com> <1199718210.20870.105.camel@hrosenstock-ws.xsigo.com> <20080108162036.GR26304@sashak.voltaire.com> <1199809325.25609.72.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080108170041.GU26304@sashak.voltaire.com> On 08:22 Tue 08 Jan , Hal Rosenstock wrote: > Sasha, > > On Tue, 2008-01-08 at 16:20 +0000, Sasha Khapyorsky wrote: > > On 07:03 Mon 07 Jan , Hal Rosenstock wrote: > > > Sasha, > > > > > > On Thu, 2007-11-15 at 11:12 +0200, Sasha Khapyorsky wrote: > > > > It fixes couple of issues with this function: > > > > > > > > - return only valid guids, don't return duplicated entries > > > > > > What entries were duplicated ? > > > > At index zero of the array. > > That was the "best" port (and was intentionally duplicated). Yes (not in general, but only for CAs) and in this sense it was "API" violation (following the name and the description of function osm_vendor_get_all_port_attr()), also seems another vendor layers never supported this (which was resulted in having four ugly #ifdef OSM_VENDOR_INTF_OPENIB in OpenSM port selection code). > > > I think there may be a subtle "API" change in that the ib_port_attr_t > > > array filled in no longer has (or properly calculates the "best" port). > > > Not sure if it is this change or some other change which causes this. > > > > I think it is a just a fix and API is untouched there. > > Changing index 0 of the array is a subtle API change (and it affected > consumers which use this). I checked an open source consumers available at OFA (ibutils, opensm) - nobody used this (OpenSM had '#ifdef OSM_VENDOR_INTF_OPENIB' just in order to ignore the index '0' of the array). Sasha From sashak at voltaire.com Tue Jan 8 09:04:26 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 17:04:26 +0000 Subject: [ofa-general] Re: [PATCH RFC] opensm/ib_types.h: remove ifdef WIN conditions In-Reply-To: <1199809976.25609.76.camel@hrosenstock-ws.xsigo.com> References: <20080106154328.GA26304@sashak.voltaire.com> <1199718196.20870.104.camel@hrosenstock-ws.xsigo.com> <20080108154947.GP26304@sashak.voltaire.com> <1199807851.25609.55.camel@hrosenstock-ws.xsigo.com> <20080108163841.GT26304@sashak.voltaire.com> <1199809976.25609.76.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080108170426.GV26304@sashak.voltaire.com> On 08:32 Tue 08 Jan , Hal Rosenstock wrote: > On Tue, 2008-01-08 at 16:38 +0000, Sasha Khapyorsky wrote: > > On 07:57 Tue 08 Jan , Hal Rosenstock wrote: > > > > > Also, ib_cm_types.h looks like it should be changed as well in terms of > > > > > this. > > > > > > > > This file seems to be unused at all, likely we could just remove it. > > > > > > This was done for an out of tree consumer a while ago (those definitions > > > were originally in ib_types.h). There was a thread on general a while > > > ago on this and I think a closed bug too. I can dig it out if you can't. > > > > I remember at least discussion in bag tracker. So what is you opinion? > > I would do the same thing here as being done with ib_types.h (and remove > the WIN declarations) assuming no objections on this in general (e.g. on > ib_types.h). > > > Are you against removing? > > Yes. Why? Does anybody this file? Sasha From hrosenstock at xsigo.com Tue Jan 8 08:58:47 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Tue, 08 Jan 2008 08:58:47 -0800 Subject: [ofa-general] [PATCH 1/3] libvendor: osm_vendor_get_all_port_attr() rework In-Reply-To: <20080108170041.GU26304@sashak.voltaire.com> References: <11951179291471-git-send-email-sashak@voltaire.com> <11951179291903-git-send-email-sashak@voltaire.com> <1199718210.20870.105.camel@hrosenstock-ws.xsigo.com> <20080108162036.GR26304@sashak.voltaire.com> <1199809325.25609.72.camel@hrosenstock-ws.xsigo.com> <20080108170041.GU26304@sashak.voltaire.com> Message-ID: <1199811527.25609.91.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-08 at 17:00 +0000, Sasha Khapyorsky wrote: > On 08:22 Tue 08 Jan , Hal Rosenstock wrote: > > Sasha, > > > > On Tue, 2008-01-08 at 16:20 +0000, Sasha Khapyorsky wrote: > > > On 07:03 Mon 07 Jan , Hal Rosenstock wrote: > > > > Sasha, > > > > > > > > On Thu, 2007-11-15 at 11:12 +0200, Sasha Khapyorsky wrote: > > > > > It fixes couple of issues with this function: > > > > > > > > > > - return only valid guids, don't return duplicated entries > > > > > > > > What entries were duplicated ? > > > > > > At index zero of the array. > > > > That was the "best" port (and was intentionally duplicated). > > Yes (not in general, but only for CAs) and in this sense it was "API" > violation (following the name and the description of function > osm_vendor_get_all_port_attr()), also seems another vendor layers never > supported this (which was resulted in having four ugly > #ifdef OSM_VENDOR_INTF_OPENIB in OpenSM port selection code). > > > > > I think there may be a subtle "API" change in that the ib_port_attr_t > > > > array filled in no longer has (or properly calculates the "best" port). > > > > Not sure if it is this change or some other change which causes this. > > > > > > I think it is a just a fix and API is untouched there. > > > > Changing index 0 of the array is a subtle API change (and it affected > > consumers which use this). > > I checked an open source consumers available at OFA (ibutils, opensm) - > nobody used this (OpenSM had '#ifdef OSM_VENDOR_INTF_OPENIB' just in > order to ignore the index '0' of the array). Yes, but it goes further than checking in tree consumers and not everyone is paying attention all the time or running complete regressions frequently so this wasn't found until recently. This change broke autoselection on a machine running OpenSM with a combination of IB and iWARP adapters as it selected an iWARP adapter and exited. If we care about continuing to support this feature, I suppose code might be able to be added to OpenSM main.c to handle this rather than it being in a lower layer as it was before. -- Hal > Sasha > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Tue Jan 8 09:00:36 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Tue, 08 Jan 2008 09:00:36 -0800 Subject: [ofa-general] Re: [PATCH RFC] opensm/ib_types.h: remove ifdef WIN conditions In-Reply-To: <20080108170426.GV26304@sashak.voltaire.com> References: <20080106154328.GA26304@sashak.voltaire.com> <1199718196.20870.104.camel@hrosenstock-ws.xsigo.com> <20080108154947.GP26304@sashak.voltaire.com> <1199807851.25609.55.camel@hrosenstock-ws.xsigo.com> <20080108163841.GT26304@sashak.voltaire.com> <1199809976.25609.76.camel@hrosenstock-ws.xsigo.com> <20080108170426.GV26304@sashak.voltaire.com> Message-ID: <1199811636.25609.92.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-08 at 17:04 +0000, Sasha Khapyorsky wrote: > On 08:32 Tue 08 Jan , Hal Rosenstock wrote: > > On Tue, 2008-01-08 at 16:38 +0000, Sasha Khapyorsky wrote: > > > On 07:57 Tue 08 Jan , Hal Rosenstock wrote: > > > > > > Also, ib_cm_types.h looks like it should be changed as well in terms of > > > > > > this. > > > > > > > > > > This file seems to be unused at all, likely we could just remove it. > > > > > > > > This was done for an out of tree consumer a while ago (those definitions > > > > were originally in ib_types.h). There was a thread on general a while > > > > ago on this and I think a closed bug too. I can dig it out if you can't. > > > > > > I remember at least discussion in bag tracker. So what is you opinion? > > > > I would do the same thing here as being done with ib_types.h (and remove > > the WIN declarations) assuming no objections on this in general (e.g. on > > ib_types.h). > > > > > Are you against removing? > > > > Yes. > > Why? Does anybody this file? I don't know for sure but the change was requested a while ago and I don't see what's so important in removing it. -- Hal > Sasha From sashak at voltaire.com Tue Jan 8 09:21:20 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 17:21:20 +0000 Subject: [ofa-general] Re: [PATCH RFC] opensm/ib_types.h: remove ifdef WIN conditions In-Reply-To: <1199811636.25609.92.camel@hrosenstock-ws.xsigo.com> References: <20080106154328.GA26304@sashak.voltaire.com> <1199718196.20870.104.camel@hrosenstock-ws.xsigo.com> <20080108154947.GP26304@sashak.voltaire.com> <1199807851.25609.55.camel@hrosenstock-ws.xsigo.com> <20080108163841.GT26304@sashak.voltaire.com> <1199809976.25609.76.camel@hrosenstock-ws.xsigo.com> <20080108170426.GV26304@sashak.voltaire.com> <1199811636.25609.92.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080108172120.GW26304@sashak.voltaire.com> On 09:00 Tue 08 Jan , Hal Rosenstock wrote: > On Tue, 2008-01-08 at 17:04 +0000, Sasha Khapyorsky wrote: > > On 08:32 Tue 08 Jan , Hal Rosenstock wrote: > > > On Tue, 2008-01-08 at 16:38 +0000, Sasha Khapyorsky wrote: > > > > On 07:57 Tue 08 Jan , Hal Rosenstock wrote: > > > > > > > Also, ib_cm_types.h looks like it should be changed as well in terms of > > > > > > > this. > > > > > > > > > > > > This file seems to be unused at all, likely we could just remove it. > > > > > > > > > > This was done for an out of tree consumer a while ago (those definitions > > > > > were originally in ib_types.h). There was a thread on general a while > > > > > ago on this and I think a closed bug too. I can dig it out if you can't. > > > > > > > > I remember at least discussion in bag tracker. So what is you opinion? > > > > > > I would do the same thing here as being done with ib_types.h (and remove > > > the WIN declarations) assuming no objections on this in general (e.g. on > > > ib_types.h). > > > > > > > Are you against removing? > > > > > > Yes. > > > > Why? Does anybody this file? > > I don't know for sure but the change was requested a while ago As far as I remember the request was to remove CM_* definitions from ib_types.h. > and I > don't see what's so important in removing it. It is easily to remove rather than cleanup an unneeded stuff. Sasha From swise at opengridcomputing.com Tue Jan 8 09:33:41 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Tue, 08 Jan 2008 11:33:41 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4783A5B0.6040603@hpc.ufl.edu> References: <4783A5B0.6040603@hpc.ufl.edu> Message-ID: <4783B3F5.20600@opengridcomputing.com> Craig Prescott wrote: > > Hi; > > Before the holidays, I tried a netperf SDP_STREAM > test between a pair of Chelsio S310-SR iWARP cards > using OFED 1.2, which crashed the netperf client > host. Using OFED-1.3-20080107-0942 on both hosts > the crash no longer happens, which is great, but it > isn't working quite yet: Hey Craig, SDP currently isn't suppored over the Chelsio RNICs. No testing has been done. Thanks, Steve. From chas at cmf.nrl.navy.mil Tue Jan 8 09:33:33 2008 From: chas at cmf.nrl.navy.mil (chas williams - CONTRACTOR) Date: Tue, 08 Jan 2008 12:33:33 -0500 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: <000401c8518c$aabcf280$a937170a@amr.corp.intel.com> Message-ID: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> In message <000401c8518c$aabcf280$a937170a at amr.corp.intel.com>,"Sean Hefty" wri tes: >I turned on lock checking and got the following possible locking dependency. >(Running on 2.6.24-rc3.) i have seen a similar dead lock before. however, i couldn't get enough information to track it down. earlier roland wrote: >This should be fine (and comes from an earlier set of changes to fix >deadlocks): ib_umad_close() does a downgrade_write() before calling >ib_unregister_mad_agent(), so it only holds the mutex with a read >lock, which means that queue_packet() should be able to take another >read lock. > >Unless there's something that prevents one thread from taking a read >lock twice? What kernel are you seeing these problems with? i dont think you are allowed to have nested locks of any sort. from include/linux/rwsem.h: #ifdef CONFIG_DEBUG_LOCK_ALLOC /* * nested locking. NOTE: rwsems are not allowed to recurse * (which occurs if the same task tries to acquire the same * lock instance multiple times), but multiple locks of the * same lock class might be taken, if the order of the locks * is always the same. This ordering rule can be expressed * to lockdep via the _nested() APIs, but enumerating the * subclasses that are used. (If the nesting relationship is * static then another method for expressing nested locking is * the explicit definition of lock class keys and the use of * lockdep_set_class() at lock initialization time. * See Documentation/lockdep-design.txt for more details.) */ the below certainly looks like a nested read lock. >Jan 7 12:23:35 mshefty-linux3 kernel: -> #0 (� ){--..}: >Jan 7 12:23:35 mshefty-linux3 kernel: [] print_stack_trace+0x6a/0x80 >Jan 7 12:23:35 mshefty-linux3 kernel: [] __lock_acquire+0x612/0x10c0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] __lock_acquire+0x8e7/0x10c0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] lock_acquire+0x53/0x70 >Jan 7 12:23:35 mshefty-linux3 kernel: [] flush_workqueue+0x0/0xa0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] _spin_unlock_irqrestore+0x55/0x70 >Jan 7 12:23:35 mshefty-linux3 kernel: [] flush_workqueue+0x43/0xa0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] ib_unregister_mad_agent+0x297/0x460 [ib_mad] >Jan 7 12:23:35 mshefty-linux3 kernel: [] ib_umad_close+0xbe/0x100 [ib_umad] >Jan 7 12:23:35 mshefty-linux3 kernel: [] __fput+0x1cb/0x200 >Jan 7 12:23:35 mshefty-linux3 kernel: [] filp_close+0x4b/0xa0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] put_files_struct+0x70/0xc0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] do_exit+0x1d8/0x8d0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] __dequeue_signal+0x27/0x1e0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] do_group_exit+0x30/0x90 >Jan 7 12:23:35 mshefty-linux3 kernel: [] get_signal_to_deliver+0x2fe/0x4f0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] do_notify_resume+0xc5/0x750 >Jan 7 12:23:35 mshefty-linux3 kernel: [] trace_hardirqs_on_thunk+0x35/0x3a >Jan 7 12:23:35 mshefty-linux3 kernel: [] trace_hardirqs_on+0xbf/0x160 >Jan 7 12:23:35 mshefty-linux3 kernel: [] sysret_signal+0x21/0x31 >Jan 7 12:23:35 mshefty-linux3 kernel: [] ptregscall_common+0x67/0xb0 >Jan 7 12:23:35 mshefty-linux3 kernel: [] 0xffffffffffffffff From sean.hefty at intel.com Tue Jan 8 09:40:57 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 8 Jan 2008 09:40:57 -0800 Subject: [ofa-general] RE: librdmacm/man: fix-up man pages In-Reply-To: <47837BFA.7040402@voltaire.com> References: <000101c81a64$3582de80$9c98070a@amr.corp.intel.com> <4726EEAC.3070105@voltaire.com> <472755C4.10600@ichips.intel.com> <47285F53.4060402@voltaire.com> <4728BF4A.1060301@ichips.intel.com> <15ddcffd0710311320v6b91b3cm3be0f7882e30ad2b@mail.gmail.com> <000001c81cb5$4ce12160$9c98070a@amr.corp.intel.com> <15ddcffd0711270435t12a18dc3waac2596b3884ac72@mail.gmail.com> <000001c8311a$176cdbe0$63248686@amr.corp.intel.com> <15ddcffd0711280307u7a89c6c2q2854b071f74d9123@mail.gmail.com> <000801c832b6$81feb850$f5d8180a@amr.corp.intel.com> <475FD984.6080203@voltaire.com> <47837BFA.7040402@voltaire.com> Message-ID: <000001c8521d$a05df320$a937170a@amr.corp.intel.com> >> 1. for rdma_disconnect - mention that >> - it applies only to connected service >> - the QP is moved to the error state and following that all the posted >> work requests will be flushed to the completion queue. >From rdma_disconnect.3: "Disconnects a connection and transitions any associated QP to the error state, which will flush any posted work requests to the completion queue." >> 2. for rdma_join_multicast >> - mention that as with unicast, if source address is provided to >> rdma_resolve_addr then the routing table need not be set to route this >> group to an ipoib device >From rdma_join_multicast.3: "Before joining a multicast group, the rdma_cm_id must be bound to an RDMA device by calling rdma_bind_addr or rdma_resolve_addr. Use of rdma_resolve_addr requires the local routing tables to resolve the multicast address to an RDMA device, unless a specific source address is provided." - Sean From prescott at hpc.ufl.edu Tue Jan 8 10:15:49 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Tue, 08 Jan 2008 13:15:49 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4783B3F5.20600@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> Message-ID: <4783BDD5.7000702@hpc.ufl.edu> Steve Wise wrote: > > > Craig Prescott wrote: >> >> Hi; >> >> Before the holidays, I tried a netperf SDP_STREAM >> test between a pair of Chelsio S310-SR iWARP cards >> using OFED 1.2, which crashed the netperf client >> host. Using OFED-1.3-20080107-0942 on both hosts >> the crash no longer happens, which is great, but it >> isn't working quite yet: > > Hey Craig, > > SDP currently isn't suppored over the Chelsio RNICs. No testing has > been done. > Hi Steve; I understand - just following up on the short thread from before the holidays where I started looking at this. I'm interested to try to make it SDP work anyway on these RNICs for a project we are working on. I am willing to put in some time to work on this and (if successful) do some testing and report back (if anyone is interested), and would be grateful for any advice or help from this list. Cheers, Craig From swise at opengridcomputing.com Tue Jan 8 10:38:30 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Tue, 08 Jan 2008 12:38:30 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4783BDD5.7000702@hpc.ufl.edu> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> Message-ID: <4783C326.3070306@opengridcomputing.com> Craig Prescott wrote: > Steve Wise wrote: >> >> >> Craig Prescott wrote: >>> >>> Hi; >>> >>> Before the holidays, I tried a netperf SDP_STREAM >>> test between a pair of Chelsio S310-SR iWARP cards >>> using OFED 1.2, which crashed the netperf client >>> host. Using OFED-1.3-20080107-0942 on both hosts >>> the crash no longer happens, which is great, but it >>> isn't working quite yet: >> >> Hey Craig, >> >> SDP currently isn't suppored over the Chelsio RNICs. No testing has >> been done. >> > > Hi Steve; > > I understand - just following up on the short thread from > before the holidays where I started looking at this. > That was sooo last year. :) > I'm interested to try to make it SDP work anyway on these RNICs > for a project we are working on. I am willing to put in some > time to work on this and (if successful) do some testing and > report back (if anyone is interested), and would be grateful > for any advice or help from this list. > Ok. First make sure the sdp kernel module uses the rdma cma. Then I'd add printk hooks in cma.c, addr.c, and iwcm.c to see what's going on and where things are failing. Also a wire trace is good if we're getting that far (like at least doing arp resolution). I haven't looked at the sdp code at all, so you'll have to do the leg work. Thanks! Steve. > Cheers, > Craig From jim at mellanox.com Tue Jan 8 11:16:38 2008 From: jim at mellanox.com (Jim Mott) Date: Tue, 8 Jan 2008 11:16:38 -0800 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4783C326.3070306@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> Message-ID: SDP kernel modules do use rdma CMA. You can get a little info from the kernel module by setting debug: echo 1 > /sys/modules/ib_sdp/debug_level Use dmesg to read the output. This might give you enough info to get to the root of the problem. Before you start adding printf()s, you can rebuild with CONFIG_INFINIBAND_SDP_DEBUG_DATA defined. I just stick a #define at the top of sdp.h. Then: echo 1 > /sys/modules/ib_sdp/data_debug_level JIm -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Steve Wise Sent: Tuesday, January 08, 2008 12:39 PM To: Craig Prescott Cc: general at lists.openfabrics.org Subject: Re: [ofa-general] SDP and iWARP Craig Prescott wrote: > Steve Wise wrote: >> >> >> Craig Prescott wrote: >>> >>> Hi; >>> >>> Before the holidays, I tried a netperf SDP_STREAM >>> test between a pair of Chelsio S310-SR iWARP cards >>> using OFED 1.2, which crashed the netperf client >>> host. Using OFED-1.3-20080107-0942 on both hosts >>> the crash no longer happens, which is great, but it >>> isn't working quite yet: >> >> Hey Craig, >> >> SDP currently isn't suppored over the Chelsio RNICs. No testing has >> been done. >> > > Hi Steve; > > I understand - just following up on the short thread from > before the holidays where I started looking at this. > That was sooo last year. :) > I'm interested to try to make it SDP work anyway on these RNICs > for a project we are working on. I am willing to put in some > time to work on this and (if successful) do some testing and > report back (if anyone is interested), and would be grateful > for any advice or help from this list. > Ok. First make sure the sdp kernel module uses the rdma cma. Then I'd add printk hooks in cma.c, addr.c, and iwcm.c to see what's going on and where things are failing. Also a wire trace is good if we're getting that far (like at least doing arp resolution). I haven't looked at the sdp code at all, so you'll have to do the leg work. Thanks! Steve. > Cheers, > Craig _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Tue Jan 8 11:44:57 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 19:44:57 +0000 Subject: [ofa-general] [PATCH 1/3] libvendor: osm_vendor_get_all_port_attr() rework In-Reply-To: <1199811527.25609.91.camel@hrosenstock-ws.xsigo.com> References: <11951179291471-git-send-email-sashak@voltaire.com> <11951179291903-git-send-email-sashak@voltaire.com> <1199718210.20870.105.camel@hrosenstock-ws.xsigo.com> <20080108162036.GR26304@sashak.voltaire.com> <1199809325.25609.72.camel@hrosenstock-ws.xsigo.com> <20080108170041.GU26304@sashak.voltaire.com> <1199811527.25609.91.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080108194457.GX26304@sashak.voltaire.com> On 08:58 Tue 08 Jan , Hal Rosenstock wrote: > > Yes, but it goes further than checking in tree consumers and not > everyone is paying attention all the time or running complete > regressions frequently so this wasn't found until recently. > > This change broke autoselection on a machine running OpenSM with a > combination of IB and iWARP adapters as it selected an iWARP adapter and > exited. Right, after reviewing this code again I see that it is possible failure. > If we care about continuing to support this feature, I suppose code > might be able to be added to OpenSM main.c to handle this rather than it > being in a lower layer as it was before. We don't have appropriate indication in the vendor layer. I think non-IB devices can be filtered out in osm_vendor_get_all_port_attr(), something like this: diff --git a/opensm/libvendor/osm_vendor_ibumad.c b/opensm/libvendor/osm_vendor_ibumad.c index 522325b..977a3b2 100644 --- a/opensm/libvendor/osm_vendor_ibumad.c +++ b/opensm/libvendor/osm_vendor_ibumad.c @@ -571,6 +571,8 @@ osm_vendor_get_all_port_attr(IN osm_vendor_t * const p_vend, * For each CA, retrieve the port guids */ if (umad_get_ca(p_vend->ca_names[i], &ca) == 0) { + if (ca.node_type < 1 || ca.node_type > 3) + continue; for (j = 0; j <= ca.numports; j++) { if (!ca.ports[j]) continue; Sasha From chas at cmf.nrl.navy.mil Tue Jan 8 11:35:04 2008 From: chas at cmf.nrl.navy.mil (chas williams - CONTRACTOR) Date: Tue, 08 Jan 2008 14:35:04 -0500 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> Message-ID: <200801081935.m08JZ4vP014767@cmf.nrl.navy.mil> In message <200801081733.m08HXX3x013059 at cmf.nrl.navy.mil>,"chas williams - CONT RACTOR" writes: >i dont think you are allowed to have nested locks of any sort. i mispoke here. i meant recursive locks. locks can be nested but never recursive. From iflareimagingm at flareimaging.com Mon Jan 7 11:36:27 2008 From: iflareimagingm at flareimaging.com (Dan Roach) Date: Tue, 7 Jan 2008 20:36:27 +0100 Subject: [ofa-general] Was Hochwertiges kann man billig kaufen? Die Antwort ist diese Software Message-ID: <653198115.26194174807563@flareimaging.com> Wie kann man die Software momentan und fur wenig Geld bekommen? Einfach bezahlen und auslasten. Gleich haben Sie die auf allen europaischen Sprachen uberlassenen Programmen, die fur Windows und Macintosh vorherbestimmt sind. Die Produkte der Software sind original und vollig.Wie das Programm aufzustellen? Dabei hilft die professionelle Konsultation des Anwenderdienstes. Garantierte schnelle Antwort, die Ruckzahlung ist moglich. Kaufen die vollkommen funktionierende Software http://geocities.com/deidre.madden/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Tue Jan 8 11:42:55 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 11:42:55 -0800 Subject: [ofa-general] Re: [PATCH] mlx4: Fix the value of the pkey_index in the completion In-Reply-To: <47831E68.9050809@dev.mellanox.co.il> (Dotan Barak's message of "Tue, 08 Jan 2008 08:55:36 +0200") References: <200801070901.26213.dotanb@dev.mellanox.co.il> <47831E68.9050809@dev.mellanox.co.il> Message-ID: > > Does this explain the problems that someone saw when using all port 2 > > of a connectx setup? > Yes. without this mask the value of the pkey_index was 0x80. I see -- the high order bit of this 32-bit field in the CQE is 0 for port 1 and 1 for port 2? Is that documented anywhere in the PRM? Anyway I'll apply this and ask Linus to pull it into 2.6.24. - R. From hrosenstock at xsigo.com Tue Jan 8 11:47:25 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Tue, 08 Jan 2008 11:47:25 -0800 Subject: [ofa-general] [PATCH 1/3] libvendor: osm_vendor_get_all_port_attr() rework In-Reply-To: <20080108194457.GX26304@sashak.voltaire.com> References: <11951179291471-git-send-email-sashak@voltaire.com> <11951179291903-git-send-email-sashak@voltaire.com> <1199718210.20870.105.camel@hrosenstock-ws.xsigo.com> <20080108162036.GR26304@sashak.voltaire.com> <1199809325.25609.72.camel@hrosenstock-ws.xsigo.com> <20080108170041.GU26304@sashak.voltaire.com> <1199811527.25609.91.camel@hrosenstock-ws.xsigo.com> <20080108194457.GX26304@sashak.voltaire.com> Message-ID: <1199821645.27852.94.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-08 at 19:44 +0000, Sasha Khapyorsky wrote: > On 08:58 Tue 08 Jan , Hal Rosenstock wrote: > > > > Yes, but it goes further than checking in tree consumers and not > > everyone is paying attention all the time or running complete > > regressions frequently so this wasn't found until recently. > > > > This change broke autoselection on a machine running OpenSM with a > > combination of IB and iWARP adapters as it selected an iWARP adapter and > > exited. > > Right, after reviewing this code again I see that it is possible > failure. > > > If we care about continuing to support this feature, I suppose code > > might be able to be added to OpenSM main.c to handle this rather than it > > being in a lower layer as it was before. > > We don't have appropriate indication in the vendor layer. I think non-IB > devices can be filtered out in osm_vendor_get_all_port_attr(), something > like this: Yes, that looks like it would work. Hoping to get it tested. -- Hal > diff --git a/opensm/libvendor/osm_vendor_ibumad.c b/opensm/libvendor/osm_vendor_ibumad.c > index 522325b..977a3b2 100644 > --- a/opensm/libvendor/osm_vendor_ibumad.c > +++ b/opensm/libvendor/osm_vendor_ibumad.c > @@ -571,6 +571,8 @@ osm_vendor_get_all_port_attr(IN osm_vendor_t * const p_vend, > * For each CA, retrieve the port guids > */ > if (umad_get_ca(p_vend->ca_names[i], &ca) == 0) { > + if (ca.node_type < 1 || ca.node_type > 3) > + continue; > for (j = 0; j <= ca.numports; j++) { > if (!ca.ports[j]) > continue; > > Sasha > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From rdreier at cisco.com Tue Jan 8 12:03:34 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 12:03:34 -0800 Subject: [ofa-general] [PATCH 3/3] IB/srp: use scatter gather chaining In-Reply-To: <537C6C0940C6C143AA46A88946B854170AD87CFA@ORNLEXCHANGE.ornl.gov> (David A. Dillow's message of "Fri, 04 Jan 2008 17:28:00 -0500") References: <1198102155.5649.37.camel@lap75545.ornl.gov> <537C6C0940C6C143AA46A88946B854170AD87CFA@ORNLEXCHANGE.ornl.gov> Message-ID: > You may not get 4MB I/Os because of memory fragmentation, but you should see 1MB > or better. You can use a real file as a data source/sink to verify against > corruption. Great, thanks! I tried it on a system using the ipath driver and it seemed to work fine... I saw SG lists with > 220 entries, which means they really were chained (only 128 entries fit in a page on x86-64). I added this for 2.6.25. - R. From rdreier at cisco.com Tue Jan 8 12:09:08 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 12:09:08 -0800 Subject: [ofa-general] Re: [2.6.24-rc minor bugfix] IB/srp: release transport before removing host In-Reply-To: <1199417741.3636.18.camel@obelisk.thedillows.org> (Dave Dillow's message of "Thu, 03 Jan 2008 22:35:41 -0500") References: <1199393485.7561.51.camel@lap75545.ornl.gov> <20080104094722F.fujita.tomonori@lab.ntt.co.jp> <1199414359.3636.13.camel@obelisk.thedillows.org> <20080104115427F.fujita.tomonori@lab.ntt.co.jp> <1199417741.3636.18.camel@obelisk.thedillows.org> Message-ID: thanks, applied. From sashak at voltaire.com Tue Jan 8 12:23:10 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 20:23:10 +0000 Subject: [ofa-general] [PATCH 1/3] libvendor: osm_vendor_get_all_port_attr() rework In-Reply-To: <1199821645.27852.94.camel@hrosenstock-ws.xsigo.com> References: <11951179291471-git-send-email-sashak@voltaire.com> <11951179291903-git-send-email-sashak@voltaire.com> <1199718210.20870.105.camel@hrosenstock-ws.xsigo.com> <20080108162036.GR26304@sashak.voltaire.com> <1199809325.25609.72.camel@hrosenstock-ws.xsigo.com> <20080108170041.GU26304@sashak.voltaire.com> <1199811527.25609.91.camel@hrosenstock-ws.xsigo.com> <20080108194457.GX26304@sashak.voltaire.com> <1199821645.27852.94.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080108202310.GY26304@sashak.voltaire.com> On 11:47 Tue 08 Jan , Hal Rosenstock wrote: > On Tue, 2008-01-08 at 19:44 +0000, Sasha Khapyorsky wrote: > > On 08:58 Tue 08 Jan , Hal Rosenstock wrote: > > > > > > Yes, but it goes further than checking in tree consumers and not > > > everyone is paying attention all the time or running complete > > > regressions frequently so this wasn't found until recently. > > > > > > This change broke autoselection on a machine running OpenSM with a > > > combination of IB and iWARP adapters as it selected an iWARP adapter and > > > exited. > > > > Right, after reviewing this code again I see that it is possible > > failure. > > > > > If we care about continuing to support this feature, I suppose code > > > might be able to be added to OpenSM main.c to handle this rather than it > > > being in a lower layer as it was before. > > > > We don't have appropriate indication in the vendor layer. I think non-IB > > devices can be filtered out in osm_vendor_get_all_port_attr(), something > > like this: > > Yes, that looks like it would work. Hoping to get it tested. Thanks. I will post this patch to the list yet. Sasha From sashak at voltaire.com Tue Jan 8 12:23:45 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 20:23:45 +0000 Subject: [ofa-general] [PATCH] opensm/vendor: filter out non-IB devices in osm_vendor_get_all_port_attr() Message-ID: <20080108202345.GZ26304@sashak.voltaire.com> osm_vendor_get_all_port_attr() will return attributes for only IB ports and will filter out all non-IB (iWARP) devices. Signed-off-by: Sasha Khapyorsky --- opensm/libvendor/osm_vendor_ibumad.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/opensm/libvendor/osm_vendor_ibumad.c b/opensm/libvendor/osm_vendor_ibumad.c index 522325b..977a3b2 100644 --- a/opensm/libvendor/osm_vendor_ibumad.c +++ b/opensm/libvendor/osm_vendor_ibumad.c @@ -571,6 +571,8 @@ osm_vendor_get_all_port_attr(IN osm_vendor_t * const p_vend, * For each CA, retrieve the port guids */ if (umad_get_ca(p_vend->ca_names[i], &ca) == 0) { + if (ca.node_type < 1 || ca.node_type > 3) + continue; for (j = 0; j <= ca.numports; j++) { if (!ca.ports[j]) continue; -- 1.5.4.rc2.38.gd6da3 From mshefty at ichips.intel.com Tue Jan 8 12:14:55 2008 From: mshefty at ichips.intel.com (Sean Hefty) Date: Tue, 08 Jan 2008 12:14:55 -0800 Subject: [ofa-general] [PATCH] opensm/vendor: filter out non-IB devices in osm_vendor_get_all_port_attr() In-Reply-To: <20080108202345.GZ26304@sashak.voltaire.com> References: <20080108202345.GZ26304@sashak.voltaire.com> Message-ID: <4783D9BF.6010806@ichips.intel.com> > + if (ca.node_type < 1 || ca.node_type > 3) > + continue; Are there enums for this? From daviesmark1984 at yahoo.co.uk Mon Jan 7 21:08:59 2008 From: daviesmark1984 at yahoo.co.uk (DAVIES MARK) Date: Tue, 8 Jan 2008 05:08:59 -0000 (GMT) Subject: [ofa-general] Last Reminder Notice Message-ID: <2232.41.219.210.45.1199768939.squirrel@www.bluelightningmail.com> THIS IS FOR YOUR ATTENTION I have not considered this medium as the best manner to approach you on this issue being that the web has been greatly abused over the recent years and now unsecured for vital information of importance. This is due to your non-response to my earlier letter as I urgently wish to alert you as you have been listed as a possible recipient to the total sum of £8,600,000.00GBP(Eight million Six hundred thousand British pounds) in the codicil and last testament of the departed. You have actually been scheduled because you bear the surname identity with our deceased customer and therefore give you the position to receive the money he long deposit only when you are presented as the beneficiary to the legacy. I therefore reckoned that you work with me for mutual benefit to receive these funds as you are clearly qualified by your name identity and I shall secure all the legal papers on your acceptance since I have all the details and since no one is coming for it. We both know that if nothing is ventured nothing gained. If this is of appeal to you kindly reply me for more details by confirming your consent and full support. Please I expect you to provide me with a confidential telephone for better communication and a physical address for the procurement of necessary legal documents at the law court division, all for the release of this sum to you. You to note that time is of essence as we approach the end of financial year. Yours faithfully, Davies Mark. From rdreier at cisco.com Tue Jan 8 12:25:23 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 12:25:23 -0800 Subject: [ofa-general] [GIT PULL] please pull infiniband.git Message-ID: Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This will get a few one-line fixes: change a maintainer email address that is going away soon, fix a bug that would make the second port on some HCAs useless, and fix SRP the right way now that the root cause is fixed via James's tree. (And just to counter James's suggestion that the SRP user base is comparable to voyager's: these SRP fixes are coming from a real production user, and that user is @ornl.gov, which means that SRP is actually being used to do something involving things like atomic bombs or sharks with laser beams) Dave Dillow (1): IB/srp: Release transport before removing host Dotan Barak (1): IB/mlx4: Fix value of pkey_index in QP1 completions Sean Hefty (1): MAINTAINERS: Update Sean Hefty's email address MAINTAINERS | 2 +- drivers/infiniband/hw/mlx4/cq.c | 2 +- drivers/infiniband/ulp/srp/ib_srp.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 79c711e..56e6159 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1919,7 +1919,7 @@ INFINIBAND SUBSYSTEM P: Roland Dreier M: rolandd at cisco.com P: Sean Hefty -M: mshefty at ichips.intel.com +M: sean.hefty at intel.com P: Hal Rosenstock M: hal.rosenstock at gmail.com L: general at lists.openfabrics.org diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 8bf44da..9d32c49 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -430,7 +430,7 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, wc->dlid_path_bits = (be32_to_cpu(cqe->g_mlpath_rqpn) >> 24) & 0x7f; wc->wc_flags |= be32_to_cpu(cqe->g_mlpath_rqpn) & 0x80000000 ? IB_WC_GRH : 0; - wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) >> 16; + wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) & 0x7f; } return 0; diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 77e8b90..bdb6f85 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2053,8 +2053,8 @@ static void srp_remove_one(struct ib_device *device) list_for_each_entry_safe(target, tmp_target, &host->target_list, list) { - scsi_remove_host(target->scsi_host); srp_remove_host(target->scsi_host); + scsi_remove_host(target->scsi_host); srp_disconnect_target(target); ib_destroy_cm_id(target->cm_id); srp_free_target_ib(target); From pw at osc.edu Tue Jan 8 12:33:58 2008 From: pw at osc.edu (Pete Wyckoff) Date: Tue, 8 Jan 2008 15:33:58 -0500 Subject: [ofa-general] Re: [PATCH v2] IB/srp: add identifying information to log messages In-Reply-To: <1199748221.22987.6.camel@lap75545.ornl.gov> References: <1198269544.9979.26.camel@lap75545.ornl.gov> <20071222145612.GA10085@osc.edu> <1199748221.22987.6.camel@lap75545.ornl.gov> Message-ID: <20080108203358.GA9362@osc.edu> dillowda at ornl.gov wrote on Mon, 07 Jan 2008 18:23 -0500: > When you have multiple targets, it gets really confusing when you try to > track down who did a reset when there is no identifying information in > the log message, especially when the same extension ID is mapped through > two different local IB ports. So, add an identifier that can be used to > track back to which local IB port/remote target pair is the one having > problems. > > Signed-off-by: David Dillow Acked-by: Pete Wyckoff > --- > On Sat, 2007-12-22 at 09:56 -0500, Pete Wyckoff wrote: > > Good idea to fix these. > > > > Could you use the standard dev_err(), dev_printk() and friends here > > instead? dev = &target->scsi_host->shost_gendev. In fact, for > > struct Scsi_host, you can do one better and use shost_printk(). > > I finally got back around to working on this; these apply to Linus's > current tree. I reviewed the patch; looks fine to me. -- Pete > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c > index 77e8b90..154ebb0 100644 > --- a/drivers/infiniband/ulp/srp/ib_srp.c > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > @@ -272,7 +272,8 @@ static void srp_path_rec_completion(int status, > > target->status = status; > if (status) > - printk(KERN_ERR PFX "Got failed path rec status %d\n", status); > + shost_printk(KERN_ERR, target->scsi_host, > + PFX "Got failed path rec status %d\n", status); > else > target->path = *pathrec; > complete(&target->done); > @@ -303,7 +304,8 @@ static int srp_lookup_path(struct srp_target_port *target) > wait_for_completion(&target->done); > > if (target->status < 0) > - printk(KERN_WARNING PFX "Path record query failed\n"); > + shost_printk(KERN_WARNING, target->scsi_host, > + PFX "Path record query failed\n"); > > return target->status; > } > @@ -379,9 +381,10 @@ static int srp_send_req(struct srp_target_port *target) > * the second 8 bytes to the local node GUID. > */ > if (srp_target_is_topspin(target)) { > - printk(KERN_DEBUG PFX "Topspin/Cisco initiator port ID workaround " > - "activated for target GUID %016llx\n", > - (unsigned long long) be64_to_cpu(target->ioc_guid)); > + shost_printk(KERN_DEBUG, target->scsi_host, > + PFX "Topspin/Cisco initiator port ID workaround " > + "activated for target GUID %016llx\n", > + (unsigned long long) be64_to_cpu(target->ioc_guid)); > memset(req->priv.initiator_port_id, 0, 8); > memcpy(req->priv.initiator_port_id + 8, > &target->srp_host->dev->dev->node_guid, 8); > @@ -400,7 +403,8 @@ static void srp_disconnect_target(struct srp_target_port *target) > > init_completion(&target->done); > if (ib_send_cm_dreq(target->cm_id, NULL, 0)) { > - printk(KERN_DEBUG PFX "Sending CM DREQ failed\n"); > + shost_printk(KERN_DEBUG, target->scsi_host, > + PFX "Sending CM DREQ failed\n"); > return; > } > wait_for_completion(&target->done); > @@ -568,7 +572,8 @@ static int srp_reconnect_target(struct srp_target_port *target) > return ret; > > err: > - printk(KERN_ERR PFX "reconnect failed (%d), removing target port.\n", ret); > + shost_printk(KERN_ERR, target->scsi_host, > + PFX "reconnect failed (%d), removing target port.\n", ret); > > /* > * We couldn't reconnect, so kill our target port off. > @@ -683,8 +688,9 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target, > > if (scmnd->sc_data_direction != DMA_FROM_DEVICE && > scmnd->sc_data_direction != DMA_TO_DEVICE) { > - printk(KERN_WARNING PFX "Unhandled data direction %d\n", > - scmnd->sc_data_direction); > + shost_printk(KERN_WARNING, target->scsi_host, > + PFX "Unhandled data direction %d\n", > + scmnd->sc_data_direction); > return -EINVAL; > } > > @@ -786,8 +792,9 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp) > } else { > scmnd = req->scmnd; > if (!scmnd) > - printk(KERN_ERR "Null scmnd for RSP w/tag %016llx\n", > - (unsigned long long) rsp->tag); > + shost_printk(KERN_ERR, target->scsi_host, > + "Null scmnd for RSP w/tag %016llx\n", > + (unsigned long long) rsp->tag); > scmnd->result = rsp->status; > > if (rsp->flags & SRP_RSP_FLAG_SNSVALID) { > @@ -831,7 +838,8 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc) > if (0) { > int i; > > - printk(KERN_ERR PFX "recv completion, opcode 0x%02x\n", opcode); > + shost_printk(KERN_ERR, target->scsi_host, > + PFX "recv completion, opcode 0x%02x\n", opcode); > > for (i = 0; i < wc->byte_len; ++i) { > if (i % 8 == 0) > @@ -852,11 +860,13 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc) > > case SRP_T_LOGOUT: > /* XXX Handle target logout */ > - printk(KERN_WARNING PFX "Got target logout request\n"); > + shost_printk(KERN_WARNING, target->scsi_host, > + PFX "Got target logout request\n"); > break; > > default: > - printk(KERN_WARNING PFX "Unhandled SRP opcode 0x%02x\n", opcode); > + shost_printk(KERN_WARNING, target->scsi_host, > + PFX "Unhandled SRP opcode 0x%02x\n", opcode); > break; > } > > @@ -872,9 +882,10 @@ static void srp_completion(struct ib_cq *cq, void *target_ptr) > ib_req_notify_cq(cq, IB_CQ_NEXT_COMP); > while (ib_poll_cq(cq, 1, &wc) > 0) { > if (wc.status) { > - printk(KERN_ERR PFX "failed %s status %d\n", > - wc.wr_id & SRP_OP_RECV ? "receive" : "send", > - wc.status); > + shost_printk(KERN_ERR, target->scsi_host, > + PFX "failed %s status %d\n", > + wc.wr_id & SRP_OP_RECV ? "receive" : "send", > + wc.status); > target->qp_in_error = 1; > break; > } > @@ -1022,12 +1033,13 @@ static int srp_queuecommand(struct scsi_cmnd *scmnd, > > len = srp_map_data(scmnd, target, req); > if (len < 0) { > - printk(KERN_ERR PFX "Failed to map data\n"); > + shost_printk(KERN_ERR, target->scsi_host, > + PFX "Failed to map data\n"); > goto err; > } > > if (__srp_post_recv(target)) { > - printk(KERN_ERR PFX "Recv failed\n"); > + shost_printk(KERN_ERR, target->scsi_host, PFX "Recv failed\n"); > goto err_unmap; > } > > @@ -1035,7 +1047,7 @@ static int srp_queuecommand(struct scsi_cmnd *scmnd, > DMA_TO_DEVICE); > > if (__srp_post_send(target, iu, len)) { > - printk(KERN_ERR PFX "Send failed\n"); > + shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n"); > goto err_unmap; > } > > @@ -1090,6 +1102,7 @@ static void srp_cm_rej_handler(struct ib_cm_id *cm_id, > struct ib_cm_event *event, > struct srp_target_port *target) > { > + struct Scsi_Host *shost = target->scsi_host; > struct ib_class_port_info *cpi; > int opcode; > > @@ -1115,19 +1128,22 @@ static void srp_cm_rej_handler(struct ib_cm_id *cm_id, > memcpy(target->path.dgid.raw, > event->param.rej_rcvd.ari, 16); > > - printk(KERN_DEBUG PFX "Topspin/Cisco redirect to target port GID %016llx%016llx\n", > - (unsigned long long) be64_to_cpu(target->path.dgid.global.subnet_prefix), > - (unsigned long long) be64_to_cpu(target->path.dgid.global.interface_id)); > + shost_printk(KERN_DEBUG, shost, > + PFX "Topspin/Cisco redirect to target port GID %016llx%016llx\n", > + (unsigned long long) be64_to_cpu(target->path.dgid.global.subnet_prefix), > + (unsigned long long) be64_to_cpu(target->path.dgid.global.interface_id)); > > target->status = SRP_PORT_REDIRECT; > } else { > - printk(KERN_WARNING " REJ reason: IB_CM_REJ_PORT_REDIRECT\n"); > + shost_printk(KERN_WARNING, shost, > + " REJ reason: IB_CM_REJ_PORT_REDIRECT\n"); > target->status = -ECONNRESET; > } > break; > > case IB_CM_REJ_DUPLICATE_LOCAL_COMM_ID: > - printk(KERN_WARNING " REJ reason: IB_CM_REJ_DUPLICATE_LOCAL_COMM_ID\n"); > + shost_printk(KERN_WARNING, shost, > + " REJ reason: IB_CM_REJ_DUPLICATE_LOCAL_COMM_ID\n"); > target->status = -ECONNRESET; > break; > > @@ -1138,20 +1154,21 @@ static void srp_cm_rej_handler(struct ib_cm_id *cm_id, > u32 reason = be32_to_cpu(rej->reason); > > if (reason == SRP_LOGIN_REJ_REQ_IT_IU_LENGTH_TOO_LARGE) > - printk(KERN_WARNING PFX > - "SRP_LOGIN_REJ: requested max_it_iu_len too large\n"); > + shost_printk(KERN_WARNING, shost, > + PFX "SRP_LOGIN_REJ: requested max_it_iu_len too large\n"); > else > - printk(KERN_WARNING PFX > - "SRP LOGIN REJECTED, reason 0x%08x\n", reason); > + shost_printk(KERN_WARNING, shost, > + PFX "SRP LOGIN REJECTED, reason 0x%08x\n", reason); > } else > - printk(KERN_WARNING " REJ reason: IB_CM_REJ_CONSUMER_DEFINED," > - " opcode 0x%02x\n", opcode); > + shost_printk(KERN_WARNING, shost, > + " REJ reason: IB_CM_REJ_CONSUMER_DEFINED," > + " opcode 0x%02x\n", opcode); > target->status = -ECONNRESET; > break; > > default: > - printk(KERN_WARNING " REJ reason 0x%x\n", > - event->param.rej_rcvd.reason); > + shost_printk(KERN_WARNING, shost, " REJ reason 0x%x\n", > + event->param.rej_rcvd.reason); > target->status = -ECONNRESET; > } > } > @@ -1166,7 +1183,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) > > switch (event->event) { > case IB_CM_REQ_ERROR: > - printk(KERN_DEBUG PFX "Sending CM REQ failed\n"); > + shost_printk(KERN_DEBUG, target->scsi_host, > + PFX "Sending CM REQ failed\n"); > comp = 1; > target->status = -ECONNRESET; > break; > @@ -1184,7 +1202,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) > target->scsi_host->can_queue = min(target->req_lim, > target->scsi_host->can_queue); > } else { > - printk(KERN_WARNING PFX "Unhandled RSP opcode %#x\n", opcode); > + shost_printk(KERN_WARNING, target->scsi_host, > + PFX "Unhandled RSP opcode %#x\n", opcode); > target->status = -ECONNRESET; > break; > } > @@ -1230,20 +1249,23 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) > break; > > case IB_CM_REJ_RECEIVED: > - printk(KERN_DEBUG PFX "REJ received\n"); > + shost_printk(KERN_DEBUG, target->scsi_host, PFX "REJ received\n"); > comp = 1; > > srp_cm_rej_handler(cm_id, event, target); > break; > > case IB_CM_DREQ_RECEIVED: > - printk(KERN_WARNING PFX "DREQ received - connection closed\n"); > + shost_printk(KERN_WARNING, target->scsi_host, > + PFX "DREQ received - connection closed\n"); > if (ib_send_cm_drep(cm_id, NULL, 0)) > - printk(KERN_ERR PFX "Sending CM DREP failed\n"); > + shost_printk(KERN_ERR, target->scsi_host, > + PFX "Sending CM DREP failed\n"); > break; > > case IB_CM_TIMEWAIT_EXIT: > - printk(KERN_ERR PFX "connection closed\n"); > + shost_printk(KERN_ERR, target->scsi_host, > + PFX "connection closed\n"); > > comp = 1; > target->status = 0; > @@ -1255,7 +1277,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) > break; > > default: > - printk(KERN_WARNING PFX "Unhandled CM event %d\n", event->event); > + shost_printk(KERN_WARNING, target->scsi_host, > + PFX "Unhandled CM event %d\n", event->event); > break; > } > > @@ -1332,7 +1355,7 @@ static int srp_abort(struct scsi_cmnd *scmnd) > struct srp_request *req; > int ret = SUCCESS; > > - printk(KERN_ERR "SRP abort called\n"); > + shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n"); > > if (target->qp_in_error) > return FAILED; > @@ -1362,7 +1385,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd) > struct srp_target_port *target = host_to_target(scmnd->device->host); > struct srp_request *req, *tmp; > > - printk(KERN_ERR "SRP reset_device called\n"); > + shost_printk(KERN_ERR, target->scsi_host, "SRP reset_device called\n"); > > if (target->qp_in_error) > return FAILED; > @@ -1389,7 +1412,7 @@ static int srp_reset_host(struct scsi_cmnd *scmnd) > struct srp_target_port *target = host_to_target(scmnd->device->host); > int ret = FAILED; > > - printk(KERN_ERR PFX "SRP reset_host called\n"); > + shost_printk(KERN_ERR, target->scsi_host, PFX "SRP reset_host called\n"); > > if (!srp_reconnect_target(target)) > ret = SUCCESS; > @@ -1814,8 +1837,9 @@ static ssize_t srp_create_target(struct class_device *class_dev, > > ib_get_cached_gid(host->dev->dev, host->port, 0, &target->path.sgid); > > - printk(KERN_DEBUG PFX "new target: id_ext %016llx ioc_guid %016llx pkey %04x " > - "service_id %016llx dgid %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", > + shost_printk(KERN_DEBUG, target->scsi_host, PFX > + "new target: id_ext %016llx ioc_guid %016llx pkey %04x " > + "service_id %016llx dgid %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", > (unsigned long long) be64_to_cpu(target->id_ext), > (unsigned long long) be64_to_cpu(target->ioc_guid), > be16_to_cpu(target->path.pkey), > @@ -1842,7 +1866,8 @@ static ssize_t srp_create_target(struct class_device *class_dev, > target->qp_in_error = 0; > ret = srp_connect_target(target); > if (ret) { > - printk(KERN_ERR PFX "Connection failed\n"); > + shost_printk(KERN_ERR, target->scsi_host, > + PFX "Connection failed\n"); > goto err_cm_id; > } > > > From rdreier at cisco.com Tue Jan 8 13:10:41 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 13:10:41 -0800 Subject: [ofa-general] OFED and Ubuntu Linux In-Reply-To: (Bart Van Assche's message of "Tue, 8 Jan 2008 16:21:24 +0100") References: Message-ID: > Is the OFED software stack supported on Ubuntu Linux ? While OFED > 1.2.5.4 compiles fine on Ubuntu Linux 7.10, I got the following errors > while installing the RPM's: Just out of curiosity, what packages from OFED are you interested in using on Ubuntu? My goal would be to get most IB/RDMA-related stuff into the upstream Debian/Ubuntu distributions directly, so that you don't have to mess around with OFED at all. Currently, Ubuntu 7.10 has a 2.6.22 kernel, which has most IB support built in, and the ubuntu archive has packages for libibverbs and libmthca in universe. 8.04 (Hardy) will have a 2.6.24 kernel and adds openmpi packages (built with libibverbs support). I have libmlx4 packaged for hardy in my PPA: deb http://ppa.launchpad.net/roland.dreier/ubuntu hardy main deb-src http://ppa.launchpad.net/roland.dreier/ubuntu hardy main (libmlx4 is in Debian testing so it should propagate automatically into Ubuntu universe for Hardy+1). I am planning on packaging librdmacm for Debian and Ubuntu in the next few weeks. The packages will appear in my PPA and should be ready in plenty of time for Hardy+1. Are there any other packages you are looking for? - R. From arthur.jones at qlogic.com Tue Jan 8 13:17:54 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 08 Jan 2008 13:17:54 -0800 Subject: [ofa-general] [PATCH] IB/ipath - first prep series for iba7220 Message-ID: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> hi roland, these are changes which help prepare for the upcoming iba7220 driver series. these are changes to the existing code only, and do not support the iba7220 directly, but should help to make reviewing the iba7220 code easier when it finally arrives. there is one more patch series like this one, then the actual driver will follow... these patches, based on your for-2.6.25 branch are avail by git pull from: git://git.qlogic.com/ipath-linux-2.6 for-roland arthur From arthur.jones at qlogic.com Tue Jan 8 13:18:00 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 08 Jan 2008 13:18:00 -0800 Subject: [ofa-general] [PATCH 1/8] IB/ipath - MAD performance sampling registers support In-Reply-To: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> References: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080108211800.23996.14094.stgit@eng-46.internal.keyresearch.com> From: Ralph Campbell Add support for QLogic HCAs which have hardware performance sampling registers for PortSamplesControl and PortSamplesResult MADs. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_kernel.h | 9 ++ drivers/infiniband/hw/ipath/ipath_mad.c | 119 ++++++++++++++++--------- drivers/infiniband/hw/ipath/ipath_registers.h | 14 +++ drivers/infiniband/hw/ipath/ipath_verbs.c | 1 drivers/infiniband/hw/ipath/ipath_verbs.h | 2 5 files changed, 101 insertions(+), 44 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index 977e88a..bfe84a1 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -930,6 +930,15 @@ static inline u32 ipath_read_creg32(const struct ipath_devdata *dd, (char __iomem *)dd->ipath_kregbase)); } +static inline void ipath_write_creg(const struct ipath_devdata *dd, + ipath_creg regno, u64 value) +{ + if (dd->ipath_kregbase) + writeq(value, regno + (u64 __iomem *) + (dd->ipath_cregbase + + (char __iomem *)dd->ipath_kregbase)); +} + static inline void ipath_clear_rcvhdrtail(const struct ipath_portdata *pd) { *((u64 *) pd->port_rcvhdrtail_kvaddr) = 0ULL; diff --git a/drivers/infiniband/hw/ipath/ipath_mad.c b/drivers/infiniband/hw/ipath/ipath_mad.c index 1978c34..d98d5f1 100644 --- a/drivers/infiniband/hw/ipath/ipath_mad.c +++ b/drivers/infiniband/hw/ipath/ipath_mad.c @@ -934,6 +934,7 @@ static int recv_pma_get_portsamplescontrol(struct ib_perf *pmp, struct ib_pma_portsamplescontrol *p = (struct ib_pma_portsamplescontrol *)pmp->data; struct ipath_ibdev *dev = to_idev(ibdev); + struct ipath_cregs const *crp = dev->dd->ipath_cregs; unsigned long flags; u8 port_select = p->port_select; @@ -955,7 +956,10 @@ static int recv_pma_get_portsamplescontrol(struct ib_perf *pmp, p->counter_width = 4; /* 32 bit counters */ p->counter_mask0_9 = COUNTER_MASK0_9; spin_lock_irqsave(&dev->pending_lock, flags); - p->sample_status = dev->pma_sample_status; + if (crp->cr_psstat) + p->sample_status = ipath_read_creg32(dev->dd, crp->cr_psstat); + else + p->sample_status = dev->pma_sample_status; p->sample_start = cpu_to_be32(dev->pma_sample_start); p->sample_interval = cpu_to_be32(dev->pma_sample_interval); p->tag = cpu_to_be16(dev->pma_tag); @@ -975,8 +979,9 @@ static int recv_pma_set_portsamplescontrol(struct ib_perf *pmp, struct ib_pma_portsamplescontrol *p = (struct ib_pma_portsamplescontrol *)pmp->data; struct ipath_ibdev *dev = to_idev(ibdev); + struct ipath_cregs const *crp = dev->dd->ipath_cregs; unsigned long flags; - u32 start; + u8 status; int ret; if (pmp->attr_mod != 0 || @@ -986,59 +991,67 @@ static int recv_pma_set_portsamplescontrol(struct ib_perf *pmp, goto bail; } - start = be32_to_cpu(p->sample_start); - if (start != 0) { - spin_lock_irqsave(&dev->pending_lock, flags); - if (dev->pma_sample_status == IB_PMA_SAMPLE_STATUS_DONE) { - dev->pma_sample_status = - IB_PMA_SAMPLE_STATUS_STARTED; - dev->pma_sample_start = start; - dev->pma_sample_interval = - be32_to_cpu(p->sample_interval); - dev->pma_tag = be16_to_cpu(p->tag); - if (p->counter_select[0]) - dev->pma_counter_select[0] = - p->counter_select[0]; - if (p->counter_select[1]) - dev->pma_counter_select[1] = - p->counter_select[1]; - if (p->counter_select[2]) - dev->pma_counter_select[2] = - p->counter_select[2]; - if (p->counter_select[3]) - dev->pma_counter_select[3] = - p->counter_select[3]; - if (p->counter_select[4]) - dev->pma_counter_select[4] = - p->counter_select[4]; - } - spin_unlock_irqrestore(&dev->pending_lock, flags); + spin_lock_irqsave(&dev->pending_lock, flags); + if (crp->cr_psstat) + status = ipath_read_creg32(dev->dd, crp->cr_psstat); + else + status = dev->pma_sample_status; + if (status == IB_PMA_SAMPLE_STATUS_DONE) { + dev->pma_sample_start = be32_to_cpu(p->sample_start); + dev->pma_sample_interval = be32_to_cpu(p->sample_interval); + dev->pma_tag = be16_to_cpu(p->tag); + dev->pma_counter_select[0] = p->counter_select[0]; + dev->pma_counter_select[1] = p->counter_select[1]; + dev->pma_counter_select[2] = p->counter_select[2]; + dev->pma_counter_select[3] = p->counter_select[3]; + dev->pma_counter_select[4] = p->counter_select[4]; + if (crp->cr_psstat) { + ipath_write_creg(dev->dd, crp->cr_psinterval, + dev->pma_sample_interval); + ipath_write_creg(dev->dd, crp->cr_psstart, + dev->pma_sample_start); + } else + dev->pma_sample_status = IB_PMA_SAMPLE_STATUS_STARTED; } + spin_unlock_irqrestore(&dev->pending_lock, flags); + ret = recv_pma_get_portsamplescontrol(pmp, ibdev, port); bail: return ret; } -static u64 get_counter(struct ipath_ibdev *dev, __be16 sel) +static u64 get_counter(struct ipath_ibdev *dev, + struct ipath_cregs const *crp, + __be16 sel) { u64 ret; switch (sel) { case IB_PMA_PORT_XMIT_DATA: - ret = dev->ipath_sword; + ret = (crp->cr_psxmitdatacount) ? + ipath_read_creg32(dev->dd, crp->cr_psxmitdatacount) : + dev->ipath_sword; break; case IB_PMA_PORT_RCV_DATA: - ret = dev->ipath_rword; + ret = (crp->cr_psrcvdatacount) ? + ipath_read_creg32(dev->dd, crp->cr_psrcvdatacount) : + dev->ipath_rword; break; case IB_PMA_PORT_XMIT_PKTS: - ret = dev->ipath_spkts; + ret = (crp->cr_psxmitpktscount) ? + ipath_read_creg32(dev->dd, crp->cr_psxmitpktscount) : + dev->ipath_spkts; break; case IB_PMA_PORT_RCV_PKTS: - ret = dev->ipath_rpkts; + ret = (crp->cr_psrcvpktscount) ? + ipath_read_creg32(dev->dd, crp->cr_psrcvpktscount) : + dev->ipath_rpkts; break; case IB_PMA_PORT_XMIT_WAIT: - ret = dev->ipath_xmit_wait; + ret = (crp->cr_psxmitwaitcount) ? + ipath_read_creg32(dev->dd, crp->cr_psxmitwaitcount) : + dev->ipath_xmit_wait; break; default: ret = 0; @@ -1053,14 +1066,21 @@ static int recv_pma_get_portsamplesresult(struct ib_perf *pmp, struct ib_pma_portsamplesresult *p = (struct ib_pma_portsamplesresult *)pmp->data; struct ipath_ibdev *dev = to_idev(ibdev); + struct ipath_cregs const *crp = dev->dd->ipath_cregs; + u8 status; int i; memset(pmp->data, 0, sizeof(pmp->data)); p->tag = cpu_to_be16(dev->pma_tag); - p->sample_status = cpu_to_be16(dev->pma_sample_status); + if (crp->cr_psstat) + status = ipath_read_creg32(dev->dd, crp->cr_psstat); + else + status = dev->pma_sample_status; + p->sample_status = cpu_to_be16(status); for (i = 0; i < ARRAY_SIZE(dev->pma_counter_select); i++) - p->counter[i] = cpu_to_be32( - get_counter(dev, dev->pma_counter_select[i])); + p->counter[i] = (status != IB_PMA_SAMPLE_STATUS_DONE) ? 0 : + cpu_to_be32( + get_counter(dev, crp, dev->pma_counter_select[i])); return reply((struct ib_smp *) pmp); } @@ -1071,16 +1091,23 @@ static int recv_pma_get_portsamplesresult_ext(struct ib_perf *pmp, struct ib_pma_portsamplesresult_ext *p = (struct ib_pma_portsamplesresult_ext *)pmp->data; struct ipath_ibdev *dev = to_idev(ibdev); + struct ipath_cregs const *crp = dev->dd->ipath_cregs; + u8 status; int i; memset(pmp->data, 0, sizeof(pmp->data)); p->tag = cpu_to_be16(dev->pma_tag); - p->sample_status = cpu_to_be16(dev->pma_sample_status); + if (crp->cr_psstat) + status = ipath_read_creg32(dev->dd, crp->cr_psstat); + else + status = dev->pma_sample_status; + p->sample_status = cpu_to_be16(status); /* 64 bits */ p->extended_width = __constant_cpu_to_be32(0x80000000); for (i = 0; i < ARRAY_SIZE(dev->pma_counter_select); i++) - p->counter[i] = cpu_to_be64( - get_counter(dev, dev->pma_counter_select[i])); + p->counter[i] = (status != IB_PMA_SAMPLE_STATUS_DONE) ? 0 : + cpu_to_be64( + get_counter(dev, crp, dev->pma_counter_select[i])); return reply((struct ib_smp *) pmp); } @@ -1113,6 +1140,8 @@ static int recv_pma_get_portcounters(struct ib_perf *pmp, dev->z_local_link_integrity_errors; cntrs.excessive_buffer_overrun_errors -= dev->z_excessive_buffer_overrun_errors; + cntrs.vl15_dropped -= dev->z_vl15_dropped; + cntrs.vl15_dropped += dev->n_vl15_dropped; memset(pmp->data, 0, sizeof(pmp->data)); @@ -1156,10 +1185,10 @@ static int recv_pma_get_portcounters(struct ib_perf *pmp, cntrs.excessive_buffer_overrun_errors = 0xFUL; p->lli_ebor_errors = (cntrs.local_link_integrity_errors << 4) | cntrs.excessive_buffer_overrun_errors; - if (dev->n_vl15_dropped > 0xFFFFUL) + if (cntrs.vl15_dropped > 0xFFFFUL) p->vl15_dropped = __constant_cpu_to_be16(0xFFFF); else - p->vl15_dropped = cpu_to_be16((u16)dev->n_vl15_dropped); + p->vl15_dropped = cpu_to_be16((u16)cntrs.vl15_dropped); if (cntrs.port_xmit_data > 0xFFFFFFFFUL) p->port_xmit_data = __constant_cpu_to_be32(0xFFFFFFFF); else @@ -1262,8 +1291,10 @@ static int recv_pma_set_portcounters(struct ib_perf *pmp, dev->z_excessive_buffer_overrun_errors = cntrs.excessive_buffer_overrun_errors; - if (p->counter_select & IB_PMA_SEL_PORT_VL15_DROPPED) + if (p->counter_select & IB_PMA_SEL_PORT_VL15_DROPPED) { dev->n_vl15_dropped = 0; + dev->z_vl15_dropped = cntrs.vl15_dropped; + } if (p->counter_select & IB_PMA_SEL_PORT_XMIT_DATA) dev->z_port_xmit_data = cntrs.port_xmit_data; diff --git a/drivers/infiniband/hw/ipath/ipath_registers.h b/drivers/infiniband/hw/ipath/ipath_registers.h index d7181d4..156ef14 100644 --- a/drivers/infiniband/hw/ipath/ipath_registers.h +++ b/drivers/infiniband/hw/ipath/ipath_registers.h @@ -469,6 +469,20 @@ struct ipath_cregs { ipath_creg cr_unsupvlcnt; ipath_creg cr_wordrcvcnt; ipath_creg cr_wordsendcnt; + ipath_creg cr_vl15droppedpktcnt; + ipath_creg cr_rxotherlocalphyerrcnt; + ipath_creg cr_excessbufferovflcnt; + ipath_creg cr_locallinkintegrityerrcnt; + ipath_creg cr_rxvlerrcnt; + ipath_creg cr_rxdlidfltrcnt; + ipath_creg cr_psstat; + ipath_creg cr_psstart; + ipath_creg cr_psinterval; + ipath_creg cr_psrcvdatacount; + ipath_creg cr_psrcvpktscount; + ipath_creg cr_psxmitdatacount; + ipath_creg cr_psxmitpktscount; + ipath_creg cr_psxmitwaitcount; }; #endif /* _IPATH_REGISTERS_H */ diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c index c4c9984..a2baa61 100644 --- a/drivers/infiniband/hw/ipath/ipath_verbs.c +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c @@ -1641,6 +1641,7 @@ int ipath_register_ib_device(struct ipath_devdata *dd) cntrs.local_link_integrity_errors; idev->z_excessive_buffer_overrun_errors = cntrs.excessive_buffer_overrun_errors; + idev->z_vl15_dropped = cntrs.vl15_dropped; /* * The system image GUID is supposed to be the same for all diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.h b/drivers/infiniband/hw/ipath/ipath_verbs.h index 6ccb54f..1c89850 100644 --- a/drivers/infiniband/hw/ipath/ipath_verbs.h +++ b/drivers/infiniband/hw/ipath/ipath_verbs.h @@ -554,6 +554,7 @@ struct ipath_ibdev { u32 z_pkey_violations; /* starting count for PMA */ u32 z_local_link_integrity_errors; /* starting count for PMA */ u32 z_excessive_buffer_overrun_errors; /* starting count for PMA */ + u32 z_vl15_dropped; /* starting count for PMA */ u32 n_rc_resends; u32 n_rc_acks; u32 n_rc_qacks; @@ -598,6 +599,7 @@ struct ipath_verbs_counters { u64 port_rcv_packets; u32 local_link_integrity_errors; u32 excessive_buffer_overrun_errors; + u32 vl15_dropped; }; static inline struct ipath_mr *to_imr(struct ib_mr *ibmr) From arthur.jones at qlogic.com Tue Jan 8 13:18:05 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 08 Jan 2008 13:18:05 -0800 Subject: [ofa-general] [PATCH 2/8] IB/ipath - export hardware counters more consistently In-Reply-To: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> References: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080108211805.23996.63261.stgit@eng-46.internal.keyresearch.com> From: Ralph Campbell Various hardware counters are exported via the ipath file system (since it is binary data). The old file format was very dependent on the HW offsets for these registers. Newer HCA chips can have different counters at different offsets. This patch adds a level of indirection to make the file format consistent across HCAs. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_common.h | 20 +++ drivers/infiniband/hw/ipath/ipath_fs.c | 11 +- drivers/infiniband/hw/ipath/ipath_iba6110.c | 159 ++++++++++++++++++++++++++- drivers/infiniband/hw/ipath/ipath_iba6120.c | 153 +++++++++++++++++++++++++- drivers/infiniband/hw/ipath/ipath_kernel.h | 2 5 files changed, 328 insertions(+), 17 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_common.h b/drivers/infiniband/hw/ipath/ipath_common.h index 851df8a..aa780e7 100644 --- a/drivers/infiniband/hw/ipath/ipath_common.h +++ b/drivers/infiniband/hw/ipath/ipath_common.h @@ -579,7 +579,7 @@ struct ipath_flash { struct infinipath_counters { __u64 LBIntCnt; __u64 LBFlowStallCnt; - __u64 Reserved1; + __u64 TxSDmaDescCnt; /* was Reserved1 */ __u64 TxUnsupVLErrCnt; __u64 TxDataPktCnt; __u64 TxFlowPktCnt; @@ -615,12 +615,26 @@ struct infinipath_counters { __u64 RxP6HdrEgrOvflCnt; __u64 RxP7HdrEgrOvflCnt; __u64 RxP8HdrEgrOvflCnt; - __u64 Reserved6; - __u64 Reserved7; + __u64 RxP9HdrEgrOvflCnt; /* was Reserved6 */ + __u64 RxP10HdrEgrOvflCnt; /* was Reserved7 */ + __u64 RxP11HdrEgrOvflCnt; /* new for IBA7220 */ + __u64 RxP12HdrEgrOvflCnt; /* new for IBA7220 */ + __u64 RxP13HdrEgrOvflCnt; /* new for IBA7220 */ + __u64 RxP14HdrEgrOvflCnt; /* new for IBA7220 */ + __u64 RxP15HdrEgrOvflCnt; /* new for IBA7220 */ + __u64 RxP16HdrEgrOvflCnt; /* new for IBA7220 */ __u64 IBStatusChangeCnt; __u64 IBLinkErrRecoveryCnt; __u64 IBLinkDownedCnt; __u64 IBSymbolErrCnt; + /* The following are new for IBA7220 */ + __u64 RxVL15DroppedPktCnt; + __u64 RxOtherLocalPhyErrCnt; + __u64 PcieRetryBufDiagQwordCnt; + __u64 ExcessBufferOvflCnt; + __u64 LocalLinkIntegrityErrCnt; + __u64 RxVlErrCnt; + __u64 RxDlidFltrCnt; }; /* diff --git a/drivers/infiniband/hw/ipath/ipath_fs.c b/drivers/infiniband/hw/ipath/ipath_fs.c index 262c25d..52325e0 100644 --- a/drivers/infiniband/hw/ipath/ipath_fs.c +++ b/drivers/infiniband/hw/ipath/ipath_fs.c @@ -108,21 +108,16 @@ static const struct file_operations atomic_stats_ops = { .read = atomic_stats_read, }; -#define NUM_COUNTERS sizeof(struct infinipath_counters) / sizeof(u64) - static ssize_t atomic_counters_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { - u64 counters[NUM_COUNTERS]; - u16 i; + struct infinipath_counters counters; struct ipath_devdata *dd; dd = file->f_path.dentry->d_inode->i_private; + dd->ipath_f_read_counters(dd, &counters); - for (i = 0; i < NUM_COUNTERS; i++) - counters[i] = ipath_snap_cntr(dd, i); - - return simple_read_from_buffer(buf, count, ppos, counters, + return simple_read_from_buffer(buf, count, ppos, &counters, sizeof counters); } diff --git a/drivers/infiniband/hw/ipath/ipath_iba6110.c b/drivers/infiniband/hw/ipath/ipath_iba6110.c index c272a73..ce85879 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6110.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6110.c @@ -148,10 +148,57 @@ struct _infinipath_do_not_use_kernel_regs { unsigned long long ReservedSW2[4]; }; -#define IPATH_KREG_OFFSET(field) (offsetof(struct \ - _infinipath_do_not_use_kernel_regs, field) / sizeof(u64)) +struct _infinipath_do_not_use_counters { + __u64 LBIntCnt; + __u64 LBFlowStallCnt; + __u64 Reserved1; + __u64 TxUnsupVLErrCnt; + __u64 TxDataPktCnt; + __u64 TxFlowPktCnt; + __u64 TxDwordCnt; + __u64 TxLenErrCnt; + __u64 TxMaxMinLenErrCnt; + __u64 TxUnderrunCnt; + __u64 TxFlowStallCnt; + __u64 TxDroppedPktCnt; + __u64 RxDroppedPktCnt; + __u64 RxDataPktCnt; + __u64 RxFlowPktCnt; + __u64 RxDwordCnt; + __u64 RxLenErrCnt; + __u64 RxMaxMinLenErrCnt; + __u64 RxICRCErrCnt; + __u64 RxVCRCErrCnt; + __u64 RxFlowCtrlErrCnt; + __u64 RxBadFormatCnt; + __u64 RxLinkProblemCnt; + __u64 RxEBPCnt; + __u64 RxLPCRCErrCnt; + __u64 RxBufOvflCnt; + __u64 RxTIDFullErrCnt; + __u64 RxTIDValidErrCnt; + __u64 RxPKeyMismatchCnt; + __u64 RxP0HdrEgrOvflCnt; + __u64 RxP1HdrEgrOvflCnt; + __u64 RxP2HdrEgrOvflCnt; + __u64 RxP3HdrEgrOvflCnt; + __u64 RxP4HdrEgrOvflCnt; + __u64 RxP5HdrEgrOvflCnt; + __u64 RxP6HdrEgrOvflCnt; + __u64 RxP7HdrEgrOvflCnt; + __u64 RxP8HdrEgrOvflCnt; + __u64 Reserved6; + __u64 Reserved7; + __u64 IBStatusChangeCnt; + __u64 IBLinkErrRecoveryCnt; + __u64 IBLinkDownedCnt; + __u64 IBSymbolErrCnt; +}; + +#define IPATH_KREG_OFFSET(field) (offsetof( \ + struct _infinipath_do_not_use_kernel_regs, field) / sizeof(u64)) #define IPATH_CREG_OFFSET(field) (offsetof( \ - struct infinipath_counters, field) / sizeof(u64)) + struct _infinipath_do_not_use_counters, field) / sizeof(u64)) static const struct ipath_kregs ipath_ht_kregs = { .kr_control = IPATH_KREG_OFFSET(Control), @@ -1614,6 +1661,111 @@ static void ipath_ht_free_irq(struct ipath_devdata *dd) dd->ipath_intconfig = 0; } +static void ipath_ht_read_counters(struct ipath_devdata *dd, + struct infinipath_counters *cntrs) +{ + cntrs->LBIntCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(LBIntCnt)); + cntrs->LBFlowStallCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(LBFlowStallCnt)); + cntrs->TxSDmaDescCnt = 0; + cntrs->TxUnsupVLErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxUnsupVLErrCnt)); + cntrs->TxDataPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxDataPktCnt)); + cntrs->TxFlowPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxFlowPktCnt)); + cntrs->TxDwordCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxDwordCnt)); + cntrs->TxLenErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxLenErrCnt)); + cntrs->TxMaxMinLenErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxMaxMinLenErrCnt)); + cntrs->TxUnderrunCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxUnderrunCnt)); + cntrs->TxFlowStallCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxFlowStallCnt)); + cntrs->TxDroppedPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxDroppedPktCnt)); + cntrs->RxDroppedPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxDroppedPktCnt)); + cntrs->RxDataPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxDataPktCnt)); + cntrs->RxFlowPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxFlowPktCnt)); + cntrs->RxDwordCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxDwordCnt)); + cntrs->RxLenErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxLenErrCnt)); + cntrs->RxMaxMinLenErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxMaxMinLenErrCnt)); + cntrs->RxICRCErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxICRCErrCnt)); + cntrs->RxVCRCErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxVCRCErrCnt)); + cntrs->RxFlowCtrlErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxFlowCtrlErrCnt)); + cntrs->RxBadFormatCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxBadFormatCnt)); + cntrs->RxLinkProblemCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxLinkProblemCnt)); + cntrs->RxEBPCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxEBPCnt)); + cntrs->RxLPCRCErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxLPCRCErrCnt)); + cntrs->RxBufOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxBufOvflCnt)); + cntrs->RxTIDFullErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxTIDFullErrCnt)); + cntrs->RxTIDValidErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxTIDValidErrCnt)); + cntrs->RxPKeyMismatchCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxPKeyMismatchCnt)); + cntrs->RxP0HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP0HdrEgrOvflCnt)); + cntrs->RxP1HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP1HdrEgrOvflCnt)); + cntrs->RxP2HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP2HdrEgrOvflCnt)); + cntrs->RxP3HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP3HdrEgrOvflCnt)); + cntrs->RxP4HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP4HdrEgrOvflCnt)); + cntrs->RxP5HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP5HdrEgrOvflCnt)); + cntrs->RxP6HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP6HdrEgrOvflCnt)); + cntrs->RxP7HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP7HdrEgrOvflCnt)); + cntrs->RxP8HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP8HdrEgrOvflCnt)); + cntrs->RxP9HdrEgrOvflCnt = 0; + cntrs->RxP10HdrEgrOvflCnt = 0; + cntrs->RxP11HdrEgrOvflCnt = 0; + cntrs->RxP12HdrEgrOvflCnt = 0; + cntrs->RxP13HdrEgrOvflCnt = 0; + cntrs->RxP14HdrEgrOvflCnt = 0; + cntrs->RxP15HdrEgrOvflCnt = 0; + cntrs->RxP16HdrEgrOvflCnt = 0; + cntrs->IBStatusChangeCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(IBStatusChangeCnt)); + cntrs->IBLinkErrRecoveryCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(IBLinkErrRecoveryCnt)); + cntrs->IBLinkDownedCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(IBLinkDownedCnt)); + cntrs->IBSymbolErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(IBSymbolErrCnt)); + cntrs->RxVL15DroppedPktCnt = 0; + cntrs->RxOtherLocalPhyErrCnt = 0; + cntrs->PcieRetryBufDiagQwordCnt = 0; + cntrs->ExcessBufferOvflCnt = dd->ipath_overrun_thresh_errs; + cntrs->LocalLinkIntegrityErrCnt = + (dd->ipath_flags & IPATH_GPIO_ERRINTRS) ? + dd->ipath_lli_errs : dd->ipath_lli_errors; + cntrs->RxVlErrCnt = 0; + cntrs->RxDlidFltrCnt = 0; +} + /** * ipath_init_iba6110_funcs - set up the chip-specific function pointers * @dd: the infinipath device @@ -1638,6 +1790,7 @@ void ipath_init_iba6110_funcs(struct ipath_devdata *dd) dd->ipath_f_setextled = ipath_setup_ht_setextled; dd->ipath_f_get_base_info = ipath_ht_get_base_info; dd->ipath_f_free_irq = ipath_ht_free_irq; + dd->ipath_f_read_counters = ipath_ht_read_counters; /* * initialize chip-specific variables diff --git a/drivers/infiniband/hw/ipath/ipath_iba6120.c b/drivers/infiniband/hw/ipath/ipath_iba6120.c index e6893eb..97ae117 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6120.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6120.c @@ -145,10 +145,57 @@ struct _infinipath_do_not_use_kernel_regs { unsigned long long Reserved12; }; -#define IPATH_KREG_OFFSET(field) (offsetof(struct \ - _infinipath_do_not_use_kernel_regs, field) / sizeof(u64)) +struct _infinipath_do_not_use_counters { + __u64 LBIntCnt; + __u64 LBFlowStallCnt; + __u64 Reserved1; + __u64 TxUnsupVLErrCnt; + __u64 TxDataPktCnt; + __u64 TxFlowPktCnt; + __u64 TxDwordCnt; + __u64 TxLenErrCnt; + __u64 TxMaxMinLenErrCnt; + __u64 TxUnderrunCnt; + __u64 TxFlowStallCnt; + __u64 TxDroppedPktCnt; + __u64 RxDroppedPktCnt; + __u64 RxDataPktCnt; + __u64 RxFlowPktCnt; + __u64 RxDwordCnt; + __u64 RxLenErrCnt; + __u64 RxMaxMinLenErrCnt; + __u64 RxICRCErrCnt; + __u64 RxVCRCErrCnt; + __u64 RxFlowCtrlErrCnt; + __u64 RxBadFormatCnt; + __u64 RxLinkProblemCnt; + __u64 RxEBPCnt; + __u64 RxLPCRCErrCnt; + __u64 RxBufOvflCnt; + __u64 RxTIDFullErrCnt; + __u64 RxTIDValidErrCnt; + __u64 RxPKeyMismatchCnt; + __u64 RxP0HdrEgrOvflCnt; + __u64 RxP1HdrEgrOvflCnt; + __u64 RxP2HdrEgrOvflCnt; + __u64 RxP3HdrEgrOvflCnt; + __u64 RxP4HdrEgrOvflCnt; + __u64 RxP5HdrEgrOvflCnt; + __u64 RxP6HdrEgrOvflCnt; + __u64 RxP7HdrEgrOvflCnt; + __u64 RxP8HdrEgrOvflCnt; + __u64 Reserved6; + __u64 Reserved7; + __u64 IBStatusChangeCnt; + __u64 IBLinkErrRecoveryCnt; + __u64 IBLinkDownedCnt; + __u64 IBSymbolErrCnt; +}; + +#define IPATH_KREG_OFFSET(field) (offsetof( \ + struct _infinipath_do_not_use_kernel_regs, field) / sizeof(u64)) #define IPATH_CREG_OFFSET(field) (offsetof( \ - struct infinipath_counters, field) / sizeof(u64)) + struct _infinipath_do_not_use_counters, field) / sizeof(u64)) static const struct ipath_kregs ipath_pe_kregs = { .kr_control = IPATH_KREG_OFFSET(Control), @@ -1368,6 +1415,105 @@ static void ipath_pe_free_irq(struct ipath_devdata *dd) dd->ipath_irq = 0; } +static void ipath_pe_read_counters(struct ipath_devdata *dd, + struct infinipath_counters *cntrs) +{ + cntrs->LBIntCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(LBIntCnt)); + cntrs->LBFlowStallCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(LBFlowStallCnt)); + cntrs->TxSDmaDescCnt = 0; + cntrs->TxUnsupVLErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxUnsupVLErrCnt)); + cntrs->TxDataPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxDataPktCnt)); + cntrs->TxFlowPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxFlowPktCnt)); + cntrs->TxDwordCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxDwordCnt)); + cntrs->TxLenErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxLenErrCnt)); + cntrs->TxMaxMinLenErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxMaxMinLenErrCnt)); + cntrs->TxUnderrunCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxUnderrunCnt)); + cntrs->TxFlowStallCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxFlowStallCnt)); + cntrs->TxDroppedPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(TxDroppedPktCnt)); + cntrs->RxDroppedPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxDroppedPktCnt)); + cntrs->RxDataPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxDataPktCnt)); + cntrs->RxFlowPktCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxFlowPktCnt)); + cntrs->RxDwordCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxDwordCnt)); + cntrs->RxLenErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxLenErrCnt)); + cntrs->RxMaxMinLenErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxMaxMinLenErrCnt)); + cntrs->RxICRCErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxICRCErrCnt)); + cntrs->RxVCRCErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxVCRCErrCnt)); + cntrs->RxFlowCtrlErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxFlowCtrlErrCnt)); + cntrs->RxBadFormatCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxBadFormatCnt)); + cntrs->RxLinkProblemCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxLinkProblemCnt)); + cntrs->RxEBPCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxEBPCnt)); + cntrs->RxLPCRCErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxLPCRCErrCnt)); + cntrs->RxBufOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxBufOvflCnt)); + cntrs->RxTIDFullErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxTIDFullErrCnt)); + cntrs->RxTIDValidErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxTIDValidErrCnt)); + cntrs->RxPKeyMismatchCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxPKeyMismatchCnt)); + cntrs->RxP0HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP0HdrEgrOvflCnt)); + cntrs->RxP1HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP1HdrEgrOvflCnt)); + cntrs->RxP2HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP2HdrEgrOvflCnt)); + cntrs->RxP3HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP3HdrEgrOvflCnt)); + cntrs->RxP4HdrEgrOvflCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(RxP4HdrEgrOvflCnt)); + cntrs->RxP5HdrEgrOvflCnt = 0; + cntrs->RxP6HdrEgrOvflCnt = 0; + cntrs->RxP7HdrEgrOvflCnt = 0; + cntrs->RxP8HdrEgrOvflCnt = 0; + cntrs->RxP9HdrEgrOvflCnt = 0; + cntrs->RxP10HdrEgrOvflCnt = 0; + cntrs->RxP11HdrEgrOvflCnt = 0; + cntrs->RxP12HdrEgrOvflCnt = 0; + cntrs->RxP13HdrEgrOvflCnt = 0; + cntrs->RxP14HdrEgrOvflCnt = 0; + cntrs->RxP15HdrEgrOvflCnt = 0; + cntrs->RxP16HdrEgrOvflCnt = 0; + cntrs->IBStatusChangeCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(IBStatusChangeCnt)); + cntrs->IBLinkErrRecoveryCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(IBLinkErrRecoveryCnt)); + cntrs->IBLinkDownedCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(IBLinkDownedCnt)); + cntrs->IBSymbolErrCnt = + ipath_snap_cntr(dd, IPATH_CREG_OFFSET(IBSymbolErrCnt)); + cntrs->RxVL15DroppedPktCnt = 0; + cntrs->RxOtherLocalPhyErrCnt = 0; + cntrs->PcieRetryBufDiagQwordCnt = 0; + cntrs->ExcessBufferOvflCnt = dd->ipath_overrun_thresh_errs; + cntrs->LocalLinkIntegrityErrCnt = dd->ipath_lli_errs; + cntrs->RxVlErrCnt = 0; + cntrs->RxDlidFltrCnt = 0; +} + /* * On platforms using this chip, and not having ordered WC stores, we * can get TXE parity errors due to speculative reads to the PIO buffers, @@ -1427,6 +1573,7 @@ void ipath_init_iba6120_funcs(struct ipath_devdata *dd) /* initialize chip-specific variables */ dd->ipath_f_tidtemplate = ipath_pe_tidtemplate; + dd->ipath_f_read_counters = ipath_pe_read_counters; /* * setup the register offsets, since they are different for each diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index bfe84a1..c76e76c 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -253,6 +253,8 @@ struct ipath_devdata { int (*ipath_f_get_base_info)(struct ipath_portdata *, void *); /* free irq */ void (*ipath_f_free_irq)(struct ipath_devdata *); + void (*ipath_f_read_counters)(struct ipath_devdata *, + struct infinipath_counters *); struct ipath_ibdev *verbs_dev; struct timer_list verbs_timer; /* total dwords sent (summed from counter) */ From arthur.jones at qlogic.com Tue Jan 8 13:18:10 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 08 Jan 2008 13:18:10 -0800 Subject: [ofa-general] [PATCH 3/8] IB/ipath - random comment fixes In-Reply-To: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> References: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080108211810.23996.13752.stgit@eng-46.internal.keyresearch.com> From: Dave Olson ipath comments cleanup Signed-off-by: Dave Olson --- drivers/infiniband/hw/ipath/ipath_driver.c | 2 +- drivers/infiniband/hw/ipath/ipath_file_ops.c | 5 ----- drivers/infiniband/hw/ipath/ipath_iba6110.c | 3 ++- drivers/infiniband/hw/ipath/ipath_iba6120.c | 2 +- drivers/infiniband/hw/ipath/ipath_keys.c | 5 ++--- drivers/infiniband/hw/ipath/ipath_verbs.c | 7 +++++-- 6 files changed, 11 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c index 4b37e46..f657295 100644 --- a/drivers/infiniband/hw/ipath/ipath_driver.c +++ b/drivers/infiniband/hw/ipath/ipath_driver.c @@ -1193,7 +1193,7 @@ reloop: be32_to_cpu(hdr->bth[0]) & 0xff); else { /* - * error packet, type of error unknown. + * error packet, type of error unknown. * Probably type 3, but we don't know, so don't * even try to print the opcode, etc. */ diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c b/drivers/infiniband/hw/ipath/ipath_file_ops.c index 2a75faf..9e5714d 100644 --- a/drivers/infiniband/hw/ipath/ipath_file_ops.c +++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c @@ -1050,11 +1050,6 @@ static int mmap_piobufs(struct vm_area_struct *vma, phys = dd->ipath_physaddr + piobufs; - /* - * Don't mark this as non-cached, or we don't get the - * write combining behavior we want on the PIO buffers! - */ - #if defined(__powerpc__) /* There isn't a generic way to specify writethrough mappings */ pgprot_val(vma->vm_page_prot) |= _PAGE_NO_CACHE; diff --git a/drivers/infiniband/hw/ipath/ipath_iba6110.c b/drivers/infiniband/hw/ipath/ipath_iba6110.c index ce85879..dffb682 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6110.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6110.c @@ -1022,7 +1022,8 @@ static int ipath_setup_ht_config(struct ipath_devdata *dd, do { u8 cap_type; - /* the HT capability type byte is 3 bytes after the + /* + * The HT capability type byte is 3 bytes after the * capability byte. */ if (pci_read_config_byte(pdev, pos + 3, &cap_type)) { diff --git a/drivers/infiniband/hw/ipath/ipath_iba6120.c b/drivers/infiniband/hw/ipath/ipath_iba6120.c index 97ae117..66925b2 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6120.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6120.c @@ -535,7 +535,7 @@ static void ipath_pe_handle_hwerrors(struct ipath_devdata *dd, char *msg, if (hwerrs & INFINIPATH_HWE_SERDESPLLFAILED) { /* - * If it occurs, it is left masked since the eternal + * If it occurs, it is left masked since the external * interface is unused */ dd->ipath_hwerrmask &= ~INFINIPATH_HWE_SERDESPLLFAILED; diff --git a/drivers/infiniband/hw/ipath/ipath_keys.c b/drivers/infiniband/hw/ipath/ipath_keys.c index 85a4aef..8f32b17 100644 --- a/drivers/infiniband/hw/ipath/ipath_keys.c +++ b/drivers/infiniband/hw/ipath/ipath_keys.c @@ -128,9 +128,8 @@ int ipath_lkey_ok(struct ipath_qp *qp, struct ipath_sge *isge, int ret; /* - * We use LKEY == zero to mean a physical kmalloc() address. - * This is a bit of a hack since we rely on dma_map_single() - * being reversible by calling bus_to_virt(). + * We use LKEY == zero for kernel virtual addresses + * (see ipath_get_dma_mr and ipath_dma.c). */ if (sge->lkey == 0) { struct ipath_pd *pd = to_ipd(qp->ibqp.pd); diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c index a2baa61..904ff15 100644 --- a/drivers/infiniband/hw/ipath/ipath_verbs.c +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c @@ -943,7 +943,7 @@ bail: * ipath_verbs_send - send a packet * @qp: the QP to send on * @hdr: the packet header - * @hdrwords: the number of words in the header + * @hdrwords: the number of 32-bit words in the header * @ss: the SGE to send * @len: the length of the packet in bytes */ @@ -955,7 +955,10 @@ int ipath_verbs_send(struct ipath_qp *qp, struct ipath_ib_header *hdr, int ret; u32 dwords = (len + 3) >> 2; - /* +1 is for the qword padding of pbc */ + /* + * Calculate the send buffer trigger address. + * The +1 counts for the pbc control dword following the pbc length. + */ plen = hdrwords + dwords + 1; /* Drop non-VL15 packets if we are not in the active state */ From arthur.jones at qlogic.com Tue Jan 8 13:18:15 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 08 Jan 2008 13:18:15 -0800 Subject: [ofa-general] [PATCH 4/8] IB/ipath - allow more flexible user register alignments In-Reply-To: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> References: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080108211815.23996.70907.stgit@eng-46.internal.keyresearch.com> From: Ralph Campbell User registers have different alignments on different chips (4KB on older, 64KB on 7220). Allow mapping the user registers on kernels with page sizes up to 64K. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_file_ops.c | 6 +++--- drivers/infiniband/hw/ipath/ipath_iba6110.c | 7 +++++++ drivers/infiniband/hw/ipath/ipath_iba6120.c | 8 ++++++++ drivers/infiniband/hw/ipath/ipath_kernel.h | 6 ++++-- 4 files changed, 22 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c b/drivers/infiniband/hw/ipath/ipath_file_ops.c index 9e5714d..0b877ed 100644 --- a/drivers/infiniband/hw/ipath/ipath_file_ops.c +++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c @@ -169,7 +169,7 @@ static int ipath_get_base_info(struct file *fp, kinfo->spi_piocnt = dd->ipath_pbufsport; kinfo->spi_piobufbase = (u64) pd->port_piobufs; kinfo->__spi_uregbase = (u64) dd->ipath_uregbase + - dd->ipath_palign * pd->port_port; + dd->ipath_ureg_align * pd->port_port; } else if (master) { kinfo->spi_piocnt = (dd->ipath_pbufsport / subport_cnt) + (dd->ipath_pbufsport % subport_cnt); @@ -186,7 +186,7 @@ static int ipath_get_base_info(struct file *fp, } if (shared) { kinfo->spi_port_uregbase = (u64) dd->ipath_uregbase + - dd->ipath_palign * pd->port_port; + dd->ipath_ureg_align * pd->port_port; kinfo->spi_port_rcvegrbuf = kinfo->spi_rcv_egrbufs; kinfo->spi_port_rcvhdr_base = kinfo->spi_rcvhdr_base; kinfo->spi_port_rcvhdr_tailaddr = kinfo->spi_rcvhdr_tailaddr; @@ -1271,7 +1271,7 @@ static int ipath_mmap(struct file *fp, struct vm_area_struct *vma) goto bail; } - ureg = dd->ipath_uregbase + dd->ipath_palign * pd->port_port; + ureg = dd->ipath_uregbase + dd->ipath_ureg_align * pd->port_port; if (!pd->port_subport_cnt) { /* port is not shared */ piocnt = dd->ipath_pbufsport; diff --git a/drivers/infiniband/hw/ipath/ipath_iba6110.c b/drivers/infiniband/hw/ipath/ipath_iba6110.c index dffb682..5ecf65b 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6110.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6110.c @@ -739,6 +739,13 @@ static int ipath_ht_boardname(struct ipath_devdata *dd, char *name, dd->ipath_htspeed); ret = 0; + /* + * set here, not in ipath_init_*_funcs because we have to do + * it after we can read chip registers. + */ + dd->ipath_ureg_align = + ipath_read_kreg32(dd, dd->ipath_kregs->kr_pagealign); + bail: return ret; } diff --git a/drivers/infiniband/hw/ipath/ipath_iba6120.c b/drivers/infiniband/hw/ipath/ipath_iba6120.c index 66925b2..23de8da 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6120.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6120.c @@ -613,6 +613,14 @@ static int ipath_pe_boardname(struct ipath_devdata *dd, char *name, dd->ipath_f_put_tid = ipath_pe_put_tid_2; } + + /* + * set here, not in ipath_init_*_funcs because we have to do + * it after we can read chip registers. + */ + dd->ipath_ureg_align = + ipath_read_kreg32(dd, dd->ipath_kregs->kr_pagealign); + return ret; } diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index c76e76c..19e0dc2 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -395,6 +395,8 @@ struct ipath_devdata { void *ipath_dummy_hdrq; /* used after port close */ dma_addr_t ipath_dummy_hdrq_phys; + unsigned long ipath_ureg_align; /* user register alignment */ + /* * Shadow copies of registers; size indicates read access size. * Most of them are readonly, but some are write-only register, @@ -865,7 +867,7 @@ static inline u32 ipath_read_ureg32(const struct ipath_devdata *dd, return readl(regno + (u64 __iomem *) (dd->ipath_uregbase + (char __iomem *)dd->ipath_kregbase + - dd->ipath_palign * port)); + dd->ipath_ureg_align * port)); } /** @@ -882,7 +884,7 @@ static inline void ipath_write_ureg(const struct ipath_devdata *dd, { u64 __iomem *ubase = (u64 __iomem *) (dd->ipath_uregbase + (char __iomem *) dd->ipath_kregbase + - dd->ipath_palign * port); + dd->ipath_ureg_align * port); if (dd->ipath_kregbase) writeq(value, &ubase[regno]); } From arthur.jones at qlogic.com Tue Jan 8 13:18:20 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 08 Jan 2008 13:18:20 -0800 Subject: [ofa-general] [PATCH 5/8] IB/ipath - port config has on-chip effects for 7220 In-Reply-To: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> References: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080108211820.23996.27322.stgit@eng-46.internal.keyresearch.com> From: Ralph Campbell The number of configured ports for the 7220 changes the number of eager TIDs available per port, for all but port 0 (kernel port) which remains constant, so add a field to give port0 count separate from the portdata structure. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_file_ops.c | 2 +- drivers/infiniband/hw/ipath/ipath_iba6110.c | 9 +++++++++ drivers/infiniband/hw/ipath/ipath_iba6120.c | 9 +++++++++ drivers/infiniband/hw/ipath/ipath_init_chip.c | 5 ++--- drivers/infiniband/hw/ipath/ipath_kernel.h | 3 +++ 5 files changed, 24 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c b/drivers/infiniband/hw/ipath/ipath_file_ops.c index 0b877ed..7b2f59a 100644 --- a/drivers/infiniband/hw/ipath/ipath_file_ops.c +++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c @@ -882,7 +882,7 @@ static int ipath_create_user_egr(struct ipath_portdata *pd) egrcnt = dd->ipath_rcvegrcnt; /* TID number offset for this port */ - egroff = pd->port_port * egrcnt; + egroff = (pd->port_port - 1) * egrcnt + dd->ipath_p0_rcvegrcnt; egrsize = dd->ipath_rcvegrbufsize; ipath_cdbg(VERBOSE, "Allocating %d egr buffers, at egrtid " "offset %x, egrsize %u\n", egrcnt, egroff, egrsize); diff --git a/drivers/infiniband/hw/ipath/ipath_iba6110.c b/drivers/infiniband/hw/ipath/ipath_iba6110.c index 5ecf65b..0c900c5 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6110.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6110.c @@ -1669,6 +1669,14 @@ static void ipath_ht_free_irq(struct ipath_devdata *dd) dd->ipath_intconfig = 0; } +static void ipath_ht_config_ports(struct ipath_devdata *dd, ushort cfgports) +{ + dd->ipath_portcnt = + ipath_read_kreg32(dd, dd->ipath_kregs->kr_portcnt); + dd->ipath_p0_rcvegrcnt = + ipath_read_kreg32(dd, dd->ipath_kregs->kr_rcvegrcnt); +} + static void ipath_ht_read_counters(struct ipath_devdata *dd, struct infinipath_counters *cntrs) { @@ -1798,6 +1806,7 @@ void ipath_init_iba6110_funcs(struct ipath_devdata *dd) dd->ipath_f_setextled = ipath_setup_ht_setextled; dd->ipath_f_get_base_info = ipath_ht_get_base_info; dd->ipath_f_free_irq = ipath_ht_free_irq; + dd->ipath_f_config_ports = ipath_ht_config_ports; dd->ipath_f_read_counters = ipath_ht_read_counters; /* diff --git a/drivers/infiniband/hw/ipath/ipath_iba6120.c b/drivers/infiniband/hw/ipath/ipath_iba6120.c index 23de8da..066a8ea 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6120.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6120.c @@ -1423,6 +1423,14 @@ static void ipath_pe_free_irq(struct ipath_devdata *dd) dd->ipath_irq = 0; } +static void ipath_pe_config_ports(struct ipath_devdata *dd, ushort cfgports) +{ + dd->ipath_portcnt = + ipath_read_kreg32(dd, dd->ipath_kregs->kr_portcnt); + dd->ipath_p0_rcvegrcnt = + ipath_read_kreg32(dd, dd->ipath_kregs->kr_rcvegrcnt); +} + static void ipath_pe_read_counters(struct ipath_devdata *dd, struct infinipath_counters *cntrs) { @@ -1581,6 +1589,7 @@ void ipath_init_iba6120_funcs(struct ipath_devdata *dd) /* initialize chip-specific variables */ dd->ipath_f_tidtemplate = ipath_pe_tidtemplate; + dd->ipath_f_config_ports = ipath_pe_config_ports; dd->ipath_f_read_counters = ipath_pe_read_counters; /* diff --git a/drivers/infiniband/hw/ipath/ipath_init_chip.c b/drivers/infiniband/hw/ipath/ipath_init_chip.c index 98b5146..3174c31 100644 --- a/drivers/infiniband/hw/ipath/ipath_init_chip.c +++ b/drivers/infiniband/hw/ipath/ipath_init_chip.c @@ -91,7 +91,7 @@ static int create_port0_egr(struct ipath_devdata *dd) struct ipath_skbinfo *skbinfo; int ret; - egrcnt = dd->ipath_rcvegrcnt; + egrcnt = dd->ipath_p0_rcvegrcnt; skbinfo = vmalloc(sizeof(*dd->ipath_port0_skbinfo) * egrcnt); if (skbinfo == NULL) { @@ -244,8 +244,7 @@ static int init_chip_first(struct ipath_devdata *dd, * cfgports. We do still check and report a difference, if * not same (should be impossible). */ - dd->ipath_portcnt = - ipath_read_kreg32(dd, dd->ipath_kregs->kr_portcnt); + dd->ipath_f_config_ports(dd, ipath_cfgports); if (!ipath_cfgports) dd->ipath_cfgports = dd->ipath_portcnt; else if (ipath_cfgports <= dd->ipath_portcnt) { diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index 19e0dc2..ded087c 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -253,6 +253,7 @@ struct ipath_devdata { int (*ipath_f_get_base_info)(struct ipath_portdata *, void *); /* free irq */ void (*ipath_f_free_irq)(struct ipath_devdata *); + void (*ipath_f_config_ports)(struct ipath_devdata *, ushort); void (*ipath_f_read_counters)(struct ipath_devdata *, struct infinipath_counters *); struct ipath_ibdev *verbs_dev; @@ -326,6 +327,8 @@ struct ipath_devdata { u32 ipath_cfgports; /* count of port 0 hdrqfull errors */ u32 ipath_p0_hdrqfull; + /* port 0 number of receive eager buffers */ + u32 ipath_p0_rcvegrcnt; /* * index of last piobuffer we used. Speeds up searching, by From arthur.jones at qlogic.com Tue Jan 8 13:18:25 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 08 Jan 2008 13:18:25 -0800 Subject: [ofa-general] [PATCH 6/8] IB/ipath - add flag and handling for chips with swapped register bug In-Reply-To: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> References: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080108211825.23996.47055.stgit@eng-46.internal.keyresearch.com> From: Ralph Campbell The 6110 had a bug that caused some registers to be swapped; it was fixed for the 7220 (and didn't affect the 6120 because it had fewer registers). This adds a flag and related code to handle that, and includes some minor cleanups in the same area. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_driver.c | 11 +++-------- drivers/infiniband/hw/ipath/ipath_iba6110.c | 2 ++ drivers/infiniband/hw/ipath/ipath_init_chip.c | 8 ++------ drivers/infiniband/hw/ipath/ipath_intr.c | 4 ++-- drivers/infiniband/hw/ipath/ipath_kernel.h | 1 + 5 files changed, 10 insertions(+), 16 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c index f657295..c1adf24 100644 --- a/drivers/infiniband/hw/ipath/ipath_driver.c +++ b/drivers/infiniband/hw/ipath/ipath_driver.c @@ -1337,14 +1337,9 @@ static void ipath_update_pio_bufs(struct ipath_devdata *dd) /* * Chip Errata: bug 6641; even and odd qwords>3 are swapped */ - if (i > 3) { - if (i & 1) - piov = le64_to_cpu( - dd->ipath_pioavailregs_dma[i - 1]); - else - piov = le64_to_cpu( - dd->ipath_pioavailregs_dma[i + 1]); - } else + if (i > 3 && (dd->ipath_flags & IPATH_SWAP_PIOBUFS)) + piov = le64_to_cpu(dd->ipath_pioavailregs_dma[i ^ 1]); + else piov = le64_to_cpu(dd->ipath_pioavailregs_dma[i]); pchg = _IPATH_ALL_CHECKBITS & ~(dd->ipath_pioavailshadow[i] ^ piov); diff --git a/drivers/infiniband/hw/ipath/ipath_iba6110.c b/drivers/infiniband/hw/ipath/ipath_iba6110.c index 0c900c5..3bfaf04 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6110.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6110.c @@ -1043,6 +1043,8 @@ static int ipath_setup_ht_config(struct ipath_devdata *dd, } while ((pos = pci_find_next_capability(pdev, pos, PCI_CAP_ID_HT))); + dd->ipath_flags |= IPATH_SWAP_PIOBUFS; + bail: return ret; } diff --git a/drivers/infiniband/hw/ipath/ipath_init_chip.c b/drivers/infiniband/hw/ipath/ipath_init_chip.c index 3174c31..4471674 100644 --- a/drivers/infiniband/hw/ipath/ipath_init_chip.c +++ b/drivers/infiniband/hw/ipath/ipath_init_chip.c @@ -528,12 +528,8 @@ static void enable_chip(struct ipath_devdata *dd, /* * Chip Errata bug 6641; even and odd qwords>3 are swapped. */ - if (i > 3) { - if (i & 1) - val = dd->ipath_pioavailregs_dma[i - 1]; - else - val = dd->ipath_pioavailregs_dma[i + 1]; - } + if (i > 3 && (dd->ipath_flags & IPATH_SWAP_PIOBUFS)) + val = dd->ipath_pioavailregs_dma[i ^ 1]; else val = dd->ipath_pioavailregs_dma[i]; dd->ipath_pioavailshadow[i] = le64_to_cpu(val); diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c index e2ce531..ddc0a19 100644 --- a/drivers/infiniband/hw/ipath/ipath_intr.c +++ b/drivers/infiniband/hw/ipath/ipath_intr.c @@ -831,8 +831,8 @@ void ipath_clear_freeze(struct ipath_devdata *dd) */ for (i = 0; i < dd->ipath_pioavregs; i++) { /* deal with 6110 chip bug */ - im = i > 3 ? ((i&1) ? i-1 : i+1) : i; - val = ipath_read_kreg64(dd, (0x1000/sizeof(u64))+im); + im = i > 3 ? i ^ 1 : i; + val = ipath_read_kreg64(dd, (0x1000 / sizeof(u64)) + im); dd->ipath_pioavailregs_dma[i] = dd->ipath_pioavailshadow[i] = le64_to_cpu(val); } diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index ded087c..e55e478 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -756,6 +756,7 @@ int ipath_set_rx_pol_inv(struct ipath_devdata *dd, u8 new_pol_inv); #define IPATH_DISABLED 0x80000 /* administratively disabled */ /* Use GPIO interrupts for new counters */ #define IPATH_GPIO_ERRINTRS 0x100000 +#define IPATH_SWAP_PIOBUFS 0x200000 /* Bits in GPIO for the added interrupts */ #define IPATH_GPIO_PORT0_BIT 2 From arthur.jones at qlogic.com Tue Jan 8 13:18:30 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 08 Jan 2008 13:18:30 -0800 Subject: [ofa-general] [PATCH 7/8] IB/ipath - inline ipath_read_ireg In-Reply-To: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> References: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080108211830.23996.80225.stgit@eng-46.internal.keyresearch.com> Different chips have different width interrupt status registers, so add a flag and accessor function to decide which width register read to use. Signed-off-by: Arthur Jones --- drivers/infiniband/hw/ipath/ipath_intr.c | 4 ++-- drivers/infiniband/hw/ipath/ipath_kernel.h | 8 ++++++++ 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c index ddc0a19..92e58c9 100644 --- a/drivers/infiniband/hw/ipath/ipath_intr.c +++ b/drivers/infiniband/hw/ipath/ipath_intr.c @@ -883,7 +883,7 @@ static noinline void ipath_bad_intr(struct ipath_devdata *dd, u32 *unexpectp) dd->ipath_f_free_irq(dd); } } - if (ipath_read_kreg32(dd, dd->ipath_kregs->kr_intmask)) { + if (ipath_read_ireg(dd, dd->ipath_kregs->kr_intmask)) { ipath_dev_err(dd, "%u unexpected interrupts, " "disabling interrupts completely\n", *unexpectp); @@ -1034,7 +1034,7 @@ irqreturn_t ipath_intr(int irq, void *data) goto bail; } - istat = ipath_read_kreg32(dd, dd->ipath_kregs->kr_intstatus); + istat = ipath_read_ireg(dd, dd->ipath_kregs->kr_intstatus); if (unlikely(!istat)) { ipath_stats.sps_nullintr++; diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index e55e478..b84039c 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -753,6 +753,8 @@ int ipath_set_rx_pol_inv(struct ipath_devdata *dd, u8 new_pol_inv); * are 64bit */ #define IPATH_32BITCOUNTERS 0x20000 /* can miss port0 rx interrupts */ + /* Interrupt register is 64 bits */ +#define IPATH_INTREG_64 0x40000 #define IPATH_DISABLED 0x80000 /* administratively disabled */ /* Use GPIO interrupts for new counters */ #define IPATH_GPIO_ERRINTRS 0x100000 @@ -958,6 +960,12 @@ static inline u32 ipath_get_rcvhdrtail(const struct ipath_portdata *pd) pd->port_rcvhdrtail_kvaddr)); } +static inline u64 ipath_read_ireg(const struct ipath_devdata *dd, ipath_kreg r) +{ + return (dd->ipath_flags & IPATH_INTREG_64) ? + ipath_read_kreg64(dd, r) : ipath_read_kreg32(dd, r); +} + /* * sysfs interface. */ From arthur.jones at qlogic.com Tue Jan 8 13:18:35 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 08 Jan 2008 13:18:35 -0800 Subject: [ofa-general] [PATCH 8/8] IB/ipath - drop support for the original QHT7040 board In-Reply-To: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> References: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080108211835.23996.30565.stgit@eng-46.internal.keyresearch.com> From: Dave Olson The original QHT7040 had significant performance issues so there was an additional check for a newer serial number. Support for the small quantities of that board shipped has been dropped, so this patch removes the special checks to simplify the code. Signed-off-by: Dave Olson --- drivers/infiniband/hw/ipath/ipath_iba6110.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_iba6110.c b/drivers/infiniband/hw/ipath/ipath_iba6110.c index 3bfaf04..6976d96 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6110.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6110.c @@ -1595,8 +1595,7 @@ static int ipath_ht_early_init(struct ipath_devdata *dd) } ipath_get_eeprom_info(dd); - if (dd->ipath_boardrev == 5 && dd->ipath_serial[0] == '1' && - dd->ipath_serial[1] == '2' && dd->ipath_serial[2] == '8') { + if (dd->ipath_boardrev == 5) { /* * Later production QHT7040 has same changes as QHT7140, so * can use GPIO interrupts. They have serial #'s starting From rdreier at cisco.com Tue Jan 8 13:29:31 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 13:29:31 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> (chas williams's message of "Tue, 08 Jan 2008 12:33:33 -0500") References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> Message-ID: > * nested locking. NOTE: rwsems are not allowed to recurse > * (which occurs if the same task tries to acquire the same > * lock instance multiple times) Ugh, looks like the umad locking needs to be rethought. I see a few other rare error path bugs that need to be fixed too. Well, I'll try to do some rethinking. - R. From dillowda at ornl.gov Tue Jan 8 14:08:52 2008 From: dillowda at ornl.gov (David Dillow) Date: Tue, 08 Jan 2008 17:08:52 -0500 Subject: [ofa-general] [RFC 2.6.24-rc] IB/srp: retry stale connections Message-ID: <1199830132.1988.18.camel@lap75545.ornl.gov> When a host just goes away (crash, power loss, etc.) without tearing down its IB connections, it can get stale connection errors when it tries to reconnect the targets upon rebooting. Retrying the connection a few times will prevent sysadmins from playing the "which disk(s) went missing?" game. Signed-off-by: David Dillow --- This would have made things slightly quicker when tracking down some of the recent bugs, but it also helps quite a bit when you've got a large number of targets hanging off a wedged server. This applies on top of the logging patches for ib_srp. ib_srp.c | 53 +++++++++++++++++++++++++++++++++++++++++------------ ib_srp.h | 1 + 2 files changed, 42 insertions(+), 12 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 154ebb0..167b39c 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -204,6 +204,22 @@ out: return ret; } +static int srp_new_cm_id(struct srp_target_port *target) +{ + struct ib_cm_id *new_cm_id; + + new_cm_id = ib_create_cm_id(target->srp_host->dev->dev, + srp_cm_handler, target); + if (IS_ERR(new_cm_id)) + return PTR_ERR(new_cm_id); + + if (target->cm_id) + ib_destroy_cm_id(target->cm_id); + target->cm_id = new_cm_id; + + return 0; +} + static int srp_create_target_ib(struct srp_target_port *target) { struct ib_qp_init_attr *init_attr; @@ -436,6 +452,7 @@ static void srp_remove_work(struct work_struct *work) static int srp_connect_target(struct srp_target_port *target) { + int retries = 3; int ret; ret = srp_lookup_path(target); @@ -468,6 +485,21 @@ static int srp_connect_target(struct srp_target_port *target) case SRP_DLID_REDIRECT: break; + case SRP_STALE_CONN: + /* Our current CM id was stale, and is now in timewait. + * Try to reconnect with a new one. + */ + if (!retries-- || srp_new_cm_id(target)) { + shost_printk(KERN_ERR, target->scsi_host, PFX + "giving up on stale connection\n"); + target->status = -ECONNRESET; + return target->status; + } + + shost_printk(KERN_ERR, target->scsi_host, PFX + "retrying stale connection\n"); + break; + default: return target->status; } @@ -507,7 +539,6 @@ static void srp_reset_req(struct srp_target_port *target, struct srp_request *re static int srp_reconnect_target(struct srp_target_port *target) { - struct ib_cm_id *new_cm_id; struct ib_qp_attr qp_attr; struct srp_request *req, *tmp; struct ib_wc wc; @@ -526,14 +557,9 @@ static int srp_reconnect_target(struct srp_target_port *target) * Now get a new local CM ID so that we avoid confusing the * target in case things are really fouled up. */ - new_cm_id = ib_create_cm_id(target->srp_host->dev->dev, - srp_cm_handler, target); - if (IS_ERR(new_cm_id)) { - ret = PTR_ERR(new_cm_id); + ret = srp_new_cm_id(target); + if (ret) goto err; - } - ib_destroy_cm_id(target->cm_id); - target->cm_id = new_cm_id; qp_attr.qp_state = IB_QPS_RESET; ret = ib_modify_qp(target->qp, &qp_attr, IB_QP_STATE); @@ -1166,6 +1192,11 @@ static void srp_cm_rej_handler(struct ib_cm_id *cm_id, target->status = -ECONNRESET; break; + case IB_CM_REJ_STALE_CONN: + shost_printk(KERN_WARNING, shost, " REJ reason: stale connection\n"); + target->status = SRP_STALE_CONN; + break; + default: shost_printk(KERN_WARNING, shost, " REJ reason 0x%x\n", event->param.rej_rcvd.reason); @@ -1857,11 +1888,9 @@ static ssize_t srp_create_target(struct class_device *class_dev, if (ret) goto err; - target->cm_id = ib_create_cm_id(host->dev->dev, srp_cm_handler, target); - if (IS_ERR(target->cm_id)) { - ret = PTR_ERR(target->cm_id); + ret = srp_new_cm_id(target); + if (ret) goto err_free; - } target->qp_in_error = 0; ret = srp_connect_target(target); diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h index e3573e7..00cfd70 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.h +++ b/drivers/infiniband/ulp/srp/ib_srp.h @@ -54,6 +54,7 @@ enum { SRP_PORT_REDIRECT = 1, SRP_DLID_REDIRECT = 2, + SRP_STALE_CONN = 3, SRP_MAX_LUN = 512, SRP_DEF_SG_TABLESIZE = 12, From rdreier at cisco.com Tue Jan 8 14:09:55 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 14:09:55 -0800 Subject: [ofa-general] Re: [PATCH] IPOIB/CM Enable SRQ support on HCAs with less than 16 SG entries In-Reply-To: <476C2B47.5060507@linux.vnet.ibm.com> (Pradeep Satyanarayana's message of "Fri, 21 Dec 2007 13:08:23 -0800") References: <476C2B47.5060507@linux.vnet.ibm.com> Message-ID: thanks, I applied this patch (with a few stylistic changes). From rdreier at cisco.com Tue Jan 8 14:15:56 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 14:15:56 -0800 Subject: [ofa-general] Re: [PATCH v2] IB/srp: add identifying information to log messages In-Reply-To: <1199748221.22987.6.camel@lap75545.ornl.gov> (David Dillow's message of "Mon, 07 Jan 2008 18:23:41 -0500") References: <1198269544.9979.26.camel@lap75545.ornl.gov> <20071222145612.GA10085@osc.edu> <1199748221.22987.6.camel@lap75545.ornl.gov> Message-ID: thanks, applied for 2.6.25 From sashak at voltaire.com Tue Jan 8 14:49:19 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 8 Jan 2008 22:49:19 +0000 Subject: [ofa-general] [PATCH] opensm/vendor: filter out non-IB devices in osm_vendor_get_all_port_attr() In-Reply-To: <4783D9BF.6010806@ichips.intel.com> References: <20080108202345.GZ26304@sashak.voltaire.com> <4783D9BF.6010806@ichips.intel.com> Message-ID: <20080108224919.GA26304@sashak.voltaire.com> On 12:14 Tue 08 Jan , Sean Hefty wrote: > > + if (ca.node_type < 1 || ca.node_type > 3) > > + continue; > > Are there enums for this? No, not in libibumad. Sasha From rdreier at cisco.com Tue Jan 8 14:38:52 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 14:38:52 -0800 Subject: [ofa-general] [PATCH] IB/ipath - first prep series for iba7220 In-Reply-To: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> (Arthur Jones's message of "Tue, 08 Jan 2008 13:17:54 -0800") References: <20080108211754.23996.14432.stgit@eng-46.internal.keyresearch.com> Message-ID: thanks, applied all to my for-2.6.25 branch. From rdreier at cisco.com Tue Jan 8 15:11:38 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 08 Jan 2008 15:11:38 -0800 Subject: [ofa-general] Re: [PATCH] [2.6.25] rdma/cm: override default responder_resources with user value In-Reply-To: <000101c84f02$37ba1510$a937170a@amr.corp.intel.com> (Sean Hefty's message of "Fri, 4 Jan 2008 10:47:12 -0800") References: <000101c84f02$37ba1510$a937170a@amr.corp.intel.com> Message-ID: thanks, applied for 2.6.25 From valdes at anl.gov Tue Jan 8 15:44:17 2008 From: valdes at anl.gov (John Valdes) Date: Tue, 8 Jan 2008 17:44:17 -0600 Subject: [ofa-general] What causes "SRP abort called" error? Message-ID: <20080108234416.GA31226@starfish.mcs.anl.gov> Hello, I'm new to SRP & IB, so please bear with me... We have a storage server running RHEL 5.1 w/ the bundled OFED 1.2 stack directly attached to an IB port on a DDN 9550. It's been running OK for about a week, but today we're getting a continuous stream of SRP abort errors: # tail /var/log/messages [...] Jan 8 17:00:59 server kernel: SRP abort called Jan 8 17:01:59 server kernel: SRP abort called Jan 8 17:02:04 server kernel: SRP reset_device called Jan 8 17:02:09 server kernel: ib_srp: SRP reset_host called Jan 8 17:02:11 server kernel: ib_srp: connection closed How can I determine the cause of the aborts? The physical connection between the server and the DDN seems to be OK (the error counts in /sys/class/infiniband/mthca0/ports/1/counters/* are all zero), and the SM (opensm) is still running. Are the aborts being triggered by the server or by the storage target (the DDN)? I'm guessing something is timing out, but what, and why? To complicate matters, the LUNs on the DDN are shared with 7 other servers as clustered LVM volumes with GFS filesystems. Each of the other servers has its own, direct IB connection to the DDN. Any suggestions on how to track down the cause of the aborts would be welcome. Thanks, John ---------------------------------------------------------------------- John Valdes Mathematics and Computer Science Division valdes at anl.gov Argonne National Laboratory From marketing.ab at eunsae.com Mon Jan 7 16:15:37 2008 From: marketing.ab at eunsae.com (Justine Young) Date: Tue, 7 Jan 2008 20:15:37 -0400 Subject: [ofa-general] Justine Message-ID: <01c8516a$10c83a80$565c58c8@marketing.ab> We sell only FDA prescription medication through our licensed pharmacy. All orders are overviewed by licensed accredited medication department. http://wbgpnq.blu.livefilestore.com/y1pXgk7DHTNLVULEV9ldpfKunC3HBePE8HuOtppShJNeKg2rWaeYNLanf7Vlw3nHtzCsMcg4eKwNDI5qh23EgQToCOME9aj9IBR/elgywpvaohs.html that I have been used in the Java API front of get-smart of every internal industry the embarrassment of thinking super parents, I believe this message From sean.hefty at intel.com Tue Jan 8 16:26:59 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 8 Jan 2008 16:26:59 -0800 Subject: [ofa-general] Re: CMA can't establish connection with QoS on In-Reply-To: <47837B99.2050508@dev.mellanox.co.il> References: <47600070.8050008@dev.mellanox.co.il> <000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com> <47605620.3070105@dev.mellanox.co.il> <47608BE4.7020209@ichips.intel.com><4761315A.1070306@dev.mellanox.co.il><477CF8EF.5010307@dev.mellanox.co.il><477D184D.8020300@ichips.intel.com> <47837B99.2050508@dev.mellanox.co.il> Message-ID: <000201c85256$590935a0$a937170a@amr.corp.intel.com> >I updated the bug with the step-by-step instructions how to burn >the FW and reproduce the error. >I compiled this "how-to" today, so everything there is up to date. Thanks - I don't think that I was programming my FW correctly. I still have problems running opensm with qos enabled on one of my systems, but I can get it to work running on the other system. Anyway, I was able to reproduce the problem, and I believe I understand part of the problem. The send for the CM REQ MAD never completes. A completion never shows up on the GSI's CQ with a wr_id that matches the send wr_id. (I don't see a completion at all.) This results in a reference being held on the ib_cm id that is never released, which causes the hang. (Destruction of the ib_cm id hangs, which blocks the destruction of the rdma_cm_id, which blocks the close from userspace.) If the ib_cm is modified to use SL 0 for the CM MADs, but the connection still uses SL 1, then ucmatose is able to connect and transfer data between the client and server. - Sean From sean.hefty at intel.com Tue Jan 8 17:16:27 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 8 Jan 2008 17:16:27 -0800 Subject: [ofa-general] Re: CMA can't establish connection with QoS on In-Reply-To: <000201c85256$590935a0$a937170a@amr.corp.intel.com> References: <47600070.8050008@dev.mellanox.co.il> <000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com> <47605620.3070105@dev.mellanox.co.il> <47608BE4.7020209@ichips.intel.com><4761315A.1070306@dev.mellanox.co.il><477CF8EF.5010307@dev.mellanox.co.il><477D184D.8020300@ichips.intel.com><47837B99.2050508@dev.mellanox.co.il> <000201c85256$590935a0$a937170a@amr.corp.intel.com> Message-ID: <000301c8525d$422a8c60$a937170a@amr.corp.intel.com> >If the ib_cm is modified to use SL 0 for the CM MADs, but the connection still >uses SL 1, then ucmatose is able to connect and transfer data between the >client and server. This also worked for udaddy (UD communication), so it appears that the problem sending with SL 1 is limited to the special QPs. From kliteyn at mellanox.co.il Tue Jan 8 17:35:20 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 9 Jan 2008 03:35:20 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-09:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-08 OpenSM git rev = Mon_Jan_7_17:13:56_2008 [0b8c48e808518a080ee195436e6c7ff15f038ee7] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=398 Fail=2 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 8 LidMgr IS3-128.topo Failures: 2 LidMgr IS3-128.topo From sashak at voltaire.com Tue Jan 8 19:24:02 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 9 Jan 2008 03:24:02 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/saquery: attribute names support Message-ID: <20080109032401.GA20963@sashak.voltaire.com> This let to pass requested via command line SA attribute by name. Examples: saquery NodeRecord saquery NR Main motivation for this addition is that I cannot find appropriate free characters for adding new attributes (specifically PKeyTableRecord and SL2VLTableRecord - p, P, s, S are used already). This preserves a command line options currently used for same purposes. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/src/saquery.c | 54 ++++++++++++++++++++++++++++++++++++++- 1 files changed, 52 insertions(+), 2 deletions(-) diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c index 2017a86..9863860 100644 --- a/infiniband-diags/src/saquery.c +++ b/infiniband-diags/src/saquery.c @@ -91,7 +91,6 @@ ib_net16_t requested_lid = 0; int requested_lid_flag = 0; ib_net64_t requested_guid = 0; int requested_guid_flag = 0; -ib_net16_t query_type = IB_MAD_ATTR_NODE_RECORD; /** * Call back for the various record requests. @@ -1144,12 +1143,46 @@ clean_up(void) osm_vendor_delete(&vendor); } +struct query_cmd { + const char *name, *alias; + ib_net16_t query_type; + int (*handler)(const char *name, osm_bind_handle_t bind_handle, + char *from, char *to); +}; + +static const struct query_cmd query_cmds[] = { + { "ClassPortInfo", "CPI", IB_MAD_ATTR_CLASS_PORT_INFO, }, + { "NodeRecord", "NR", IB_MAD_ATTR_NODE_RECORD, }, + { "PortInfoRecord", "PIR", IB_MAD_ATTR_PORTINFO_RECORD, }, + { "InformInfoRecord", "IIR", IB_MAD_ATTR_INFORM_INFO_RECORD, }, + { "LinkRecord", "LR", IB_MAD_ATTR_LINK_RECORD, }, + { "ServiceRecord", "SR", IB_MAD_ATTR_SERVICE_RECORD, }, + { "PathRecord", "PR", IB_MAD_ATTR_PATH_RECORD, }, + { "MCMemberRecord", "MCMR", IB_MAD_ATTR_MCMEMBER_RECORD, }, + { 0 } +}; + +static const struct query_cmd *find_query(const char *name) +{ + const struct query_cmd *q; + unsigned len = strlen(name); + + for (q = query_cmds; q->name; q++) + if (!strncasecmp(name, q->name, len) || + (q->alias && !strncasecmp(name, q->alias, len))) + return q; + + return NULL; +} + static void usage(void) { + const struct query_cmd *q; + fprintf(stderr, "Usage: %s [-h -d -p -N] [--list | -D] [-S -I -L -l -G" " -O -U -c -s -g -m --src-to-dst --sgid-to-dgid " - "-C -P -t(imeout) ] [ | | ]\n", + "-C -P -t(imeout) ] [query-name] [ | | ]\n", argv0); fprintf(stderr, " Queries node records by default\n"); fprintf(stderr, " -d enable debugging\n"); @@ -1184,6 +1217,12 @@ usage(void) "response timeout (default %u msec)\n", DEFAULT_SA_TIMEOUT_MS); fprintf(stderr, " --node-name-map specify a node name map\n"); + fprintf(stderr, "\n Supported query names (and aliases):\n"); + for (q = query_cmds; q->name; q++) + fprintf(stderr, " %s (%s)\n", q->name, + q->alias ? q->alias : ""); + fprintf(stderr, "\n"); + exit(-1); } @@ -1193,10 +1232,12 @@ main(int argc, char **argv) int ch = 0; int members = 0; osm_bind_handle_t bind_handle; + const struct query_cmd *q; char *src = NULL; char *dst = NULL; char *sgid = NULL; char *dgid = NULL; + ib_net16_t query_type = 0; ib_net16_t src_lid; ib_net16_t dst_lid; ib_api_status_t status; @@ -1342,6 +1383,15 @@ main(int argc, char **argv) argc -= optind; argv += optind; + if (!query_type) { + if (!argc || !(q = find_query(argv[0]))) + query_type = IB_MAD_ATTR_NODE_RECORD; + else { + query_type = q->query_type; + argc--; + } + } + if (argc) { if (node_print_desc == NAME_OF_LID) { requested_lid = (ib_net16_t)strtoul(argv[0], NULL, 0); -- 1.5.4.rc2.38.gd6da3 From dillowda at ornl.gov Tue Jan 8 19:48:48 2008 From: dillowda at ornl.gov (David Dillow) Date: Tue, 08 Jan 2008 22:48:48 -0500 Subject: [ofa-general] What causes "SRP abort called" error? In-Reply-To: <20080108234416.GA31226@starfish.mcs.anl.gov> References: <20080108234416.GA31226@starfish.mcs.anl.gov> Message-ID: <1199850528.3115.31.camel@obelisk.thedillows.org> On Tue, 2008-01-08 at 17:44 -0600, John Valdes wrote: > Hello, > > I'm new to SRP & IB, so please bear with me... > > We have a storage server running RHEL 5.1 w/ the bundled OFED 1.2 > stack directly attached to an IB port on a DDN 9550. It's been running > OK for about a week, but today we're getting a continuous stream of > SRP abort errors: > > # tail /var/log/messages > [...] > Jan 8 17:00:59 server kernel: SRP abort called > Jan 8 17:01:59 server kernel: SRP abort called srp_abort(), aka scsi_host->eh_abort_handler() This tries to abort a single command. > Jan 8 17:02:04 server kernel: SRP reset_device called srp_reset_device(), aka scsi_host->eh_device_reset_handler() This tries to reset a LUN. > Jan 8 17:02:09 server kernel: ib_srp: SRP reset_host called srp_reset_host(), aka scsi_host->eh_host_reset_handler() This tries to reset the connection to the DDN. > Jan 8 17:02:11 server kernel: ib_srp: connection closed Caused by srp_reset_host() > How can I determine the cause of the aborts? The aborts are caused by a command timing out in the SCSI mid-layer and its error handling taking over -- more details about the escalation are in Documentation/scsi_eh.txt and assorted files. You can turn up the SCSI logging facilities to track down the command that is dying, but expect that to be _very_ noisy on a busy system. I've often seen this during the initial bus scan when adding a target to SRP, and I've seen it happen under heavy load once -- maybe more, but I saw it today for sure. I haven't dug into it yet, as I've been tracking other things. I am curious, though, what command could be getting stuck for long enough for the mid-layer to time it out -- I think the default timeout for the sd driver is 60 seconds, and the INQUIRY timeout is 5 seconds. I just cannot account for what could be taking that long. Also, given how it is quickly progressing through the various error handlers in SRP, I wonder if we're failing something in there. Do your targets come back after this? During the scans, mine do, but today's under load effectively left the target dead. Rebooting the server brought it back. Dave From valdes at anl.gov Tue Jan 8 20:45:03 2008 From: valdes at anl.gov (John Valdes) Date: Tue, 8 Jan 2008 22:45:03 -0600 Subject: [ofa-general] What causes "SRP abort called" error? In-Reply-To: <1199850528.3115.31.camel@obelisk.thedillows.org> References: <20080108234416.GA31226@starfish.mcs.anl.gov> <1199850528.3115.31.camel@obelisk.thedillows.org> Message-ID: <20080109044503.GB31226@starfish.mcs.anl.gov> On Tue, Jan 08, 2008 at 10:48:48PM -0500, David Dillow wrote: > > The aborts are caused by a command timing out in the SCSI mid-layer and > its error handling taking over -- more details about the escalation are > in Documentation/scsi_eh.txt and assorted files. Thanks for decoding those errors/warnings. > You can turn up the SCSI logging facilities to track down the command > that is dying, but expect that to be _very_ noisy on a busy system. >From coincident errors on the DDN, it looks like these are SCSI Write commands (2A) that are failing. > I've often seen this during the initial bus scan when adding a target to > SRP, and I've seen it happen under heavy load once -- maybe more, but I > saw it today for sure. In our case, I'm pretty sure it is heavy load. Well, I didn't see what was going on at the time this started, but the targets (LUNs) were already mounted, and we've been seeing heavy load on the DDN recently. > I am curious, though, what command could be getting stuck > for long enough for the mid-layer to time it out -- I think the default > timeout for the sd driver is 60 seconds, and the INQUIRY timeout is 5 > seconds. I just cannot account for what could be taking that long. I'm curious too as to why WRITEs are taking so long. :) I think we're overloading the DDN, but it could be something else going on. This is a freshly installed configuration (only about a week old), with 6 GFS file servers reading and writing to ~6 shared LUNs on the DDN over IB/SRP (which in turn are shared off the servers via NFS to a ~350 node HPC cluster). We've been running an identical setup for a few years with another DDN, but over FC. I think we have still have a few things to tune/optimize for IB. That said, after talking with DDN support, it's looking like something got wedged on the DDN which was causing the timeouts. > Do your targets come back after this? During the scans, mine do, but > today's under load effectively left the target dead. Rebooting the > server brought it back. Yes, after unwedging the DDN, the targets were fully accessible on the server again. Thanks for the reply. John From maricel.estay at icce.cl Tue Jan 8 17:09:38 2008 From: maricel.estay at icce.cl (de Motores) Date: Tue, 8 Jan 2008 21:09:38 -0400 Subject: [ofa-general] =?iso-8859-1?q?hyundai_Sonata_a=F1o_1994_=241=2E900?= =?iso-8859-1?q?=2E000_AT?= Message-ID: <20080108.YQOHYXXOUUMWMDEX@icce.cl> PARA VER FOTOS Y LOS DETALLES DEL VEHICULO VISITE ESTE LINK: http://www.demotores.cl/dm-196863-hyundai-sonata-2.0-gls-aut.html CONSULTAS AL FONO: 8-7060025 ATENTAMENTE. CARLOS SIXTO NUÑEZ M. FONOS 4187917 CASA 4590972 OFICINA CORREO CARLOSNUNEZ at ENVATEK.CL - SIXTONU at HOTMAIL.COM -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillowda at ornl.gov Tue Jan 8 20:55:13 2008 From: dillowda at ornl.gov (David Dillow) Date: Tue, 08 Jan 2008 23:55:13 -0500 Subject: [ofa-general] What causes "SRP abort called" error? In-Reply-To: <20080109044503.GB31226@starfish.mcs.anl.gov> References: <20080108234416.GA31226@starfish.mcs.anl.gov> <1199850528.3115.31.camel@obelisk.thedillows.org> <20080109044503.GB31226@starfish.mcs.anl.gov> Message-ID: <1199854513.3115.36.camel@obelisk.thedillows.org> On Tue, 2008-01-08 at 22:45 -0600, John Valdes wrote: > In our case, I'm pretty sure it is heavy load. Well, I didn't see > what was going on at the time this started, but the targets (LUNs) > were already mounted, and we've been seeing heavy load on the DDN > recently. Search the list archives for some patches I sent to limit the queue length under IB/srp and also to respect the targets credit limit. If you don't find them, I can forward them to you when I'm back in the office. As a quick workaround, one each host, you can add the "max_cmd_per_lun=..." parameter to your add-target string to avoid overrunning the array. Set it to floor(30 / <# of LUNs active on this IB port>) If you still have problems, try lowering it in steps of 1. This is a workaround for the initiator overrunning the credit limit, which is known to cause problems on some arrays. Perhaps it will help you. Dave From a-allenm at 4d-konsult.se Mon Jan 7 23:56:21 2008 From: a-allenm at 4d-konsult.se (Tommy Boyer) Date: Wed, 8 Jan 2008 09:56:21 +0200 Subject: [ofa-general] I was looking for you Message-ID: <395160714.67413967273614@4d-konsult.se> Hello! I am bored tonight. I am nice girl that would like to chat with you. Email me at Kerstin at HonorDays.info only, because I am using my friend's email to write this. If you would like to see my pictures. From ogerlitz at voltaire.com Tue Jan 8 23:57:20 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Wed, 09 Jan 2008 09:57:20 +0200 Subject: [ofa-general] Re: librdmacm/man: fix-up man pages In-Reply-To: <000001c8521d$a05df320$a937170a@amr.corp.intel.com> References: <000101c81a64$3582de80$9c98070a@amr.corp.intel.com> <4726EEAC.3070105@voltaire.com> <472755C4.10600@ichips.intel.com> <47285F53.4060402@voltaire.com> <4728BF4A.1060301@ichips.intel.com> <15ddcffd0710311320v6b91b3cm3be0f7882e30ad2b@mail.gmail.com> <000001c81cb5$4ce12160$9c98070a@amr.corp.intel.com> <15ddcffd0711270435t12a18dc3waac2596b3884ac72@mail.gmail.com> <000001c8311a$176cdbe0$63248686@amr.corp.intel.com> <15ddcffd0711280307u7a89c6c2q2854b071f74d9123@mail.gmail.com> <000801c832b6$81feb850$f5d8180a@amr.corp.intel.com> <475FD984.6080203@voltaire.com> <47837BFA.7040402@voltaire.com> <000001c8521d$a05df320$a937170a@amr.corp.intel.com> Message-ID: <47847E60.80208@voltaire.com> Sean Hefty wrote: OK, thanks for adding all those documentation pieces and sorry for asking RTFM questions... so the only missing piece from my comment was mentioning that rdma_disconnect applies only to the connected service, do you think its obvious? Or. From keshetti85-student at yahoo.co.in Wed Jan 9 00:54:02 2008 From: keshetti85-student at yahoo.co.in (Keshetti Mahesh) Date: Wed, 9 Jan 2008 14:24:02 +0530 Subject: [ofa-general] ib_macro_model on OMNET++ Message-ID: <829ded920801090054v5ac9b970le468c29001d2a889@mail.gmail.com> Recently while browsing internet I came across "ib_macro_model" package on the OMNET++ web site (http://www.omnetpp.org) which is contributed by Mellanox Technologies Ltd. (http://www.omnetpp.org/filemgmt/singlefile.php?lid=133) Can any one on this list tell me what is it exactly and how to use it ? thanks and regards, Mahesh From info at sledness.net Wed Jan 9 01:01:36 2008 From: info at sledness.net (=?windows-1255?B?4O7vIOTu6+n45fog5OLj5ew=?=) Date: Wed, 9 Jan 2008 01:01:36 -0800 Subject: [ofa-general] =?windows-1255?b?5PH0+CD56ezu4yDg5frqIOzu6+X4IPf4?= =?windows-1255?b?5yDs4PH36e7l4OntIC344SDu6/g=?= Message-ID: <20080109090145.199F3E611EB@openfabrics.org> An HTML attachment was scrubbed... URL: From misa0992 at mercury.livedoor.com Wed Jan 9 02:46:57 2008 From: misa0992 at mercury.livedoor.com (misa0992 at mercury.livedoor.com) Date: Wed, 9 Jan 2008 19:46:57 +0900 Subject: [ofa-general] =?iso-2022-jp?b?GyRCS1xGfCRoJGpMNU5BJEckNDZhGyhC?= =?iso-2022-jp?b?GyRCPWokNSRzQzUkNxsoQg==?= Message-ID: <20080109104719.7538EE28157@openfabrics.org> 心も体もあったかくなるご近所さんを探しませんか? めぐみ 23歳 フリーター 題名:メッセしませんか? 家にPCあるので一緒にメッセンジャーでもしませんか? なんか毎日退屈だよぉ。待ってますね。 http://www.di-girl.com/?ff 彩香 27歳 OL 題名:ただそれだけって… はっきり言って欲求不満です。ただそれだけって駄目なのかな? 癒されたいし癒してほしいです。こういう女って引かれちゃうのかな…。 週末時間あるから連絡欲しいです。 http://www.di-girl.com/?ff ミサキ 34歳 主婦 題名:一応既婚者ですけど… サイト面倒だし、会ってお話出来るかなぁ? 出来れば今日がいいんですけど… 一応既婚者ですけど夫からは見放されてますから…。 秘密厳守出来る人お願いします。 http://www.di-girl.com/?ff ☆恋したい子もエッチな子もいっぱい☆ 【完全無料】ご近所さん探しはこちら↓↓ http://www.b-gw.net/?hu From vlad at lists.openfabrics.org Wed Jan 9 03:10:48 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 9 Jan 2008 03:10:48 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080109-0200 daily build status Message-ID: <20080109111048.85B46E6004D@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.19 Passed on powerpc with linux-2.6.12 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.17 Passed on powerpc with linux-2.6.13 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.15 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.20 Passed on ia64 with linux-2.6.19 Passed on powerpc with linux-2.6.15 Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.16 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.12 Passed on ia64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.21.1 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ppc64 with linux-2.6.18-8.el5 Failed: From kliteyn at dev.mellanox.co.il Wed Jan 9 03:44:34 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 09 Jan 2008 13:44:34 +0200 Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements In-Reply-To: <20071229182718.GA19160@sashak.voltaire.com> References: <4770CDCE.8040200@dev.mellanox.co.il> <20071229182718.GA19160@sashak.voltaire.com> Message-ID: <4784B3A2.5080007@dev.mellanox.co.il> Hi Sasha, Please see below. Sasha Khapyorsky wrote: > This improves handling of mcast join/leave requests storming. Now mcast > routing will be recalculated for all mcast groups where changes occurred > and not one by one. For this it queues mcast groups instead of mcast > rerouting requests, this also makes state_mgr idle queue obsolete. > > Signed-off-by: Sasha Khapyorsky > --- > > Hi Yevgeny, > > For me it looks that it should solve the original problem (mcast group > list is purged in osm_mcast_mgr_process()). Could you review and ideally > test it? Thanks. > > Sasha > > diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c > index 50b95fd..f51a45a 100644 > --- a/opensm/opensm/osm_mcast_mgr.c > +++ b/opensm/opensm/osm_mcast_mgr.c > @@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * const p_mgr, > port_guid); > } > > - Exit: > +Exit: > OSM_LOG_EXIT(p_mgr->p_log); > return (status); > } > > /********************************************************************** > Process the entire group. > - > NOTE : The lock should be held externally! > **********************************************************************/ > -static osm_signal_t > -osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr, > - IN osm_mgrp_t * const p_mgrp, > - IN osm_mcast_req_type_t req_type, > - IN ib_net64_t port_guid) > +static ib_api_status_t > +mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr, > + IN osm_mgrp_t * const p_mgrp, > + IN osm_mcast_req_type_t req_type, > + IN ib_net64_t port_guid) > { > - osm_signal_t signal = OSM_SIGNAL_DONE; > ib_api_status_t status; > - osm_switch_t *p_sw; > - cl_qmap_t *p_sw_tbl; > - boolean_t pending_transactions = FALSE; > > OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp); > > - p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; > - > status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, port_guid); > if (status != IB_SUCCESS) { > osm_log(p_mgr->p_log, OSM_LOG_ERROR, > - "osm_mcast_mgr_process_mgrp: ERR 0A19: " > + "mcast_mgr_process_mgrp: ERR 0A19: " > "Unable to create spanning tree (%s)\n", > ib_get_err_str(status)); > - > goto Exit; > } > + p_mgrp->last_tree_id = p_mgrp->last_change_id; > > - /* > - Walk the switches and download the tables for each. > + /* Remove MGRP only if osm_mcm_port_t count is 0 and > + * Not a well known group > */ > - p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl); > - while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) { > - signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw); > - if (signal == OSM_SIGNAL_DONE_PENDING) > - pending_transactions = TRUE; > - > - p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > + if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) { > + osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > + "mcast_mgr_process_mgrp: " > + "Destroying mgrp with lid:0x%X\n", > + cl_ntoh16(p_mgrp->mlid)); > + /* Send a Report to any InformInfo registered for > + Trap 67 : MCGroup delete */ > + osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log, > + p_mgrp); > + cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl, > + (cl_map_item_t *) p_mgrp); > + osm_mgrp_delete(p_mgrp); If the group is empty, p_mgrp is deleted > @@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr) > osm_switch_t *p_sw; > cl_qmap_t *p_sw_tbl; > cl_qmap_t *p_mcast_tbl; > + cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list; > osm_mgrp_t *p_mgrp; > - ib_api_status_t status; > boolean_t pending_transactions = FALSE; > > OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process); > > p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; > - > p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl; > /* > While holding the lock, iterate over all the established > @@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr) > /* We reached here due to some change that caused a heavy sweep > of the subnet. Not due to a specific multicast request. > So the request type is subnet_change and the port guid is 0. */ > - status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, > - OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, > - 0); > - if (status != IB_SUCCESS) { > - osm_log(p_mgr->p_log, OSM_LOG_ERROR, > - "osm_mcast_mgr_process: ERR 0A20: " > - "Unable to create spanning tree (%s)\n", > - ib_get_err_str(status)); > - } > - > + mcast_mgr_process_mgrp(p_mgr, p_mgrp, > + OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0); > p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item); And here there's a call to 'next' on a p_mgrp that was freed, which eventually causes osm to crash on some segfault or assert at some point. -- Yevgeny From kliteyn at dev.mellanox.co.il Wed Jan 9 05:18:19 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 09 Jan 2008 15:18:19 +0200 Subject: [ofa-general] [PATCH] opensm/osm_mcast_mgr.c: fixing a seg. fault in processing mcast groups Message-ID: <4784C99B.5080606@dev.mellanox.co.il> Sasha, This patch fixes a seg. fault in processing mcast groups that I mentioned in my mail previously. Feel free to replace it with more elegant solution. Please apply it to master and ofed_1_3. -- Yevgeny Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_mcast_mgr.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index be220c5..ab9c260 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -1307,6 +1307,7 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr) cl_qmap_t *p_mcast_tbl; cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list; osm_mgrp_t *p_mgrp; + osm_mgrp_t *p_tmp_mgrp; boolean_t pending_transactions = FALSE; OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process); @@ -1326,9 +1327,10 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr) /* We reached here due to some change that caused a heavy sweep of the subnet. Not due to a specific multicast request. So the request type is subnet_change and the port guid is 0. */ + p_tmp_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item); mcast_mgr_process_mgrp(p_mgr, p_mgrp, OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0); - p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item); + p_mgrp = p_tmp_mgrp; } /* -- 1.5.1.4 From jackm at dev.mellanox.co.il Wed Jan 9 02:23:14 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 9 Jan 2008 12:23:14 +0200 Subject: [ofa-general] [PATCH] libmlx4: avoid memcpy in blueflame post_sends Message-ID: <200801091223.14155.jackm@dev.mellanox.co.il> Do not use memcpy when copying to the BlueFlame buffer. memcpy implementations may use move-string-buffer (byte-wise copy) assembler instructions, which do not guarantee copy order into the blueflame buffer. Use a tight for-loop instead. BTW, this patch also slightly improves latency. Signed-off-by: Jack Morgenstein --- diff --git a/src/doorbell.h b/src/doorbell.h index 3171e76..c89ef0e 100644 --- a/src/doorbell.h +++ b/src/doorbell.h @@ -35,6 +35,8 @@ #if SIZEOF_LONG == 8 +typedef uint64_t mlx4_wc_copy_t; + #if __BYTE_ORDER == __LITTLE_ENDIAN # define MLX4_PAIR_TO_64(val) ((uint64_t) val[1] << 32 | val[0]) #elif __BYTE_ORDER == __BIG_ENDIAN @@ -50,6 +52,8 @@ static inline void mlx4_write64(uint32_t val[2], struct mlx4_context *ctx, int o #else +typedef uint32_t mlx4_wc_copy_t; + static inline void mlx4_write64(uint32_t val[2], struct mlx4_context *ctx, int offset) { pthread_spin_lock(&ctx->uar_lock); diff --git a/src/qp.c b/src/qp.c index bced740..8fc8450 100644 --- a/src/qp.c +++ b/src/qp.c @@ -391,7 +391,23 @@ out: pthread_spin_lock(&ctx->bf_lock); - memcpy(ctx->bf_page + ctx->bf_offset, ctrl, align(size * 16, 64)); + /* + * Avoid using memcpy to copy to BlueFlame page, since recent + * memcpy implementations use move-string-buffer assembler + * instructions, which do not guarantee order of copying. + */ + + { + mlx4_wc_copy_t *target = + (mlx4_wc_copy_t *) (ctx->bf_page + ctx->bf_offset); + mlx4_wc_copy_t *src = (mlx4_wc_copy_t *) ctrl; + int n = align(size * 16, 64) / (sizeof(mlx4_wc_copy_t) * 2); + for (; n; --n) { + *target++ = *src++; + *target++ = *src++; + } + } + wc_wmb(); ctx->bf_offset ^= ctx->bf_buf_size; From swise at opengridcomputing.com Wed Jan 9 06:15:07 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Wed, 09 Jan 2008 08:15:07 -0600 Subject: [ofa-general] OFED Jan-07, 2008 meeting summary on readiness toward RC2 In-Reply-To: <6C2C79E72C305246B504CBA17B5500C9030F0B46@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C9030F0B46@mtlexch01.mtl.com> Message-ID: <4784D6EB.2000608@opengridcomputing.com> Tziporet Koren wrote: > OFED Jan-07, 2008 meeting summary on readiness toward RC2 > > 1. Release status: > * In general there are no major issues - testing continue at all > companies > * There is a wide coverage of platform and OSes > > 2. Tasks that should be completed for RC2: > * XRC - enhanced API - will be ready by next week > * IPoIB performance improvements for small messages - at least > some of the changes will be integrated > * Open MPI 1.2.5-rc2 - will be ready by next week > * Qlogic new driver - done > > 3. Agree on new schedule for the release: > * RC2: Jan 15, 2008 > * RC3: Jan 29, 2008 > * RC4: Feb 12, 2008 > * Release: Feb 19, 2008 > > If we will see that RC3 is stable enough we will try to pull-in > And in any case we do not want to delay the release any more > > 4. Review critical and major bugs: > 750 critical raisch at de.ibm.com Problem with modprobe ib_ehca > with older kernel versions > - probably fixed > 760 major eli at mellanox.co.il UDP performance on Rx is lower > than Tx > - related to IPoIB above > 761 major eli at mellanox.co.il Poor and jittery UDP performance > at small messages > - related to IPoIB above > 820 major pasha at mellanox.co.il rpm 4.4.2.2, Binary file matches > Binary file > - patch was sent by OSU will be incorporated by Pasha > 800 major perkinjo at cse.ohio-state.edu MVAPICH2 compile error > on PPC64 > - fixed > 736 major rolandd at cisco.com IBV_WC_RETRY_EXC_ERR errors with > local rdma_reads > - Need Arlin to retest with new FW > 767 major swise at opengridcomputing.com Non backport Kernels > that don't build in genalloc compile errors for cxgb3 > - not a major issue (will be in RN) > > I'm beginning to think I should really fix this. I had another customer hit this issue today. The fix, however, is to _always_ build the genpool backport into the ib_core module. Is that a reasonable fix? I'd basically move the genpool backport patch into kernel_patches/fixes so it always gets applied... Thoughts? From anton.bodner at qlogic.com Wed Jan 9 07:16:15 2008 From: anton.bodner at qlogic.com (Anton Bodner) Date: Wed, 9 Jan 2008 09:16:15 -0600 Subject: [ofa-general] Multiple Apps attempting to register as report processors Message-ID: <99863D2ED484D449811D97A4C44C9CBD62D35D@EPEXCH2.qlogic.org> Hello - I am in process of writing applications using the OFED stack version 1.2.5. In my application - I want to register as a SA class report processor, and I have more than one instance of this application. So that means I have two or more apps registering as SA class report processors FOR THE SAME NODE / HCA PORT. I get an error when I attempt to do the second (and subsequent registrations). It seems as though the OFED stack implementation allows for only 1 report processor per class per node. Is that correct? Are there any plans to enhance the OFED stack to allow for multiple report processor registrations for the same node? I am trying to determine if I need to / should modify my application design. Thanks Anton Bodner Jr. QLogic Corporation (610)233-4856 anton.bodner at qlogic.com http://www.qlogic.com From hrosenstock at xsigo.com Wed Jan 9 08:03:19 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Wed, 09 Jan 2008 08:03:19 -0800 Subject: [ofa-general] Multiple Apps attempting to register as report processors In-Reply-To: <99863D2ED484D449811D97A4C44C9CBD62D35D@EPEXCH2.qlogic.org> References: <99863D2ED484D449811D97A4C44C9CBD62D35D@EPEXCH2.qlogic.org> Message-ID: <1199894600.3611.38.camel@hrosenstock-ws.xsigo.com> Anton, On Wed, 2008-01-09 at 09:16 -0600, Anton Bodner wrote: > Hello - > > I am in process of writing applications using the OFED stack version > 1.2.5. > > In my application - I want to register as a SA class report processor, > and I have more than one instance of this application. > > So that means I have two or more apps registering as SA class report > processors FOR THE SAME NODE / HCA PORT. I get an error when I attempt > to do the second (and subsequent registrations). What APIs are you using ? What is the specific error message ? > It seems as though the OFED stack implementation allows for only 1 > report processor per class per node. Is that correct? Presuming you are using unsolicited MAD registration, only one subscriber is allowed per class, method, and port (rather than node). > Are there any plans to enhance the OFED stack to allow for multiple > report processor registrations for the same node? ^^^^ port Not as far as I know; maybe others know otherwise. -- Hal > I am trying to determine if I need to / should modify my application > design. > Thanks > > > Anton Bodner Jr. > QLogic Corporation > (610)233-4856 > anton.bodner at qlogic.com > http://www.qlogic.com > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From dastclaire at norganics.com Tue Jan 8 08:20:00 2008 From: dastclaire at norganics.com (Rosario) Date: Wed, 8 Jan 2008 18:20:00 +0200 Subject: [ofa-general] Game Spectrum Message-ID: <01c85223$146bc800$38bc7b59@dastclaire> Hey You I tried to attach a picture to this email Im not sure if I did it correctly If you dont see it please Email me at Jenny at ImproveThies.info and I will retry right away Maybe we can Chat soon From john.mauro at marceybrownstein.com Tue Jan 8 08:20:00 2008 From: john.mauro at marceybrownstein.com (Means) Date: Wed, 8 Jan 2008 18:20:00 +0200 Subject: [ofa-general] Tennis racquet Kitchen Message-ID: <01c85223$146bc800$38bc7b59@john.mauro> Hey You I tried to attach a picture to this email Im not sure if I did it correctly If you dont see it please Email me at Cherry at ImproveThies.info and I will retry right away Maybe we can Chat soon From sashak at voltaire.com Wed Jan 9 09:04:23 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 9 Jan 2008 17:04:23 +0000 Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements In-Reply-To: <4784B3A2.5080007@dev.mellanox.co.il> References: <4770CDCE.8040200@dev.mellanox.co.il> <20071229182718.GA19160@sashak.voltaire.com> <4784B3A2.5080007@dev.mellanox.co.il> Message-ID: <20080109170423.GC20963@sashak.voltaire.com> Hi Yevgeny, On 13:44 Wed 09 Jan , Yevgeny Kliteynik wrote: > Hi Sasha, > > Please see below. > > Sasha Khapyorsky wrote: > > This improves handling of mcast join/leave requests storming. Now mcast > > routing will be recalculated for all mcast groups where changes occurred > > and not one by one. For this it queues mcast groups instead of mcast > > rerouting requests, this also makes state_mgr idle queue obsolete. > > Signed-off-by: Sasha Khapyorsky > > --- > > Hi Yevgeny, > > For me it looks that it should solve the original problem (mcast group > > list is purged in osm_mcast_mgr_process()). Could you review and ideally > > test it? Thanks. > > Sasha > > diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c > > index 50b95fd..f51a45a 100644 > > --- a/opensm/opensm/osm_mcast_mgr.c > > +++ b/opensm/opensm/osm_mcast_mgr.c > > @@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * > > const p_mgr, > > port_guid); > > } > > - Exit: > > +Exit: > > OSM_LOG_EXIT(p_mgr->p_log); > > return (status); > > } > > /********************************************************************** > > Process the entire group. > > - > > NOTE : The lock should be held externally! > > **********************************************************************/ > > -static osm_signal_t > > -osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr, > > - IN osm_mgrp_t * const p_mgrp, > > - IN osm_mcast_req_type_t req_type, > > - IN ib_net64_t port_guid) > > +static ib_api_status_t > > +mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr, > > + IN osm_mgrp_t * const p_mgrp, > > + IN osm_mcast_req_type_t req_type, > > + IN ib_net64_t port_guid) > > { > > - osm_signal_t signal = OSM_SIGNAL_DONE; > > ib_api_status_t status; > > - osm_switch_t *p_sw; > > - cl_qmap_t *p_sw_tbl; > > - boolean_t pending_transactions = FALSE; > > OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp); > > - p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; > > - > > status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, port_guid); > > if (status != IB_SUCCESS) { > > osm_log(p_mgr->p_log, OSM_LOG_ERROR, > > - "osm_mcast_mgr_process_mgrp: ERR 0A19: " > > + "mcast_mgr_process_mgrp: ERR 0A19: " > > "Unable to create spanning tree (%s)\n", > > ib_get_err_str(status)); > > - > > goto Exit; > > } > > + p_mgrp->last_tree_id = p_mgrp->last_change_id; > > - /* > > - Walk the switches and download the tables for each. > > + /* Remove MGRP only if osm_mcm_port_t count is 0 and > > + * Not a well known group > > */ > > - p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl); > > - while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) { > > - signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw); > > - if (signal == OSM_SIGNAL_DONE_PENDING) > > - pending_transactions = TRUE; > > - > > - p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item); > > + if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) { > > + osm_log(p_mgr->p_log, OSM_LOG_DEBUG, > > + "mcast_mgr_process_mgrp: " > > + "Destroying mgrp with lid:0x%X\n", > > + cl_ntoh16(p_mgrp->mlid)); > > + /* Send a Report to any InformInfo registered for > > + Trap 67 : MCGroup delete */ > > + osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log, > > + p_mgrp); > > + cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl, > > + (cl_map_item_t *) p_mgrp); > > + osm_mgrp_delete(p_mgrp); > > If the group is empty, p_mgrp is deleted > > > @@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN > > osm_mcast_mgr_t * const p_mgr) > > osm_switch_t *p_sw; > > cl_qmap_t *p_sw_tbl; > > cl_qmap_t *p_mcast_tbl; > > + cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list; > > osm_mgrp_t *p_mgrp; > > - ib_api_status_t status; > > boolean_t pending_transactions = FALSE; > > OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process); > > p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl; > > - > > p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl; > > /* > > While holding the lock, iterate over all the established > > @@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN > > osm_mcast_mgr_t * const p_mgr) > > /* We reached here due to some change that caused a heavy sweep > > of the subnet. Not due to a specific multicast request. > > So the request type is subnet_change and the port guid is 0. */ > > - status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, > > - OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, > > - 0); > > - if (status != IB_SUCCESS) { > > - osm_log(p_mgr->p_log, OSM_LOG_ERROR, > > - "osm_mcast_mgr_process: ERR 0A20: " > > - "Unable to create spanning tree (%s)\n", > > - ib_get_err_str(status)); > > - } > > - > > + mcast_mgr_process_mgrp(p_mgr, p_mgrp, > > + OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0); > > p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item); > > And here there's a call to 'next' on a p_mgrp that was freed, > which eventually causes osm to crash on some segfault or assert > at some point. Nice catch! Thanks for the fix! Sasha From sashak at voltaire.com Wed Jan 9 09:13:50 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 9 Jan 2008 17:13:50 +0000 Subject: [ofa-general] Re: [PATCH] opensm/osm_mcast_mgr.c: fixing a seg. fault in processing mcast groups In-Reply-To: <4784C99B.5080606@dev.mellanox.co.il> References: <4784C99B.5080606@dev.mellanox.co.il> Message-ID: <20080109171350.GD20963@sashak.voltaire.com> On 15:18 Wed 09 Jan , Yevgeny Kliteynik wrote: > Sasha, > > This patch fixes a seg. fault in processing mcast groups > that I mentioned in my mail previously. > Feel free to replace it with more elegant solution. The patch looks fine for me. Just changed p_tmp_mgrp to p_next_mgrp. > > Please apply it to master and ofed_1_3. > > -- Yevgeny > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From weiny2 at llnl.gov Wed Jan 9 09:25:50 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 9 Jan 2008 09:25:50 -0800 Subject: [ofa-general] Re: [PATCH] infiniband-diags/saquery: attribute names support In-Reply-To: <20080109032401.GA20963@sashak.voltaire.com> References: <20080109032401.GA20963@sashak.voltaire.com> Message-ID: <20080109092550.50c68032.weiny2@llnl.gov> Sasha, I found this to not work completely for me: 09:23:20 > ./saquery NR 8 root at wopri:~/OpenIB/git-trees/root/sbin 09:23:26 > ./saquery NR wopr3 root at wopri:~/OpenIB/git-trees/root/sbin 09:23:30 > ./saquery -N wopr3 NodeRecord dump: lid.....................0x8 reserved................0x0 base_version............0x1 class_version...........0x1 node_type...............Channel Adapter num_ports...............0x2 sys_guid................0x0002c902002265ef node_guid...............0x0002c902002265ec port_guid...............0x0002c902002265ed partition_cap...........0x40 device_id...............0x6282 revision................0xA0 port_num................0x1 vendor_id...............0x2C9 NodeDescription.........wopr3 I added the following patch. diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c index 9863860..26cb0d8 100644 --- a/infiniband-diags/src/saquery.c +++ b/infiniband-diags/src/saquery.c @@ -1389,6 +1389,7 @@ main(int argc, char **argv) else { query_type = q->query_type; argc--; + argv++; } } And things work out. 09:23:34 > ./saquery NR wopr3 NodeRecord dump: lid.....................0x8 reserved................0x0 base_version............0x1 class_version...........0x1 node_type...............Channel Adapter num_ports...............0x2 sys_guid................0x0002c902002265ef node_guid...............0x0002c902002265ec port_guid...............0x0002c902002265ed partition_cap...........0x40 device_id...............0x6282 revision................0xA0 port_num................0x1 vendor_id...............0x2C9 NodeDescription.........wopr3 Ira On Wed, 9 Jan 2008 03:24:02 +0000 Sasha Khapyorsky wrote: > > This let to pass requested via command line SA attribute by name. > Examples: > > saquery NodeRecord > saquery NR > > Main motivation for this addition is that I cannot find appropriate free > characters for adding new attributes (specifically PKeyTableRecord and > SL2VLTableRecord - p, P, s, S are used already). > > This preserves a command line options currently used for same > purposes. > > Signed-off-by: Sasha Khapyorsky > --- > infiniband-diags/src/saquery.c | 54 ++++++++++++++++++++++++++++++++++++++- > 1 files changed, 52 insertions(+), 2 deletions(-) > > diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c > index 2017a86..9863860 100644 > --- a/infiniband-diags/src/saquery.c > +++ b/infiniband-diags/src/saquery.c > @@ -91,7 +91,6 @@ ib_net16_t requested_lid = 0; > int requested_lid_flag = 0; > ib_net64_t requested_guid = 0; > int requested_guid_flag = 0; > -ib_net16_t query_type = IB_MAD_ATTR_NODE_RECORD; > > /** > * Call back for the various record requests. > @@ -1144,12 +1143,46 @@ clean_up(void) > osm_vendor_delete(&vendor); > } > > +struct query_cmd { > + const char *name, *alias; > + ib_net16_t query_type; > + int (*handler)(const char *name, osm_bind_handle_t bind_handle, > + char *from, char *to); > +}; > + > +static const struct query_cmd query_cmds[] = { > + { "ClassPortInfo", "CPI", IB_MAD_ATTR_CLASS_PORT_INFO, }, > + { "NodeRecord", "NR", IB_MAD_ATTR_NODE_RECORD, }, > + { "PortInfoRecord", "PIR", IB_MAD_ATTR_PORTINFO_RECORD, }, > + { "InformInfoRecord", "IIR", IB_MAD_ATTR_INFORM_INFO_RECORD, }, > + { "LinkRecord", "LR", IB_MAD_ATTR_LINK_RECORD, }, > + { "ServiceRecord", "SR", IB_MAD_ATTR_SERVICE_RECORD, }, > + { "PathRecord", "PR", IB_MAD_ATTR_PATH_RECORD, }, > + { "MCMemberRecord", "MCMR", IB_MAD_ATTR_MCMEMBER_RECORD, }, > + { 0 } > +}; > + > +static const struct query_cmd *find_query(const char *name) > +{ > + const struct query_cmd *q; > + unsigned len = strlen(name); > + > + for (q = query_cmds; q->name; q++) > + if (!strncasecmp(name, q->name, len) || > + (q->alias && !strncasecmp(name, q->alias, len))) > + return q; > + > + return NULL; > +} > + > static void > usage(void) > { > + const struct query_cmd *q; > + > fprintf(stderr, "Usage: %s [-h -d -p -N] [--list | -D] [-S -I -L -l -G" > " -O -U -c -s -g -m --src-to-dst --sgid-to-dgid " > - "-C -P -t(imeout) ] [ | | ]\n", > + "-C -P -t(imeout) ] [query-name] [ | | ]\n", > argv0); > fprintf(stderr, " Queries node records by default\n"); > fprintf(stderr, " -d enable debugging\n"); > @@ -1184,6 +1217,12 @@ usage(void) > "response timeout (default %u msec)\n", > DEFAULT_SA_TIMEOUT_MS); > fprintf(stderr, " --node-name-map specify a node name map\n"); > + fprintf(stderr, "\n Supported query names (and aliases):\n"); > + for (q = query_cmds; q->name; q++) > + fprintf(stderr, " %s (%s)\n", q->name, > + q->alias ? q->alias : ""); > + fprintf(stderr, "\n"); > + > exit(-1); > } > > @@ -1193,10 +1232,12 @@ main(int argc, char **argv) > int ch = 0; > int members = 0; > osm_bind_handle_t bind_handle; > + const struct query_cmd *q; > char *src = NULL; > char *dst = NULL; > char *sgid = NULL; > char *dgid = NULL; > + ib_net16_t query_type = 0; > ib_net16_t src_lid; > ib_net16_t dst_lid; > ib_api_status_t status; > @@ -1342,6 +1383,15 @@ main(int argc, char **argv) > argc -= optind; > argv += optind; > > + if (!query_type) { > + if (!argc || !(q = find_query(argv[0]))) > + query_type = IB_MAD_ATTR_NODE_RECORD; > + else { > + query_type = q->query_type; > + argc--; > + } > + } > + > if (argc) { > if (node_print_desc == NAME_OF_LID) { > requested_lid = (ib_net16_t)strtoul(argv[0], NULL, 0); > -- > 1.5.4.rc2.38.gd6da3 From sean.hefty at intel.com Wed Jan 9 09:31:27 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 9 Jan 2008 09:31:27 -0800 Subject: [ofa-general] RE: librdmacm/man: fix-up man pages In-Reply-To: <47847E60.80208@voltaire.com> References: <000101c81a64$3582de80$9c98070a@amr.corp.intel.com> <4726EEAC.3070105@voltaire.com> <472755C4.10600@ichips.intel.com> <47285F53.4060402@voltaire.com> <4728BF4A.1060301@ichips.intel.com> <15ddcffd0710311320v6b91b3cm3be0f7882e30ad2b@mail.gmail.com> <000001c81cb5$4ce12160$9c98070a@amr.corp.intel.com> <15ddcffd0711270435t12a18dc3waac2596b3884ac72@mail.gmail.com> <000001c8311a$176cdbe0$63248686@amr.corp.intel.com> <15ddcffd0711280307u7a89c6c2q2854b071f74d9123@mail.gmail.com> <000801c832b6$81feb850$f5d8180a@amr.corp.intel.com> <475FD984.6080203@voltaire.com> <47837BFA.7040402@voltaire.com> <000001c8521d$a05df320$a937170a@amr.corp.intel.com> <47847E60.80208@voltaire.com> Message-ID: <000101c852e5$776befd0$3cd9180a@amr.corp.intel.com> >mentioning that rdma_disconnect applies only to the connected service, >do you think its obvious? I was viewing this as covered by the text: "Disconnects a connection.." - Sean From sashak at voltaire.com Wed Jan 9 09:57:29 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 9 Jan 2008 17:57:29 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/saquery: attribute names support In-Reply-To: <20080109092550.50c68032.weiny2@llnl.gov> References: <20080109032401.GA20963@sashak.voltaire.com> <20080109092550.50c68032.weiny2@llnl.gov> Message-ID: <20080109175729.GE20963@sashak.voltaire.com> Hi Ira, On 09:25 Wed 09 Jan , Ira Weiny wrote: > Sasha, > > I found this to not work completely for me: > > 09:23:20 > ./saquery NR 8 > root at wopri:~/OpenIB/git-trees/root/sbin Yes, this version of the patch was buggy (argv++ is missing, but I see - you found this too :)). I will repost a fixed version. Thanks for looking at this. Sasha > 09:23:26 > ./saquery NR wopr3 > root at wopri:~/OpenIB/git-trees/root/sbin > 09:23:30 > ./saquery -N wopr3 > NodeRecord dump: > lid.....................0x8 > reserved................0x0 > base_version............0x1 > class_version...........0x1 > node_type...............Channel Adapter > num_ports...............0x2 > sys_guid................0x0002c902002265ef > node_guid...............0x0002c902002265ec > port_guid...............0x0002c902002265ed > partition_cap...........0x40 > device_id...............0x6282 > revision................0xA0 > port_num................0x1 > vendor_id...............0x2C9 > NodeDescription.........wopr3 > > I added the following patch. > > diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c > index 9863860..26cb0d8 100644 > --- a/infiniband-diags/src/saquery.c > +++ b/infiniband-diags/src/saquery.c > @@ -1389,6 +1389,7 @@ main(int argc, char **argv) > else { > query_type = q->query_type; > argc--; > + argv++; > } > } > > > And things work out. > > 09:23:34 > ./saquery NR wopr3 > NodeRecord dump: > lid.....................0x8 > reserved................0x0 > base_version............0x1 > class_version...........0x1 > node_type...............Channel Adapter > num_ports...............0x2 > sys_guid................0x0002c902002265ef > node_guid...............0x0002c902002265ec > port_guid...............0x0002c902002265ed > partition_cap...........0x40 > device_id...............0x6282 > revision................0xA0 > port_num................0x1 > vendor_id...............0x2C9 > NodeDescription.........wopr3 > > Ira > > On Wed, 9 Jan 2008 03:24:02 +0000 > Sasha Khapyorsky wrote: > > > > > This let to pass requested via command line SA attribute by name. > > Examples: > > > > saquery NodeRecord > > saquery NR > > > > Main motivation for this addition is that I cannot find appropriate free > > characters for adding new attributes (specifically PKeyTableRecord and > > SL2VLTableRecord - p, P, s, S are used already). > > > > This preserves a command line options currently used for same > > purposes. > > > > Signed-off-by: Sasha Khapyorsky > > --- > > infiniband-diags/src/saquery.c | 54 ++++++++++++++++++++++++++++++++++++++- > > 1 files changed, 52 insertions(+), 2 deletions(-) > > > > diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c > > index 2017a86..9863860 100644 > > --- a/infiniband-diags/src/saquery.c > > +++ b/infiniband-diags/src/saquery.c > > @@ -91,7 +91,6 @@ ib_net16_t requested_lid = 0; > > int requested_lid_flag = 0; > > ib_net64_t requested_guid = 0; > > int requested_guid_flag = 0; > > -ib_net16_t query_type = IB_MAD_ATTR_NODE_RECORD; > > > > /** > > * Call back for the various record requests. > > @@ -1144,12 +1143,46 @@ clean_up(void) > > osm_vendor_delete(&vendor); > > } > > > > +struct query_cmd { > > + const char *name, *alias; > > + ib_net16_t query_type; > > + int (*handler)(const char *name, osm_bind_handle_t bind_handle, > > + char *from, char *to); > > +}; > > + > > +static const struct query_cmd query_cmds[] = { > > + { "ClassPortInfo", "CPI", IB_MAD_ATTR_CLASS_PORT_INFO, }, > > + { "NodeRecord", "NR", IB_MAD_ATTR_NODE_RECORD, }, > > + { "PortInfoRecord", "PIR", IB_MAD_ATTR_PORTINFO_RECORD, }, > > + { "InformInfoRecord", "IIR", IB_MAD_ATTR_INFORM_INFO_RECORD, }, > > + { "LinkRecord", "LR", IB_MAD_ATTR_LINK_RECORD, }, > > + { "ServiceRecord", "SR", IB_MAD_ATTR_SERVICE_RECORD, }, > > + { "PathRecord", "PR", IB_MAD_ATTR_PATH_RECORD, }, > > + { "MCMemberRecord", "MCMR", IB_MAD_ATTR_MCMEMBER_RECORD, }, > > + { 0 } > > +}; > > + > > +static const struct query_cmd *find_query(const char *name) > > +{ > > + const struct query_cmd *q; > > + unsigned len = strlen(name); > > + > > + for (q = query_cmds; q->name; q++) > > + if (!strncasecmp(name, q->name, len) || > > + (q->alias && !strncasecmp(name, q->alias, len))) > > + return q; > > + > > + return NULL; > > +} > > + > > static void > > usage(void) > > { > > + const struct query_cmd *q; > > + > > fprintf(stderr, "Usage: %s [-h -d -p -N] [--list | -D] [-S -I -L -l -G" > > " -O -U -c -s -g -m --src-to-dst --sgid-to-dgid " > > - "-C -P -t(imeout) ] [ | | ]\n", > > + "-C -P -t(imeout) ] [query-name] [ | | ]\n", > > argv0); > > fprintf(stderr, " Queries node records by default\n"); > > fprintf(stderr, " -d enable debugging\n"); > > @@ -1184,6 +1217,12 @@ usage(void) > > "response timeout (default %u msec)\n", > > DEFAULT_SA_TIMEOUT_MS); > > fprintf(stderr, " --node-name-map specify a node name map\n"); > > + fprintf(stderr, "\n Supported query names (and aliases):\n"); > > + for (q = query_cmds; q->name; q++) > > + fprintf(stderr, " %s (%s)\n", q->name, > > + q->alias ? q->alias : ""); > > + fprintf(stderr, "\n"); > > + > > exit(-1); > > } > > > > @@ -1193,10 +1232,12 @@ main(int argc, char **argv) > > int ch = 0; > > int members = 0; > > osm_bind_handle_t bind_handle; > > + const struct query_cmd *q; > > char *src = NULL; > > char *dst = NULL; > > char *sgid = NULL; > > char *dgid = NULL; > > + ib_net16_t query_type = 0; > > ib_net16_t src_lid; > > ib_net16_t dst_lid; > > ib_api_status_t status; > > @@ -1342,6 +1383,15 @@ main(int argc, char **argv) > > argc -= optind; > > argv += optind; > > > > + if (!query_type) { > > + if (!argc || !(q = find_query(argv[0]))) > > + query_type = IB_MAD_ATTR_NODE_RECORD; > > + else { > > + query_type = q->query_type; > > + argc--; > > + } > > + } > > + > > if (argc) { > > if (node_print_desc == NAME_OF_LID) { > > requested_lid = (ib_net16_t)strtoul(argv[0], NULL, 0); > > -- > > 1.5.4.rc2.38.gd6da3 From sashak at voltaire.com Wed Jan 9 09:59:02 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 9 Jan 2008 17:59:02 +0000 Subject: [ofa-general] [PATCH v2] infiniband-diags/saquery: attribute names support In-Reply-To: <20080109032401.GA20963@sashak.voltaire.com> References: <20080109032401.GA20963@sashak.voltaire.com> Message-ID: <20080109175902.GF20963@sashak.voltaire.com> This let to pass requested via command line SA attribute by name. Examples: saquery NodeRecord saquery NR Main motivation for this addition is that I cannot find appropriate free characters for adding new attributes (specifically PKeyTableRecord and SL2VLTableRecord - p, P, s, S are used already). This preserves a command line options currently used for same purposes. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/src/saquery.c | 55 ++++++++++++++++++++++++++++++++++++++- 1 files changed, 53 insertions(+), 2 deletions(-) diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c index 2017a86..26cb0d8 100644 --- a/infiniband-diags/src/saquery.c +++ b/infiniband-diags/src/saquery.c @@ -91,7 +91,6 @@ ib_net16_t requested_lid = 0; int requested_lid_flag = 0; ib_net64_t requested_guid = 0; int requested_guid_flag = 0; -ib_net16_t query_type = IB_MAD_ATTR_NODE_RECORD; /** * Call back for the various record requests. @@ -1144,12 +1143,46 @@ clean_up(void) osm_vendor_delete(&vendor); } +struct query_cmd { + const char *name, *alias; + ib_net16_t query_type; + int (*handler)(const char *name, osm_bind_handle_t bind_handle, + char *from, char *to); +}; + +static const struct query_cmd query_cmds[] = { + { "ClassPortInfo", "CPI", IB_MAD_ATTR_CLASS_PORT_INFO, }, + { "NodeRecord", "NR", IB_MAD_ATTR_NODE_RECORD, }, + { "PortInfoRecord", "PIR", IB_MAD_ATTR_PORTINFO_RECORD, }, + { "InformInfoRecord", "IIR", IB_MAD_ATTR_INFORM_INFO_RECORD, }, + { "LinkRecord", "LR", IB_MAD_ATTR_LINK_RECORD, }, + { "ServiceRecord", "SR", IB_MAD_ATTR_SERVICE_RECORD, }, + { "PathRecord", "PR", IB_MAD_ATTR_PATH_RECORD, }, + { "MCMemberRecord", "MCMR", IB_MAD_ATTR_MCMEMBER_RECORD, }, + { 0 } +}; + +static const struct query_cmd *find_query(const char *name) +{ + const struct query_cmd *q; + unsigned len = strlen(name); + + for (q = query_cmds; q->name; q++) + if (!strncasecmp(name, q->name, len) || + (q->alias && !strncasecmp(name, q->alias, len))) + return q; + + return NULL; +} + static void usage(void) { + const struct query_cmd *q; + fprintf(stderr, "Usage: %s [-h -d -p -N] [--list | -D] [-S -I -L -l -G" " -O -U -c -s -g -m --src-to-dst --sgid-to-dgid " - "-C -P -t(imeout) ] [ | | ]\n", + "-C -P -t(imeout) ] [query-name] [ | | ]\n", argv0); fprintf(stderr, " Queries node records by default\n"); fprintf(stderr, " -d enable debugging\n"); @@ -1184,6 +1217,12 @@ usage(void) "response timeout (default %u msec)\n", DEFAULT_SA_TIMEOUT_MS); fprintf(stderr, " --node-name-map specify a node name map\n"); + fprintf(stderr, "\n Supported query names (and aliases):\n"); + for (q = query_cmds; q->name; q++) + fprintf(stderr, " %s (%s)\n", q->name, + q->alias ? q->alias : ""); + fprintf(stderr, "\n"); + exit(-1); } @@ -1193,10 +1232,12 @@ main(int argc, char **argv) int ch = 0; int members = 0; osm_bind_handle_t bind_handle; + const struct query_cmd *q; char *src = NULL; char *dst = NULL; char *sgid = NULL; char *dgid = NULL; + ib_net16_t query_type = 0; ib_net16_t src_lid; ib_net16_t dst_lid; ib_api_status_t status; @@ -1342,6 +1383,16 @@ main(int argc, char **argv) argc -= optind; argv += optind; + if (!query_type) { + if (!argc || !(q = find_query(argv[0]))) + query_type = IB_MAD_ATTR_NODE_RECORD; + else { + query_type = q->query_type; + argc--; + argv++; + } + } + if (argc) { if (node_print_desc == NAME_OF_LID) { requested_lid = (ib_net16_t)strtoul(argv[0], NULL, 0); -- 1.5.4.rc2.38.gd6da3 From brian.budge at gmail.com Wed Jan 9 10:04:43 2008 From: brian.budge at gmail.com (Brian Budge) Date: Wed, 9 Jan 2008 10:04:43 -0800 Subject: [ofa-general] strange problem with infiniband and GPUs Message-ID: <5b7094580801091004l352877f1wb7b562d67a024d17@mail.gmail.com> Hi all - I'm new to the list, and I hope this is the correct place to post this. I am running an MPI application which uses CUDA and NVIDIA GPUs to accelerate computation. I am using mvapich2 to get multi-thread-safe MPI with infiniband. If I run mvapich2 configured for tcp, my application runs fine (or if I run it in single node mode without MPI), but if I run it configured for infiniband, my application fails on GPU initialization about 80% of the time (the other 20% of the time, my application runs fine to completion). I'm not sure what could be happening. I'm not sure if somehow one of the infiniband drivers could be interacting with the nvidia driver? Thanks for any help, Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From weiny2 at llnl.gov Wed Jan 9 10:35:01 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 9 Jan 2008 10:35:01 -0800 Subject: [ofa-general] ***SPAM*** [PATCH] Fix perfmgr enable in spec file. Message-ID: <20080109103501.57d84fd1.weiny2@llnl.gov> >From f419375b725d00b9b6bfe37e335fa1e0e4259174 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Wed, 9 Jan 2008 10:31:48 -0800 Subject: [PATCH] Fix perfmgr enable in spec file. Signed-off-by: Ira K. Weiny --- opensm/opensm.spec.in | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/opensm/opensm.spec.in b/opensm/opensm.spec.in index 4fe2ea4..20bcc7b 100644 --- a/opensm/opensm.spec.in +++ b/opensm/opensm.spec.in @@ -75,6 +75,8 @@ Static version of the opensm libraries %configure \ %{?_enable_console_socket} \ %{?_disable_console_socket} \ + %{?_enable_perf_mgr} \ + %{?_disable_perf_mgr} \ %{?_enable_event_plugin} \ %{?_disable_event_plugin} make %{?_smp_mflags} -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Fix-perfmgr-enable-in-spec-file.patch Type: application/octet-stream Size: 793 bytes Desc: not available URL: From anton.bodner at qlogic.com Wed Jan 9 10:55:24 2008 From: anton.bodner at qlogic.com (Anton Bodner) Date: Wed, 9 Jan 2008 12:55:24 -0600 Subject: [ofa-general] Multiple Apps attempting to register as reportprocessors In-Reply-To: <1199894600.3611.38.camel@hrosenstock-ws.xsigo.com> References: <99863D2ED484D449811D97A4C44C9CBD62D35D@EPEXCH2.qlogic.org> <1199894600.3611.38.camel@hrosenstock-ws.xsigo.com> Message-ID: <99863D2ED484D449811D97A4C44C9CBD62D3B7@EPEXCH2.qlogic.org> We are using the umad interface [umad_register ]... with the capability bit IB_MAD_METHOD_REPORT set. The method returns -1 (EPERM), and the kernel logs the error: Ib_mad:Method 6 already in use. And yes - I did mean (as you suggest)one subscriber per class, method and PORT. >From your feedback - it seems as though I need to make some mods to my application. Anton Bodner Jr. QLogic Corporation (610)233-4856 anton.bodner at qlogic.com http://www.qlogic.com -----Original Message----- From: Hal Rosenstock [mailto:hrosenstock at xsigo.com] Sent: Wednesday, January 09, 2008 11:03 AM To: Anton Bodner Cc: general at lists.openfabrics.org Subject: Re: [ofa-general] Multiple Apps attempting to register as reportprocessors Anton, On Wed, 2008-01-09 at 09:16 -0600, Anton Bodner wrote: > Hello - > > I am in process of writing applications using the OFED stack version > 1.2.5. > > In my application - I want to register as a SA class report processor, > and I have more than one instance of this application. > > So that means I have two or more apps registering as SA class report > processors FOR THE SAME NODE / HCA PORT. I get an error when I attempt > to do the second (and subsequent registrations). What APIs are you using ? What is the specific error message ? > It seems as though the OFED stack implementation allows for only 1 > report processor per class per node. Is that correct? Presuming you are using unsolicited MAD registration, only one subscriber is allowed per class, method, and port (rather than node). > Are there any plans to enhance the OFED stack to allow for multiple > report processor registrations for the same node? ^^^^ port Not as far as I know; maybe others know otherwise. -- Hal > I am trying to determine if I need to / should modify my application > design. > Thanks > > > Anton Bodner Jr. > QLogic Corporation > (610)233-4856 > anton.bodner at qlogic.com > http://www.qlogic.com > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Wed Jan 9 11:39:14 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 9 Jan 2008 19:39:14 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/saquery: support for SL2VLTableRecord attribute In-Reply-To: <20080109175902.GF20963@sashak.voltaire.com> References: <20080109032401.GA20963@sashak.voltaire.com> <20080109175902.GF20963@sashak.voltaire.com> Message-ID: <20080109193914.GG20963@sashak.voltaire.com> This adds support for SL2VLTableRecord attribute, The port numbers could be optionally specified in command line together with LID in follow format: /[in-port]/[out-port]. Examples: saquery SL2VLTableRecord 28 - query lid 28 saquery SL2VL 28/1/3 - query lid 28, in-port 1, out-port 3 saquery SL2VL 28//3 - query lid 28, out-port 3 Signed-off-by: Sasha Khapyorsky --- infiniband-diags/src/saquery.c | 122 ++++++++++++++++++++++++++++++++++++--- 1 files changed, 112 insertions(+), 10 deletions(-) diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c index 26cb0d8..23c2d92 100644 --- a/infiniband-diags/src/saquery.c +++ b/infiniband-diags/src/saquery.c @@ -57,6 +57,13 @@ #include "ibdiag_common.h" +struct query_cmd { + const char *name, *alias; + ib_net16_t query_type; + int (*handler)(const struct query_cmd *q, osm_bind_handle_t bind_handle, + int argc, char *argv[]); +}; + char *argv0 = "saquery"; static char *node_name_map_file = NULL; @@ -568,6 +575,27 @@ static void dump_one_link_record(ib_link_record_t *lr) lr->to_port_num, cl_ntoh16(lr->to_lid)); } +static void dump_one_slvl_record(ib_slvl_table_record_t *slvl) +{ + ib_slvl_table_t *t = &slvl->slvl_tbl; + printf("SL2VLTableRecord dump:\n" + "\t\tLID....................%u\n" + "\t\tInPort...................%u\n" + "\t\tOutPort.....................%u\n" + "\t\tSL: 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|\n" + "\t\tVL:%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u" + "|%2u|%2u|%2u|\n", + cl_ntoh16(slvl->lid), slvl->in_port_num, slvl->out_port_num, + ib_slvl_table_get(t, 0), ib_slvl_table_get(t, 1), + ib_slvl_table_get(t, 2), ib_slvl_table_get(t, 3), + ib_slvl_table_get(t, 4), ib_slvl_table_get(t, 5), + ib_slvl_table_get(t, 6), ib_slvl_table_get(t, 7), + ib_slvl_table_get(t, 8), ib_slvl_table_get(t, 9), + ib_slvl_table_get(t, 10), ib_slvl_table_get(t, 11), + ib_slvl_table_get(t, 12), ib_slvl_table_get(t, 13), + ib_slvl_table_get(t, 14), ib_slvl_table_get(t, 15)); +} + static void return_mad(void) { @@ -737,6 +765,33 @@ static ib_api_status_t get_link_records(osm_bind_handle_t bind_handle, ib_get_attr_offset(sizeof(ib_link_record_t)), 0); } +static ib_api_status_t get_slvl_records(osm_bind_handle_t bind_handle, + int lid, int in_port, int out_port) +{ + ib_slvl_table_record_t slvl; + ib_net64_t comp_mask; + + memset(&slvl, 0, sizeof(slvl)); + comp_mask = 0; + + if (lid > 0) { + slvl.lid = cl_hton16(lid); + comp_mask |= IB_SLVL_COMPMASK_LID; + } + if (in_port >= 0) { + slvl.in_port_num = in_port; + comp_mask |= IB_SLVL_COMPMASK_IN_PORT; + } + if (out_port >= 0) { + slvl.out_port_num = out_port; + comp_mask |= IB_SLVL_COMPMASK_OUT_PORT; + } + + return get_any_records(bind_handle, IB_MAD_ATTR_SLVL_RECORD, 0, + comp_mask, &slvl, + ib_get_attr_offset(sizeof(ib_slvl_table_record_t)), 0); +} + static ib_api_status_t print_node_records(osm_bind_handle_t bind_handle) { @@ -1069,6 +1124,54 @@ print_link_records(osm_bind_handle_t bind_handle, char *from, char *to) return status; } +static int +print_sl2vl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, + int argc, char *argv[]) +{ + int i; + ib_slvl_table_record_t *slvl; + int lid = 0, in_port = -1, out_port = -1; + ib_api_status_t status; + + char *p, *s, *e; + + if (argc < 1) + goto _query; + + p = argv[0]; + s = strchr(p, '/'); + if (s) *s = '\0'; + lid = get_lid(bind_handle, p); + + if (!s) + goto _query; + p = s + 1; + s = strchr(p, '/'); + if (s) *s = '\0'; + in_port = strtoul(p, &e, 0); + if (e == p) + in_port = -1; + + if (!s) + goto _query; + p = s + 1; + out_port = strtoul(p, &e, 0); + if (e == p) + out_port = -1; + +_query: + status = get_slvl_records(bind_handle, lid, in_port, out_port); + if (status != IB_SUCCESS) + return status; + + for (i = 0; i < result.result_cnt; i++) { + slvl = osmv_get_query_result(result.p_result_madw, i); + dump_one_slvl_record(slvl); + } + return_mad(); + return status; +} + static osm_bind_handle_t get_bind_handle(void) { @@ -1143,17 +1246,12 @@ clean_up(void) osm_vendor_delete(&vendor); } -struct query_cmd { - const char *name, *alias; - ib_net16_t query_type; - int (*handler)(const char *name, osm_bind_handle_t bind_handle, - char *from, char *to); -}; - static const struct query_cmd query_cmds[] = { { "ClassPortInfo", "CPI", IB_MAD_ATTR_CLASS_PORT_INFO, }, { "NodeRecord", "NR", IB_MAD_ATTR_NODE_RECORD, }, { "PortInfoRecord", "PIR", IB_MAD_ATTR_PORTINFO_RECORD, }, + { "SL2VLTableRecord", "SL2VL", IB_MAD_ATTR_SLVL_RECORD, + print_sl2vl_records }, { "InformInfoRecord", "IIR", IB_MAD_ATTR_INFORM_INFO_RECORD, }, { "LinkRecord", "LR", IB_MAD_ATTR_LINK_RECORD, }, { "ServiceRecord", "SR", IB_MAD_ATTR_SERVICE_RECORD, }, @@ -1232,7 +1330,7 @@ main(int argc, char **argv) int ch = 0; int members = 0; osm_bind_handle_t bind_handle; - const struct query_cmd *q; + const struct query_cmd *q = NULL; char *src = NULL; char *dst = NULL; char *sgid = NULL; @@ -1484,8 +1582,12 @@ main(int argc, char **argv) status = print_link_records(bind_handle, src, dst); break; default: - fprintf(stderr, "Unknown query type %d\n", query_type); - status = IB_UNKNOWN_ERROR; + if (q && q->handler) + status = q->handler(q, bind_handle, argc, argv); + else { + fprintf(stderr, "Unknown query type %d\n", query_type); + status = IB_UNKNOWN_ERROR; + } break; } -- 1.5.4.rc2.38.gd6da3 From sashak at voltaire.com Wed Jan 9 11:41:53 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 9 Jan 2008 19:41:53 +0000 Subject: [ofa-general] [PATCH] opensm/osm_sa_slvl_record: fix overflow crash Message-ID: <20080109194153.GH20963@sashak.voltaire.com> When SL2VLTableRecord is requested for switch by lid only (no in and out ports are selected in compmask) it overflows its own physical ports table. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_sa_slvl_record.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/opensm/opensm/osm_sa_slvl_record.c b/opensm/opensm/osm_sa_slvl_record.c index 28dddd4..cc21765 100644 --- a/opensm/opensm/osm_sa_slvl_record.c +++ b/opensm/opensm/osm_sa_slvl_record.c @@ -149,9 +149,9 @@ __osm_sa_slvl_by_comp_mask(IN osm_sa_t * sa, comp_mask = p_ctxt->comp_mask; num_ports = osm_node_get_num_physp(p_port->p_node); in_port_start = 0; - in_port_end = num_ports; + in_port_end = num_ports - 1; out_port_start = 0; - out_port_end = num_ports; + out_port_end = num_ports - 1; p_req_physp = p_ctxt->p_req_physp; if (p_port->p_node->node_info.node_type != IB_NODE_TYPE_SWITCH) { -- 1.5.4.rc2.38.gd6da3 From sashak at voltaire.com Wed Jan 9 11:56:19 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 9 Jan 2008 19:56:19 +0000 Subject: [ofa-general] Re: [PATCH] Fix perfmgr enable in spec file. In-Reply-To: <20080109103501.57d84fd1.weiny2@llnl.gov> References: <20080109103501.57d84fd1.weiny2@llnl.gov> Message-ID: <20080109195619.GK20963@sashak.voltaire.com> On 10:35 Wed 09 Jan , Ira Weiny wrote: > From f419375b725d00b9b6bfe37e335fa1e0e4259174 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Wed, 9 Jan 2008 10:31:48 -0800 > Subject: [PATCH] Fix perfmgr enable in spec file. > > > Signed-off-by: Ira K. Weiny Applied (for OFED-1.3 too). Thanks. Sasha From small at cs.fsu.edu Wed Jan 9 11:51:47 2008 From: small at cs.fsu.edu (small at cs.fsu.edu) Date: Wed, 9 Jan 2008 14:51:47 -0500 Subject: [ofa-general] Problems running dtest - DAT_INVALID_ADDRESS Message-ID: <20080109145147.m22cb935wg0o0sso@system.cs.fsu.edu> I am VERY new to infiniband and I am trying to figure out the DAT API so I can perhaps create a simple ping-pong program using uDAPl. I have not found any good examples of accessing infiniband through a user API. My adviser has a sixteen node infiniband cluster that we have used primarily for MPI research and when I try and run the dtest program on one of the nodes I get the following error: $ dtest 24924 Running as server - OpenIB-cma 24924: Error Adaptor open: DAT_INVALID_ADDRESS I saw a previous thread with a similar problem and the solution was to add the ib1 ip address to the /etc/dat.conf file, but I cannot find access to any infiniband ip information. We have ethernet adapters on each node which we use to address each node in MPI programs. But then with the shared file system I have 16 ip addresses and one copy of dat.conf. Which ip address(') should I put into it and where? When I do an 'ifconfig' no ib adapter is present and when I try 'ifconfig ib0' I get: ib0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 BROADCAST MULTICAST MTU:2044 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) When I do an ifconfig on ib1-9 they do not exist. Here is my /etc/dat.conf : ... # OpenIB-cma u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "ib0 0" "" OpenIB-cma-1 u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "ib1 0" "" OpenIB-cma-2 u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "ib2 0" "" OpenIB-cma-3 u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "ib3 0" "" OpenIB-bond u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 "bond0 0" "" Any help would be much appreciated though I know I am probably missing something very simple and obvious (or just have a too limited understanding of my adviser's cluster). Also, I would like to know if the uDAPl is in fact the best way to interface with infiniband and any links to help me get started with it would be nice. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From or.gerlitz at gmail.com Wed Jan 9 12:03:42 2008 From: or.gerlitz at gmail.com (Or Gerlitz) Date: Wed, 9 Jan 2008 22:03:42 +0200 Subject: [ofa-general] RE: librdmacm/man: fix-up man pages In-Reply-To: <000101c852e5$776befd0$3cd9180a@amr.corp.intel.com> References: <000101c81a64$3582de80$9c98070a@amr.corp.intel.com> <15ddcffd0711270435t12a18dc3waac2596b3884ac72@mail.gmail.com> <000001c8311a$176cdbe0$63248686@amr.corp.intel.com> <15ddcffd0711280307u7a89c6c2q2854b071f74d9123@mail.gmail.com> <000801c832b6$81feb850$f5d8180a@amr.corp.intel.com> <475FD984.6080203@voltaire.com> <47837BFA.7040402@voltaire.com> <000001c8521d$a05df320$a937170a@amr.corp.intel.com> <47847E60.80208@voltaire.com> <000101c852e5$776befd0$3cd9180a@amr.corp.intel.com> Message-ID: <15ddcffd0801091203x3ad33c9fy30aae325f08e69a5@mail.gmail.com> On 1/9/08, Sean Hefty wrote: > > >mentioning that rdma_disconnect applies only to the connected service, > >do you think its obvious? > > I was viewing this as covered by the text: "Disconnects a connection.." > OK Or. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Wed Jan 9 13:05:23 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 9 Jan 2008 21:05:23 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/saquery: support ports with LinkRecord query In-Reply-To: <20080109193914.GG20963@sashak.voltaire.com> References: <20080109032401.GA20963@sashak.voltaire.com> <20080109175902.GF20963@sashak.voltaire.com> <20080109193914.GG20963@sashak.voltaire.com> Message-ID: <20080109210523.GL20963@sashak.voltaire.com> Ports can be selected now with LinkRecord query. This should be passed in commnad line in format [LID]/[port]. Examples: saquery LinkRecord - queries all LinkRecords in a fabric saquery LR 28 - all LinkRecords for lid 28 saquery LR 28/3 - all LinkRecords for lid 28 port 3 saquery LR 28/3 36/4 - all LinkRecords from lid 28 port 3 to lid 36 port 4 saquery LR /3 - all LinkRecords from ports 3 in a fabric Signed-off-by: Sasha Khapyorsky --- infiniband-diags/src/saquery.c | 88 +++++++++++++++++++++++---------------- 1 files changed, 52 insertions(+), 36 deletions(-) diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c index 23c2d92..d1cfe37 100644 --- a/infiniband-diags/src/saquery.c +++ b/infiniband-diags/src/saquery.c @@ -712,6 +712,42 @@ get_lid(osm_bind_handle_t bind_handle, const char * name) return (rc_lid); } +static int parse_lid_and_ports(osm_bind_handle_t bind_handle, + char *str, int *lid, int *port1, int *port2) +{ + char *p, *e; + + if (port1) *port1 = -1; + if (port2) *port2 = -1; + + p = strchr(str, '/'); + if (p) *p = '\0'; + if (lid) + *lid = get_lid(bind_handle, str); + + if (!p) + return 0; + str = p + 1; + p = strchr(str, '/'); + if (p) *p = '\0'; + if (port1) { + *port1 = strtoul(str, &e, 0); + if (e == str) + *port1 = -1; + } + + if (!p) + return 0; + str = p + 1; + if (port2) { + *port2 = strtoul(str, &e, 0); + if (e == str) + *port2 = -1; + } + + return 0; +} + /* * Get the portinfo records available with IsSM or IsSMdisabled CapabilityMask bit on. */ @@ -748,7 +784,7 @@ static ib_api_status_t get_link_records(osm_bind_handle_t bind_handle, comp_mask |= IB_LR_COMPMASK_FROM_LID; } if (from_port >= 0) { - lr.from_port_num = cl_hton16(from_port); + lr.from_port_num = from_port; comp_mask |= IB_LR_COMPMASK_FROM_PORT; } if (to_lid > 0) { @@ -756,7 +792,7 @@ static ib_api_status_t get_link_records(osm_bind_handle_t bind_handle, comp_mask |= IB_LR_COMPMASK_TO_LID; } if (to_port >= 0) { - lr.to_port_num = cl_hton16(to_port); + lr.to_port_num = to_port; comp_mask |= IB_LR_COMPMASK_TO_PORT; } @@ -1099,17 +1135,20 @@ print_inform_info_records(osm_bind_handle_t bind_handle) } static ib_api_status_t -print_link_records(osm_bind_handle_t bind_handle, char *from, char *to) +print_link_records(osm_bind_handle_t bind_handle, int argc, char *argv[]) { int i; ib_link_record_t *lr; - int from_lid, to_lid, from_port, to_port; + int from_lid = 0, to_lid = 0, from_port = -1, to_port = -1; ib_api_status_t status; - from_lid = get_lid(bind_handle, from); - to_lid = get_lid(bind_handle, to); - from_port = -1; - to_port = -1; + if (argc > 0) + parse_lid_and_ports(bind_handle, argv[0], + &from_lid, &from_port, NULL); + + if (argc > 1) + parse_lid_and_ports(bind_handle, argv[1], + &to_lid, &to_port, NULL); status = get_link_records(bind_handle, from_lid, from_port, to_lid, to_port); @@ -1133,33 +1172,10 @@ print_sl2vl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, int lid = 0, in_port = -1, out_port = -1; ib_api_status_t status; - char *p, *s, *e; - - if (argc < 1) - goto _query; - - p = argv[0]; - s = strchr(p, '/'); - if (s) *s = '\0'; - lid = get_lid(bind_handle, p); - - if (!s) - goto _query; - p = s + 1; - s = strchr(p, '/'); - if (s) *s = '\0'; - in_port = strtoul(p, &e, 0); - if (e == p) - in_port = -1; - - if (!s) - goto _query; - p = s + 1; - out_port = strtoul(p, &e, 0); - if (e == p) - out_port = -1; - -_query: + if (argc > 0) + parse_lid_and_ports(bind_handle, argv[0], + &lid, &in_port, &out_port); + status = get_slvl_records(bind_handle, lid, in_port, out_port); if (status != IB_SUCCESS) return status; @@ -1579,7 +1595,7 @@ main(int argc, char **argv) status = print_inform_info_records(bind_handle); break; case IB_MAD_ATTR_LINK_RECORD: - status = print_link_records(bind_handle, src, dst); + status = print_link_records(bind_handle, argc, argv); break; default: if (q && q->handler) -- 1.5.4.rc2.38.gd6da3 From hillhuge at yahoo.com.au Wed Jan 9 13:37:03 2008 From: hillhuge at yahoo.com.au (hillhuge at yahoo.com.au) Date: Wed, 9 Jan 2008 13:37:03 -0800 Subject: [ofa-general] Torture case Won Against Chinese Commerce Minister in NSW Supreme Court Message-ID: <20080109213703674.2674509B48FE2206@KTCA03> Hi, I am writting to forward you this recent news: Torture case Won Against Chinese Commerce Minister in NSW Supreme Court. Please read the news below, and see the Judement Order from the case no. 11474/06. It is a significantly victory of human rights and justice. Also, the Spanish justice Orders Investigation of Crimes against Falun Gong in China I send you this as I think you might be interested in this important news. * For information of why people are still silent on this persecution, please visit this video <> http://youmaker.com/video/sv?id=b5fb07e2b962485e9f6f23190ffefdad002 * For information of what is Falun Dafa (Falun Gong), please visit http://falundafa.org, different languages are available for option. please visit this video <> http://youmaker.com/video/sv?id=ea48e182af1a46f5abe0d41d6336915b001 * For information of why CCP persecute Falun Gong, please visit http://TheEpochTimes.com or http://ninecommentaries.com for <<9 Commentaries on CCP>> Thank you. Faithfully, Mr. Hill, --------------------------------------- On Monday (5/11/07), the Supreme Court of NSW issued a default judgement against Chinese Minister of Commerce, Mr BO Xilai recognising his failure to provide a defense for the torture case brought against him by Sydney Falun Gong practitioner and Chinese labour camp survivor, Mr PAN Yu. This is the first won in a case against Bo Xilai, 58, who had been sued in over ten countries, including the USA, the UK, Canada, Germany, and Ireland for his role in the persecution Falun Gong practitioners in China. Bo was served a Statement of Claim in person on 4 September 2007 in Canberra prior to the APEC forum. Another two similar cases are in the Supreme Court of NSW and are progressing towards similar judgements against former Chinese dictator Jiang Zemin, Luo Gan, the 6-10 ´Gestapo-like´ office and its head Chen Shaoji. "This is a win not only for all Chinese people and Falun Gong practitioners, but also a potent reminder of the currently opposing legal sytems between Australia and China, as still to this day, no lawyers in China are allowed to take on such cases. We are another step closer to the event of the perpetrators of the persecution of Falun Gong in China being brought to justice at an international level," said Mr Newton XU, legal assistant for the case. "As soon as the [40,000 watt electric] baton would make contact [with] me, I would lose control of my bodily functions, rendering me incontinent. They also applied the electric baton to my face and my head, which made me feel as though I wanted to die, but I would just hold on. In the end, they applied the baton to the most sensitive spot on my inner leg. The pain was indescribable," said Mr PAN in a testimony of his ordeal. As Governor of Liaoning Province in 2000, Bo Xilai played a pivotal role in the campaign of eradication directed against Falun Gong practitioners. The persecution of Falun Gong in Liaoning is severe, with at least 373 known cases of deaths of Falun Gong practitioners confirmed - ----------------------------------------- Spanish justice Orders Investigation of Crimes against Falun Gong in China The Constitutional Court accepted the case against Jiang Zemin and Luo Gan, which is added to the cases of Jia Qinglin and Wu Guanzheng, already under investigation by the National Audience of Spain Notification has being received of the sentence decreed by the highest Court of Spain in constitutional matters, the Spanish Constitutional Tribunal, has recently accepted the case of Falun Gong, admitting the resource of constitutional shelter interposed by the victims and all the judges from the Second Room unanimously deciding that the Courts of Spain must investigate the genocide against Falun Gong denounced by victims in the complaint presented 15 October, 2003. The road to justice against the crimes committed by the Chinese communist Party against the spiritual movement Falun form Gong in China begins. Each and every one of the six judges from the Second Room have accepted the expedient presented by the lawyer of the victims and have declared Jiang Zemin and Luo Gan will have to be investigated by the Courts of Spain for the crimes of genocide and torture of which they have been accused by the victims of these horrible crimes against the humanity. This way Jiang Zemin and Luo Gan's case will be join to the cases of Jia Qinglin and Wu Guanzheng that already are being investigated by the National Hearing of Spain, being accused of the very serious crimes of genocide and tortures against Falun Gong Practitioners in China. Spain and its Justice system this way Spain becomes the first European country that takes a strong step forward and in defense of the human rights of the Chinese citizens who are brutally persecuted by the Chinese communist party. 15 October, 2003, fifteen genocide victims interpose a criminal complaint against Chinese Ex-president Jiang Zemin and Luo Gan, vice director and true brain of Chinese Gestapo created with the name of office of control of Falun Gong denominated 6/10. Both were accused by means of complaint in the penal scope, of the crimes of genocide and tortures against hundreds of thousands of medical instructors of spiritual education Chinese Falun Gong, as well as to devise and to plan the true extermination of the practice of Falun Gong and all its practitioners, who already in year 1999 were considered in a number near the 100 million Chinese citizens. In concrete Jiang Zemin with the direct support of the Chinese communist Party's members and using his Presidential status as president of China and his direct control of the Chinese government ordered a ferocious and brutal and illegal persecution against Falun Gong Practitioners and against the Chinese Constitution, giving instructions to the Chinese Polit Bureau in April of 1999 three very concrete slogans that are identified and will be part of the history that clearly define genocide: "defame the practitioners" "Bankrupt them financially" "Eliminate them physically". These orders were issued by the dictator of the Chinese communist party, to carry out the greater genocide in the history of humanity, were coordinated by Luo Gan, Vise-Director and coordinator of office 6/10 office designated the Falun Gong control office, an illegal organization created in the margin of the Chinese official institutions, that acted in a bloody way across the Chinese territory arresting illegally thousands of Falun Gong practitioners, locking them up in forced labor camps where they were and are being tortured to death trying to make them "resign their beliefs" by means of the so call "repentance declarations". This genocide has been possible with the collaboration of the Chinese communist Party through all its public institutions that have been illegally used with the aim to persecute, to exterminate millions honest Chinese citizens Falun Gong practitioners in China. The official Chinese mass media, centers of appeals, police and their own Chinese Courts, all have been used by the Chinese communist party and Jiang Zemin for genocidal criminal aims against Falun Gong, serving as accomplices of this cruelty and this extreme barbarism. More than eight years of persecution have passed, Justice begins to awake and the people responsible of these crimes will have to respond before her to avoid impunity. Under the universal principle of justice Spain will serve justice against Jiang Zemin in the case of Falun Gong and thus gives hope to the victims of the genocide against Falun Gong. At present four Chinese leaders in Spain are under investigation. The crimes of Jiang Zemin, Luo Gan, Jia Qinglin and Wu Wangzhen are under investigation by the National Tribunal of Spain. This opens the road to investigation and the petitions of the lawyer of the victims for the defendants to be made available to the Spanish Justice System. Until now Jiang Zemin has escaped justice in numerous countries protected as his Ex- President, and diplomatic immunity, but this has not served him in Spain, since no high political agent chief executive of a country can assassinate with impunity, commit torture acts or exert his power to impose the terror amongst his honest citizens. These acts, are acts committed personally and marginally in his position as President and therefore he will have to respond personally for them. No President has as part of his duties the assassination or torture of his own citizens. Jiang Zemin has done it and for that reason will not escape Justice in Spain. The Spanish constitutional court admits Falun Gong's case The complaint interposed in 2003 against Jiang and Luo Gan was rejected by the National Hearing of Spain and Spain's Supreme Court under the circumstances that victims of Spanish nationality in this genocide did not exist, and when these Courts demanded to admit the competition of the Spanish Courts and Tribunals that bond of nationality the complaint was denied. Nevertheless the victim's lawyer of the went to the highest Spanish Court in constitutional matters, the Constitutional Court who by means of a sentence and a historical decision manifested a direct way in the terrible crimes as the genocide and the tortures, not only is affected a state or actual victims, but really they are crimes and that by their cruelty affect all the international community, all of the world countries, which are forced to prevent and to sanction the crime of genocide, even though there are no victims of his own no country. It is what is called the principle of universal jurisdiction, universal justice. This principle of universal justice must be considered absolute and it mustn't be limited by interests or national bonds, since we are before crimes of such gravity that their persecution force all States, and the Spanish constitutional Court says that the criterion of the victims nationality cannot limit the universal aspiration to persecute the crime of genocide, crime of international right. . The Spanish constitutional Court highlights the fact that the "China has not ratified the Statute of Rome that approved the Penal International Court " which without a doubt gives evidence of the necessity that other countries can do justice in this genocide against Falun Gong, given the material impossibility that these crimes are investigated in China when the lawyer for Falun Gong alleges that the Chinese communist party high levels are involved in this persecution. Universal justice must be applied and therefore the defendants Jiang Zemin and Luo Gan will be investigated by the Spanish Justice System. The six magistrates from the Second Room of the Spanish Constitutional Court unanimously decide to accept the recourse and to grant constitutional shelter, the right of victims to complain that the Falun Gong case is investigated by the Spanish justice System. The steps taken in this investigation from now on will go directed by the victims own defense with their lawyer and the Office of the public prosecutor in order to investigate these crimes and try to obtain that the people in charge of these crimes respond before the Spanish Justice System and are seated before the prisoners' dock facing the a jail term of up to twenty years according to article 290 of the Spanish Penal Code. To achieve this the use of international legal mechanisms like extradition in the case that any of these people travel from China to other countries they can be stopped by the international police, the INTERPOL or the FBI and be send to the Spanish justice System to be placed before the Spanish judges and seated in the prisoners' dock. The road is open to expose the truth of the Falun Gong case and the way to defend the victims is opened, no crime must be unpunished and Jiang Zemin, Luo Gan, Jia Qinglin, Wu Wanzheng and all those that have taken part in this terrible genocide must know that the hour of justice approaches fast and effectively, and Spain is ready so that they respond for their crimes. -------------- next part -------------- A non-text attachment was scrubbed... Name: JudgmentOrder.jpg Type: image/jpeg Size: 229734 bytes Desc: not available URL: From rdreier at cisco.com Wed Jan 9 13:53:34 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 09 Jan 2008 13:53:34 -0800 Subject: [ofa-general] Re: [PATCH] libmlx4: avoid memcpy in blueflame post_sends In-Reply-To: <200801091223.14155.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Wed, 9 Jan 2008 12:23:14 +0200") References: <200801091223.14155.jackm@dev.mellanox.co.il> Message-ID: OK, but I'm not sure why this complexity: > #if SIZEOF_LONG == 8 > +typedef uint64_t mlx4_wc_copy_t; > #else > +typedef uint32_t mlx4_wc_copy_t; Isn't that just a crazy way of coming up with a new name for unsigned long? How about something like this (it seems to generate pretty good code on x86-64 at least): diff --git a/src/qp.c b/src/qp.c index 8b4adaa..5721860 100644 --- a/src/qp.c +++ b/src/qp.c @@ -168,6 +168,20 @@ static void set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ibv_sge *sg) dseg->byte_count = htonl(sg->length); } +/* + * Avoid using memcpy() to copy to BlueFlame page, since memcpy() + * implementations may use move-string-buffer assembler instructions, + * which do not guarantee order of copying. + */ +static void mlx4_bf_copy(unsigned long *dst, unsigned long *src, unsigned bytecnt) +{ + while (bytecnt > 0) { + *dst++ = *src++; + *dst++ = *src++; + bytecnt -= 2 * sizeof (long); + } +} + int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { @@ -388,7 +402,8 @@ out: pthread_spin_lock(&ctx->bf_lock); - memcpy(ctx->bf_page + ctx->bf_offset, ctrl, align(size * 16, 64)); + mlx4_bf_copy(ctx->bf_page + ctx->bf_offset, (unsigned long *) ctrl, + align(size * 16, 64)); wc_wmb(); ctx->bf_offset ^= ctx->bf_buf_size; From ardavis at ichips.intel.com Wed Jan 9 14:25:19 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Wed, 09 Jan 2008 14:25:19 -0800 Subject: [ofa-general] Problems running dtest - DAT_INVALID_ADDRESS In-Reply-To: <20080109145147.m22cb935wg0o0sso@system.cs.fsu.edu> References: <20080109145147.m22cb935wg0o0sso@system.cs.fsu.edu> Message-ID: <478549CF.2040408@ichips.intel.com> small at cs.fsu.edu wrote: > dat.conf. Which ip address(') should I put into it and where? When I do > an 'ifconfig' no ib adapter is present and when I try 'ifconfig ib0' I get: DAPL requires IPoIB to be up and configured with an address. Assuming your adapters are all connected via port 1, you need to configure the ib0 devices (static addresses) on the fabric. For example: /etc/sysconfig/network-scripts/ifcfg-ib0 DEVICE=ib0 BOOTPROTO=static ONBOOT=yes IPADDR=192.168.0.1 NETMASK=255.255.255.0 > # > OpenIB-cma u1.2 nonthreadsafe default /usr/lib64/libdaplcma.so dapl.1.2 > "ib0 0" "" uDAPL applications, dtest included, call dat_ia_open with a provider device name. In this case, the "OpenIB-cma" name is used and if you look at /etc/dat.conf you will see the link to a provider library and the device of ib0. The INVALID_ADDRESS is returned because no IP address is configured on netdev ib0. See dtest manpage for dtest client/server usage. Also, I would like to know if > the uDAPl is in fact the best way to interface with infiniband and any > links to help me get started with it would be nice. It depends on how much portability you want and how many different devices you wish to support with your application. uDAPL provides both O/S and transport independence so if you have plans to run your application on multiple O/S'es or want to support more then just OFA RDMA devices then uDAPL is your best choice. -arlin From weiny2 at llnl.gov Wed Jan 9 14:30:59 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 9 Jan 2008 14:30:59 -0800 Subject: [ofa-general] Re: [PATCH] Fix perfmgr enable in spec file. In-Reply-To: <20080109195619.GK20963@sashak.voltaire.com> References: <20080109103501.57d84fd1.weiny2@llnl.gov> <20080109195619.GK20963@sashak.voltaire.com> Message-ID: <20080109143059.07e2c944.weiny2@llnl.gov> On Wed, 9 Jan 2008 19:56:19 +0000 Sasha Khapyorsky wrote: > On 10:35 Wed 09 Jan , Ira Weiny wrote: > > From f419375b725d00b9b6bfe37e335fa1e0e4259174 Mon Sep 17 00:00:00 2001 > > From: Ira K. Weiny > > Date: Wed, 9 Jan 2008 10:31:48 -0800 > > Subject: [PATCH] Fix perfmgr enable in spec file. > > > > > > Signed-off-by: Ira K. Weiny > > Applied (for OFED-1.3 too). Thanks. > Thanks, I forgot to metion it should be for 1.3 as well. Ira From rajouri.jammu at gmail.com Wed Jan 9 14:34:38 2008 From: rajouri.jammu at gmail.com (Rajouri Jammu) Date: Wed, 9 Jan 2008 14:34:38 -0800 Subject: [ofa-general] retry exceeded problem with rdma_read Message-ID: <3307cdf90801091434q4298cab0sf8e670c21087afad@mail.gmail.com> Occasionally, I'm getting a retry exceeded error on the qp (error 12) when doing rdma_reads. Under what conditions would thins kind of problem happen? I have the retry_count = 5 and 'am using rdma_cm for all the connection setup. OFED version is 1.2.5 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Wed Jan 9 17:19:50 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 10 Jan 2008 01:19:50 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/saquery: support for VLArb and PKey Table Records In-Reply-To: <20080109210523.GL20963@sashak.voltaire.com> References: <20080109032401.GA20963@sashak.voltaire.com> <20080109175902.GF20963@sashak.voltaire.com> <20080109193914.GG20963@sashak.voltaire.com> <20080109210523.GL20963@sashak.voltaire.com> Message-ID: <20080110011950.GM20963@sashak.voltaire.com> This adds support for VLArbTableRecord and PKeyTableRecord attributes. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/src/saquery.c | 161 +++++++++++++++++++++++++++++++++++++++- 1 files changed, 158 insertions(+), 3 deletions(-) diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c index d1cfe37..8c0aff8 100644 --- a/infiniband-diags/src/saquery.c +++ b/infiniband-diags/src/saquery.c @@ -579,9 +579,9 @@ static void dump_one_slvl_record(ib_slvl_table_record_t *slvl) { ib_slvl_table_t *t = &slvl->slvl_tbl; printf("SL2VLTableRecord dump:\n" - "\t\tLID....................%u\n" - "\t\tInPort...................%u\n" - "\t\tOutPort.....................%u\n" + "\t\tLID........................%u\n" + "\t\tInPort.....................%u\n" + "\t\tOutPort....................%u\n" "\t\tSL: 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|\n" "\t\tVL:%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u" "|%2u|%2u|%2u|\n", @@ -596,6 +596,54 @@ static void dump_one_slvl_record(ib_slvl_table_record_t *slvl) ib_slvl_table_get(t, 14), ib_slvl_table_get(t, 15)); } +static void dump_one_vlarb_record(ib_vl_arb_table_record_t *vlarb) +{ + ib_vl_arb_element_t *e = vlarb->vl_arb_tbl.vl_entry; + int i; + printf("VLArbTableRecord dump:\n" + "\t\tLID........................%u\n" + "\t\tPort.......................%u\n" + "\t\tBlock......................%u\n", + cl_ntoh16(vlarb->lid), vlarb->port_num, vlarb->block_num); + for (i = 0; i < 32 ; i += 16) { + printf("\t\tVL :%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|" + "%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|", + e[i + 0].vl, e[i + 1].vl, e[i + 2].vl, e[i + 3].vl, + e[i + 4].vl, e[i + 5].vl, e[i + 6].vl, e[i + 7].vl, + e[i + 8].vl, e[i + 9].vl, e[i + 10].vl, e[i + 11].vl, + e[i + 12].vl, e[i + 13].vl, e[i + 14].vl, e[i + 15].vl); + printf("\n\t\tWeight:%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|" + "%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|", + e[i + 0].weight, e[i + 1].weight, e[i + 2].weight, + e[i + 3].weight, e[i + 4].weight, e[i + 5].weight, + e[i + 6].weight, e[i + 7].weight, e[i + 8].weight, + e[i + 9].weight, e[i + 10].weight, e[i + 11].weight, + e[i + 12].weight, e[i + 13].weight, e[i + 14].weight, + e[i + 15].weight); + printf("\n"); + } +} + +static void dump_one_pkey_tbl_record(ib_pkey_table_record_t *pktr) +{ + ib_net16_t *p = pktr->pkey_tbl.pkey_entry; + int i; + printf("PKeyTableRecord dump:\n" + "\t\tLID........................%u\n" + "\t\tPort.......................%u\n" + "\t\tBlock......................%u\n" + "\t\tPKey Table:\n", + cl_ntoh16(pktr->lid), pktr->port_num, pktr->block_num); + for (i = 0; i < 32 ; i += 8) + printf("\t\t0x%04x 0x%04x 0x%04x 0x%04x" + " 0x%04x 0x%04x 0x%04x 0x%04x\n", + cl_ntoh16(p[i + 0]), cl_ntoh16(p[i + 1]), + cl_ntoh16(p[i + 2]), cl_ntoh16(p[i + 3]), + cl_ntoh16(p[i + 4]), cl_ntoh16(p[i + 5]), + cl_ntoh16(p[i + 6]), cl_ntoh16(p[i + 7])); + printf("\n"); +} + static void return_mad(void) { @@ -828,6 +876,59 @@ static ib_api_status_t get_slvl_records(osm_bind_handle_t bind_handle, ib_get_attr_offset(sizeof(ib_slvl_table_record_t)), 0); } +static ib_api_status_t get_vlarb_records(osm_bind_handle_t bind_handle, + int lid, int port, int block) +{ + ib_vl_arb_table_record_t vlarb; + ib_net64_t comp_mask = 0; + + memset(&vlarb, 0, sizeof(vlarb)); + + if (lid > 0) { + vlarb.lid = cl_hton16(lid); + comp_mask |= IB_VLA_COMPMASK_LID; + } + if (port >= 0) { + vlarb.port_num = port; + comp_mask |= IB_VLA_COMPMASK_OUT_PORT; + } + if (block >= 0) { + vlarb.block_num = block; + comp_mask |= IB_VLA_COMPMASK_BLOCK; + } + + return get_any_records(bind_handle, IB_MAD_ATTR_VLARB_RECORD, 0, + comp_mask, &vlarb, + ib_get_attr_offset(sizeof(ib_vl_arb_table_record_t)), 0); +} + +static ib_api_status_t get_pkey_tbl_records(osm_bind_handle_t bind_handle, + int lid, int port, int block) +{ + ib_pkey_table_record_t pktr; + ib_net64_t comp_mask = 0; + + memset(&pktr, 0, sizeof(pktr)); + + if (lid > 0) { + pktr.lid = cl_hton16(lid); + comp_mask |= IB_PKEY_COMPMASK_LID; + } + if (port >= 0) { + pktr.port_num = port; + comp_mask |= IB_PKEY_COMPMASK_PORT; + } + if (block >= 0) { + pktr.block_num = block; + comp_mask |= IB_PKEY_COMPMASK_BLOCK; + } + + return get_any_records(bind_handle, IB_MAD_ATTR_PKEY_TBL_RECORD, 0, + comp_mask, &pktr, + ib_get_attr_offset(sizeof(pktr)), + OSM_DEFAULT_SM_KEY); +} + static ib_api_status_t print_node_records(osm_bind_handle_t bind_handle) { @@ -1188,6 +1289,56 @@ print_sl2vl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, return status; } +static int +print_vlarb_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, + int argc, char *argv[]) +{ + int i; + ib_vl_arb_table_record_t *vlarb; + int lid = 0, port = -1, block = -1; + ib_api_status_t status; + + if (argc > 0) + parse_lid_and_ports(bind_handle, argv[0], + &lid, &port, &block); + + status = get_vlarb_records(bind_handle, lid, port, block); + if (status != IB_SUCCESS) + return status; + + for (i = 0; i < result.result_cnt; i++) { + vlarb = osmv_get_query_result(result.p_result_madw, i); + dump_one_vlarb_record(vlarb); + } + return_mad(); + return status; +} + +static int +print_pkey_tbl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, + int argc, char *argv[]) +{ + int i; + ib_pkey_table_record_t *pktr; + int lid = 0, port = -1, block = -1; + ib_api_status_t status; + + if (argc > 0) + parse_lid_and_ports(bind_handle, argv[0], + &lid, &port, &block); + + status = get_pkey_tbl_records(bind_handle, lid, port, block); + if (status != IB_SUCCESS) + return status; + + for (i = 0; i < result.result_cnt; i++) { + pktr = osmv_get_query_result(result.p_result_madw, i); + dump_one_pkey_tbl_record(pktr); + } + return_mad(); + return status; +} + static osm_bind_handle_t get_bind_handle(void) { @@ -1268,6 +1419,10 @@ static const struct query_cmd query_cmds[] = { { "PortInfoRecord", "PIR", IB_MAD_ATTR_PORTINFO_RECORD, }, { "SL2VLTableRecord", "SL2VL", IB_MAD_ATTR_SLVL_RECORD, print_sl2vl_records }, + { "PKeyTableRecord", "PKTR", IB_MAD_ATTR_PKEY_TBL_RECORD, + print_pkey_tbl_records }, + { "VLArbitrationTableRecord", "VLAR", IB_MAD_ATTR_VLARB_RECORD, + print_vlarb_records }, { "InformInfoRecord", "IIR", IB_MAD_ATTR_INFORM_INFO_RECORD, }, { "LinkRecord", "LR", IB_MAD_ATTR_LINK_RECORD, }, { "ServiceRecord", "SR", IB_MAD_ATTR_SERVICE_RECORD, }, -- 1.5.4.rc2.38.gd6da3 From kliteyn at mellanox.co.il Wed Jan 9 17:37:07 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 10 Jan 2008 03:37:07 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-10:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-09 OpenSM git rev = Wed_Jan_9_15:18:19_2008 [f6a173f94d746c077b8e27ae8ebdd51cbe75f94f] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=398 Fail=2 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 8 LidMgr IS3-128.topo Failures: 2 LidMgr IS3-128.topo From dotanb at dev.mellanox.co.il Wed Jan 9 22:16:02 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Thu, 10 Jan 2008 08:16:02 +0200 Subject: [ofa-general] retry exceeded problem with rdma_read In-Reply-To: <3307cdf90801091434q4298cab0sf8e670c21087afad@mail.gmail.com> References: <3307cdf90801091434q4298cab0sf8e670c21087afad@mail.gmail.com> Message-ID: <4785B822.80306@dev.mellanox.co.il> Rajouri Jammu wrote: > Occasionally, I'm getting a retry exceeded error on the qp (error 12) > when doing rdma_reads. > > Under what conditions would thins kind of problem happen? > > I have the retry_count = 5 and 'am using rdma_cm for all the > connection setup. > > OFED version is 1.2.5 Does it happen between different HCAs? If this happens during working with the QPs (not in the first message) than check the following thing: If the QP attributes values of max_rd_atomic and max_dest_rd_atomic this may happen. The values should be (for sides A and B): A.max_rd_atomic <= B.max_dest_rd_atomic A.max_dest_rd_atomic >= B.max_rd_atomic (which means that RDMA Reads/atomic as initiator shouldn't be larger than the supported value as the destination) You can check it by query the used QP and verify those values. If it happens at the beginning of the connection, there may be other problem and i need more info .... Dotan From keshetti85-student at yahoo.co.in Wed Jan 9 22:45:51 2008 From: keshetti85-student at yahoo.co.in (Keshetti Mahesh) Date: Thu, 10 Jan 2008 12:15:51 +0530 Subject: [ofa-general] ib_macro_model on OMNET++ Message-ID: <829ded920801092245j3c11c251n3711a0d23ac55a30@mail.gmail.com> Recently while browsing internet I came across "ib_macro_model" package on the OMNET++ web site (http://www.omnetpp.org) which is contributed by Mellanox Technologies Ltd. (http://www.omnetpp.org/filemgmt/singlefile.php?lid=133) Can any one on this list tell me what is it exactly and how to use it ? Thanks and regards, Mahesh From bart.vanassche at gmail.com Wed Jan 9 23:38:47 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Thu, 10 Jan 2008 08:38:47 +0100 Subject: [ofa-general] OFED and Ubuntu Linux In-Reply-To: References: Message-ID: On Jan 8, 2008 10:10 PM, Roland Dreier wrote: > > Is the OFED software stack supported on Ubuntu Linux ? While OFED > > 1.2.5.4 compiles fine on Ubuntu Linux 7.10, I got the following errors > > while installing the RPM's: > > Just out of curiosity, what packages from OFED are you interested in > using on Ubuntu? My goal would be to get most IB/RDMA-related stuff > into the upstream Debian/Ubuntu distributions directly, so that you > don't have to mess around with OFED at all. > > Currently, Ubuntu 7.10 has a 2.6.22 kernel, which has most IB support > built in, and the ubuntu archive has packages for libibverbs and > libmthca in universe. 8.04 (Hardy) will have a 2.6.24 kernel and adds > openmpi packages (built with libibverbs support). I have libmlx4 > packaged for hardy in my PPA: > > deb http://ppa.launchpad.net/roland.dreier/ubuntu hardy main > deb-src http://ppa.launchpad.net/roland.dreier/ubuntu hardy main > > (libmlx4 is in Debian testing so it should propagate automatically > into Ubuntu universe for Hardy+1). > > I am planning on packaging librdmacm for Debian and Ubuntu in the next > few weeks. The packages will appear in my PPA and should be ready in > plenty of time for Hardy+1. Are there any other packages you are > looking for? Hello Roland, I'm looking for iSCSI, iSER initiator, iSER target, SDP initiator, SDP target and uDAPL support. I'm not sure which are the corresponding OFED packages -- probably perftest, libibverbs, libibverbs-utils, libibverbs-devel, libmlx4, libmthca, librdmacm, librdmacm-devel and ofed-docs ? Bart Van Assche. From jackm at dev.mellanox.co.il Wed Jan 9 23:43:48 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Thu, 10 Jan 2008 09:43:48 +0200 Subject: [ofa-general] Re: [PATCH] libmlx4: avoid memcpy in blueflame post_sends In-Reply-To: References: <200801091223.14155.jackm@dev.mellanox.co.il> Message-ID: <200801100943.48907.jackm@dev.mellanox.co.il> On Wednesday 09 January 2008 23:53, Roland Dreier wrote: > Isn't that just a crazy way of coming up with a new name for unsigned long? > > How about something like this (it seems to generate pretty good code > on x86-64 at least): > > diff --git a/src/qp.c b/src/qp.c > index 8b4adaa..5721860 100644 > --- a/src/qp.c > +++ b/src/qp.c > @@ -168,6 +168,20 @@ static void set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ibv_sge *sg) > dseg->byte_count = htonl(sg->length); > } > > +/* > + * Avoid using memcpy() to copy to BlueFlame page, since memcpy() > + * implementations may use move-string-buffer assembler instructions, > + * which do not guarantee order of copying. > + */ > +static void mlx4_bf_copy(unsigned long *dst, unsigned long *src, unsigned bytecnt) > +{ > + while (bytecnt > 0) { > + *dst++ = *src++; > + *dst++ = *src++; > + bytecnt -= 2 * sizeof (long); > + } > +} > + Agreed regarding the unsigned long. However, your solution still results in a procedure call (mlx4_bf_copy is compiled as a procedure using gcc 4.1.0 on an X86_64 host, even if I add "inline"). I would prefer the patch below (which does generate inline code, and does the (sizeof(unsigned long) * 2) calculation just once). - Jack ======================= diff --git a/src/qp.c b/src/qp.c index bced740..8fc8450 100644 --- a/src/qp.c +++ b/src/qp.c @@ -391,7 +391,24 @@ out: pthread_spin_lock(&ctx->bf_lock); - memcpy(ctx->bf_page + ctx->bf_offset, ctrl, align(size * 16, 64)); + /* + * Avoid using memcpy to copy to BlueFlame page, since recent + * memcpy implementations use move-string-buffer assembler + * instructions, which do not guarantee order of copying. + */ + + { + unsigned long *target = + (unsigned long *) (ctx->bf_page + ctx->bf_offset); + unsigned long *src = (unsigned long *) ctrl; + int n = align(size * 16, 64) / (sizeof(unsigned long) * 2); + while (n > 0) { + *target++ = *src++; + *target++ = *src++; + --n; + } + } + wc_wmb(); ctx->bf_offset ^= ctx->bf_buf_size; From jackm at dev.mellanox.co.il Thu Jan 10 02:39:23 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Thu, 10 Jan 2008 12:39:23 +0200 Subject: [ofa-general] [PATCH] libmlx4: revert commit which eliminated extra CQE. Message-ID: <200801101239.23455.jackm@dev.mellanox.co.il> commit 078f2170e768e707b8c71eae315f87e2a0c3ab12 Author: Jack Morgenstein Date: Tue Jan 8 09:53:35 2008 +0200 Revert "Don't add an extra entry to CQs" It turns out that this entry is needed for the resize-cq implementation. This reverts commit 216b90eac10cc8e11b9abaa710385986e26fbf85. Signed-off-by: Jack Morgenstein --- Roland, We need to return the extra CQE -- it is needed for a special CQE (opcode = 16h) which denotes that the resizing operation has completed. This CQE is placed in the old CQ buffer, and indicates that it is no longer used by the HCA. Sorry about that! - Jack diff --git a/src/cq.c b/src/cq.c index 06ae9e2..d9ebff1 100644 --- a/src/cq.c +++ b/src/cq.c @@ -114,10 +114,10 @@ static struct mlx4_cqe *get_cqe(struct mlx4_cq *cq, int entry) static void *get_sw_cqe(struct mlx4_cq *cq, int n) { - struct mlx4_cqe *cqe = get_cqe(cq, n & (cq->ibv_cq.cqe - 1)); + struct mlx4_cqe *cqe = get_cqe(cq, n & cq->ibv_cq.cqe); return (!!(cqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK) ^ - !!(n & cq->ibv_cq.cqe)) ? NULL : cqe; + !!(n & (cq->ibv_cq.cqe + 1))) ? NULL : cqe; } static struct mlx4_cqe *next_cqe_sw(struct mlx4_cq *cq) @@ -398,7 +398,7 @@ void mlx4_cq_clean(struct mlx4_cq *cq, uint32_t qpn, struct mlx4_srq *srq) * from our QP and therefore don't need to be checked. */ for (prod_index = cq->cons_index; get_sw_cqe(cq, prod_index); ++prod_index) - if (prod_index == cq->cons_index + cq->ibv_cq.cqe - 1) + if (prod_index == cq->cons_index + cq->ibv_cq.cqe) break; /* @@ -406,13 +406,13 @@ void mlx4_cq_clean(struct mlx4_cq *cq, uint32_t qpn, struct mlx4_srq *srq) * that match our QP by copying older entries on top of them. */ while ((int) --prod_index - (int) cq->cons_index >= 0) { - cqe = get_cqe(cq, prod_index & (cq->ibv_cq.cqe - 1)); + cqe = get_cqe(cq, prod_index & cq->ibv_cq.cqe); if ((ntohl(cqe->my_qpn) & 0xffffff) == qpn) { if (srq && !(cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK)) mlx4_free_srq_wqe(srq, ntohs(cqe->wqe_index)); ++nfreed; } else if (nfreed) { - dest = get_cqe(cq, (prod_index + nfreed) & (cq->ibv_cq.cqe - 1)); + dest = get_cqe(cq, (prod_index + nfreed) & cq->ibv_cq.cqe); owner_bit = dest->owner_sr_opcode & MLX4_CQE_OWNER_MASK; memcpy(dest, cqe, sizeof *cqe); dest->owner_sr_opcode = owner_bit | diff --git a/src/verbs.c b/src/verbs.c index 0bbab57..50e0947 100644 --- a/src/verbs.c +++ b/src/verbs.c @@ -182,11 +182,7 @@ struct ibv_cq *mlx4_create_cq(struct ibv_context *context, int cqe, if (pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE)) goto err; - cqe = align_queue_size(cqe); - - /* Always allocate at least two CQEs to keep things simple */ - if (cqe < 2) - cqe = 2; + cqe = align_queue_size(cqe + 1); if (mlx4_alloc_buf(&cq->buf, cqe * MLX4_CQ_ENTRY_SIZE, to_mdev(context->device)->page_size)) @@ -206,8 +202,6 @@ struct ibv_cq *mlx4_create_cq(struct ibv_context *context, int cqe, cmd.buf_addr = (uintptr_t) cq->buf.buf; cmd.db_addr = (uintptr_t) cq->set_ci_db; - /* Subtract 1 from the number of entries we pass into the - * kernel because the kernel mlx4_ib driver will add 1 again. */ ret = ibv_cmd_create_cq(context, cqe - 1, channel, comp_vector, &cq->ibv_cq, &cmd.ibv_cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); @@ -215,8 +209,6 @@ struct ibv_cq *mlx4_create_cq(struct ibv_context *context, int cqe, goto err_db; cq->cqn = resp.cqn; - /* Bump the number of entries to make up for subtracting 1 above */ - ++cq->ibv_cq.cqe; return &cq->ibv_cq; From vlad at lists.openfabrics.org Thu Jan 10 03:11:30 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 10 Jan 2008 03:11:30 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080110-0200 daily build status Message-ID: <20080110111130.ABA7FE601A6@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.15 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.13 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.16 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.17 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.12 Passed on ppc64 with linux-2.6.17 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.14 Passed on powerpc with linux-2.6.14 Passed on ppc64 with linux-2.6.15 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.15 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Failed: From marcelo-digital_ at datafull.com Thu Jan 10 03:41:00 2008 From: marcelo-digital_ at datafull.com (Tierra Digital) Date: Thu, 10 Jan 2008 08:41:00 -0300 Subject: [ofa-general] COMPUTADORAS COMPLETAS 2008 Message-ID: <381-22008141011410265@coloso> TIERRA DIGITAL TE: 011-15-6971-7166 marcelo-digital at datafull.com Monitor Samsung CRT 794V Black $399 Monitor Samsung 17" LCD 732NW $725 Portaretrato Digital 7" TITAN $329 Precios Finales con Factura C, entrega en local de Computacion en la zona de Caballito, Capital Federal. Argentina Las computadoras no incluyen MONITOR Aceptamos pago con Tarjetas de Credito VISA, Mastercard, American Express, Cabal y Carta Franca Creditos en el acto con DNI, recibo de Sueldo y un Servicio a nombre del titula Modelo ATHLON FIRE Procesador AMD ATHLON 64 DUAL CORE 4000+ Socket AM2 BOX Motherboard MSI K9N6GM-V / ASUS M2N-MX, chipset nVidia nForce 430 Memoria 1GB DDR2 667 Placa de Video VGA nVidia GeForce 6100GS 256MB Disco Rígido Western Digital 80GB SATA2 Optico ReGrabadora de DVD DVDRW PHILIPS 20X Placa Red 10/100 Sonido 8 canales Puertos 2 USB frontales + 4USB traseros KIT Circuit Planet 9034 NEGRO / Teclado Multimedia / Mouse Optico Garantia 12 meses Precio Efectivo $ 999 CUOTAS 12 cuotas de $105,72 Con sistema operativo Windows XP instalado y funcionando FULL, todos los programas y utilidades Modelo CELERON SKY Procesador CPU INTEL CELERON D 331 2.66GHZ 533 Socket 775 BOX Motherboard ASUS P5S-MX BOX Memoria MEM DDR2 512MB 667MHZ Placa de Video VGA VIA Unichrome 256MB onboard Disco Rígido Western Digital 80GB SATA2 Optico ReGrabadora de DVD DVDRW PHILIPS 20X Placa Red 10/100 Sonido 6 canales Puertos 2 USB frontales + 4USB traseros KIT Circuit Planet 9034 NEGRO / Teclado Multimedia / Mouse Optico Garantia 12 meses Precio $ 869 CUOTAS 12 cuotas de $92,00 Con sistema operativo Windows XP instalado y funcionando FULL, todos los programas y utilidades Modelo PENTIUM 4 LIGHT Procesador CPU INTEL PENTIUM 4 3.0 GHz 800MHz Socket 775 (631) Motherboard ASUS P5S-MX BOX Memoria MEM DDR2 512MB 667MHZ Placa de Video VGA VIA Unichrome 256MB onboard Disco Rígido Western Digital 80GB SATA2 Optico ReGrabadora de DVD DVDRW PHILIPS 20X Placa Red 10/100 Sonido 6 canales Puertos 2 USB frontales + 4USB traseros KIT Circuit Planet 9034 NEGRO / Teclado Multimedia / Mouse Optico Garantia 12 meses Precio $ 930 CUOTAS 12 cuotas de $98,42 Con sistema operativo Windows XP instalado y funcionando FULL, todos los programas y utilidades Modelo DUAL CORE INTEL Procesador CPU INTEL E2140 DUAL CORE 1.6 1MB Socket 775 BOX Motherboard MSI P6NGM BOX Memoria 1GB DDR2 667 Super Talent c/disipador Placa de Video VGA nVidia GeForce 7150 256MB Disco Rígido Western Digital 80GB SATA2 Optico ReGrabadora de DVD DVDRW PHILIPS 20X Placa Red 10/100 Sonido 8 canales Puertos 2 USB frontales + 4USB traseros KIT Circuit Planet 9034 NEGRO / Teclado Multimedia / Mouse Optico Garantia 12 meses Precio $ 1.099 CUOTAS 12 cuotas de $116,31 Con sistema operativo Windows XP instalado y funcionando FULL, todos los programas y utilidades Modelo DUAL CORE FULL Procesador CPU INTEL E2140 DUAL CORE 1.6 1MB Socket 775 BOX Motherboard ASUS P5S-MX BOX Memoria 1GB DDR2 667 Placa de Video VGA PCI-Express 256MB REALES NX8400GS MSI Disco Rígido Western Digital 250GB SATA2 Optico ReGrabadora de DVD DVDRW PHILIPS 20X Placa Red 10/100 Sonido 8 canales Puertos 2 USB frontales + 4USB traseros KIT Circuit Planet 9034 NEGRO / Teclado Multimedia / Mouse Optico Garantia 12 meses Precio $ 1.339 CUOTAS 12 cuotas de $141,71 Con sistema operativo Windows XP instalado y funcionando FULL, todos los programas y utilidades Modelo CORE 2 DUO Procesador INTEL E4400 CORE2DUO 2.0 2MB Socket 775 BOX Motherboard ASUS P5LD2-X BOX Memoria 1GB DDR2 667 Placa de Video VGA PCI-Express 256MB REALES NX8400GS MSI Disco Rígido Western Digital 250GB SATA2 Optico ReGrabadora de DVD DVDRW PHILIPS 20X Placa Red 10/100 Sonido 8 canales Puertos 2 USB frontales + 4USB traseros KIT Circuit Planet 9034 NEGRO / Teclado Multimedia / Mouse Optico Garantia 12 meses Precio $ 1.599 CUOTAS 12 cuotas de $169,23 Con sistema operativo Windows XP instalado y funcionando FULL, todos los programas y utilidades Modelo CIRYON Procesador Procesador Intel BOX CORE 2 DUO E6420 2.13 4MB Socket 775 Motherboard Mother MSI P965 NEO (Chip Intel P965- Soporta memoria hasta 8GB DDR II 800 DUAL CHANNEL - 3 PCI - 2 PCI-Express 1X - 8 USB - PCI Express 16X) Memoria Memoria 2Gb RAM DDR2 800Mhz SUPER TALENT c/disipador DUAL CHANNEL Placa de Video Placa de video GeForce 8600GT 256MB PCI Express 16x MSI Disco Rígido Disco Rígido de 320GB SATA II 8Mb de Buffer Optico Regrabadora de DVD Pioneer SATA DVR-212D 18X Dual Layer Placa Red Placa de Red LAN 10/100/1000 Gigabit LAN Sonido Placa de sonido 8 CANALES 24 Bits High Definition Onboard Puertos 2 USB frontales + 4USB traseros KIT Gabinete ATX 4 Bahías Vistuba 8030BK FULL 4 COOLERS ADICIONALES y USB Frontales. Fuente de 450Watts con fan de 120mm. Acrilico lateral. Entrada de USB + Audio fronta MOUSE OPTICO GENIUS NS 110 NEGRO Teclado Genius KB-21E Garantia 12 meses Precio $ 3.180 CUOTAS 12 cuotas de $336,55 Con sistema operativo Windows XP instalado y funcionando FULL, todos los programas y utilidades From ReginapepGrace at slashdot.org Thu Jan 10 06:31:27 2008 From: ReginapepGrace at slashdot.org (Cathy Wills) Date: Thu, 10 Jan 2008 06:31:27 -0800 (PST) Subject: [ofa-general] We pay you to play. Message-ID: <20080110143127.C3FA0E60202@openfabrics.org> Come find out. We pay you to play. $2400 welcome bonus will be deposited in your new casino account! Relax and have fun with poker, blackjack, roulette, progressive video slots at your own leisure from your couch. http://wowacasino.com/ From prescott at hpc.ufl.edu Thu Jan 10 07:07:17 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Thu, 10 Jan 2008 10:07:17 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4783C326.3070306@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> Message-ID: <478634A5.3080204@hpc.ufl.edu> Steve Wise wrote: > > First make sure the sdp kernel module uses the rdma cma. Then I'd add > printk hooks in cma.c, addr.c, and iwcm.c to see what's going on and > where things are failing. Also a wire trace is good if we're getting > that far (like at least doing arp resolution). > Small update - a little progress. printk's spinkled liberally and ib_sdp debug options turned on. The initial problem was on the listener during an IW_CM_EVENT_CONNECT_REQUEST event; the SDP hello header was rejected in sdp_cma.c:sdp_connect_handler() because its max_adverts field was zero, which is not permissible. In fact, all of the sdp_hh fields were zero. Comparing with the RDMA_TRANSPORT_IB case, I saw that cma.c:cma_connect_ib() does some work to create the SDP header via cma_format_hdr(). But cma_connect_iw() did not. I patched cma_connect_iw() to create the SDP header as cma_connect_ib() does. This gets us farther - examining the SDP header on the listener side looks right now, and the listener at least enters rdma_accept(), but iw_cm_accept() fails due to cm_id->device->iwcm->accept(cm_id, iw_param) returning -104. The above call also emits a couple of messages into the listener's syslog now : Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid 0x20 opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 In the end, we still end up in rdma_reject(). Will keep digging. Cheers, Craig From swise at opengridcomputing.com Thu Jan 10 07:19:48 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 10 Jan 2008 09:19:48 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: <478634A5.3080204@hpc.ufl.edu> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> Message-ID: <47863794.9080709@opengridcomputing.com> Craig Prescott wrote: > Steve Wise wrote: >> >> First make sure the sdp kernel module uses the rdma cma. Then I'd >> add printk hooks in cma.c, addr.c, and iwcm.c to see what's going on >> and where things are failing. Also a wire trace is good if we're >> getting that far (like at least doing arp resolution). >> > > Small update - a little progress. printk's spinkled liberally and > ib_sdp debug options turned on. The initial problem was on the > listener during an IW_CM_EVENT_CONNECT_REQUEST event; the SDP hello > header was rejected in sdp_cma.c:sdp_connect_handler() because its > max_adverts field was zero, which is not permissible. In fact, all > of the sdp_hh fields were zero. > > Comparing with the RDMA_TRANSPORT_IB case, I saw that > cma.c:cma_connect_ib() does some work to create the SDP header > via cma_format_hdr(). But cma_connect_iw() did not. > Why is this SDP protocol stuff done in the CMA?? That's seems like a layer violation... > I patched cma_connect_iw() to create the SDP header as > cma_connect_ib() does. This gets us farther - examining the > SDP header on the listener side looks right now, and the > listener at least enters rdma_accept(), but iw_cm_accept() > fails due to cm_id->device->iwcm->accept(cm_id, iw_param) > returning -104. 104 == ECONNRESET, so the client side must have reset the connection. Did this happen after 10 seconds? (there's a 10 second MPA negiation timeout in the chelsio cm). Also, a wire trace might be useful. If this reset happens immediately, then you should look on the client side and see why it reset the connection. > The above call also emits a couple of messages > into the listener's syslog now : > > Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid 0x20 > opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 > Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 opcode 14 > status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 > This is an async event generated due to a failure processing a SQ WR, I think. opcodes and status codes for iw_cxgb3 are in cxio_wr.h. type 1 means it was an egress (SQ) failure status 0x6 is a base/bounds violation, but 14 seems incorrect. That's not a valid T3 opcode. ???? > In the end, we still end up in rdma_reject(). Will keep digging. > > Cheers, > Craig From hrosenstock at xsigo.com Thu Jan 10 07:24:45 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 10 Jan 2008 07:24:45 -0800 Subject: [ofa-general] Re: [PATCH] opensm/osm_sa_slvl_record: fix overflow crash In-Reply-To: <20080109194153.GH20963@sashak.voltaire.com> References: <20080109194153.GH20963@sashak.voltaire.com> Message-ID: <1199978685.3611.109.camel@hrosenstock-ws.xsigo.com> Sasha, On Wed, 2008-01-09 at 19:41 +0000, Sasha Khapyorsky wrote: > When SL2VLTableRecord is requested for switch by lid only (no in and out > ports are selected in compmask) it overflows its own physical ports > table. > > Signed-off-by: Sasha Khapyorsky > --- > opensm/opensm/osm_sa_slvl_record.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/opensm/opensm/osm_sa_slvl_record.c b/opensm/opensm/osm_sa_slvl_record.c > index 28dddd4..cc21765 100644 > --- a/opensm/opensm/osm_sa_slvl_record.c > +++ b/opensm/opensm/osm_sa_slvl_record.c > @@ -149,9 +149,9 @@ __osm_sa_slvl_by_comp_mask(IN osm_sa_t * sa, > comp_mask = p_ctxt->comp_mask; > num_ports = osm_node_get_num_physp(p_port->p_node); > in_port_start = 0; > - in_port_end = num_ports; > + in_port_end = num_ports - 1; > out_port_start = 0; > - out_port_end = num_ports; > + out_port_end = num_ports - 1; > p_req_physp = p_ctxt->p_req_physp; > > if (p_port->p_node->node_info.node_type != IB_NODE_TYPE_SWITCH) { Minor comment: Rather than subtracting 1 from in/out_port_end, wouldn't changing the input and output port number comparisons to < rather than <= work (and be more consistent with other SA record handling) ? -- Hal From slava at auto.ru Thu Jan 10 07:38:26 2008 From: slava at auto.ru (Viatcheslav E. Kouznetsov) Date: Thu, 10 Jan 2008 18:38:26 +0300 Subject: [ofa-general] build & install error Message-ID: <200801101838.26820.slava@auto.ru> Hi All! I have a some trouble with building & installing OFED software If I try to build OFED-1.2.5.4, i get next error ---- make -f scripts/Makefile.build obj=/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/.af_rds.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.1.2/include -D__KERNEL__ \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/kernel_addons/backport/2.6.18_RH_5.1 /include/ \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include \ -Iinclude \ \ -include include/linux/autoconf.h \ -include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include/linux/autoconf.h \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -mtune=generic -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/ulp/ipoib -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/debug -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/hw/cxgb3/core -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/net/cxgb3 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/net/mlx4 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/hw/mlx4 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(af_rds)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/.tmp_af_rds.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/af_rds.c In file included from /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/af_rds.c:39: /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h:167: error: expected specifier-qualifier-list before '__sum16' /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h: In function 'rds_message_make_checksum': /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h:480: error: 'struct rds_header' has no member named 'h_csum' /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h:481: error: 'struct rds_header' has no member named 'h_csum' /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h: In function 'rds_message_verify_checksum': /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h:486: error: 'const struct rds_header' has no member named 'h_c sum' make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/af_rds.o] Error 1 make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds] Error 2 make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4] Error 2 make[1]: Leaving directory `/usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64' make: *** [kernel] Error 2 When I build OFED-1.2.5.4-20080107-0713, building OK, but when I run modprobe command, I get next error ---- [root at blade02 ~]# modprobe ib_ipoib WARNING: Error inserting ib_core (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/core/ib_core.ko): Invalid module format WARNING: Error inserting ib_mad (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/core/ib_mad.ko): Invalid module format WARNING: Error inserting ib_sa (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/core/ib_sa.ko): Invalid module format WARNING: Error inserting ib_cm (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/core/ib_cm.ko): Invalid module format FATAL: Error inserting ib_ipoib (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/ulp/ipoib/ib_ipoib.ko): Invalid module format OS - CentOS 5.1 Kernel 2.6.18-53.el5 Hardware Supermicro SuperBlade with http://supermicro.com/products/superblade/module/SBI-7125B-T1.cfm modules. lspci -vv: InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (rev 20) Subsystem: Mellanox Technologies MT25208 InfiniHost III Ex From erezz at Voltaire.COM Thu Jan 10 07:43:31 2008 From: erezz at Voltaire.COM (Erez Zilber) Date: Thu, 10 Jan 2008 17:43:31 +0200 Subject: [ofa-general] OFED and Ubuntu Linux In-Reply-To: References: Message-ID: <47863D23.3060801@Voltaire.COM> Bart Van Assche wrote: > > On Jan 8, 2008 10:10 PM, Roland Dreier wrote: > > > Is the OFED software stack supported on Ubuntu Linux ? While OFED > > > 1.2.5.4 compiles fine on Ubuntu Linux 7.10, I got the following > errors > > > while installing the RPM's: > > > > Just out of curiosity, what packages from OFED are you interested in > > using on Ubuntu? My goal would be to get most IB/RDMA-related stuff > > into the upstream Debian/Ubuntu distributions directly, so that you > > don't have to mess around with OFED at all. > > > > Currently, Ubuntu 7.10 has a 2.6.22 kernel, which has most IB support > > built in, and the ubuntu archive has packages for libibverbs and > > libmthca in universe. 8.04 (Hardy) will have a 2.6.24 kernel and adds > > openmpi packages (built with libibverbs support). I have libmlx4 > > packaged for hardy in my PPA: > > > > deb http://ppa.launchpad.net/roland.dreier/ubuntu hardy main > > deb-src http://ppa.launchpad.net/roland.dreier/ubuntu hardy main > > > > (libmlx4 is in Debian testing so it should propagate automatically > > into Ubuntu universe for Hardy+1). > > > > I am planning on packaging librdmacm for Debian and Ubuntu in the next > > few weeks. The packages will appear in my PPA and should be ready in > > plenty of time for Hardy+1. Are there any other packages you are > > looking for? > > Hello Roland, > > I'm looking for iSCSI, iSER initiator, iSER target, SDP initiator, SDP > target and uDAPL support. I'm not sure which are the corresponding > OFED packages -- probably perftest, libibverbs, libibverbs-utils, > libibverbs-devel, libmlx4, libmthca, librdmacm, librdmacm-devel and > ofed-docs ? > I hope that I can help you with the iSCSI over iSER part. I don't have an Ubuntu machine, but I checked the status on Debian (lenny/sid): * OFED contains open-iscsi (iSCSI initiator) with iSER support. However, it doesn't run on Debian/Ubuntu. * Debian has an open-iscsi package that contains the userspace code. The kernel code is in Debian's kernel (iSER was pushed into the kernel in 2.6.18, so it should be there). In order to run open-iscsi over iSER, you don't need any IB userspace packages. Therefore, you don't need OFED for the initiator side. * OFED 1.3 doesn't contain an iSCSI over iSER target. We plan to add that in OFED 1.4. I still don't know if it will have support for Debian/Ubuntu. The target is called stgt. * If you still want to run an iSCSI over iSER target, I suggest that you follow the instructions in a short howto that we wrote: https://wiki.openfabrics.org/tiki-index.php?page=ISER-target I hope it helps. Erez From eli at mellanox.co.il Thu Jan 10 08:01:39 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 10 Jan 2008 18:01:39 +0200 Subject: [ofa-general] [PATCH] ib/core: Add creation flags to create QP Message-ID: <1199980899.11174.91.camel@mtls03> Add creation flags to create QP This allows the verbs consumer to pass flags to the low level drivers so they can act accordingly. Signed-off-by: Eli Cohen --- In the case of LSO for example, the mlx4 layer needs to know that so it can allocate space in the QP send queue buffer to hold the headers. include/rdma/ib_verbs.h | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 11f3960..6d766d9 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -486,6 +486,10 @@ enum ib_qp_type { IB_QPT_RAW_ETY }; +enum qp_create_flags { + QP_CREATE_LSO = 1 << 0, +}; + struct ib_qp_init_attr { void (*event_handler)(struct ib_event *, void *); void *qp_context; @@ -496,6 +500,7 @@ struct ib_qp_init_attr { enum ib_sig_type sq_sig_type; enum ib_qp_type qp_type; u8 port_num; /* special QP types only */ + enum qp_create_flags create_flags; }; enum ib_rnr_timeout { -- 1.5.3.8 From prescott at hpc.ufl.edu Thu Jan 10 09:47:54 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Thu, 10 Jan 2008 12:47:54 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <47863794.9080709@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> Message-ID: <47865A4A.4070603@hpc.ufl.edu> Steve Wise wrote: > > Craig Prescott wrote: >> >> I patched cma_connect_iw() to create the SDP header as >> cma_connect_ib() does. This gets us farther - examining the >> SDP header on the listener side looks right now, and the >> listener at least enters rdma_accept(), but iw_cm_accept() >> fails due to cm_id->device->iwcm->accept(cm_id, iw_param) >> returning -104. > 104 == ECONNRESET, so the client side must have reset the connection. > Did this happen after 10 seconds? (there's a 10 second MPA negiation > timeout in the chelsio cm). Also, a wire trace might be useful. If > this reset happens immediately, then you should look on the client side > and see why it reset the connection. The reset happens after 10 seconds. Here is tcpdump output from the netperf client host (tebow1): 12:00:17.156120 arp who-has tebow2.hpc.ufl.edu tell tebow1.hpc.ufl.edu 12:00:17.156178 arp reply tebow2.hpc.ufl.edu is-at 00:07:43:05:11:8a (oui Unknown) 12:00:27.180401 IP tebow1.hpc.ufl.edu.41353 > tebow2.hpc.ufl.edu.12865: S 697245480:697245480(0) win 17920 12:00:30.180571 IP tebow1.hpc.ufl.edu.41353 > tebow2.hpc.ufl.edu.12865: S 697245480:697245480(0) win 17920 12:00:30.180616 IP tebow2.hpc.ufl.edu.12865 > tebow1.hpc.ufl.edu.41353: S 1878582380:1878582380(0) ack 697245481 win 65535 12:00:30.180630 IP tebow1.hpc.ufl.edu.41353 > tebow2.hpc.ufl.edu.12865: . ack 1 win 35 12:00:30.255717 IP tebow1.hpc.ufl.edu.41353 > tebow2.hpc.ufl.edu.12865: P 1:257(256) ack 1 win 35 12:00:30.255753 IP tebow2.hpc.ufl.edu.12865 > tebow1.hpc.ufl.edu.41353: . ack 257 win 32736 12:00:30.255763 IP tebow2.hpc.ufl.edu.12865 > tebow1.hpc.ufl.edu.41353: R 1:1(0) ack 257 win 0 On the netserver host (tebow2), we see only the initial arp. >> The above call also emits a couple of messages >> into the listener's syslog now : >> >> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid 0x20 >> opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 opcode 14 >> status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >> > This is an async event generated due to a failure processing a SQ WR, I > think. opcodes and status codes for iw_cxgb3 are in cxio_wr.h. > type 1 means it was an egress (SQ) failure > status 0x6 is a base/bounds violation, > but 14 seems incorrect. That's not a valid T3 opcode. ???? > Ok, thanks! I guess I'm not sure what to make of that yet, though. Thanks, Craig From sean.hefty at intel.com Thu Jan 10 09:50:37 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 10 Jan 2008 09:50:37 -0800 Subject: [ofa-general] SDP and iWARP In-Reply-To: <47863794.9080709@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu><4783B3F5.20600@opengridcomputing.com><4783BDD5.7000702@hpc.ufl.edu><4783C326.3070306@opengridcomputing.com><478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> Message-ID: <000001c853b1$4efc6620$ff0da8c0@amr.corp.intel.com> >Why is this SDP protocol stuff done in the CMA?? That's seems like a >layer violation... The alternative is to have SDP duplicate much of the rdma_cm functionality. - Sean From rdreier at cisco.com Thu Jan 10 09:54:47 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 10 Jan 2008 09:54:47 -0800 Subject: [ofa-general] Re: [PATCH] libmlx4: revert commit which eliminated extra CQE. In-Reply-To: <200801101239.23455.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Thu, 10 Jan 2008 12:39:23 +0200") References: <200801101239.23455.jackm@dev.mellanox.co.il> Message-ID: thank, I reverted the commit. From dwignytem at ignyte.com Wed Jan 9 10:01:08 2008 From: dwignytem at ignyte.com (Jordon Cruz) Date: Thu, 9 Jan 2008 20:01:08 +0200 Subject: [ofa-general] Leave problems with cock size for losers Message-ID: <753682789.68596117863014@ignyte.com> Have a look at what people say about our product:"When I first saw this patch, I couldn't believe it might help. However, I gained 2 inches. It works great, and I really like the patch approach". Rodney, Buffalo.You don't need to envy guys with larger cocks anymore! Order our VPXL. http://geocities.com/noelmorton23/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Thu Jan 10 10:05:15 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 10 Jan 2008 12:05:15 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: <47865A4A.4070603@hpc.ufl.edu> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> Message-ID: <47865E5B.4030607@opengridcomputing.com> Craig Prescott wrote: > Steve Wise wrote: >> >> Craig Prescott wrote: >>> >>> I patched cma_connect_iw() to create the SDP header as >>> cma_connect_ib() does. This gets us farther - examining the >>> SDP header on the listener side looks right now, and the >>> listener at least enters rdma_accept(), but iw_cm_accept() >>> fails due to cm_id->device->iwcm->accept(cm_id, iw_param) >>> returning -104. >> 104 == ECONNRESET, so the client side must have reset the connection. >> Did this happen after 10 seconds? (there's a 10 second MPA negiation >> timeout in the chelsio cm). Also, a wire trace might be useful. If >> this reset happens immediately, then you should look on the client >> side and see why it reset the connection. > > The reset happens after 10 seconds. > > Here is tcpdump output from the netperf client host (tebow1): > > 12:00:17.156120 arp who-has tebow2.hpc.ufl.edu tell tebow1.hpc.ufl.edu > 12:00:17.156178 arp reply tebow2.hpc.ufl.edu is-at 00:07:43:05:11:8a > (oui Unknown) > 12:00:27.180401 IP tebow1.hpc.ufl.edu.41353 > tebow2.hpc.ufl.edu.12865: > S 697245480:697245480(0) win 17920 > 12:00:30.180571 IP tebow1.hpc.ufl.edu.41353 > tebow2.hpc.ufl.edu.12865: > S 697245480:697245480(0) win 17920 > 12:00:30.180616 IP tebow2.hpc.ufl.edu.12865 > tebow1.hpc.ufl.edu.41353: > S 1878582380:1878582380(0) ack 697245481 win 65535 > 12:00:30.180630 IP tebow1.hpc.ufl.edu.41353 > tebow2.hpc.ufl.edu.12865: > . ack 1 win 35 > 12:00:30.255717 IP tebow1.hpc.ufl.edu.41353 > tebow2.hpc.ufl.edu.12865: > P 1:257(256) ack 1 win 35 The above packet is the mpa-start with the SDP hello as private data, I think. > 12:00:30.255753 IP tebow2.hpc.ufl.edu.12865 > tebow1.hpc.ufl.edu.41353: > . ack 257 win 32736 > 12:00:30.255763 IP tebow2.hpc.ufl.edu.12865 > tebow1.hpc.ufl.edu.41353: > R 1:1(0) ack 257 win 0 And then nothing happens from the listening side, so the mpa-start reply never comes out. > > On the netserver host (tebow2), we see only the initial arp. > >>> The above call also emits a couple of messages >>> into the listener's syslog now : >>> >>> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid 0x20 >>> opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 opcode 14 >>> status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>> >> This is an async event generated due to a failure processing a SQ WR, >> I think. opcodes and status codes for iw_cxgb3 are in cxio_wr.h. >> type 1 means it was an egress (SQ) failure >> status 0x6 is a base/bounds violation, >> but 14 seems incorrect. That's not a valid T3 opcode. ???? >> > > Ok, thanks! I guess I'm not sure what to make of that yet, though. > See where in iwch_accept_cr() the failure is happening. It doesn't look like send_mpa_reply() is being called. From rdreier at cisco.com Thu Jan 10 10:08:38 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 10 Jan 2008 10:08:38 -0800 Subject: [ofa-general] Re: [PATCH] libmlx4: avoid memcpy in blueflame post_sends In-Reply-To: <200801100943.48907.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Thu, 10 Jan 2008 09:43:48 +0200") References: <200801091223.14155.jackm@dev.mellanox.co.il> <200801100943.48907.jackm@dev.mellanox.co.il> Message-ID: > However, your solution still results in a procedure call (mlx4_bf_copy > is compiled as a procedure using gcc 4.1.0 on an X86_64 host, even if > I add "inline"). Can you give more detail on the platform and how you compiled? I can't reproduce it with gcc 4.1.3 here. Are you compiling with optimization enabled? Are other things like set_atomic_seg() getting inlined properly? > I would prefer the patch below (which does generate inline code, and does the > (sizeof(unsigned long) * 2) calculation just once). Dividing by 2 * sizeof (long) seems to generate slightly worse code for me. Since sizeof (long) is a compile time constant, in my version the compiler just generates a sub $10, while in your version there is a sub $1 instead (which costs the same) plus an extra right shift at the beginning of the loop. - R. From rajouri.jammu at gmail.com Thu Jan 10 11:21:40 2008 From: rajouri.jammu at gmail.com (Rajouri Jammu) Date: Thu, 10 Jan 2008 11:21:40 -0800 Subject: [ofa-general] retry exceeded problem with rdma_read In-Reply-To: <4785B822.80306@dev.mellanox.co.il> References: <3307cdf90801091434q4298cab0sf8e670c21087afad@mail.gmail.com> <4785B822.80306@dev.mellanox.co.il> Message-ID: <3307cdf90801101121u39f4fa29l5673ea4c3ffe790c@mail.gmail.com> I have the following set both on rdma_connect as well as at rdma_accept. conn_param.responder_resources = 4; conn_param.initiator_depth = 4; Should initiator_depth be lower for better behavior? On Jan 9, 2008 10:16 PM, Dotan Barak wrote: > Rajouri Jammu wrote: > > Occasionally, I'm getting a retry exceeded error on the qp (error 12) > > when doing rdma_reads. > > > > Under what conditions would thins kind of problem happen? > > > > I have the retry_count = 5 and 'am using rdma_cm for all the > > connection setup. > > > > OFED version is 1.2.5 > Does it happen between different HCAs? > > If this happens during working with the QPs (not in the first message) > than check the following thing: > > If the QP attributes values of max_rd_atomic and max_dest_rd_atomic > this may happen. > > The values should be (for sides A and B): > A.max_rd_atomic <= B.max_dest_rd_atomic > A.max_dest_rd_atomic >= B.max_rd_atomic > > (which means that RDMA Reads/atomic as initiator shouldn't be larger > than the supported value as the destination) > > You can check it by query the used QP and verify those values. > > > > If it happens at the beginning of the connection, there may be other > problem and i need more info .... > > Dotan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdidomenico at gmail.com Thu Jan 10 12:16:39 2008 From: mdidomenico at gmail.com (Michael Di Domenico) Date: Thu, 10 Jan 2008 15:16:39 -0500 Subject: [ofa-general] ofed against lustre kernel Message-ID: <97a7c7ed0801101216p16ad7f55padac2b7200c3a4be@mail.gmail.com> Is there a trick to getting OFED 1.2.5.1 to compile against the lustre kernel on RedHat 5 x86_64? The first time i tried i got the below error, which looks like a problem with the lustre kernel source tree. I'm trying to work through it, but i have a feeling im starting to wander down a rabbit hole... Does anyone know if there is a step guide for installing redhat 5, ofed 1.2.5.1, and lustre 1.6.4.1? gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/net/rds/.ib_sysctl.o.d -nostdinc -isystem /usr/lib/gcc-I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/kernel_addons/backport/2.6.18_FC6/include/ \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/include \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/drivers/infiniband/include \ -Iinclude \ -Iinclude2 -I/usr/src/linux-2.6.18-8.1.14.el5_lustre.1.6.4.1/include \ -include include/linux/autoconf.h \ -include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/include/linux/autoconf.h \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/net/rds -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-st ld -m elf_x86_64 -r -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/net/rds/rds.o /var/tmp/OFEDRPM/BUILD/ofa_ke Building modules, stage 2. make -rR -f /usr/src/linux-2.6.18-8.1.14.el5_lustre.1.6.4.1/scripts/Makefile.modpost /usr/src/linux-2.6.18-8.1.14.el5_lustre.1.6.4.1/scripts/Makefile.modpost:38: include/config/auto.conf: No such make[4]: *** No rule to make target `include/config/auto.conf'. Stop. make[3]: *** [modules] Error 2 make[2]: *** [modules] Error 2 make[1]: *** [modules] Error 2 make[1]: Leaving directory `/usr/src/linux-2.6.18-8.1.14.el5_lustre.1.6.4.1-obj/x86_64/smp' make: *** [kernel] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.65645 (%install) From ufca at bradleysdavis.com Wed Jan 9 16:34:08 2008 From: ufca at bradleysdavis.com (Orval Hudson) Date: Fri, 10 Jan 2008 08:34:08 +0800 Subject: [ofa-general] Re: Orval Message-ID: <01c85363$9105c000$826e0a75@ufca> Try United states FDA approved prescription medications through our licensed pharmacy. orders are overviewed by licensed accredited medication department. http://RickieWhitleyHJ.googlepages.com liquidation and principles will help have shown that recommended by (and impress cocktail party guests) A lack of spontaneous From ufca at bradleysdavis.com Wed Jan 9 16:34:08 2008 From: ufca at bradleysdavis.com (Orval Hudson) Date: Fri, 10 Jan 2008 08:34:08 +0800 Subject: [ofa-general] Re: Orval Message-ID: <01c85363$9105c000$826e0a75@ufca> Try United states FDA approved prescription medications through our licensed pharmacy. orders are overviewed by licensed accredited medication department. http://RickieWhitleyHJ.googlepages.com liquidation and principles will help have shown that recommended by (and impress cocktail party guests) A lack of spontaneous From kliteyn at mellanox.co.il Thu Jan 10 17:20:50 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 11 Jan 2008 03:20:50 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-11:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-10 OpenSM git rev = Thu_Jan_10_03:48:16_2008 [7bb2045bd9f659f8466a4494f4ec983f0edbf96a] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=399 Fail=1 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 9 LidMgr IS3-128.topo Failures: 1 LidMgr IS3-128.topo From davem at davemloft.net Thu Jan 10 20:56:03 2008 From: davem at davemloft.net (David Miller) Date: Thu, 10 Jan 2008 20:56:03 -0800 (PST) Subject: [ofa-general] Re: [RFC PATCH] IPoIB: improve IPv4/IPv6 to IB mcast mapping functions In-Reply-To: References: <20071210203544.GI30090@obsidianresearch.com> <20071210203841.GJ30090@obsidianresearch.com> Message-ID: <20080110.205603.78484957.davem@davemloft.net> From: Roland Dreier Date: Fri, 04 Jan 2008 14:05:29 -0800 > Any objection to merging the following for 2.6.25? The ipv4/ipv6 bits look fine to me. From bart.vanassche at gmail.com Thu Jan 10 23:53:13 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Fri, 11 Jan 2008 08:53:13 +0100 Subject: [ofa-general] OFED and Ubuntu Linux In-Reply-To: <47863D23.3060801@Voltaire.COM> References: <47863D23.3060801@Voltaire.COM> Message-ID: On Jan 10, 2008 4:43 PM, Erez Zilber wrote: > Bart Van Assche wrote: > > > I'm looking for iSCSI, iSER initiator, iSER target, SDP initiator, SDP > > target and uDAPL support. I'm not sure which are the corresponding > > OFED packages -- probably perftest, libibverbs, libibverbs-utils, > > libibverbs-devel, libmlx4, libmthca, librdmacm, librdmacm-devel and > > ofed-docs ? > > I hope that I can help you with the iSCSI over iSER part. I don't have > an Ubuntu machine, but I checked the status on Debian (lenny/sid): Thanks for all the help -- I already managed to set up an iSCSI initiator and target. I'm struggling with SDP however. I'll post a separate mail about SDP. Bart. From tadlock at fujitsu.com Fri Jan 11 00:57:17 2008 From: tadlock at fujitsu.com (phineas) Date: Fri, 11 Jan 2008 08:57:17 +0000 Subject: [ofa-general] January disocount for openib-general Message-ID: <000801c8543e$053d3630$5ccae4a5@boagjpii> * 100 Jahre Propagandafotografie: TASS war die Wahrheit * ?bersicht * Konflikte: Bush schreibt Brief an Kim Jong Il * Familiendramen: Warum M?tter ihre Kinder t?ten * Ehrensenf: Sparen ist angesagt * Konflikte: Bush schreibt Brief an Kim Jong Il * Guter Geruchssinn: Elefanten erschn?ffeln, wo sich die Familie aufh?ltbl?tternFlugzeugbau: Boeing- Manager sieht Airbus auf Platz eins * WintersportKULTUR * Tageskarte Film: Einsam an der SpitzeEin japanischer Chemiker glaubte, ein geniales neues Ringmolek?l hergestellt zu haben. Das aber kam einem emeritierten Professor aus W?rzburg so bekannt vor, dass er nachbl?tterte - und herausfand: Die Entdeckung wurde vor 102 Jahren schon mal gemacht. Von Jens Lubbadeh mehr...Das europ?ische "Columbus"-Weltraumlabor bleibt vorerst am Boden: Der Start des Space Shuttle "Atlantis" wird verschoben. Bei der Betankung der Raumf?hre sind Probleme aufgetreten - fr?hestens morgen ist ein neuer Versuch m?glich. mehr... * 150 Jahre Alpine Club: Wer steigt, der bleibt * Digitale Wasserzeichen: Nielsen will Web- Fernsehnutzern ?ber die Schulter schauen * 15 Jahre SMS: Stoppt den Tasten- Terror! * ?bersicht * Fl?chtlinge im Irak: Gefangen in der Green Zone * Themen * 145 Beitr?ge, Neuester: Heute, 14.38 Uhr von dhanz * ?bersicht * Reaktionen: "Solche Taten lassen sich nicht verhindern"Hallo Taxi! Das Solartaxi bef?rdert Delegationen, Minister und Journalisten ?ber das Gel?nde des Klimakonferenz in Bali. Ein scharfes Auge hat die Security auf das ungew?hnliche Gef?hrt - es gibt st?ndig Drohungen gegen die Uno. mehr... * Geheimdienstbericht zu Iran: Putins stille Schadenfreude * 100 Jahre Propagandafotografie: TASS war die Wahrheit * ?bersicht * Konflikte: Bush schreibt Brief an Kim Jong Il * Familiendramen: Warum M?tter ihre Kinder t?ten * Ehrensenf: Sparen ist angesagt * Konflikte: Bush schreibt Brief an Kim Jong Il * Guter Geruchssinn: Elefanten erschn?ffeln, wo sich die Familie aufh?ltbl?tternFlugzeugbau: Boeing- Manager sieht Airbus auf Platz eins * WintersportKULTUR * Tageskarte Film: Einsam an der SpitzeEin japanischer Chemiker glaubte, ein geniales neues Ringmolek?l hergestellt zu haben. Das aber kam einem emeritierten Professor aus W?rzburg so bekannt vor, dass er nachbl?tterte - und herausfand: Die Entdeckung wurde vor 102 Jahren schon mal gemacht. Von Jens Lubbadeh mehr... -------------- next part -------------- An HTML attachment was scrubbed... URL: From makc at sgi.com Fri Jan 11 03:07:39 2008 From: makc at sgi.com (Max Matveev) Date: Fri, 11 Jan 2008 22:07:39 +1100 Subject: [ofa-general] opensm dumps core when using LASH for routing Message-ID: <18311.19963.735177.83038@kuku.melbourne.sgi.com> I've got opensm 3.0.3 from OFED 1.2 dying on startup when using LASH for routing. Here is the trace: #0 0x0000000000459abf in get_lash_id (p_sw=0x5fbf40) at osm_ucast_lash.c:1124 #1 0x000000000045a704 in osm_get_lash_sl (p_osm=0x7fffa0aba4d0, p_src_port=0x2aab06971400, p_dst_port=0x61ade0) at osm_ucast_lash.c:1450 #2 0x000000000042e80d in __osm_pr_rcv_get_path_parms(p_rcv=0x7fffa0abba80, p_pr=0x2aab0661add0, p_src_port=0x2aab06971400, p_dest_port=0x61ade0, dest_lid_ho=2, comp_mask=580964351930793984, p_parms=0x649eef20) at osm_sa_path_record.c:685 #3 0x000000000042f02b in __osm_pr_rcv_get_lid_pair_path ( p_rcv=0x7fffa0abba80, p_pr=0x2aab0661add0, p_src_port=0x2aab06971400, p_dest_port=0x61ade0, p_dgid=0x649ef0a0, src_lid_ho=1, dest_lid_ho=2, comp_mask=580964351930793984, preference=0 '\0') at osm_sa_path_record.c:852 #4 0x000000000042f5d6 in __osm_pr_rcv_get_port_pair_paths ( p_rcv=0x7fffa0abba80, p_madw=0x6ecbb0, p_req_port=0x2aab06971400, p_src_port=0x2aab06971400, p_dest_port=0x61ade0, p_dgid=0x649ef0a0, comp_mask=580964351930793984, p_list=0x649ef0b0) at osm_sa_path_record.c:1072 #5 0x000000000042fdc5 in __osm_pr_rcv_process_half(p_rcv=0x7fffa0abba80, p_madw=0x6ecbb0, requester_port=0x2aab06971400, p_src_port=0x2aab06971400, p_dest_port=0x0, p_dgid=0x649ef0a0, comp_mask=580964351930793984, p_list=0x649ef0b0) at osm_sa_path_record.c:1437 #6 0x0000000000430c6f in osm_pr_rcv_process (context=0x7fffa0abba80, data=0x6ecbb0) at osm_sa_path_record.c:2003 #7 0x00002b110a54ef57 in __cl_disp_worker (context=0x7fffa0abcb30) at cl_dispatcher.c:102 #8 0x00002b110a5563b7 in __cl_thread_pool_routine(context=0x7fffa0abcba8) at cl_threadpool.c:74 #9 0x00002b110a55620a in __cl_thread_wrapper (arg=0x5a4a40) at cl_thread.c:58 #10 0x00002b110a21b143 in start_thread () from /lib64/libpthread.so.0 #11 0x00002b110a82774d in clone () from /lib64/libc.so.6 #12 0x0000000000000000 in ?? () This is the switch: (gdb) p *( osm_switch_t *)0x5fbf40 $3 = {map_item = {pool_item = {list_item = {p_next = 0x2aab065b4ed0, p_prev = 0x2aab069850f0}}, p_left = 0x2aab069850f0, p_right = 0x2aab065b4ed0, p_up = 0x2aab06587cd0, color = CL_MAP_BLACK, key = 17582052945261297672}, p_node = 0x608c80, switch_info = { lin_cap = 192, rand_cap = 0, mcast_cap = 4, lin_top = 32769, def_port = 0 '\0', def_mcast_pri_port = 0 '\0', def_mcast_not_port = 0 '\0', life_state = 144 '\220', lids_per_port = 0, enforce_cap = 8192, flags = 240 ''}, max_lid_ho = 0, num_ports = 25 '\031', num_hops = 0, hops = 0x0, p_prof = 0x5fbff0, fwd_tbl = {p_rnd_tbl = 0x0, p_lin_tbl = 0x7dd0f0}, mcast_tbl = { num_ports = 25 '\031', max_position = 1 '\001', max_block = 31, max_block_in_use = -1, num_entries = 1024, max_mlid_ho = 50176, p_mask_tbl = 0x7e9100}, discovery_count = 3, priv = 0x0} As you can see the priv pointer is NULL get_lash_id() follows it and dies. There is an obvious fix - simply check for priv in osm_get_lash_sl() and return OSM_DEFAULT_SL: it already does it when checking for src_id but not for dst_id but I'm not sure it's the right fix because I cannot quite understand how priv got to be NULL - it was reset in lash_cleanup() but I don't see any threads which are inside discover_network_properties() and I would've thought that when opensm gets out of there, all switches must be initialized properly. I'm also not sure who initiated PATH_RECORD query - it does not look like opensm would do it to itself yet the requestor_port was on the name HCA. It could be another process running on the same host for what I know. max From vlad at lists.openfabrics.org Fri Jan 11 03:11:31 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 11 Jan 2008 03:11:31 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080111-0200 daily build status Message-ID: <20080111111131.18478E60086@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.14 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.18 Passed on x86_64 with linux-2.6.20 Passed on powerpc with linux-2.6.15 Passed on ppc64 with linux-2.6.15 Passed on powerpc with linux-2.6.12 Passed on powerpc with linux-2.6.13 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.14 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.21.1 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.12 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.19 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.14 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.13 Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.14 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Failed: From infodept at uklottery.co.uk Fri Jan 11 04:12:36 2008 From: infodept at uklottery.co.uk (BRITISH ONLINE SWEEPSTAKES) Date: Fri, 11 Jan 2008 06:12:36 -0600 (CST) Subject: [ofa-general] ITS YOUR TURN TO SHINE...YOUR EMAIL JUST WON YOU =?iso-8859-1?q?=A3500=2C000?= Message-ID: <55447.196.220.0.114.1200053556.squirrel@cablecomrey.com.mx> The National Lottery P O Box 1010 Liverpool, L70 1NL UNITED KINGDOM (Customer Services) Ref: UK/9420X2/68/BSH Batch: 074/05/ZY369/BSH SWEEPSTAKES UPDATES[CONTACT CLAIMS AGENT] This mail is to give you final update of your won funds in the Uk National Lottery promotions whose informations was previously sent to you and you have not yet finalise the process of your claims. You are hereby advised to respond swiftly to the mail below. Your e-mail address attached to ticket number: 56475600545188/BSH with Serial number 5368/02/BSH drew the lucky numbers: 03-11-13-18-26-31(bonus no.), which subsequently won you the lottery in the 1st category i.e match 6. You have therefore been approved to claim a total sum of £500,000(five hundred thousand pounds) in cash credited to file KTU/9023118308/03/BSH. All participants for the online version were selected randomly from World Wide Web sites through computer draw system and extracted from over 100,000 unions, associations, and corporate bodies that are listed online. Please note that your lucky winning number falls within our European booklet representative office in Europe as indicated in your play coupon. In view of this,You have therefore been approved to claim a total sum of £500,000(five hundred thousand pounds) would be released to you by any of our payment offices in Europe. Our European agents will immediately commence the process to facilitate the release of your funds as soon as you contact them. For security reasons, you are advised to keep your winning information confidential till your claims is processed and your money remitted to you in whatever manner you deem fit to claim your prize.This is part of our precautionary measure to avoid double claiming and unwarranted abuse of this program.Please be warned. To file for your claim, please contact our claims agent: Mr Walters Richfield. Email: info.waltersrichfield5 at yahoo.co.uk OR info.waltersrichfield4 at yahoo.co.uk Tel: +447045729312 +447045729311 Fax: +448712565142 1, Amount won 2, Winning E-mail address 3, Batch Number 4, Refference Number 5, Ticket Number 6, Serial Number Upon comfirmation of this mail you are advised to call any of our official numbers above for verification of your winning. Goodluck from me and members of staff of the UK NATIONAL LOTTERY. Yours faithfully, Richard K. Loydd Online coordinator for UK NATIONAL LOTTERY Sweepstakes International Program From andrew at multexinvestornetwork.com Fri Jan 11 03:26:50 2008 From: andrew at multexinvestornetwork.com (Viagra sven) Date: Fri, 11 Jan 2008 11:26:50 +0000 Subject: [ofa-general] Receive your discount openib-general Message-ID: <000601c85453$04167958$33cf5fbf@npjgy> * Gammelfleisch- Verdacht: Superm?rkte verbannen Gefl?gel aus den Regalen * J?discher GI Kleeman: Der KZ- H?ftling, der seinen Verr?ter verhaftete * Themen * 150 Jahre Alpine Club: Wer steigt, der bleibtHOCHSTAPLERIN ALS ?RZTIN * Fernweh * B?rse * Europa BuaTVqy your mUJCCpseVMmeds XadqjvKiSMonline. VilVWyaTVqUJagra $1.79 only PANORAMA * Anschlag in Bagdad: Autobombe explodiert w?hrend Gates-Besuch - viele Tote * Belgische Regierungskrise: Gefangen im politischen NirgendwoHOCHSTAPLERIN ALS ?RZTINbl?ttern * Fundb?ro * Gammelfleisch- Verdacht: Superm?rkte verbannen Gefl?gel aus den RegalenPeinliche Panne: Microsofts digitaler Chat-Weihnachtsmann hat nicht etwa saison?bliche Spr?che von sich gegeben - sondern Themen angesprochen, die in den USA als "sexuell explizit" bezeichnet werden. Jetzt wurde der Plauderautomat zum Schweigen verdonnert. mehr... Other meds are even cheaper. Click * Uni Dresden: Rechtsextreme st?ren Vortrag * Korruptionsaff?re: Nigeria stoppt Gesch?fte mit Siemens * Zerst?ren Mindestl?hne den Wettbewerb? * ?bersicht * ?bersicht * Gammelfleisch- Verdacht: Superm?rkte verbannen Gefl?gel aus den Regalen * J?discher GI Kleeman: Der KZ- H?ftling, der seinen Verr?ter verhaftete * Themen -------------- next part -------------- An HTML attachment was scrubbed... URL: From info at gletle.net Fri Jan 11 06:41:33 2008 From: info at gletle.net (=?windows-1255?B?4/jqIOT59PI=?=) Date: Fri, 11 Jan 2008 06:41:33 -0800 Subject: [ofa-general] FW: FW =?windows-1255?b?4uns6frpIOD6IOTx5eMgLSDu4+Tp7Q==?= Message-ID: <20080111144147.C8252E601B5@openfabrics.org> An HTML attachment was scrubbed... URL: From prescott at hpc.ufl.edu Fri Jan 11 08:03:58 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Fri, 11 Jan 2008 11:03:58 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <47865E5B.4030607@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> Message-ID: <4787936E.5010603@hpc.ufl.edu> Steve Wise wrote: > Craig Prescott wrote: >> Steve Wise wrote: >>> >>> Craig Prescott wrote: >>>> >>>> The above call also emits a couple of messages >>>> into the listener's syslog now : >>>> >>>> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid 0x20 >>>> opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 opcode >>>> 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>> >>> This is an async event generated due to a failure processing a SQ WR, >>> I think. opcodes and status codes for iw_cxgb3 are in cxio_wr.h. >>> type 1 means it was an egress (SQ) failure >>> status 0x6 is a base/bounds violation, >>> but 14 seems incorrect. That's not a valid T3 opcode. ???? >>> >> >> Ok, thanks! I guess I'm not sure what to make of that yet, though. >> > > See where in iwch_accept_cr() the failure is happening. It doesn't look > like send_mpa_reply() is being called. > The ECONNRESET is coming from here in iwch_accept_cr(): ... /* wait for wr_ack */ wait_event(ep->com.waitq, ep->com.rpl_done); err = ep->com.rpl_err; ... Is that what you thought was happening? Thanks, Craig From swise at opengridcomputing.com Fri Jan 11 08:21:18 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 11 Jan 2008 10:21:18 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4787936E.5010603@hpc.ufl.edu> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> Message-ID: <4787977E.509@opengridcomputing.com> Craig Prescott wrote: > Steve Wise wrote: >> Craig Prescott wrote: >>> Steve Wise wrote: >>>> >>>> Craig Prescott wrote: >>>>> >>>>> The above call also emits a couple of messages >>>>> into the listener's syslog now : >>>>> >>>>> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid 0x20 >>>>> opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 opcode >>>>> 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>> >>>> This is an async event generated due to a failure processing a SQ >>>> WR, I think. opcodes and status codes for iw_cxgb3 are in cxio_wr.h. >>>> type 1 means it was an egress (SQ) failure >>>> status 0x6 is a base/bounds violation, >>>> but 14 seems incorrect. That's not a valid T3 opcode. ???? >>>> >>> >>> Ok, thanks! I guess I'm not sure what to make of that yet, though. >>> >> >> See where in iwch_accept_cr() the failure is happening. It doesn't >> look like send_mpa_reply() is being called. >> > > The ECONNRESET is coming from here in iwch_accept_cr(): > > ... > /* wait for wr_ack */ > wait_event(ep->com.waitq, ep->com.rpl_done); > err = ep->com.rpl_err; > ... > > Is that what you thought was happening? I don't know exactly what is going on! But the code above means that the firmware never successfully sent the last streaming message (the mpa-start reply) and never transitioned the connection into rdma mode. And the async error might indicate that some WR was posted prior to doing the rdma_accept() and that WR had problems. a few questions: What firmware are you running? ethtool -i will tell you. What ofed version exactly? Does sdp post a SQ or RQ WR prior to doing the rdma_accept()? Can you dump that work request? Maybe in iwch_post_send and iwch_post_recv, dump the work request after it is built and before the code rings the doorbell. You can dump it as 8B flits, and be sure an put the flits in host byte order. See cxio_dump_wqe() in cxio_dbg.c... Steve. From davem at systemfabricworks.com Fri Jan 11 08:28:15 2008 From: davem at systemfabricworks.com (davem at systemfabricworks.com) Date: Fri, 11 Jan 2008 10:28:15 -0600 Subject: [ofa-general] [PATCH] drivers/infiniband/ulp/srpt: Fix target data corruption Message-ID: <4787991F.mailCSZ16MSG5@systemfabricworks.com> Change the local buffer allocator to use a spin-lock protected linked list instead of an array of atomic_t used/free variables. The atomic_t code was open to a multi-thread race between test and set. This has been observed with the result that the same data buffer was used for more than one SCSI operation, either writing the wrong data to the disk or sending the wrong data to the initiator. Signed-off-by: Robert Pearson Signed-off-by: David A. McMillen --- ib_srpt.c | 113 +++++++++++++++++++++++++++++++++++++++--------------------- 1 files changed, 73 insertions(+), 40 deletions(-) diff --git a/ib_srpt.c b/ib_srpt.c index 1fb9d56..44cb98a 100644 --- a/ib_srpt.c +++ b/ib_srpt.c @@ -56,10 +56,14 @@ MODULE_LICENSE("Dual BSD/GPL"); struct mem_elem { struct page *page; + struct mem_elem *next; + int ndx; u32 len; - atomic_t used; }; +static spinlock_t mempool_lock; +static struct mem_elem *mempool_freelist; + struct srpt_thread { spinlock_t thread_lock; struct list_head thread_ioctx_list; @@ -70,7 +74,7 @@ static u64 mellanox_ioc_guid = 0; static struct list_head srpt_devices; static int mem_size = 32768; static int mem_elements = 4096; -static atomic_t mem_avail; +static int mem_avail; static int cur_pos = 1; static struct mem_elem *srpt_mempool = NULL; static int thread = 1; @@ -100,26 +104,22 @@ static struct ib_client srpt_client = { static struct mem_elem *srpt_get_mem_elem(int *index) { - int i, end_pos; + struct mem_elem *elem; *index = 0; - end_pos = mem_elements; - for (i = cur_pos; i <= end_pos; ++i) { - if (i == mem_elements) { - end_pos = cur_pos; - i = 1; - } - if (atomic_read(&srpt_mempool[i].used) == 0) { - atomic_inc(&srpt_mempool[i].used); - smp_mb__after_atomic_inc(); - *index = i; - cur_pos = i + 1; - atomic_dec(&mem_avail); - return &srpt_mempool[i]; - } + spin_lock(&mempool_lock); + elem = mempool_freelist; + if (elem) { + mempool_freelist = elem->next; + mem_avail--; + *index = elem->ndx; + } else { + *index = 0; } - return NULL; + spin_unlock(&mempool_lock); + + return elem; } static int srpt_free_mem_elem(int index) @@ -127,17 +127,42 @@ static int srpt_free_mem_elem(int index) if (!index || index >= mem_elements) return -EINVAL; - atomic_dec(&srpt_mempool[index].used); - smp_mb__after_atomic_dec(); - atomic_inc(&mem_avail); + spin_lock(&mempool_lock); + srpt_mempool[index].next = mempool_freelist; + mempool_freelist = &srpt_mempool[index]; + mem_avail++; + spin_unlock(&mempool_lock); return 0; } static int srpt_mempool_create(void) { - int i, order; + int i, order, array_size; - order = get_order(mem_elements * sizeof(struct mem_elem)); + if (mem_elements <= 0) + return -ENOMEM; + array_size = mem_elements * sizeof(struct mem_elem); + while (array_size & (array_size - 1)) + array_size += array_size & ~(array_size - 1); + /* array_size now first power of 2 >= actual array size */ + + if (mem_size < PAGE_SIZE) { + printk(KERN_ERR PFX "mem_size parameter changed from" + " %d to %lu (PAGE_SIZE)\n", mem_size, PAGE_SIZE); + mem_size = PAGE_SIZE; + } + i = mem_size; + while (i & (i - 1)) + i += i & (i - 1); + if (i != mem_size) { + printk(KERN_ERR PFX "mem_size parameter rounded up from" + " %d to %d\n", mem_size, i); + mem_size = i; + } + /* mem_size now also a power of 2 >= PAGE_SIZE */ + + spin_lock_init(&mempool_lock); + order = get_order(array_size); srpt_mempool = (struct mem_elem *) __get_free_pages(GFP_KERNEL, order); if (!srpt_mempool) return -ENOMEM; @@ -151,32 +176,39 @@ static int srpt_mempool_create(void) mem_size, i); goto free_mem; } - atomic_set(&srpt_mempool[i].used, 0); + srpt_mempool[i].ndx = i; + srpt_mempool[i].next = mempool_freelist; + mempool_freelist = &srpt_mempool[i]; } - atomic_set(&mem_avail, mem_elements); + mem_avail = mem_elements; return 0; free_mem: while (i > 1) __free_pages(srpt_mempool[--i].page, order); - free_pages((unsigned long) srpt_mempool, - get_order(mem_elements * sizeof(struct mem_elem))); + free_pages((unsigned long) srpt_mempool, get_order(array_size)); return -ENOMEM; } static void srpt_mempool_destroy(void) { - int i, order; + int i, order, array_size; + + if (srpt_mempool == NULL) return; + + array_size = mem_elements * sizeof(struct mem_elem); + while (array_size & (array_size - 1)) + array_size += array_size & ~(array_size - 1); + /* array_size now first power of 2 >= actual array size */ order = get_order(mem_size); for (i = 1; i < mem_elements; ++i) __free_pages(srpt_mempool[i].page, order); - free_pages((unsigned long) srpt_mempool, - get_order(mem_elements * sizeof(struct mem_elem))); + free_pages((unsigned long) srpt_mempool, get_order(array_size)); } static void srpt_event_handler(struct ib_event_handler *handler, @@ -450,6 +482,7 @@ static int srpt_refresh_port(struct srpt_port *sport) goto err_query_port; if (!sport->mad_agent) { + memset(®_req, 0, sizeof reg_req); reg_req.mgmt_class = IB_MGMT_CLASS_DEVICE_MGMT; reg_req.mgmt_class_version = IB_MGMT_BASE_VERSION; set_bit(IB_MGMT_METHOD_GET, reg_req.method_mask); @@ -900,7 +933,7 @@ static void srpt_handle_new_iu(struct srpt_rdma_ch *ch, dir = SCST_DATA_WRITE; else dir = SCST_DATA_NONE; - } else + } else dir = SCST_DATA_NONE; scmnd = scst_rx_cmd(ch->scst_sess, (u8 *) & srp_cmd->lun, @@ -939,7 +972,7 @@ static void srpt_handle_new_iu(struct srpt_rdma_ch *ch, /* * if we have buffer in mem pool and * IO is > mem_size - we will allocate the memory buffer - * for data xfer to avoid big scatterlist with 4KB each + * for data xfer to avoid big scatterlist with 4KB each * allocated by scst memory module. */ if (mem_elements && (ioctx->data_len >= mem_size)) @@ -1682,7 +1715,7 @@ static int srpt_map_sg_to_ib_sge(struct srpt_rdma_ch *ch, dma_len = sg_dma_len(&scat[0]); riu = ioctx->rdma_ius; - /* + /* * For each remote desc - calculate the #ib_sge. * If #ib_sge < SRPT_DEF_SG_TABLESIZE per rdma operation then * each remote desc rdma_iu is required a rdma wr; @@ -2116,8 +2149,8 @@ static int srpt_alloc_data_buf(struct scst_cmd *scmnd) tsize = scst_cmd_get_bufflen(scmnd); nrdma = tsize / mem_size + ioctx->n_rbuf; - if (nrdma > atomic_read(&mem_avail)) { - printk(KERN_ERR PFX "!!ALERT mem_avail= %d\n", atomic_read(&mem_avail)); + if (nrdma > mem_avail) { + printk(KERN_ERR PFX "!!ALERT mem_avail= %d\n", mem_avail); goto out; } @@ -2161,7 +2194,7 @@ static int srpt_alloc_data_buf(struct scst_cmd *scmnd) release_mem: printk(KERN_WARNING PFX "ALERT! alloc_mem mem_avail= %d n_mem_elem=%d\n", - atomic_read(&mem_avail), i); + mem_avail, i); riu = ioctx->rdma_ius; while (i) @@ -2295,7 +2328,7 @@ static CLASS_DEVICE_ATTR(login_info, S_IRUGO, show_login_info, NULL); static ssize_t show_mem_info(struct class_device *class_dev, char *buf) { return sprintf(buf, "mem_avail= %d mem_elements= %d mem_size= %d\n", - atomic_read(&mem_avail), mem_elements, mem_size); + mem_avail, mem_elements, mem_size); } static CLASS_DEVICE_ATTR(mem_info, S_IRUGO, show_mem_info, NULL); @@ -2362,10 +2395,10 @@ static void srpt_add_one(struct ib_device *device) (unsigned long long) mellanox_ioc_guid, (unsigned long long) mellanox_ioc_guid); - /* + /* * We do not have a consistent service_id (ie. also id_ext of target_id) - * to identify this target. We currently use the guid of the first HCA - * in the system as service_id; therefore, the target_id will change + * to identify this target. We currently use the guid of the first HCA + * in the system as service_id; therefore, the target_id will change * if this HCA is gone bad and replaced by different HCA */ if (ib_cm_listen(sdev->cm_id, cpu_to_be64(mellanox_ioc_guid), 0, NULL)) From prescott at hpc.ufl.edu Fri Jan 11 08:40:13 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Fri, 11 Jan 2008 11:40:13 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4787977E.509@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> Message-ID: <47879BED.6060104@hpc.ufl.edu> Steve Wise wrote: > Craig Prescott wrote: >> Steve Wise wrote: >>> Craig Prescott wrote: >>>> Steve Wise wrote: >>>>> >>>>> Craig Prescott wrote: >>>>>> >>>>>> The above call also emits a couple of messages >>>>>> into the listener's syslog now : >>>>>> >>>>>> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid >>>>>> 0x20 opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 opcode >>>>>> 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>> >>>>> This is an async event generated due to a failure processing a SQ >>>>> WR, I think. opcodes and status codes for iw_cxgb3 are in cxio_wr.h. >>>>> type 1 means it was an egress (SQ) failure >>>>> status 0x6 is a base/bounds violation, >>>>> but 14 seems incorrect. That's not a valid T3 opcode. ???? >>>>> >>>> >>>> Ok, thanks! I guess I'm not sure what to make of that yet, though. >>>> >>> >>> See where in iwch_accept_cr() the failure is happening. It doesn't >>> look like send_mpa_reply() is being called. >>> >> >> The ECONNRESET is coming from here in iwch_accept_cr(): >> >> ... >> /* wait for wr_ack */ >> wait_event(ep->com.waitq, ep->com.rpl_done); >> err = ep->com.rpl_err; >> ... >> >> Is that what you thought was happening? > > I don't know exactly what is going on! But the code above means that > the firmware never successfully sent the last streaming message (the > mpa-start reply) and never transitioned the connection into rdma mode. > And the async error might indicate that some WR was posted prior to > doing the rdma_accept() and that WR had problems. > > a few questions: > > What firmware are you running? ethtool -i will tell you. [root at tebow2 ~]# ethtool -i eth4 driver: cxgb3 version: 1.0-ko firmware-version: T 5.0.0 TP 1.1.0 bus-info: 0000:86:00.0 > What ofed version exactly? I am still using the same code as at the top of the thread, an OFED 1.3 daily from Monday morning: OFED-1.3-20080107-0942 > Does sdp post a SQ or RQ WR prior to doing the rdma_accept()? Can you > dump that work request? Maybe in iwch_post_send and iwch_post_recv, > dump the work request after it is built and before the code rings the > doorbell. You can dump it as 8B flits, and be sure an put the flits in > host byte order. See cxio_dump_wqe() in cxio_dbg.c... > Ok, I'll do this. I really appreciate your help and advice! Thanks, Craig From rdreier at cisco.com Fri Jan 11 11:05:18 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 11 Jan 2008 11:05:18 -0800 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <47831C71.9040008@dev.mellanox.co.il> (Dotan Barak's message of "Tue, 08 Jan 2008 08:47:13 +0200") References: <200801061801.25386.dotanb@dev.mellanox.co.il> <47831C71.9040008@dev.mellanox.co.il> Message-ID: I guess we might as well get the code right anyway... I applied this. From rdreier at cisco.com Fri Jan 11 11:09:27 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 11 Jan 2008 11:09:27 -0800 Subject: [ofa-general] [PATCH] Use round_jiffies() in ehca timer In-Reply-To: <20071015054907.GE3257@kryten> (Anton Blanchard's message of "Mon, 15 Oct 2007 00:49:07 -0500") References: <20071015054907.GE3257@kryten> Message-ID: ehca guys -- this looks sane to me, and I've had it sitting in my inbox for a while. Any objection to merging it for 2.6.25? - R. > Use round_jiffies() to align the 1 second timer with other timers and > potentially save power by sleeping cores for longer. > > Signed-off-by: Anton Blanchard > --- > > diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c > index 403467f..23000b7 100644 > --- a/drivers/infiniband/hw/ehca/ehca_main.c > +++ b/drivers/infiniband/hw/ehca/ehca_main.c > @@ -902,7 +902,7 @@ void ehca_poll_eqs(unsigned long data) > ehca_process_eq(shca, 0); > } > } > - mod_timer(&poll_eqs_timer, jiffies + HZ); > + mod_timer(&poll_eqs_timer, round_jiffies(jiffies + HZ)); > spin_unlock(&shca_list_lock); > } > From rdreier at cisco.com Fri Jan 11 11:40:53 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 11 Jan 2008 11:40:53 -0800 Subject: [ofa-general] Possible ipath bugs detected by compiler/sparse warnings Message-ID: I've been looking at warnings coming from running sparse on drivers/infiniband, and I've spotted a few things in ipath that look like real bugs, but I don't know enough about the HW to know how to fix things properly. It all looks like error path stuff that wouldn't get exercised very often. Here's what I've seen: ipath_iba6110.c:840: pci_write_config_byte(pdev, link_off, linkctrl & (0xf << 8)); The low byte of linkctrl & (0xf << 8) is always 0 -- should this be pci_write_config_word() to match the way linkctrl is read a few lines earlier? ipath_intr.c:835: __le64 val; //... val = ipath_read_kreg64(dd, (0x1000 / sizeof(u64)) + im); dd->ipath_pioavailregs_dma[i] = dd->ipath_pioavailshadow[i] = le64_to_cpu(val); ipath_read_kreg64() seems as if it will return a value in CPU endian, since it is just readq(). And there seems to be some confusion here -- ipath_pioavailregs_dma seems to be in little-endian and ipath_pioavailshadow is in host-endian, so you can't assign both with the same value, right? Has this code ever triggered on a big-endian system? ipath_intr.c:62: u32 __iomem *pbuf; //... *pbuf = dwcnt+1; /* no flush required, since already in freeze */ pbuf is declared as __iomem, but then you write directly to it in CPU endian. Should this be a writel() to handle byte-swapping and general IO space stuff properly? Finally, there are a few warnings about symbols shadowing earlier declarations. Most of them look trivial to fix... any objection to merging the following for 2.6.25? diff --git a/drivers/infiniband/hw/ipath/ipath_eeprom.c b/drivers/infiniband/hw/ipath/ipath_eeprom.c index a5b6299..e28a42f 100644 --- a/drivers/infiniband/hw/ipath/ipath_eeprom.c +++ b/drivers/infiniband/hw/ipath/ipath_eeprom.c @@ -574,7 +574,7 @@ void ipath_get_eeprom_info(struct ipath_devdata *dd) struct ipath_devdata *dd0 = ipath_lookup(0); if (t && dd0->ipath_nguid > 1 && t <= dd0->ipath_nguid) { - u8 *bguid, oguid; + u8 oguid; dd->ipath_guid = dd0->ipath_guid; bguid = (u8 *) & dd->ipath_guid; @@ -674,7 +674,6 @@ void ipath_get_eeprom_info(struct ipath_devdata *dd) * elsewhere for backward-compatibility. */ char *snp = dd->ipath_serial; - int len; memcpy(snp, ifp->if_sprefix, sizeof ifp->if_sprefix); snp[sizeof ifp->if_sprefix] = '\0'; len = strlen(snp); diff --git a/drivers/infiniband/hw/ipath/ipath_qp.c b/drivers/infiniband/hw/ipath/ipath_qp.c index b405906..80dc623 100644 --- a/drivers/infiniband/hw/ipath/ipath_qp.c +++ b/drivers/infiniband/hw/ipath/ipath_qp.c @@ -855,8 +855,6 @@ struct ib_qp *ipath_create_qp(struct ib_pd *ibpd, * See ipath_mmap() for details. */ if (udata && udata->outlen >= sizeof(__u64)) { - int err; - if (!qp->r_rq.wq) { __u64 offset = 0; From rdreier at cisco.com Fri Jan 11 11:42:12 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 11 Jan 2008 11:42:12 -0800 Subject: [ofa-general] [PATCH] RDMA/cxgb3: Endianness annotation Message-ID: Steve, is this change correct? The only place the irs field is assigned to is with cpu_to_be32(). If it looks good, I'll merge it for 2.6.25... diff --git a/drivers/infiniband/hw/cxgb3/cxio_wr.h b/drivers/infiniband/hw/cxgb3/cxio_wr.h index c84d4ac..de366b0 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_wr.h +++ b/drivers/infiniband/hw/cxgb3/cxio_wr.h @@ -315,7 +315,7 @@ struct t3_rdma_init_wr { __be32 ird; __be64 qp_dma_addr; /* 7 */ __be32 qp_dma_size; /* 8 */ - u32 irs; + __be32 irs; }; struct t3_genbit { From jon at opengridcomputing.com Fri Jan 11 11:46:39 2008 From: jon at opengridcomputing.com (Jon Mason) Date: Fri, 11 Jan 2008 13:46:39 -0600 Subject: [ofa-general] man page for rdma_create_event_channel garbled Message-ID: <20080111194639.GG30920@opengridcomputing.com> I noticed the first line of the man page of rdma_create_event_channel is garbled. `man rdma_create_event_channel` shows as the first line: RDMA_CREATE_EVENT_CHANNELLibrdmacm Programmer’s MaRDMA_CREATE_EVENT_CHANNEL(3) Comparitively a `man rdma_bind_addr` shows as its first line: RDMA_BIND_ADDR(3) Librdmacm Programmer’s Manual RDMA_BIND_ADDR(3) Thanks, Jon From HNGUYEN at de.ibm.com Fri Jan 11 11:55:35 2008 From: HNGUYEN at de.ibm.com (Hoang-Nam Nguyen) Date: Fri, 11 Jan 2008 20:55:35 +0100 Subject: [ofa-general] [PATCH] Use round_jiffies() in ehca timer In-Reply-To: Message-ID: Roland Dreier wrote on 11.01.2008 20:09:27: > ehca guys -- this looks sane to me, and I've had it sitting in my > inbox for a while. Any objection to merging it for 2.6.25? > > - R. > > > Use round_jiffies() to align the 1 second timer with other timers and > > potentially save power by sleeping cores for longer. > > > > Signed-off-by: Anton Blanchard Acked-by: Hoang-Nam Nguyen From swise at opengridcomputing.com Fri Jan 11 11:59:24 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 11 Jan 2008 13:59:24 -0600 Subject: [ofa-general] Re: [PATCH] RDMA/cxgb3: Endianness annotation In-Reply-To: References: Message-ID: <4787CA9C.4020902@opengridcomputing.com> Roland Dreier wrote: > Steve, is this change correct? The only place the irs field is > assigned to is with cpu_to_be32(). If it looks good, I'll merge it > for 2.6.25... > > diff --git a/drivers/infiniband/hw/cxgb3/cxio_wr.h b/drivers/infiniband/hw/cxgb3/cxio_wr.h > index c84d4ac..de366b0 100644 > --- a/drivers/infiniband/hw/cxgb3/cxio_wr.h > +++ b/drivers/infiniband/hw/cxgb3/cxio_wr.h > @@ -315,7 +315,7 @@ struct t3_rdma_init_wr { > __be32 ird; > __be64 qp_dma_addr; /* 7 */ > __be32 qp_dma_size; /* 8 */ > - u32 irs; > + __be32 irs; > }; > > struct t3_genbit { Yes, the irs field should be __be32. Thanks, Steve. From ashwin.kalbag at lehman.com Fri Jan 11 12:05:56 2008 From: ashwin.kalbag at lehman.com (Kalbag, Ashwin) Date: Fri, 11 Jan 2008 15:05:56 -0500 Subject: [ofa-general] Verbs questions... Message-ID: <0B877605F0F36F45A439AF0DEEA18CCF0DC519CC@njpcmg1exms305.leh.lbcorp.lehman.com> Question 1: Here's a section of the man page for ibv_post_send: "The attribute send_flags describes the properties of the WR. It is either 0 or the bitwise OR of one or more of the following flags: IBV_SEND_FENCE Set the fence indicator. Valid only for QPs with Transport Service Type IBV_QPT_RC IBV_SEND_SIGNALED Set the completion notification indicator. Relevant only if QP was created with sq_sig_all=0 IBV_SEND_SOLICITED Set the solicited event indicator. Valid only for Send and RDMA Write with immediate IBV_SEND_INLINE Send data in given gather list as inline data in a send WQE. Valid only for Send and RDMA Write. The L_Key will not be checked." a. What is the fence indicator? Under what circumstances would I use this? b. How is a solicited event different from a signaled event? Under what circumstances would I use this? c. What is not apparent from this man page is whether the signal is generated on the sending side on send completion or on the receiving side on completion of the corresponding posted recv. It's not explicitly stated, but I am assuming that the signaling refers to send completion on the sending side. Is it true that regardless of whether the send is signaled on the sending side, it will generate a signal on the receiving side when the recv operation completes? Question 2: Say I'm trying to optimize between polling and completion event notification. Could you please see whether I'm conceptualizing this correctly? At one extreme, you could poll continuously (without signaling sends), until you want to send the next message. The polling would take care of recycling memory region elements on send completions, handling received messages for recv completions. At the other extreme, every event would be signaled. You could set sg_sig_all=1 while creating the completion queue, or alternatively always use the IBV_SEND_SIGNALED flag when posting sends. To take the middle road, one possibility is to generate completions for every nth send. If the messages are being sent rapidly, you can afford to signal fewer sends. If sends are fewer, you would need to either signal more sends, say every send, or compensate by expending CPU in polling. Seems like the signaling needs to be adaptive to the rate of sending. If this is the case, you still have a choice in how you process send completions, assuming that signals will always be generated on the receiving side. Even if you didn't signal every send, you could still process send completions by relying exclusively on signaling instead of polling. If you did this, you'd need to signal at least once per "send queue depth", correct? There is no urgency in processing send completions so long as you have some available depth in the send queue and elements to post sends with. But there is urgency in processing recv completions. So, it may be more optimal to process send completions with fewer signals, say 2 signals per send queue depth, for a margin of safety, so at any time, you will have half the send queue available. This asymmetry would imply it is better to use separate completion queues for sending and receiving. Am I right in surmising this? On the recv side, you want to process completions in a hurry and post more recvs, to keep up with the incoming message traffic. Now here's where you might benefit from polling even after your recv event completion was signaled, so you avoid context switching out and back in, and the concomitant delay, just in case there are actually more recvs that completed. Is this correct? - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice. -------- IRS Circular 230 Disclosure: Please be advised that any discussion of U.S. tax matters contained within this communication (including any attachments) is not intended or written to be used and cannot be used for the purpose of (i) avoiding U.S. tax related penalties or (ii) promoting, marketing or recommending to another party any transaction or matter addressed herein. From rdreier at cisco.com Fri Jan 11 12:07:02 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 11 Jan 2008 12:07:02 -0800 Subject: [ofa-general] [PATCH] Use round_jiffies() in ehca timer In-Reply-To: (Hoang-Nam Nguyen's message of "Fri, 11 Jan 2008 20:55:35 +0100") References: Message-ID: thanks, I queued this for 2.6.25 From arthur.jones at qlogic.com Fri Jan 11 13:10:28 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Fri, 11 Jan 2008 13:10:28 -0800 Subject: [ofa-general] Re: Possible ipath bugs detected by compiler/sparse warnings In-Reply-To: References: Message-ID: <20080111211028.GL3949@bauxite.pathscale.com> hi roland, ... On Fri, Jan 11, 2008 at 11:40:53AM -0800, Roland Dreier wrote: > I've been looking at warnings coming from running sparse on > drivers/infiniband [...] do you have a way to turn down the false positives on make C=1|2? i get a lot of spurious warnings on ARRAY_SIZE and the list_for_each macros. a new sparse perhaps? or do you just grind your way through? > [...] and I've spotted a few things in ipath that look > like real bugs, but I don't know enough about the HW to know how to > fix things properly. It all looks like error path stuff that wouldn't > get exercised very often. Here's what I've seen: > > ipath_iba6110.c:840: > [...] > ipath_intr.c:835: > [...] i don't know right now, but i'll find out, they both look very suspicious... > ipath_intr.c:62: > > u32 __iomem *pbuf; > //... > *pbuf = dwcnt+1; /* no flush required, since already in freeze */ > > pbuf is declared as __iomem, but then you write directly to it in > CPU endian. Should this be a writel() to handle byte-swapping and > general IO space stuff properly? yes, we had an internal bug open on this already... > Finally, there are a few warnings about symbols shadowing earlier > declarations. Most of them look trivial to fix... any objection to > merging the following for 2.6.25? no objections, the patch looks perfect, thanks... arthur From sean.hefty at intel.com Fri Jan 11 13:15:56 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Fri, 11 Jan 2008 13:15:56 -0800 Subject: [ofa-general] RE: man page for rdma_create_event_channel garbled In-Reply-To: <20080111194639.GG30920@opengridcomputing.com> References: <20080111194639.GG30920@opengridcomputing.com> Message-ID: <000101c85497$27c07ba0$3c98070a@amr.corp.intel.com> >I noticed the first line of the man page of rdma_create_event_channel is >garbled. `man rdma_create_event_channel` shows as the first line: > >RDMA_CREATE_EVENT_CHANNELLibrdmacm Programmer's MaRDMA_CREATE_EVENT_CHANNEL(3) > >Comparitively a `man rdma_bind_addr` shows as its first line: > >RDMA_BIND_ADDR(3) Librdmacm Programmer's Manual RDMA_BIND_ADDR(3) I think this is simply a result of the function name being a little long. I'm not sure what the correct fix for this would be. Anyone have any ideas? - Sean From dwistrodentm at istrodent.com Thu Jan 10 14:33:04 2008 From: dwistrodentm at istrodent.com (Lynette Ritchie) Date: Fri, 10 Jan 2008 16:33:04 -0600 Subject: [ofa-general] New pleasure with new bigger cock Message-ID: <275762291.29894778545995@istrodent.com> Have a look at one of the men's testimonials:"This VPXL just makes everything harder and bigger. I have nothing to go wrong with "my equipment" now and sex has become just a pure pleasure." Robert, Columbiana.Don't hesitate, grab the chance of your lifetime and order our VPXL now. http://geocities.com/kevinheath67/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From davem at systemfabricworks.com Fri Jan 11 14:44:18 2008 From: davem at systemfabricworks.com (davem at systemfabricworks.com) Date: Fri, 11 Jan 2008 16:44:18 -0600 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fix target data corruption Message-ID: <4787F142.mailGZ011OYG6@systemfabricworks.com> This is an updated version of [PATCH] drivers/infiniband/ulp/srpt: Fix target data corruption It was pointed out to me that the code to round up to a power of 2 was not as clean as it should be, plus I extracted two unrelated patches and submitted them separately. ===================================================================== Change the local buffer allocator to use a spin-lock protected linked list instead of an array of atomic_t used/free variables. The atomic_t code was open to a multi-thread race between test and set. This has been observed with the result that the same data buffer was used for more than one SCSI operation, either writing the wrong data to the disk or sending the wrong data to the initiator. Signed-off-by: Robert Pearson Signed-off-by: David A. McMillen --- ib_srpt.c | 95 +++++++++++++++++++++++++++++++++++++++--------------------- 1 files changed, 62 insertions(+), 33 deletions(-) diff --git a/ib_srpt.c b/ib_srpt.c index 1fb9d56..97088ea 100644 --- a/ib_srpt.c +++ b/ib_srpt.c @@ -56,10 +56,14 @@ MODULE_LICENSE("Dual BSD/GPL"); struct mem_elem { struct page *page; + struct mem_elem *next; + int ndx; u32 len; - atomic_t used; }; +static spinlock_t mempool_lock; +static struct mem_elem *mempool_freelist; + struct srpt_thread { spinlock_t thread_lock; struct list_head thread_ioctx_list; @@ -70,7 +74,7 @@ static u64 mellanox_ioc_guid = 0; static struct list_head srpt_devices; static int mem_size = 32768; static int mem_elements = 4096; -static atomic_t mem_avail; +static int mem_avail; static int cur_pos = 1; static struct mem_elem *srpt_mempool = NULL; static int thread = 1; @@ -98,28 +102,29 @@ static struct ib_client srpt_client = { .remove = srpt_remove_one }; +static int round_up_power_of_2(int val) +{ + while (val & (val - 1)) + val += val & ~(val - 1); + return val; +} + static struct mem_elem *srpt_get_mem_elem(int *index) { - int i, end_pos; + struct mem_elem *elem; *index = 0; - end_pos = mem_elements; - for (i = cur_pos; i <= end_pos; ++i) { - if (i == mem_elements) { - end_pos = cur_pos; - i = 1; - } - if (atomic_read(&srpt_mempool[i].used) == 0) { - atomic_inc(&srpt_mempool[i].used); - smp_mb__after_atomic_inc(); - *index = i; - cur_pos = i + 1; - atomic_dec(&mem_avail); - return &srpt_mempool[i]; - } + spin_lock(&mempool_lock); + elem = mempool_freelist; + if (elem) { + mempool_freelist = elem->next; + mem_avail--; + *index = elem->ndx; } - return NULL; + spin_unlock(&mempool_lock); + + return elem; } static int srpt_free_mem_elem(int index) @@ -127,23 +132,45 @@ static int srpt_free_mem_elem(int index) if (!index || index >= mem_elements) return -EINVAL; - atomic_dec(&srpt_mempool[index].used); - smp_mb__after_atomic_dec(); - atomic_inc(&mem_avail); + spin_lock(&mempool_lock); + srpt_mempool[index].next = mempool_freelist; + mempool_freelist = &srpt_mempool[index]; + mem_avail++; + spin_unlock(&mempool_lock); return 0; } static int srpt_mempool_create(void) { - int i, order; + int i, order, array_size; - order = get_order(mem_elements * sizeof(struct mem_elem)); + if (mem_elements <= 0) + return -ENOMEM; + array_size = mem_elements * sizeof(struct mem_elem); + array_size = round_up_power_of_2(array_size); + /* array_size now first power of 2 >= actual array size */ + + if (mem_size < PAGE_SIZE) { + printk(KERN_ERR PFX "mem_size parameter changed from" + " %d to %lu (PAGE_SIZE)\n", mem_size, PAGE_SIZE); + mem_size = PAGE_SIZE; + } + i = round_up_power_of_2(mem_size); + if (i != mem_size) { + printk(KERN_ERR PFX "mem_size parameter rounded up from" + " %d to %d\n", mem_size, i); + mem_size = i; + } + /* mem_size now also a power of 2 >= PAGE_SIZE */ + + order = get_order(array_size); + spin_lock_init(&mempool_lock); srpt_mempool = (struct mem_elem *) __get_free_pages(GFP_KERNEL, order); if (!srpt_mempool) return -ENOMEM; + order = get_order(mem_size); for (i = 1; i < mem_elements; ++i) { - order = get_order(mem_size); srpt_mempool[i].page = alloc_pages(GFP_KERNEL, order); if (!srpt_mempool[i].page) { printk(KERN_ERR PFX @@ -151,18 +178,19 @@ static int srpt_mempool_create(void) mem_size, i); goto free_mem; } - atomic_set(&srpt_mempool[i].used, 0); + srpt_mempool[i].ndx = i; + srpt_mempool[i].next = mempool_freelist; + mempool_freelist = &srpt_mempool[i]; } - atomic_set(&mem_avail, mem_elements); + mem_avail = mem_elements; return 0; free_mem: while (i > 1) __free_pages(srpt_mempool[--i].page, order); - free_pages((unsigned long) srpt_mempool, - get_order(mem_elements * sizeof(struct mem_elem))); + free_pages((unsigned long) srpt_mempool, get_order(array_size)); return -ENOMEM; } @@ -176,7 +204,8 @@ static void srpt_mempool_destroy(void) __free_pages(srpt_mempool[i].page, order); free_pages((unsigned long) srpt_mempool, - get_order(mem_elements * sizeof(struct mem_elem))); + get_order(round_up_power_of_2( + mem_elements * sizeof(struct mem_elem)))); } static void srpt_event_handler(struct ib_event_handler *handler, @@ -2116,8 +2145,8 @@ static int srpt_alloc_data_buf(struct scst_cmd *scmnd) tsize = scst_cmd_get_bufflen(scmnd); nrdma = tsize / mem_size + ioctx->n_rbuf; - if (nrdma > atomic_read(&mem_avail)) { - printk(KERN_ERR PFX "!!ALERT mem_avail= %d\n", atomic_read(&mem_avail)); + if (nrdma > mem_avail) { + printk(KERN_ERR PFX "!!ALERT mem_avail= %d\n", mem_avail); goto out; } @@ -2161,7 +2190,7 @@ static int srpt_alloc_data_buf(struct scst_cmd *scmnd) release_mem: printk(KERN_WARNING PFX "ALERT! alloc_mem mem_avail= %d n_mem_elem=%d\n", - atomic_read(&mem_avail), i); + mem_avail, i); riu = ioctx->rdma_ius; while (i) @@ -2295,7 +2324,7 @@ static CLASS_DEVICE_ATTR(login_info, S_IRUGO, show_login_info, NULL); static ssize_t show_mem_info(struct class_device *class_dev, char *buf) { return sprintf(buf, "mem_avail= %d mem_elements= %d mem_size= %d\n", - atomic_read(&mem_avail), mem_elements, mem_size); + mem_avail, mem_elements, mem_size); } static CLASS_DEVICE_ATTR(mem_info, S_IRUGO, show_mem_info, NULL); From davem at systemfabricworks.com Fri Jan 11 14:45:10 2008 From: davem at systemfabricworks.com (davem at systemfabricworks.com) Date: Fri, 11 Jan 2008 16:45:10 -0600 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fix structure initialization Message-ID: <4787F176.mailGZV1T5RRI@systemfabricworks.com> This is a repost of an unrelated patch that was inadvertantly included in [PATCH] drivers/infiniband/ulp/srpt: Fix target data corruption ===================================================================== Initialize the reg_req structure before setting bits. Signed-off-by: David A. McMillen --- ib_srpt.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/ib_srpt.c b/ib_srpt.c index 1fb9d56..35c30a5 100644 --- a/ib_srpt.c +++ b/ib_srpt.c @@ -450,6 +450,7 @@ static int srpt_refresh_port(struct srpt_port *sport) goto err_query_port; if (!sport->mad_agent) { + memset(®_req, 0, sizeof reg_req); reg_req.mgmt_class = IB_MGMT_CLASS_DEVICE_MGMT; reg_req.mgmt_class_version = IB_MGMT_BASE_VERSION; set_bit(IB_MGMT_METHOD_GET, reg_req.method_mask); From davem at systemfabricworks.com Fri Jan 11 14:46:02 2008 From: davem at systemfabricworks.com (davem at systemfabricworks.com) Date: Fri, 11 Jan 2008 16:46:02 -0600 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Don't deallocate pool if not allocated Message-ID: <4787F1AA.mailH001FJKFI@systemfabricworks.com> This is a repost of an unrelated patch that was inadvertantly included in [PATCH] drivers/infiniband/ulp/srpt: Fix target data corruption ===================================================================== If local buffer pool was not allocated, don't try to deallocate it. Signed-off-by: David A. McMillen --- ib_srpt.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/ib_srpt.c b/ib_srpt.c index 1fb9d56..452cda9 100644 --- a/ib_srpt.c +++ b/ib_srpt.c @@ -166,3 +166,4 @@ static int srpt_mempool_create(void) + srpt_mempool = NULL; return -ENOMEM; } @@ -171,6 +172,8 @@ static void srpt_mempool_destroy(void) { int i, order; + if (srpt_mempool == NULL) return; + order = get_order(mem_size); for (i = 1; i < mem_elements; ++i) __free_pages(srpt_mempool[i].page, order); From rdreier at cisco.com Fri Jan 11 15:08:07 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 11 Jan 2008 15:08:07 -0800 Subject: [ofa-general] Re: Possible ipath bugs detected by compiler/sparse warnings In-Reply-To: <20080111211028.GL3949@bauxite.pathscale.com> (Arthur Jones's message of "Fri, 11 Jan 2008 13:10:28 -0800") References: <20080111211028.GL3949@bauxite.pathscale.com> Message-ID: > do you have a way to turn down the false positives > on make C=1|2? i get a lot of spurious warnings > on ARRAY_SIZE and the list_for_each macros. a new > sparse perhaps? or do you just grind your way through? I don't see many false positives on my system... so probably a newer version of sparse will help. I seem to be using version 0.4.1 here. > no objections, the patch looks perfect, thanks... Thanks, I'll queue it up. From arthur.jones at qlogic.com Fri Jan 11 15:46:33 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Fri, 11 Jan 2008 15:46:33 -0800 Subject: [ofa-general] Re: Possible ipath bugs detected by compiler/sparse warnings In-Reply-To: References: <20080111211028.GL3949@bauxite.pathscale.com> Message-ID: <20080111234633.GN3949@bauxite.pathscale.com> hi roland, ... On Fri, Jan 11, 2008 at 03:08:07PM -0800, Roland Dreier wrote: > > do you have a way to turn down the false positives > > on make C=1|2? i get a lot of spurious warnings > > on ARRAY_SIZE and the list_for_each macros. a new > > sparse perhaps? or do you just grind your way through? > > I don't see many false positives on my system... so probably a newer > version of sparse will help. I seem to be using version 0.4.1 here. yes this is _much_ better, i upgraded from 0.2, now the false positives seem to have mostly gone away... arthur From rdreier at cisco.com Fri Jan 11 16:05:25 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 11 Jan 2008 16:05:25 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> (chas williams's message of "Tue, 08 Jan 2008 12:33:33 -0500") References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> Message-ID: OK, just in time for the weekend, here's a patch that rewrites and simplifies the locking for ib_umad. It passes light testing for me, and I see no warnings with CONFIG_LOCKDEP=y, but I'd really appreciate more tests (especially if you can try to reproduce the hangs/lockdep warnings) and more review. The basic idea is to add a new mutex to each open file handle, and switch the port mutex to a simple mutex. By rejigerring things, I seem to be able to always call ib_unregister_mad_agent() with no locks held, which avoids deadlocks. We set agents_dead (protected by file->mutex) whenever anything is going away, so I don't think there are any races between file closing and removing a device or anything like that. [John -- this eliminates all use of rwsem and downgrade_write(), so I think it should work well with CONFIG_PREEMPT_RT. If you get a chance, can you confirm this too?] Thanks, Roland drivers/infiniband/core/user_mad.c | 109 +++++++++++++++--------------------- 1 files changed, 46 insertions(+), 63 deletions(-) diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c index b53eac4..473ba14 100644 --- a/drivers/infiniband/core/user_mad.c +++ b/drivers/infiniband/core/user_mad.c @@ -42,7 +42,7 @@ #include #include #include -#include +#include #include #include @@ -94,7 +94,7 @@ struct ib_umad_port { struct class_device *sm_class_dev; struct semaphore sm_sem; - struct rw_semaphore mutex; + struct mutex file_mutex; struct list_head file_list; struct ib_device *ib_dev; @@ -110,11 +110,11 @@ struct ib_umad_device { }; struct ib_umad_file { + struct mutex mutex; struct ib_umad_port *port; struct list_head recv_list; struct list_head send_list; struct list_head port_list; - spinlock_t recv_lock; spinlock_t send_lock; wait_queue_head_t recv_wait; struct ib_mad_agent *agent[IB_UMAD_MAX_AGENTS]; @@ -156,7 +156,7 @@ static int hdr_size(struct ib_umad_file *file) sizeof (struct ib_user_mad_hdr_old); } -/* caller must hold port->mutex at least for reading */ +/* caller must hold file->mutex */ static struct ib_mad_agent *__get_agent(struct ib_umad_file *file, int id) { return file->agents_dead ? NULL : file->agent[id]; @@ -168,32 +168,30 @@ static int queue_packet(struct ib_umad_file *file, { int ret = 1; - down_read(&file->port->mutex); + mutex_lock(&file->mutex); for (packet->mad.hdr.id = 0; packet->mad.hdr.id < IB_UMAD_MAX_AGENTS; packet->mad.hdr.id++) if (agent == __get_agent(file, packet->mad.hdr.id)) { - spin_lock_irq(&file->recv_lock); list_add_tail(&packet->list, &file->recv_list); - spin_unlock_irq(&file->recv_lock); wake_up_interruptible(&file->recv_wait); ret = 0; break; } - up_read(&file->port->mutex); + mutex_unlock(&file->mutex); return ret; } static void dequeue_send(struct ib_umad_file *file, struct ib_umad_packet *packet) - { +{ spin_lock_irq(&file->send_lock); list_del(&packet->list); spin_unlock_irq(&file->send_lock); - } +} static void send_handler(struct ib_mad_agent *agent, struct ib_mad_send_wc *send_wc) @@ -341,10 +339,10 @@ static ssize_t ib_umad_read(struct file *filp, char __user *buf, if (count < hdr_size(file)) return -EINVAL; - spin_lock_irq(&file->recv_lock); + mutex_lock(&file->mutex); while (list_empty(&file->recv_list)) { - spin_unlock_irq(&file->recv_lock); + mutex_unlock(&file->mutex); if (filp->f_flags & O_NONBLOCK) return -EAGAIN; @@ -353,13 +351,13 @@ static ssize_t ib_umad_read(struct file *filp, char __user *buf, !list_empty(&file->recv_list))) return -ERESTARTSYS; - spin_lock_irq(&file->recv_lock); + mutex_lock(&file->mutex); } packet = list_entry(file->recv_list.next, struct ib_umad_packet, list); list_del(&packet->list); - spin_unlock_irq(&file->recv_lock); + mutex_unlock(&file->mutex); if (packet->recv_wc) ret = copy_recv_mad(file, buf, packet, count); @@ -368,9 +366,9 @@ static ssize_t ib_umad_read(struct file *filp, char __user *buf, if (ret < 0) { /* Requeue packet */ - spin_lock_irq(&file->recv_lock); + mutex_lock(&file->mutex); list_add(&packet->list, &file->recv_list); - spin_unlock_irq(&file->recv_lock); + mutex_unlock(&file->mutex); } else { if (packet->recv_wc) ib_free_recv_mad(packet->recv_wc); @@ -481,7 +479,7 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf, goto err; } - down_read(&file->port->mutex); + mutex_lock(&file->mutex); agent = __get_agent(file, packet->mad.hdr.id); if (!agent) { @@ -577,7 +575,7 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf, if (ret) goto err_send; - up_read(&file->port->mutex); + mutex_unlock(&file->mutex); return count; err_send: @@ -587,7 +585,7 @@ err_msg: err_ah: ib_destroy_ah(ah); err_up: - up_read(&file->port->mutex); + mutex_unlock(&file->mutex); err: kfree(packet); return ret; @@ -613,11 +611,11 @@ static int ib_umad_reg_agent(struct ib_umad_file *file, void __user *arg, { struct ib_user_mad_reg_req ureq; struct ib_mad_reg_req req; - struct ib_mad_agent *agent; + struct ib_mad_agent *agent = NULL; int agent_id; int ret; - down_write(&file->port->mutex); + mutex_lock(&file->mutex); if (!file->port->ib_dev) { ret = -EPIPE; @@ -666,13 +664,13 @@ found: send_handler, recv_handler, file); if (IS_ERR(agent)) { ret = PTR_ERR(agent); + agent = NULL; goto out; } if (put_user(agent_id, (u32 __user *) (arg + offsetof(struct ib_user_mad_reg_req, id)))) { ret = -EFAULT; - ib_unregister_mad_agent(agent); goto out; } @@ -690,7 +688,11 @@ found: ret = 0; out: - up_write(&file->port->mutex); + mutex_unlock(&file->mutex); + + if (ret && agent) + ib_unregister_mad_agent(agent); + return ret; } @@ -703,7 +705,7 @@ static int ib_umad_unreg_agent(struct ib_umad_file *file, u32 __user *arg) if (get_user(id, arg)) return -EFAULT; - down_write(&file->port->mutex); + mutex_lock(&file->mutex); if (id < 0 || id >= IB_UMAD_MAX_AGENTS || !__get_agent(file, id)) { ret = -EINVAL; @@ -714,7 +716,7 @@ static int ib_umad_unreg_agent(struct ib_umad_file *file, u32 __user *arg) file->agent[id] = NULL; out: - up_write(&file->port->mutex); + mutex_unlock(&file->mutex); if (agent) ib_unregister_mad_agent(agent); @@ -726,12 +728,12 @@ static long ib_umad_enable_pkey(struct ib_umad_file *file) { int ret = 0; - down_write(&file->port->mutex); + mutex_lock(&file->mutex); if (file->already_used) ret = -EINVAL; else file->use_pkey_index = 1; - up_write(&file->port->mutex); + mutex_unlock(&file->mutex); return ret; } @@ -783,7 +785,7 @@ static int ib_umad_open(struct inode *inode, struct file *filp) if (!port) return -ENXIO; - down_write(&port->mutex); + mutex_lock(&port->file_mutex); if (!port->ib_dev) { ret = -ENXIO; @@ -797,7 +799,7 @@ static int ib_umad_open(struct inode *inode, struct file *filp) goto out; } - spin_lock_init(&file->recv_lock); + mutex_init(&file->mutex); spin_lock_init(&file->send_lock); INIT_LIST_HEAD(&file->recv_list); INIT_LIST_HEAD(&file->send_list); @@ -809,7 +811,7 @@ static int ib_umad_open(struct inode *inode, struct file *filp) list_add_tail(&file->port_list, &port->file_list); out: - up_write(&port->mutex); + mutex_unlock(&port->file_mutex); return ret; } @@ -821,7 +823,8 @@ static int ib_umad_close(struct inode *inode, struct file *filp) int already_dead; int i; - down_write(&file->port->mutex); + mutex_lock(&file->port->file_mutex); + mutex_lock(&file->mutex); already_dead = file->agents_dead; file->agents_dead = 1; @@ -834,15 +837,14 @@ static int ib_umad_close(struct inode *inode, struct file *filp) list_del(&file->port_list); - downgrade_write(&file->port->mutex); + mutex_unlock(&file->mutex); + mutex_unlock(&file->port->file_mutex); if (!already_dead) for (i = 0; i < IB_UMAD_MAX_AGENTS; ++i) if (file->agent[i]) ib_unregister_mad_agent(file->agent[i]); - up_read(&file->port->mutex); - kfree(file); kref_put(&dev->ref, ib_umad_release_dev); @@ -914,10 +916,10 @@ static int ib_umad_sm_close(struct inode *inode, struct file *filp) }; int ret = 0; - down_write(&port->mutex); + mutex_lock(&port->file_mutex); if (port->ib_dev) ret = ib_modify_port(port->ib_dev, port->port_num, 0, &props); - up_write(&port->mutex); + mutex_unlock(&port->file_mutex); up(&port->sm_sem); @@ -981,7 +983,7 @@ static int ib_umad_init_port(struct ib_device *device, int port_num, port->ib_dev = device; port->port_num = port_num; init_MUTEX(&port->sm_sem); - init_rwsem(&port->mutex); + mutex_init(&port->file_mutex); INIT_LIST_HEAD(&port->file_list); port->dev = cdev_alloc(); @@ -1052,6 +1054,7 @@ err_cdev: static void ib_umad_kill_port(struct ib_umad_port *port) { struct ib_umad_file *file; + int already_dead; int id; class_set_devdata(port->class_dev, NULL); @@ -1067,42 +1070,22 @@ static void ib_umad_kill_port(struct ib_umad_port *port) umad_port[port->dev_num] = NULL; spin_unlock(&port_lock); - down_write(&port->mutex); + mutex_lock(&port->file_mutex); port->ib_dev = NULL; - /* - * Now go through the list of files attached to this port and - * unregister all of their MAD agents. We need to hold - * port->mutex while doing this to avoid racing with - * ib_umad_close(), but we can't hold the mutex for writing - * while calling ib_unregister_mad_agent(), since that might - * deadlock by calling back into queue_packet(). So we - * downgrade our lock to a read lock, and then drop and - * reacquire the write lock for the next iteration. - * - * We do list_del_init() on the file's list_head so that the - * list_del in ib_umad_close() is still OK, even after the - * file is removed from the list. - */ - while (!list_empty(&port->file_list)) { - file = list_entry(port->file_list.next, struct ib_umad_file, - port_list); - + list_for_each_entry(file, &port->file_list, port_list) { + mutex_lock(&file->mutex); + already_dead = file->agents_dead; file->agents_dead = 1; - list_del_init(&file->port_list); - - downgrade_write(&port->mutex); + mutex_unlock(&file->mutex); for (id = 0; id < IB_UMAD_MAX_AGENTS; ++id) if (file->agent[id]) ib_unregister_mad_agent(file->agent[id]); - - up_read(&port->mutex); - down_write(&port->mutex); } - up_write(&port->mutex); + mutex_unlock(&port->file_mutex); clear_bit(port->dev_num, dev_map); } From kliteyn at mellanox.co.il Fri Jan 11 17:16:55 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 12 Jan 2008 03:16:55 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-12:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-11 OpenSM git rev = Thu_Jan_10_03:48:16_2008 [7bb2045bd9f659f8466a4494f4ec983f0edbf96a] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=398 Fail=2 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 9 OsmStress IS3-128.topo 9 LidMgr IS3-128.topo Failures: 1 OsmStress IS3-128.topo 1 LidMgr IS3-128.topo From john.benninghoff at intel.com Fri Jan 11 17:23:12 2008 From: john.benninghoff at intel.com (Benninghoff, John) Date: Fri, 11 Jan 2008 17:23:12 -0800 Subject: [ofa-general] ofa_kernel build error on RH 4.6 (2.6.9-67.ELsmp) Message-ID: <2E020D3DD4A80647AE77E1692F6E97D9C8C65E@FMSMSX420> I can build ofa_user RPMs but building the ofa_kernel RPMs fails with the error below. Is this a known issue with RH 4.6? make -f scripts/Makefile.build obj=/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/cor e/.addr.o.d -nostdinc -iwithprefix include -D__KERNEL__ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1 .2.5.4/drivers/infiniband/include -Iinclude -include include/linux/autoconf.h -include /var/tmp/OFEDRPM/BUI LD/ofa_kernel-1.2.5.4/include/linux/autoconf.h -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing - fno-common -Os -fomit-frame-pointer -g -Wdeclaration-after-statement -mno-red-zone -mcmodel=kernel -pipe -fno-r eorder-blocks -Wno-sign-compare -funit-at-a-time -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include -I/va r/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/d rivers/infiniband/ulp/ipoib -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/debug -I/var/tmp/OF EDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/hw/cxgb3/core -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/dri vers/net/cxgb3 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4 /drivers/net/mlx4 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/hw/mlx4 -DMODULE -DKBUILD_BA SENAME=addr -DKBUILD_MODNAME=ib_addr -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/.tm p_addr.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c In file included from /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/addr.c :32: include/linux/inetdevice.h:50: error: field `mr_gq_timer' has incomplete type include/linux/inetdevice.h:51: error: field `mr_ifc_timer' has incomplete type include/linux/inetdevice.h:95: error: `IFNAMSIZ' undeclared here (not in a function) ..... -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Fri Jan 11 19:36:14 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 11 Jan 2008 19:36:14 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: (Roland Dreier's message of "Fri, 11 Jan 2008 16:05:25 -0800") References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> Message-ID: > @@ -714,7 +716,7 @@ static int ib_umad_unreg_agent(struct ib_umad_file *file, u32 __user *arg) > file->agent[id] = NULL; > > out: > - up_write(&file->port->mutex); > + mutex_unlock(&file->mutex); > > if (agent) > ib_unregister_mad_agent(agent); Actually, I think places like this need to hold file->port->file_mutex across the call to ib_unregister_mad_agent() to avoid a window where a device could be hot-unplugged after the mutex_unlock() but before the call to ib_unregister_mad_agent(). Please try the patch below instead: drivers/infiniband/core/user_mad.c | 114 ++++++++++++++++------------------- 1 files changed, 52 insertions(+), 62 deletions(-) diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c index b53eac4..d47c307 100644 --- a/drivers/infiniband/core/user_mad.c +++ b/drivers/infiniband/core/user_mad.c @@ -42,7 +42,7 @@ #include #include #include -#include +#include #include #include @@ -94,7 +94,7 @@ struct ib_umad_port { struct class_device *sm_class_dev; struct semaphore sm_sem; - struct rw_semaphore mutex; + struct mutex file_mutex; struct list_head file_list; struct ib_device *ib_dev; @@ -110,11 +110,11 @@ struct ib_umad_device { }; struct ib_umad_file { + struct mutex mutex; struct ib_umad_port *port; struct list_head recv_list; struct list_head send_list; struct list_head port_list; - spinlock_t recv_lock; spinlock_t send_lock; wait_queue_head_t recv_wait; struct ib_mad_agent *agent[IB_UMAD_MAX_AGENTS]; @@ -156,7 +156,7 @@ static int hdr_size(struct ib_umad_file *file) sizeof (struct ib_user_mad_hdr_old); } -/* caller must hold port->mutex at least for reading */ +/* caller must hold file->mutex */ static struct ib_mad_agent *__get_agent(struct ib_umad_file *file, int id) { return file->agents_dead ? NULL : file->agent[id]; @@ -168,32 +168,30 @@ static int queue_packet(struct ib_umad_file *file, { int ret = 1; - down_read(&file->port->mutex); + mutex_lock(&file->mutex); for (packet->mad.hdr.id = 0; packet->mad.hdr.id < IB_UMAD_MAX_AGENTS; packet->mad.hdr.id++) if (agent == __get_agent(file, packet->mad.hdr.id)) { - spin_lock_irq(&file->recv_lock); list_add_tail(&packet->list, &file->recv_list); - spin_unlock_irq(&file->recv_lock); wake_up_interruptible(&file->recv_wait); ret = 0; break; } - up_read(&file->port->mutex); + mutex_unlock(&file->mutex); return ret; } static void dequeue_send(struct ib_umad_file *file, struct ib_umad_packet *packet) - { +{ spin_lock_irq(&file->send_lock); list_del(&packet->list); spin_unlock_irq(&file->send_lock); - } +} static void send_handler(struct ib_mad_agent *agent, struct ib_mad_send_wc *send_wc) @@ -341,10 +339,10 @@ static ssize_t ib_umad_read(struct file *filp, char __user *buf, if (count < hdr_size(file)) return -EINVAL; - spin_lock_irq(&file->recv_lock); + mutex_lock(&file->mutex); while (list_empty(&file->recv_list)) { - spin_unlock_irq(&file->recv_lock); + mutex_unlock(&file->mutex); if (filp->f_flags & O_NONBLOCK) return -EAGAIN; @@ -353,13 +351,13 @@ static ssize_t ib_umad_read(struct file *filp, char __user *buf, !list_empty(&file->recv_list))) return -ERESTARTSYS; - spin_lock_irq(&file->recv_lock); + mutex_lock(&file->mutex); } packet = list_entry(file->recv_list.next, struct ib_umad_packet, list); list_del(&packet->list); - spin_unlock_irq(&file->recv_lock); + mutex_unlock(&file->mutex); if (packet->recv_wc) ret = copy_recv_mad(file, buf, packet, count); @@ -368,9 +366,9 @@ static ssize_t ib_umad_read(struct file *filp, char __user *buf, if (ret < 0) { /* Requeue packet */ - spin_lock_irq(&file->recv_lock); + mutex_lock(&file->mutex); list_add(&packet->list, &file->recv_list); - spin_unlock_irq(&file->recv_lock); + mutex_unlock(&file->mutex); } else { if (packet->recv_wc) ib_free_recv_mad(packet->recv_wc); @@ -481,7 +479,7 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf, goto err; } - down_read(&file->port->mutex); + mutex_lock(&file->mutex); agent = __get_agent(file, packet->mad.hdr.id); if (!agent) { @@ -577,7 +575,7 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf, if (ret) goto err_send; - up_read(&file->port->mutex); + mutex_unlock(&file->mutex); return count; err_send: @@ -587,7 +585,7 @@ err_msg: err_ah: ib_destroy_ah(ah); err_up: - up_read(&file->port->mutex); + mutex_unlock(&file->mutex); err: kfree(packet); return ret; @@ -613,11 +611,12 @@ static int ib_umad_reg_agent(struct ib_umad_file *file, void __user *arg, { struct ib_user_mad_reg_req ureq; struct ib_mad_reg_req req; - struct ib_mad_agent *agent; + struct ib_mad_agent *agent = NULL; int agent_id; int ret; - down_write(&file->port->mutex); + mutex_lock(&file->port->file_mutex); + mutex_lock(&file->mutex); if (!file->port->ib_dev) { ret = -EPIPE; @@ -666,13 +665,13 @@ found: send_handler, recv_handler, file); if (IS_ERR(agent)) { ret = PTR_ERR(agent); + agent = NULL; goto out; } if (put_user(agent_id, (u32 __user *) (arg + offsetof(struct ib_user_mad_reg_req, id)))) { ret = -EFAULT; - ib_unregister_mad_agent(agent); goto out; } @@ -690,7 +689,13 @@ found: ret = 0; out: - up_write(&file->port->mutex); + mutex_unlock(&file->mutex); + + if (ret && agent) + ib_unregister_mad_agent(agent); + + mutex_unlock(&file->port->file_mutex); + return ret; } @@ -703,7 +708,8 @@ static int ib_umad_unreg_agent(struct ib_umad_file *file, u32 __user *arg) if (get_user(id, arg)) return -EFAULT; - down_write(&file->port->mutex); + mutex_unlock(&file->port->file_mutex); + mutex_lock(&file->mutex); if (id < 0 || id >= IB_UMAD_MAX_AGENTS || !__get_agent(file, id)) { ret = -EINVAL; @@ -714,11 +720,13 @@ static int ib_umad_unreg_agent(struct ib_umad_file *file, u32 __user *arg) file->agent[id] = NULL; out: - up_write(&file->port->mutex); + mutex_unlock(&file->mutex); if (agent) ib_unregister_mad_agent(agent); + mutex_unlock(&file->port->file_mutex); + return ret; } @@ -726,12 +734,12 @@ static long ib_umad_enable_pkey(struct ib_umad_file *file) { int ret = 0; - down_write(&file->port->mutex); + mutex_lock(&file->mutex); if (file->already_used) ret = -EINVAL; else file->use_pkey_index = 1; - up_write(&file->port->mutex); + mutex_unlock(&file->mutex); return ret; } @@ -783,7 +791,7 @@ static int ib_umad_open(struct inode *inode, struct file *filp) if (!port) return -ENXIO; - down_write(&port->mutex); + mutex_lock(&port->file_mutex); if (!port->ib_dev) { ret = -ENXIO; @@ -797,7 +805,7 @@ static int ib_umad_open(struct inode *inode, struct file *filp) goto out; } - spin_lock_init(&file->recv_lock); + mutex_init(&file->mutex); spin_lock_init(&file->send_lock); INIT_LIST_HEAD(&file->recv_list); INIT_LIST_HEAD(&file->send_list); @@ -809,7 +817,7 @@ static int ib_umad_open(struct inode *inode, struct file *filp) list_add_tail(&file->port_list, &port->file_list); out: - up_write(&port->mutex); + mutex_unlock(&port->file_mutex); return ret; } @@ -821,7 +829,8 @@ static int ib_umad_close(struct inode *inode, struct file *filp) int already_dead; int i; - down_write(&file->port->mutex); + mutex_lock(&file->port->file_mutex); + mutex_lock(&file->mutex); already_dead = file->agents_dead; file->agents_dead = 1; @@ -834,14 +843,14 @@ static int ib_umad_close(struct inode *inode, struct file *filp) list_del(&file->port_list); - downgrade_write(&file->port->mutex); + mutex_unlock(&file->mutex); if (!already_dead) for (i = 0; i < IB_UMAD_MAX_AGENTS; ++i) if (file->agent[i]) ib_unregister_mad_agent(file->agent[i]); - up_read(&file->port->mutex); + mutex_unlock(&file->port->file_mutex); kfree(file); kref_put(&dev->ref, ib_umad_release_dev); @@ -914,10 +923,10 @@ static int ib_umad_sm_close(struct inode *inode, struct file *filp) }; int ret = 0; - down_write(&port->mutex); + mutex_lock(&port->file_mutex); if (port->ib_dev) ret = ib_modify_port(port->ib_dev, port->port_num, 0, &props); - up_write(&port->mutex); + mutex_unlock(&port->file_mutex); up(&port->sm_sem); @@ -981,7 +990,7 @@ static int ib_umad_init_port(struct ib_device *device, int port_num, port->ib_dev = device; port->port_num = port_num; init_MUTEX(&port->sm_sem); - init_rwsem(&port->mutex); + mutex_init(&port->file_mutex); INIT_LIST_HEAD(&port->file_list); port->dev = cdev_alloc(); @@ -1052,6 +1061,7 @@ err_cdev: static void ib_umad_kill_port(struct ib_umad_port *port) { struct ib_umad_file *file; + int already_dead; int id; class_set_devdata(port->class_dev, NULL); @@ -1067,42 +1077,22 @@ static void ib_umad_kill_port(struct ib_umad_port *port) umad_port[port->dev_num] = NULL; spin_unlock(&port_lock); - down_write(&port->mutex); + mutex_lock(&port->file_mutex); port->ib_dev = NULL; - /* - * Now go through the list of files attached to this port and - * unregister all of their MAD agents. We need to hold - * port->mutex while doing this to avoid racing with - * ib_umad_close(), but we can't hold the mutex for writing - * while calling ib_unregister_mad_agent(), since that might - * deadlock by calling back into queue_packet(). So we - * downgrade our lock to a read lock, and then drop and - * reacquire the write lock for the next iteration. - * - * We do list_del_init() on the file's list_head so that the - * list_del in ib_umad_close() is still OK, even after the - * file is removed from the list. - */ - while (!list_empty(&port->file_list)) { - file = list_entry(port->file_list.next, struct ib_umad_file, - port_list); - + list_for_each_entry(file, &port->file_list, port_list) { + mutex_lock(&file->mutex); + already_dead = file->agents_dead; file->agents_dead = 1; - list_del_init(&file->port_list); - - downgrade_write(&port->mutex); + mutex_unlock(&file->mutex); for (id = 0; id < IB_UMAD_MAX_AGENTS; ++id) if (file->agent[id]) ib_unregister_mad_agent(file->agent[id]); - - up_read(&port->mutex); - down_write(&port->mutex); } - up_write(&port->mutex); + mutex_unlock(&port->file_mutex); clear_bit(port->dev_num, dev_map); } From weiny2 at llnl.gov Fri Jan 11 19:36:57 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 11 Jan 2008 19:36:57 -0800 Subject: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups. Message-ID: <20080111193657.58477fb0.weiny2@llnl.gov> I don't really understand the innerworkings of IPoIB so forgive me if this is a really stupid question but: Is it a bug that there is a Multicast group created for every node in our clusters? If not a bug why is this done? We just tried to boot on a 1151 node cluster and opensm is complaining there are not enough multicast groups. Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed Here is the output from my small test cluster: (ibnodesinmcast uses saquery a couple of times to print this nice report.) 19:17:24 > whatsup up: 9: wopr[0-7],wopri down: 0: root at wopri:/tftpboot/images 19:25:03 > ibnodesinmcast -g 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) In 9: wopr[0-7],wopri Out 0: 0 0xC001 (0xff12401bffff0000 : 0x0000000000000001) In 9: wopr[0-7],wopri Out 0: 0 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) In 1: wopr3 Out 8: wopr[0-2,4-7],wopri 0xC003 (0xff12601bffff0000 : 0x0000000000000001) In 9: wopr[0-7],wopri Out 0: 0 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) In 1: wopr4 Out 8: wopr[0-3,5-7],wopri 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) In 1: wopri Out 8: wopr[0-7] 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) In 1: wopr6 Out 8: wopr[0-5,7],wopri 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) In 1: wopr7 Out 8: wopr[0-6],wopri 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) In 1: wopr1 Out 8: wopr[0,2-7],wopri 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) In 1: wopr2 Out 8: wopr[0-1,3-7],wopri 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) In 1: wopr0 Out 8: wopr[1-7],wopri 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) In 1: wopr5 Out 8: wopr[0-4,6-7],wopri Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in them and represent an ipv6 address. Could you turn off ipv6 with the latest IPoIB? In a bind, Ira From weiny2 at llnl.gov Fri Jan 11 22:04:56 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 11 Jan 2008 22:04:56 -0800 Subject: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups. In-Reply-To: <20080111193657.58477fb0.weiny2@llnl.gov> References: <20080111193657.58477fb0.weiny2@llnl.gov> Message-ID: <20080111220456.0d62de97.weiny2@llnl.gov> Ok, I found my own answer. Sorry for the spam. http://lists.openfabrics.org/pipermail/general/2006-November/029617.html Sorry, Ira On Fri, 11 Jan 2008 19:36:57 -0800 Ira Weiny wrote: > I don't really understand the innerworkings of IPoIB so forgive me if this is a > really stupid question but: > > Is it a bug that there is a Multicast group created for every node in our > clusters? > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > and opensm is complaining there are not enough multicast groups. > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > couple of times to print this nice report.) > > > 19:17:24 > whatsup > up: 9: wopr[0-7],wopri > down: 0: > root at wopri:/tftpboot/images > 19:25:03 > ibnodesinmcast -g > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > In 9: wopr[0-7],wopri > Out 0: 0 > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > In 9: wopr[0-7],wopri > Out 0: 0 > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > In 1: wopr3 > Out 8: wopr[0-2,4-7],wopri > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > In 9: wopr[0-7],wopri > Out 0: 0 > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > In 1: wopr4 > Out 8: wopr[0-3,5-7],wopri > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > In 1: wopri > Out 8: wopr[0-7] > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > In 1: wopr6 > Out 8: wopr[0-5,7],wopri > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > In 1: wopr7 > Out 8: wopr[0-6],wopri > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > In 1: wopr1 > Out 8: wopr[0,2-7],wopri > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > In 1: wopr2 > Out 8: wopr[0-1,3-7],wopri > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > In 1: wopr0 > Out 8: wopr[1-7],wopri > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > In 1: wopr5 > Out 8: wopr[0-4,6-7],wopri > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > them and represent an ipv6 address. Could you turn off ipv6 with the latest > IPoIB? > > In a bind, > Ira > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From weiny2 at llnl.gov Sat Jan 12 00:01:17 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Sat, 12 Jan 2008 00:01:17 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080111220456.0d62de97.weiny2@llnl.gov> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> Message-ID: <20080112000117.6b52b53c.weiny2@llnl.gov> And to further answer my question...[*] This seems to fix the problem for us, however I know that it could be better. For example it only takes care of partition 0xFFFF, and I think Jason's idea of having say 16 Mcast Groups and some hash of these into them would be nice. But is this on the right track? Am I missing some other place in the code? Thanks, Ira [*] Again I apologize for the spam but we were in a bit of a panic as we only have the big system for the weekend and IB was not part of the test... ;-) >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Fri, 11 Jan 2008 22:58:19 -0800 Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast Group. Signed-off-by: root --- opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- 2 files changed, 59 insertions(+), 2 deletions(-) diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 8eb97ad..6bcc124 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) /* compare entire MGID so different scope will not sneak in for the same MGID */ if (memcmp(&p_mgrp->mcmember_rec.mgid, - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { + + /* Special Case IPV6 Multicast Loopback addresses */ + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ +#define SPEC_PREFIX (0xff12601bffff0000) +#define INT_ID_MASK (0x00000001ff000000) + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); + + if (rcv_prefix == SPEC_PREFIX + && + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { + + if ((g_prefix == rcv_prefix) + && + (g_interface_id & INT_ID_MASK) == + (rcv_interface_id & INT_ID_MASK) + ) { + osm_log(sa->p_log, OSM_LOG_INFO, + "Special Case Mcast Join for MGID " + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", + rcv_prefix, rcv_interface_id); + goto match; + } + } return; + } +match: if (p_ctxt->p_mgrp) { osm_log(sa->p_log, OSM_LOG_ERROR, "__search_mgrp_by_mgid: ERR 1B03: " diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 749a936..469773a 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) /* compare entire MGID so different scope will not sneak in for the same MGID */ - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { + + /* Special Case IPV6 Multicast Loopback addresses */ + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ +#define SPEC_PREFIX (0xff12601bffff0000) +#define INT_ID_MASK (0x00000001ff000000) + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); + + if (rcv_prefix == SPEC_PREFIX + && + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { + + if ((g_prefix == rcv_prefix) + && + (g_interface_id & INT_ID_MASK) == + (rcv_interface_id & INT_ID_MASK) + ) { + osm_log(sa->p_log, OSM_LOG_INFO, + "Special Case Mcast Join for MGID " + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", + rcv_prefix, rcv_interface_id); + goto match; + } + } return; + } + +match: #if 0 for (i = 0; -- 1.5.1 On Fri, 11 Jan 2008 22:04:56 -0800 Ira Weiny wrote: > Ok, > > I found my own answer. Sorry for the spam. > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > Sorry, > Ira > > > On Fri, 11 Jan 2008 19:36:57 -0800 > Ira Weiny wrote: > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > really stupid question but: > > > > Is it a bug that there is a Multicast group created for every node in our > > clusters? > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > and opensm is complaining there are not enough multicast groups. > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > couple of times to print this nice report.) > > > > > > 19:17:24 > whatsup > > up: 9: wopr[0-7],wopri > > down: 0: > > root at wopri:/tftpboot/images > > 19:25:03 > ibnodesinmcast -g > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > In 9: wopr[0-7],wopri > > Out 0: 0 > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > In 9: wopr[0-7],wopri > > Out 0: 0 > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > In 1: wopr3 > > Out 8: wopr[0-2,4-7],wopri > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > In 9: wopr[0-7],wopri > > Out 0: 0 > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > In 1: wopr4 > > Out 8: wopr[0-3,5-7],wopri > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > In 1: wopri > > Out 8: wopr[0-7] > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > In 1: wopr6 > > Out 8: wopr[0-5,7],wopri > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > In 1: wopr7 > > Out 8: wopr[0-6],wopri > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > In 1: wopr1 > > Out 8: wopr[0,2-7],wopri > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > In 1: wopr2 > > Out 8: wopr[0-1,3-7],wopri > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > In 1: wopr0 > > Out 8: wopr[1-7],wopri > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > In 1: wopr5 > > Out 8: wopr[0-4,6-7],wopri > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > IPoIB? > > > > In a bind, > > Ira > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Special-Case-the-IPv6-Solicited-Node-Multicast-addre.patch Type: application/octet-stream Size: 3637 bytes Desc: not available URL: From vst at vlnb.net Sat Jan 12 01:51:27 2008 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Sat, 12 Jan 2008 12:51:27 +0300 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fix target data corruption In-Reply-To: <4787F142.mailGZ011OYG6@systemfabricworks.com> References: <4787F142.mailGZ011OYG6@systemfabricworks.com> Message-ID: <47888D9F.3020905@vlnb.net> davem at systemfabricworks.com wrote: > This is an updated version of [PATCH] drivers/infiniband/ulp/srpt: Fix > target data corruption > > It was pointed out to me that the code to round up to a power of 2 was > not as clean as it should be, plus I extracted two unrelated patches and > submitted them separately. > > ===================================================================== > > Change the local buffer allocator to use a spin-lock protected linked > list instead of an array of atomic_t used/free variables. The atomic_t > code was open to a multi-thread race between test and set. This has > been observed with the result that the same data buffer was used for > more than one SCSI operation, either writing the wrong data to the disk > or sending the wrong data to the initiator. I, as a main SCST developer and implementor, would suggest to completely remove internal memory management from the SRPT driver and use SCST memory management instead. It will provide the following advantages: 1. Simplify SRPT driver and completely remove such kind of bugs. 2. Make SRPT target driver compatible with scst_user module, i.e. will allow to use SRPT target driver with backstorage devices, implemented in user space. Usual example of such devices is a VTL (Virtual Tape Library). 3. (Most likely, since I'm not too familiar with SRPT drivers internals, but for me it looks like so) Allow SRPT driver to reliably work with many outstanding commands with big data transfer sizes (>=1MB) 4. Might improve performance by caching and reusing already allocated and "iomaped" to Infiniband hardware SG vectors. Vu knows the details, we discussed them with him. It will require some minor SCST modifications (extending its interface with target drivers), but I'm willing to make them if somebody ask for it. Vlad From vlad at lists.openfabrics.org Sat Jan 12 03:08:03 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 12 Jan 2008 03:08:03 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080112-0200 daily build status Message-ID: <20080112110803.BE168E601CA@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.18 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.15 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.17 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-53.el5 Passed on ia64 with linux-2.6.14 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ppc64 with linux-2.6.12 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on powerpc with linux-2.6.14 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.15 Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.13 Passed on powerpc with linux-2.6.15 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.16 Passed on ia64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.13 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.18 Failed: From gdigicam at digica.com Fri Jan 11 06:56:40 2008 From: gdigicam at digica.com (Mary Courtney) Date: Sat, 11 Jan 2008 09:56:40 -0500 Subject: [ofa-general] We shall lead you to your new life Message-ID: <938536490.17590299099315@digica.com> Dear openib-general at openib.orgAre you unhappy about your actual cock size? Increase your cock and satisfy your partner. Choose a really effective and money worth product which is also the safest one - a VPXL. It has helped a lot of men worldwide to increase their erotic confidence. They are happy to have a cock size they always dreamt about.Don't hesitate, grab the chance of your lifetime and order our VPXL now.http://geocities.com/danabentley37/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpearson at systemfabricworks.com Sat Jan 12 10:41:22 2008 From: rpearson at systemfabricworks.com (Robert Pearson) Date: Sat, 12 Jan 2008 12:41:22 -0600 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fixtarget data corruption In-Reply-To: <47888D9F.3020905@vlnb.net> Message-ID: <6grf06$3258oc@rrcs-agw-02.hrndva.rr.com> Vlad, I think we agree. But, when we tried the experiment of running without the local memory allocator scst hung when we did large IO operations. Probably something simple. We can look harder. Next step for us is to sync up with Vu on a few other changes in the works. Bob Pearson -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vladislav Bolkhovitin Sent: Saturday, January 12, 2008 3:51 AM To: davem at systemfabricworks.com Cc: vu at mellanox.com; general at lists.openfabrics.org Subject: Re: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fixtarget data corruption davem at systemfabricworks.com wrote: > This is an updated version of [PATCH] drivers/infiniband/ulp/srpt: Fix > target data corruption > > It was pointed out to me that the code to round up to a power of 2 was > not as clean as it should be, plus I extracted two unrelated patches and > submitted them separately. > > ===================================================================== > > Change the local buffer allocator to use a spin-lock protected linked > list instead of an array of atomic_t used/free variables. The atomic_t > code was open to a multi-thread race between test and set. This has > been observed with the result that the same data buffer was used for > more than one SCSI operation, either writing the wrong data to the disk > or sending the wrong data to the initiator. I, as a main SCST developer and implementor, would suggest to completely remove internal memory management from the SRPT driver and use SCST memory management instead. It will provide the following advantages: 1. Simplify SRPT driver and completely remove such kind of bugs. 2. Make SRPT target driver compatible with scst_user module, i.e. will allow to use SRPT target driver with backstorage devices, implemented in user space. Usual example of such devices is a VTL (Virtual Tape Library). 3. (Most likely, since I'm not too familiar with SRPT drivers internals, but for me it looks like so) Allow SRPT driver to reliably work with many outstanding commands with big data transfer sizes (>=1MB) 4. Might improve performance by caching and reusing already allocated and "iomaped" to Infiniband hardware SG vectors. Vu knows the details, we discussed them with him. It will require some minor SCST modifications (extending its interface with target drivers), but I'm willing to make them if somebody ask for it. Vlad _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From kliteyn at mellanox.co.il Sat Jan 12 17:27:21 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 13 Jan 2008 03:27:21 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-13:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-12 OpenSM git rev = Thu_Jan_10_03:48:16_2008 [7bb2045bd9f659f8466a4494f4ec983f0edbf96a] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=399 Fail=1 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 9 LidMgr IS3-128.topo Failures: 1 LidMgr IS3-128.topo From dotanb at dev.mellanox.co.il Sat Jan 12 23:04:47 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Sun, 13 Jan 2008 09:04:47 +0200 Subject: [ofa-general] retry exceeded problem with rdma_read In-Reply-To: <3307cdf90801101121u39f4fa29l5673ea4c3ffe790c@mail.gmail.com> References: <3307cdf90801091434q4298cab0sf8e670c21087afad@mail.gmail.com> <4785B822.80306@dev.mellanox.co.il> <3307cdf90801101121u39f4fa29l5673ea4c3ffe790c@mail.gmail.com> Message-ID: <4789B80F.7010900@dev.mellanox.co.il> Rajouri Jammu wrote: > I have the following set both on rdma_connect as well as at rdma_accept. > > conn_param.responder_resources = 4; > conn_param.initiator_depth = 4; > > Should initiator_depth be lower for better behavior? > Higher values for those attributes means that more outstanding WR of RDMA Read/Atomic will be handled. It doesn't matter which value you put in initiator_depth (for example: 1, 4) as long as it sync with the responder_resources value. Dotan > On Jan 9, 2008 10:16 PM, Dotan Barak < dotanb at dev.mellanox.co.il > > wrote: > > Rajouri Jammu wrote: > > Occasionally, I'm getting a retry exceeded error on the qp > (error 12) > > when doing rdma_reads. > > > > Under what conditions would thins kind of problem happen? > > > > I have the retry_count = 5 and 'am using rdma_cm for all the > > connection setup. > > > > OFED version is 1.2.5 > Does it happen between different HCAs? > > If this happens during working with the QPs (not in the first message) > than check the following thing: > > If the QP attributes values of max_rd_atomic and max_dest_rd_atomic > this may happen. > > The values should be (for sides A and B): > A.max_rd_atomic <= B.max_dest_rd_atomic > A.max_dest_rd_atomic >= B.max_rd_atomic > > (which means that RDMA Reads/atomic as initiator shouldn't be larger > than the supported value as the destination) > > You can check it by query the used QP and verify those values. > > > > If it happens at the beginning of the connection, there may be other > problem and i need more info .... > > Dotan > > From dotanb at dev.mellanox.co.il Sat Jan 12 23:44:23 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Sun, 13 Jan 2008 09:44:23 +0200 Subject: [ofa-general] Verbs questions... In-Reply-To: <0B877605F0F36F45A439AF0DEEA18CCF0DC519CC@njpcmg1exms305.leh.lbcorp.lehman.com> References: <0B877605F0F36F45A439AF0DEEA18CCF0DC519CC@njpcmg1exms305.leh.lbcorp.lehman.com> Message-ID: <4789C157.9080803@dev.mellanox.co.il> All of the questions that you are asking can be answered by the IB spec. But i will try to do my best to answer anyway... Kalbag, Ashwin wrote: > Question 1: > Here's a section of the man page for ibv_post_send: > "The attribute send_flags describes the properties of the WR. It is > either 0 or the bitwise OR of one or more of the following flags: > IBV_SEND_FENCE Set the fence indicator. Valid only for QPs with > Transport Service Type IBV_QPT_RC > IBV_SEND_SIGNALED Set the completion notification indicator. > Relevant only if QP was created with sq_sig_all=0 > IBV_SEND_SOLICITED Set the solicited event indicator. Valid only > for Send and RDMA Write with immediate > IBV_SEND_INLINE Send data in given gather list as inline data > in a send WQE. Valid only for Send and RDMA Write. The > L_Key will not be checked." > > a. What is the fence indicator? Under what circumstances would I use > this? > Here is the line from the IB spec describing the use of the fence indicator: "When the Fence Indicator has been set in a Work Request, the Send Queue shall not begin processing that Work Request until all prior RDMA Read and Atomic Operations on that Send Queue have completed." > b. How is a solicited event different from a signaled event? Under what > circumstances would I use this? > Signal influence about the requestor (sender) and solicited influence about the responser (receiver). There isn't any connection between them and they can used alone (only one of them will be set) or together. Signaled means that a completion will be created when did SR will be finished (if the QP was created with sq_sig_all=0 this mens that only ended SR with the SIGNAL indicator will create a completion. if sq_sig_all=1 EVERY SR will create a completion when it ends). The solicited event means that a special event called solicited event will occur in the OTHER side and he will be able to wait of completion event with the solicited bit set (ibv_req_notify_cq(cq, 1) will wait until such completion is being created). Using the solicited bit the sender can influence when the receiver side will handle the received completions. > c. What is not apparent from this man page is whether the signal is > generated on the sending side on send completion or on the receiving > side on completion of the corresponding posted recv. It's not > explicitly stated, but I am assuming that the signaling refers to send > completion on the sending side. Is it true that regardless of whether > the send is signaled on the sending side, it will generate a signal on > the receiving side when the recv operation completes? > I don't quite understand your question, but i think that the answered that i gave will answer this. If not, can you please rephrase the question? > Question 2: > Say I'm trying to optimize between polling and completion event > notification. Could you please see whether I'm conceptualizing this > correctly? > > At one extreme, you could poll continuously (without signaling sends), > until you want to send the next message. The polling would take care of > recycling memory region elements on send completions, handling received > messages for recv completions. At the other extreme, every event would > be signaled. You could set sg_sig_all=1 while creating the completion > queue, or alternatively always use the IBV_SEND_SIGNALED flag when > posting sends. > FYI: The term is every "Send Request". And yes, completion of WRs (send or receiver request) means that the operation was completed and you can reclaim the buffers that it accessed and reuse/free them. If sg_sig_all=0 you can make sure that completions won't be created for all of the WR. > To take the middle road, one possibility is to generate completions for > every nth send. If the messages are being sent rapidly, you can afford > to signal fewer sends. If sends are fewer, you would need to either > signal more sends, say every send, or compensate by expending CPU in > polling. Seems like the signaling needs to be adaptive to the rate of > sending. > You are right, this is what this mechanism is all about. > If this is the case, you still have a choice in how you process send > completions, assuming that signals will always be generated on the > receiving side. In the Receiver side completions will be created for EVERY WR. > Even if you didn't signal every send, you could still > process send completions by relying exclusively on signaling instead of > polling. If you did this, you'd need to signal at least once per "send > queue depth", correct? There is no urgency in processing send > completions so long as you have some available depth in the send queue > and elements to post sends with. But there is urgency in processing > recv completions. So, it may be more optimal to process send > completions with fewer signals, say 2 signals per send queue depth, for > a margin of safety, so at any time, you will have half the send queue > available. This asymmetry would imply it is better to use separate > completion queues for sending and receiving. Am I right in surmising > this? > In the receiver side you have completion for every WR because you don't know when a message is available, but in the sender side you knows when you send the message, so you don't have to create a completion for every created SR. You can separate the Send CQ and the Receive CQ (for example: one of them can be used with polling and the other with completion events). This depend of the way you write your code. Maybe handling one CQ will be easier for you ... > On the recv side, you want to process completions in a hurry and post > more recvs, to keep up with the incoming message traffic. Now here's > where you might benefit from polling even after your recv event > completion was signaled, so you avoid context switching out and back in, > and the concomitant delay, just in case there are actually more recvs > that completed. Is this correct? > This is Infiniband, you don't have any context switch in the data operations. Post SR and Poll completions are being executed in the context you are in .... You are right that incoming messages need to have more priority than messages that you send. I hope that i helped a little bit. If there is any point that i missed, you are welcome to ask again Dotan > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice. > > -------- > IRS Circular 230 Disclosure: > Please be advised that any discussion of U.S. tax matters contained within this communication (including any attachments) is not intended or written to be used and cannot be used for the purpose of (i) avoiding U.S. tax related penalties or (ii) promoting, marketing or recommending to another party any transaction or matter addressed herein. > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > From jackm at dev.mellanox.co.il Sun Jan 13 00:01:04 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 13 Jan 2008 10:01:04 +0200 Subject: [ofa-general] Re: [PATCH] libmlx4: avoid memcpy in blueflame post_sends In-Reply-To: References: <200801091223.14155.jackm@dev.mellanox.co.il> <200801100943.48907.jackm@dev.mellanox.co.il> Message-ID: <200801131001.04353.jackm@dev.mellanox.co.il> On Thursday 10 January 2008 20:08, Roland Dreier wrote: > > However, your solution still results in a procedure call (mlx4_bf_copy > > is compiled as a procedure using gcc 4.1.0 on an X86_64 host, even if > > I add "inline"). > > Can you give more detail on the platform and how you compiled? I > can't reproduce it with gcc 4.1.3 here. Are you compiling with > optimization enabled? Are other things like set_atomic_seg() getting > inlined properly? > > > I would prefer the patch below (which does generate inline code, and does the > > (sizeof(unsigned long) * 2) calculation just once). > > Dividing by 2 * sizeof (long) seems to generate slightly worse code > for me. Since sizeof (long) is a compile time constant, in my version > the compiler just generates a sub $10, while in your version there is > a sub $1 instead (which costs the same) plus an extra right shift at > the beginning of the loop. > > - R. Your implementation is the better one. I did not notice at the time, but I had evidently (a long time ago) defined CFLAGS in the local bash environment to just an include path. The result was that this local env variable was used (instead of the correct one) when the makefile was generated at installation time -- so the -O2 flag was absent. Once I corrected this, the code from your patch was properly generated, and I could see that the result was better than the patch I proposed. Let's go with your patch. - Jack From eli at dev.mellanox.co.il Sun Jan 13 00:05:26 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Sun, 13 Jan 2008 10:05:26 +0200 Subject: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups. In-Reply-To: <20080111193657.58477fb0.weiny2@llnl.gov> References: <20080111193657.58477fb0.weiny2@llnl.gov> Message-ID: <1200211526.11174.128.camel@mtls03> IPOIB does not initiate a join to a mulitcast group (except for the broadcast group). This comes from routing protocols or use space sockets. Do you run processes that use many different multicast groups? On Fri, 2008-01-11 at 19:36 -0800, Ira Weiny wrote: > I don't really understand the innerworkings of IPoIB so forgive me if this is a > really stupid question but: > > Is it a bug that there is a Multicast group created for every node in our > clusters? > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > and opensm is complaining there are not enough multicast groups. > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > couple of times to print this nice report.) > > > 19:17:24 > whatsup > up: 9: wopr[0-7],wopri > down: 0: > root at wopri:/tftpboot/images > 19:25:03 > ibnodesinmcast -g > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > In 9: wopr[0-7],wopri > Out 0: 0 > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > In 9: wopr[0-7],wopri > Out 0: 0 > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > In 1: wopr3 > Out 8: wopr[0-2,4-7],wopri > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > In 9: wopr[0-7],wopri > Out 0: 0 > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > In 1: wopr4 > Out 8: wopr[0-3,5-7],wopri > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > In 1: wopri > Out 8: wopr[0-7] > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > In 1: wopr6 > Out 8: wopr[0-5,7],wopri > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > In 1: wopr7 > Out 8: wopr[0-6],wopri > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > In 1: wopr1 > Out 8: wopr[0,2-7],wopri > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > In 1: wopr2 > Out 8: wopr[0-1,3-7],wopri > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > In 1: wopr0 > Out 8: wopr[1-7],wopri > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > In 1: wopr5 > Out 8: wopr[0-4,6-7],wopri > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > them and represent an ipv6 address. Could you turn off ipv6 with the latest > IPoIB? > > In a bind, > Ira > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Sun Jan 13 00:21:04 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Jan 2008 08:21:04 +0000 Subject: [ofa-general] Re: [PATCH] opensm/osm_sa_slvl_record: fix overflow crash In-Reply-To: <1199978685.3611.109.camel@hrosenstock-ws.xsigo.com> References: <20080109194153.GH20963@sashak.voltaire.com> <1199978685.3611.109.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080113082104.GB1903@sashak.voltaire.com> Hi Hal, On 07:24 Thu 10 Jan , Hal Rosenstock wrote: > On Wed, 2008-01-09 at 19:41 +0000, Sasha Khapyorsky wrote: > > When SL2VLTableRecord is requested for switch by lid only (no in and out > > ports are selected in compmask) it overflows its own physical ports > > table. > > > > Signed-off-by: Sasha Khapyorsky > > --- > > opensm/opensm/osm_sa_slvl_record.c | 4 ++-- > > 1 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/opensm/opensm/osm_sa_slvl_record.c b/opensm/opensm/osm_sa_slvl_record.c > > index 28dddd4..cc21765 100644 > > --- a/opensm/opensm/osm_sa_slvl_record.c > > +++ b/opensm/opensm/osm_sa_slvl_record.c > > @@ -149,9 +149,9 @@ __osm_sa_slvl_by_comp_mask(IN osm_sa_t * sa, > > comp_mask = p_ctxt->comp_mask; > > num_ports = osm_node_get_num_physp(p_port->p_node); > > in_port_start = 0; > > - in_port_end = num_ports; > > + in_port_end = num_ports - 1; > > out_port_start = 0; > > - out_port_end = num_ports; > > + out_port_end = num_ports - 1; > > p_req_physp = p_ctxt->p_req_physp; > > > > if (p_port->p_node->node_info.node_type != IB_NODE_TYPE_SWITCH) { > > Minor comment: > Rather than subtracting 1 from in/out_port_end, wouldn't changing the > input and output port number comparisons to < rather than <= work (and > be more consistent with other SA record handling) ? In this particular SA processor the only using '<' instead of '<=' will break the case when in- or out- ports are requested in a comp mask. Sasha From sashak at voltaire.com Sun Jan 13 00:40:01 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Jan 2008 08:40:01 +0000 Subject: [ofa-general] opensm dumps core when using LASH for routing In-Reply-To: <18311.19963.735177.83038@kuku.melbourne.sgi.com> References: <18311.19963.735177.83038@kuku.melbourne.sgi.com> Message-ID: <20080113084001.GC1903@sashak.voltaire.com> Hi Max, On 22:07 Fri 11 Jan , Max Matveev wrote: > > I've got opensm 3.0.3 from OFED 1.2 dying on startup when using LASH > for routing. Here is the trace: > > #0 0x0000000000459abf in get_lash_id (p_sw=0x5fbf40) at osm_ucast_lash.c:1124 > #1 0x000000000045a704 in osm_get_lash_sl (p_osm=0x7fffa0aba4d0, > p_src_port=0x2aab06971400, p_dst_port=0x61ade0) at > osm_ucast_lash.c:1450 > #2 0x000000000042e80d in __osm_pr_rcv_get_path_parms(p_rcv=0x7fffa0abba80, > p_pr=0x2aab0661add0, p_src_port=0x2aab06971400, > p_dest_port=0x61ade0, > dest_lid_ho=2, comp_mask=580964351930793984, p_parms=0x649eef20) > at osm_sa_path_record.c:685 > #3 0x000000000042f02b in __osm_pr_rcv_get_lid_pair_path ( > p_rcv=0x7fffa0abba80, p_pr=0x2aab0661add0, > p_src_port=0x2aab06971400, > p_dest_port=0x61ade0, p_dgid=0x649ef0a0, src_lid_ho=1, > dest_lid_ho=2, > comp_mask=580964351930793984, preference=0 '\0') > at osm_sa_path_record.c:852 > #4 0x000000000042f5d6 in __osm_pr_rcv_get_port_pair_paths ( > p_rcv=0x7fffa0abba80, p_madw=0x6ecbb0, p_req_port=0x2aab06971400, > p_src_port=0x2aab06971400, p_dest_port=0x61ade0, > p_dgid=0x649ef0a0, > comp_mask=580964351930793984, p_list=0x649ef0b0) > at osm_sa_path_record.c:1072 > #5 0x000000000042fdc5 in __osm_pr_rcv_process_half(p_rcv=0x7fffa0abba80, > p_madw=0x6ecbb0, requester_port=0x2aab06971400, > p_src_port=0x2aab06971400, > p_dest_port=0x0, p_dgid=0x649ef0a0, comp_mask=580964351930793984, > p_list=0x649ef0b0) at osm_sa_path_record.c:1437 > #6 0x0000000000430c6f in osm_pr_rcv_process (context=0x7fffa0abba80, > data=0x6ecbb0) at osm_sa_path_record.c:2003 > #7 0x00002b110a54ef57 in __cl_disp_worker (context=0x7fffa0abcb30) > at cl_dispatcher.c:102 > #8 0x00002b110a5563b7 in __cl_thread_pool_routine(context=0x7fffa0abcba8) > at cl_threadpool.c:74 > #9 0x00002b110a55620a in __cl_thread_wrapper (arg=0x5a4a40) at cl_thread.c:58 > #10 0x00002b110a21b143 in start_thread () from /lib64/libpthread.so.0 > #11 0x00002b110a82774d in clone () from /lib64/libc.so.6 > #12 0x0000000000000000 in ?? () > > This is the switch: > > (gdb) p *( osm_switch_t *)0x5fbf40 > $3 = {map_item = {pool_item = {list_item = {p_next = 0x2aab065b4ed0, > p_prev = 0x2aab069850f0}}, p_left = 0x2aab069850f0, > p_right = 0x2aab065b4ed0, p_up = 0x2aab06587cd0, color = > CL_MAP_BLACK, > key = 17582052945261297672}, p_node = 0x608c80, switch_info = { > lin_cap = 192, rand_cap = 0, mcast_cap = 4, lin_top = 32769, > def_port = 0 '\0', def_mcast_pri_port = 0 '\0', > def_mcast_not_port = 0 '\0', life_state = 144 '\220', > lids_per_port = 0, > enforce_cap = 8192, flags = 240 ''}, max_lid_ho = 0, > num_ports = 25 '\031', num_hops = 0, hops = 0x0, p_prof = 0x5fbff0, > fwd_tbl = {p_rnd_tbl = 0x0, p_lin_tbl = 0x7dd0f0}, mcast_tbl = { > num_ports = 25 '\031', max_position = 1 '\001', max_block = 31, > max_block_in_use = -1, num_entries = 1024, max_mlid_ho = 50176, > p_mask_tbl = 0x7e9100}, discovery_count = 3, priv = 0x0} > > As you can see the priv pointer is NULL get_lash_id() follows it and > dies. > > There is an obvious fix - simply check for priv in osm_get_lash_sl() > and return OSM_DEFAULT_SL: it already does it when checking for src_id > but not for dst_id but I'm not sure it's the right fix because I > cannot quite understand how priv got to be NULL - it was reset in > lash_cleanup() I suspect that the failure scenario is different. This switch was just connected/discovered by OpenSM (it has hops = 0x0 yet - this indicates that it does not pass lid matrix generation stage yet) and it still be uninitialized by LASH. If it is really so checking ->priv for NULL looks like valid fix. Is this reproducible failure? Sasha > but I don't see any threads which are inside > discover_network_properties() and I would've thought that when opensm > gets out of there, all switches must be initialized properly. > > I'm also not sure who initiated PATH_RECORD query - it does not look > like opensm would do it to itself yet the requestor_port was on the > name HCA. It could be another process running on the same host for > what I know. > > max > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From vlad at lists.openfabrics.org Sun Jan 13 03:07:16 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 13 Jan 2008 03:07:16 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080113-0200 daily build status Message-ID: <20080113110716.8B2B0E6019B@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.22 Passed on powerpc with linux-2.6.12 Passed on x86_64 with linux-2.6.18-53.el5 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.13 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.18 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.16 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on powerpc with linux-2.6.13 Passed on ia64 with linux-2.6.12 Passed on ppc64 with linux-2.6.12 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.17 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.13 Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.18 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18-8.el5 Failed: From makc at sgi.com Sun Jan 13 03:25:48 2008 From: makc at sgi.com (Max Matveev) Date: Sun, 13 Jan 2008 22:25:48 +1100 Subject: [ofa-general] opensm dumps core when using LASH for routing In-Reply-To: <20080113084001.GC1903@sashak.voltaire.com> References: <18311.19963.735177.83038@kuku.melbourne.sgi.com> <20080113084001.GC1903@sashak.voltaire.com> Message-ID: <18313.62780.11344.45768@kuku.melbourne.sgi.com> >>>>> "sashak" == Sasha Khapyorsky writes: sashak> I suspect that the failure scenario is different. This switch sashak> was just connected/discovered by OpenSM (it has hops = 0x0 sashak> yet - this indicates that it does not pass lid matrix sashak> generation stage yet) and it still be uninitialized by sashak> LASH. If it is really so checking ->priv for NULL looks like sashak> valid fix. Should opensm ignore requests while it's initializing? sashak> Is this reproducible failure? We've hit it twice - first time cores were disabled, so I only know what opensm died in get_lash_id() but I don't know where it was called from. And this is the second time. It does not happen on each restart or fabric re-scan though. max From dotanb at dev.mellanox.co.il Sun Jan 13 07:37:20 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Sun, 13 Jan 2008 17:37:20 +0200 Subject: [ofa-general] [PATCH] libmthca: prevent seg fault when sending big messages as inline Message-ID: <200801131737.20356.dotanb@dev.mellanox.co.il> Fix the type of the variable that hold the number of bytes sent as inline message so far. If the user will try to use very big messages (total > 2^31) there will be seg fault. Signed-off-by: Dotan Barak --- diff --git a/src/qp.c b/src/qp.c index 2ea9dc0..cb290ec 100644 --- a/src/qp.c +++ b/src/qp.c @@ -244,7 +244,7 @@ int mthca_tavor_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, if (wr->send_flags & IBV_SEND_INLINE) { if (wr->num_sge) { struct mthca_inline_seg *seg = wqe; - int s = 0; + unsigned int s = 0; wqe += sizeof *seg; for (i = 0; i < wr->num_sge; ++i) { From dotanb at dev.mellanox.co.il Sun Jan 13 07:43:23 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Sun, 13 Jan 2008 17:43:23 +0200 Subject: [ofa-general] [PATCH] libmlx4: prevent seg fault when sending big messages as inline Message-ID: <200801131743.24014.dotanb@dev.mellanox.co.il> Fix the type of the variable that hold the number of bytes sent as inline message so far. Without the patch, If the user will try to use very big messages (total > 2^31) there will be a seg fault. Signed-off-by: Dotan Barak --- diff --git a/src/qp.c b/src/qp.c index 8b4adaa..abac597 100644 --- a/src/qp.c +++ b/src/qp.c @@ -177,7 +177,7 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct mlx4_wqe_ctrl_seg *ctrl; int ind; int nreq; - int inl = 0; + unsigned int inl = 0; int ret = 0; int size; int i; From sashak at voltaire.com Sun Jan 13 11:34:35 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Jan 2008 19:34:35 +0000 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080113193435.GG10650@sashak.voltaire.com> Hi Hal, On 07:03 Mon 07 Jan , Hal Rosenstock wrote: > > > > The problem that Yevgeny mentioned exists in the openSM and i opened a > > bug on this issue. > > Some comments on the issues raised above: > > 1. There has already been discussion on this list of other OpenFabrics > components assuming the default PKey at index 0 (and yes, this is not > mandated by IBA). > > 2. As to the impact of using a non default PKey (in the BTH) for > querying SA, is this really used in any implementations ? It makes > deployment of SM difficult (needing much more configuration). That's not > to say this isn't a bug but more speaks to the severity of it. IMO it > should be documented as a current limitation. > > [Also, note that user MAD API only supports pkey index at recent kernel > and library versions.] This could be handled in the way similar to how umad_set_pkey() does. I will post the patches shortly. Sasha From sashak at voltaire.com Sun Jan 13 11:35:59 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Jan 2008 19:35:59 +0000 Subject: [ofa-general] [PATCH] libibumad: umad_get_pkey() function In-Reply-To: <20080113193435.GG10650@sashak.voltaire.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> Message-ID: <20080113193559.GH10650@sashak.voltaire.com> This returns value of pkey_index in network byte order from user_mad header. If we are running with kernel where pkey_index is not supported yet it will return 0. Signed-off-by: Sasha Khapyorsky --- libibumad/include/infiniband/umad.h | 1 + libibumad/src/libibumad.map | 1 + libibumad/src/umad.c | 13 ++++++++++++- 3 files changed, 14 insertions(+), 1 deletions(-) diff --git a/libibumad/include/infiniband/umad.h b/libibumad/include/infiniband/umad.h index 681b440..742c7b0 100644 --- a/libibumad/include/infiniband/umad.h +++ b/libibumad/include/infiniband/umad.h @@ -174,6 +174,7 @@ int umad_set_grh(void *umad, void *mad_addr); int umad_set_addr_net(void *umad, int dlid, int dqp, int sl, int qkey); int umad_set_addr(void *umad, int dlid, int dqp, int sl, int qkey); int umad_set_pkey(void *umad, int pkey_index); +int umad_get_pkey(void *umad); int umad_send(int portid, int agentid, void *umad, int length, int timeout_ms, int retries); diff --git a/libibumad/src/libibumad.map b/libibumad/src/libibumad.map index 9444aa9..0154b7f 100644 --- a/libibumad/src/libibumad.map +++ b/libibumad/src/libibumad.map @@ -15,6 +15,7 @@ IBUMAD_1.0 { umad_size; umad_set_grh; umad_set_pkey; + umad_get_pkey; umad_set_addr; umad_set_addr_net; umad_send; diff --git a/libibumad/src/umad.c b/libibumad/src/umad.c index 1dc328d..b01e313 100644 --- a/libibumad/src/umad.c +++ b/libibumad/src/umad.c @@ -722,7 +722,18 @@ umad_set_pkey(void *umad, int pkey_index) struct ib_user_mad *mad = umad; if (new_user_mad_api) - mad->addr.pkey_index = htons(pkey_index); + mad->addr.pkey_index = pkey_index; + + return 0; +} + +int +umad_get_pkey(void *umad) +{ + struct ib_user_mad *mad = umad; + + if (new_user_mad_api) + return mad->addr.pkey_index; return 0; } -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Sun Jan 13 11:36:43 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Jan 2008 19:36:43 +0000 Subject: [ofa-general] [PATCH] opensm/vendor: use valid pkey index value for gsi mads In-Reply-To: <20080113193559.GH10650@sashak.voltaire.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> Message-ID: <20080113193643.GI10650@sashak.voltaire.com> Use valid (as received from user_mad) pkey index value for gsi mads. Signed-off-by: Sasha Khapyorsky --- opensm/libvendor/osm_vendor_ibumad.c | 10 +++++----- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/opensm/libvendor/osm_vendor_ibumad.c b/opensm/libvendor/osm_vendor_ibumad.c index 977a3b2..8d1f070 100644 --- a/opensm/libvendor/osm_vendor_ibumad.c +++ b/opensm/libvendor/osm_vendor_ibumad.c @@ -199,9 +199,10 @@ put_madw(osm_vendor_t * p_vend, osm_madw_t * p_madw, ib_net64_t * tid) } static void -ib_mad_addr_conv(ib_mad_addr_t * ib_mad_addr, osm_mad_addr_t * osm_mad_addr, +ib_mad_addr_conv(ib_user_mad_t * umad, osm_mad_addr_t * osm_mad_addr, int is_smi) { + ib_mad_addr_t *ib_mad_addr = umad_get_mad_addr(umad); osm_mad_addr->dest_lid = ib_mad_addr->lid; osm_mad_addr->path_bits = ib_mad_addr->path_bits; osm_mad_addr->static_rate = 0; @@ -214,7 +215,7 @@ ib_mad_addr_conv(ib_mad_addr_t * ib_mad_addr, osm_mad_addr_t * osm_mad_addr, osm_mad_addr->addr_type.gsi.remote_qp = ib_mad_addr->qpn; osm_mad_addr->addr_type.gsi.remote_qkey = ib_mad_addr->qkey; - osm_mad_addr->addr_type.gsi.pkey = IB_DEFAULT_PKEY; /* FIXME: support real pkey */ + osm_mad_addr->addr_type.gsi.pkey = umad_get_pkey(umad); osm_mad_addr->addr_type.gsi.service_level = ib_mad_addr->sl; osm_mad_addr->addr_type.gsi.global_route = 0; /* FIXME: handle GRH */ memset(&osm_mad_addr->addr_type.gsi.grh_info, 0, @@ -303,7 +304,7 @@ static void *umad_receiver(void *p_ptr) mad = (ib_mad_t *) umad_get_mad(umad); ib_mad_addr = umad_get_mad_addr(umad); - ib_mad_addr_conv(ib_mad_addr, &osm_addr, + ib_mad_addr_conv(umad, &osm_addr, mad->mgmt_class == IB_MCLASS_SUBN_LID || mad->mgmt_class == IB_MCLASS_SUBN_DIR); @@ -1046,8 +1047,7 @@ osm_vendor_send(IN osm_bind_handle_t h_bind, p_mad_addr->addr_type.gsi.service_level, IB_QP1_WELL_KNOWN_Q_KEY); umad_set_grh(p_vw->umad, 0); /* FIXME: GRH support */ - umad_set_pkey(p_vw->umad, 0); - /* FIXME: p_mad_addr->addr_type.gsi.pkey to index */ + umad_set_pkey(p_vw->umad, p_mad_addr->addr_type.gsi.pkey); if (ib_class_is_rmpp(p_mad->mgmt_class)) { /* RMPP GSI classes FIXME: no GRH */ if (!ib_rmpp_is_flag_set((ib_rmpp_mad_t *) p_sa, IB_RMPP_FLAG_ACTIVE)) { -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Sun Jan 13 12:08:59 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Jan 2008 20:08:59 +0000 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <47821FF3.7020705@dev.mellanox.co.il> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> Message-ID: <20080113200859.GJ10650@sashak.voltaire.com> Hi Dotan, On 14:49 Mon 07 Jan , Dotan Barak wrote: > Dotan Barak wrote: > > Yevgeny Kliteynik wrote: > >> Dotan Barak wrote: > >>> Or Gerlitz wrote: > >>>> Dotan Barak wrote: > >>>>> Fix the value of the pkey_index in the completion to get a valid value > >>>>> for GSI QPs. > >>>> > >>>> Is libmthca fine in that respect? > >>> As much as i know, everything is fine with mthca/libmthca. > >>> > >>> We saw several problems only in ConnectX (because of the new low level > >>> driver). > >>> > >>> Right now, we are doing some more checks to check the mlx4_0 low level > >>> driver as well as the IB core. > >>> After that we'll check the mthca low level driver too. > >> > >> Currently OpenSM doesn't support any non-default pkey > >> (or any pkey at index other than 0) in sa queries. > >> When a request is received, opensm doesn't extract the > >> right pkey from the mad header - it replaces it with a > >> default pkey, and when a response is sent, OpenSM always > >> uses pkey at index 0. > >> > >> -- Yevgeny > > FYI: after several testings it seems that mthca low level driver don't have > > this problem. > > > > Dotan > > > > Just to make sure that everything is clear: I checked that the mthca low > level driver can extract > the right pkey_index in the completion of GSI QP. > > The problem that Yevgeny mentioned exists in the openSM and i opened a bug > on this issue. I tried mthca connected back-to-back (between ports 1 and 2). When non-default P_Key value is configured (at any index, full membership on both ports pkey tables and no 0xffff), saquery is timed out and trap 257 (Bad P_Key) is reported to OpenSM. I'm using kernel 2.6.24-rc7-gcdf71a10 and FW 3.2.000. Could this be old FW issue? Sasha From sashak at voltaire.com Sun Jan 13 12:17:47 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 13 Jan 2008 20:17:47 +0000 Subject: [ofa-general] opensm dumps core when using LASH for routing In-Reply-To: <18313.62780.11344.45768@kuku.melbourne.sgi.com> References: <18311.19963.735177.83038@kuku.melbourne.sgi.com> <20080113084001.GC1903@sashak.voltaire.com> <18313.62780.11344.45768@kuku.melbourne.sgi.com> Message-ID: <20080113201747.GK10650@sashak.voltaire.com> On 22:25 Sun 13 Jan , Max Matveev wrote: > >>>>> "sashak" == Sasha Khapyorsky writes: > > sashak> I suspect that the failure scenario is different. This switch > sashak> was just connected/discovered by OpenSM (it has hops = 0x0 > sashak> yet - this indicates that it does not pass lid matrix > sashak> generation stage yet) and it still be uninitialized by > sashak> LASH. If it is really so checking ->priv for NULL looks like > sashak> valid fix. > > Should opensm ignore requests while it's initializing? It is initialized, except a newly added switch. I did some tests today in order to reproduce the failure with simulator, but without big success - PathRecord query should be rejected when it passes non-prepared switches. At least it is with master branch. > sashak> Is this reproducible failure? > > We've hit it twice - first time cores were disabled, so I only know > what opensm died in get_lash_id() but I don't know where it was called > from. And this is the second time. Would be interesting to know in which OpenSM state it happens. Could you send me the core file and exact git tree hash? I would like to investigate this deeper. Sasha From noreply at eoxiamail.com Sun Jan 13 12:16:31 2008 From: noreply at eoxiamail.com (Airtist) Date: Sun, 13 Jan 2008 21:16:31 +0100 Subject: [ofa-general] Resolution pour 2008 : telecharger gratuitement et legalement avec Airtist Message-ID: An HTML attachment was scrubbed... URL: From weiny2 at llnl.gov Sun Jan 13 15:38:57 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Sun, 13 Jan 2008 15:38:57 -0800 Subject: [ofa-general] [PATCH] Consolidate the 2 __get_mgrp_by_mgid functions into one global function which actually takes a mgid and returns a mgrp. Message-ID: <20080113153857.26657832.weiny2@llnl.gov> Sasha, this is based directly off of master and does _not_ require the patch I submitted a couple of days ago to special case the IPv6 stuff. Basically this is just a clean up of the code. >From 18869ffce87b6b3fa906d300b793b881ce37fb9e Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Sun, 13 Jan 2008 15:28:34 -0800 Subject: [PATCH] Consolidate the 2 __get_mgrp_by_mgid functions into one global function which actually takes a mgid and returns a mgrp. Signed-off-by: Ira K. Weiny --- opensm/include/opensm/osm_sa.h | 5 + opensm/opensm/osm_sa_mcmember_record.c | 137 +++++++++++++++---------------- opensm/opensm/osm_sa_path_record.c | 74 +----------------- 3 files changed, 72 insertions(+), 144 deletions(-) diff --git a/opensm/include/opensm/osm_sa.h b/opensm/include/opensm/osm_sa.h index 82ca1dc..751bc96 100644 --- a/opensm/include/opensm/osm_sa.h +++ b/opensm/include/opensm/osm_sa.h @@ -469,5 +469,10 @@ osm_mcmr_rcv_find_or_create_new_mgrp(IN osm_sa_t * sa, * *********/ +ib_api_status_t +osm_get_mgrp_by_mgid(IN osm_sa_t * sa, + IN ib_gid_t *p_mgid, + OUT osm_mgrp_t ** pp_mgrp); + END_C_DECLS #endif /* _OSM_SA_H_ */ diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 8eb97ad..bd1f42b 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -96,49 +96,6 @@ typedef struct osm_sa_mcmr_search_ctxt { } osm_sa_mcmr_search_ctxt_t; /********************************************************************** - A search function that compares the given mgrp with the search context - if there is a match by mgid the p_mgrp is copied to the search context - p_mgrp component - - Inputs: - p_map_item - which is part of a mgrp object - context - points to the osm_sa_mcmr_search_ctxt_t including the mgid - looked for and the result p_mgrp -**********************************************************************/ -static void -__search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) -{ - osm_mgrp_t *p_mgrp = (osm_mgrp_t *) p_map_item; - osm_sa_mcmr_search_ctxt_t *p_ctxt = - (osm_sa_mcmr_search_ctxt_t *) context; - const ib_member_rec_t *p_recvd_mcmember_rec; - osm_sa_t *sa; - - p_recvd_mcmember_rec = p_ctxt->p_mcmember_rec; - sa = p_ctxt->sa; - - /* ignore groups marked for deletion */ - if (p_mgrp->to_be_deleted) - return; - - /* compare entire MGID so different scope will not sneak in for - the same MGID */ - if (memcmp(&p_mgrp->mcmember_rec.mgid, - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) - return; - - if (p_ctxt->p_mgrp) { - osm_log(sa->p_log, OSM_LOG_ERROR, - "__search_mgrp_by_mgid: ERR 1B03: " - "Multiple MC groups for same MGID\n"); - return; - } - - p_ctxt->p_mgrp = p_mgrp; - -} - -/********************************************************************** Look for a MGRP in the mgrp_mlid_tbl by mlid **********************************************************************/ static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_sa_t * sa, @@ -154,31 +111,6 @@ static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_sa_t * sa, } -/********************************************************************** -Look for a MGRP in the mgrp_mlid_tbl by mgid -***********************************************************************/ -static ib_api_status_t -__get_mgrp_by_mgid(IN osm_sa_t * sa, - IN ib_member_rec_t * p_recvd_mcmember_rec, - OUT osm_mgrp_t ** pp_mgrp) -{ - osm_sa_mcmr_search_ctxt_t mcmr_search_context; - - mcmr_search_context.p_mcmember_rec = p_recvd_mcmember_rec; - mcmr_search_context.sa = sa; - mcmr_search_context.p_mgrp = NULL; - - cl_qmap_apply_func(&sa->p_subn->mgrp_mlid_tbl, - __search_mgrp_by_mgid, &mcmr_search_context); - - if (mcmr_search_context.p_mgrp == NULL) { - return IB_NOT_FOUND; - } - - *pp_mgrp = mcmr_search_context.p_mgrp; - return IB_SUCCESS; -} - /********************************************************************* Copy certain fields between two mcmember records used during the process of join request to copy data from the mgrp to the @@ -1208,6 +1140,69 @@ osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, } + +typedef struct osm_sa_pr_mcmr_search_ctxt { + ib_gid_t *p_mgid; + osm_mgrp_t *p_mgrp; + osm_sa_t *sa; +} osm_sa_pr_mcmr_search_ctxt_t; + +/********************************************************************** + *********************************************************************/ +static void +__search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) +{ + osm_mgrp_t *p_mgrp = (osm_mgrp_t *) p_map_item; + osm_sa_pr_mcmr_search_ctxt_t *p_ctxt = + (osm_sa_pr_mcmr_search_ctxt_t *) context; + const ib_gid_t *p_recvd_mgid; + osm_sa_t *sa; + /* uint32_t i; */ + + p_recvd_mgid = p_ctxt->p_mgid; + sa = p_ctxt->sa; + + /* ignore groups marked for deletion */ + if (p_mgrp->to_be_deleted) + return; + + /* compare entire MGID so different scope will not sneak in for + the same MGID */ + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) + return; + + if (p_ctxt->p_mgrp) { + osm_log(sa->p_log, OSM_LOG_ERROR, + "__search_mgrp_by_mgid: ERR 1F08: " + "Multiple MC groups for same MGID\n"); + return; + } + p_ctxt->p_mgrp = p_mgrp; +} + +/********************************************************************** + **********************************************************************/ +ib_api_status_t +osm_get_mgrp_by_mgid(IN osm_sa_t *sa, + IN ib_gid_t *p_mgid, + OUT osm_mgrp_t **pp_mgrp) +{ + osm_sa_pr_mcmr_search_ctxt_t mcmr_search_context; + + mcmr_search_context.p_mgid = p_mgid; + mcmr_search_context.sa = sa; + mcmr_search_context.p_mgrp = NULL; + + cl_qmap_apply_func(&sa->p_subn->mgrp_mlid_tbl, + __search_mgrp_by_mgid, &mcmr_search_context); + + if (mcmr_search_context.p_mgrp == NULL) + return IB_NOT_FOUND; + + *pp_mgrp = mcmr_search_context.p_mgrp; + return IB_SUCCESS; +} + /********************************************************************** Call this function to find or create a new mgrp. **********************************************************************/ @@ -1220,7 +1215,7 @@ osm_mcmr_rcv_find_or_create_new_mgrp(IN osm_sa_t * sa, { ib_api_status_t status; - status = __get_mgrp_by_mgid(sa, p_recvd_mcmember_rec, pp_mgrp); + status = osm_get_mgrp_by_mgid(sa, &p_recvd_mcmember_rec->mgid, pp_mgrp); if (status == IB_SUCCESS) return status; return osm_mcmr_rcv_create_new_mgrp(sa, comp_mask, @@ -1264,7 +1259,7 @@ __osm_mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, } CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); - status = __get_mgrp_by_mgid(sa, p_recvd_mcmember_rec, &p_mgrp); + status = osm_get_mgrp_by_mgid(sa, &p_recvd_mcmember_rec->mgid, &p_mgrp); if (status == IB_SUCCESS) { mlid = p_mgrp->mlid; portguid = p_recvd_mcmember_rec->port_gid.unicast.interface_id; @@ -1440,7 +1435,7 @@ __osm_mcmr_rcv_join_mgrp(IN osm_sa_t * sa, &join_state); /* do we need to create a new group? */ - status = __get_mgrp_by_mgid(sa, p_recvd_mcmember_rec, &p_mgrp); + status = osm_get_mgrp_by_mgid(sa, &p_recvd_mcmember_rec->mgid, &p_mgrp); if ((status == IB_NOT_FOUND) || p_mgrp->to_be_deleted) { /* check for JoinState.FullMember = 1 o15.0.1.9 */ if ((join_state & 0x01) != 0x01) { diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 749a936..816e8e2 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -88,12 +88,6 @@ typedef struct _osm_path_parms { boolean_t reversible; } osm_path_parms_t; -typedef struct osm_sa_pr_mcmr_search_ctxt { - ib_gid_t *p_mgid; - osm_mgrp_t *p_mgrp; - osm_sa_t *sa; -} osm_sa_pr_mcmr_search_ctxt_t; - static const ib_gid_t zero_gid = { {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, @@ -1516,72 +1510,6 @@ __osm_pr_rcv_process_pair(IN osm_sa_t * sa, } /********************************************************************** - *********************************************************************/ -static void -__search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) -{ - osm_mgrp_t *p_mgrp = (osm_mgrp_t *) p_map_item; - osm_sa_pr_mcmr_search_ctxt_t *p_ctxt = - (osm_sa_pr_mcmr_search_ctxt_t *) context; - const ib_gid_t *p_recvd_mgid; - osm_sa_t *sa; - /* uint32_t i; */ - - p_recvd_mgid = p_ctxt->p_mgid; - sa = p_ctxt->sa; - - /* ignore groups marked for deletion */ - if (p_mgrp->to_be_deleted) - return; - - /* compare entire MGID so different scope will not sneak in for - the same MGID */ - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) - return; - -#if 0 - for (i = 0; - i < sizeof(p_mgrp->mcmember_rec.mgid.multicast.raw_group_id); - i++) { - if (p_mgrp->mcmember_rec.mgid.multicast.raw_group_id[i] != - p_recvd_mgid->mgid.multicast.raw_group_id[i]) - return; - } -#endif - - if (p_ctxt->p_mgrp) { - osm_log(sa->p_log, OSM_LOG_ERROR, - "__search_mgrp_by_mgid: ERR 1F08: " - "Multiple MC groups for same MGID\n"); - return; - } - p_ctxt->p_mgrp = p_mgrp; -} - -/********************************************************************** - **********************************************************************/ -static ib_api_status_t -__get_mgrp_by_mgid(IN osm_sa_t * sa, - IN ib_path_rec_t * p_recvd_path_rec, - OUT osm_mgrp_t ** pp_mgrp) -{ - osm_sa_pr_mcmr_search_ctxt_t mcmr_search_context; - - mcmr_search_context.p_mgid = &p_recvd_path_rec->dgid; - mcmr_search_context.sa = sa; - mcmr_search_context.p_mgrp = NULL; - - cl_qmap_apply_func(&sa->p_subn->mgrp_mlid_tbl, - __search_mgrp_by_mgid, &mcmr_search_context); - - if (mcmr_search_context.p_mgrp == NULL) - return IB_NOT_FOUND; - - *pp_mgrp = mcmr_search_context.p_mgrp; - return IB_SUCCESS; -} - -/********************************************************************** **********************************************************************/ static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_sa_t * sa, IN ib_net16_t const mlid) @@ -1615,7 +1543,7 @@ __osm_pr_get_mgrp(IN osm_sa_t * sa, comp_mask = p_sa_mad->comp_mask; if (comp_mask & IB_PR_COMPMASK_DGID) { - status = __get_mgrp_by_mgid(sa, p_pr, pp_mgrp); + status = osm_get_mgrp_by_mgid(sa, &p_pr->dgid, pp_mgrp); if (status != IB_SUCCESS) { osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_get_mgrp: ERR 1F09: " -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Consolidate-the-2-__get_mgrp_by_mgid-functions-into.patch Type: application/octet-stream Size: 10252 bytes Desc: not available URL: From kliteyn at mellanox.co.il Sun Jan 13 17:31:23 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 14 Jan 2008 03:31:23 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-14:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-13 OpenSM git rev = Thu_Jan_10_03:48:16_2008 [7bb2045bd9f659f8466a4494f4ec983f0edbf96a] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=399 Fail=1 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 9 LidMgr IS3-128.topo Failures: 1 LidMgr IS3-128.topo From makc at sgi.com Sun Jan 13 20:21:52 2008 From: makc at sgi.com (Max Matveev) Date: Mon, 14 Jan 2008 15:21:52 +1100 Subject: [ofa-general] opensm dumps core when using LASH for routing In-Reply-To: <20080113201747.GK10650@sashak.voltaire.com> References: <18311.19963.735177.83038@kuku.melbourne.sgi.com> <20080113084001.GC1903@sashak.voltaire.com> <18313.62780.11344.45768@kuku.melbourne.sgi.com> <20080113201747.GK10650@sashak.voltaire.com> Message-ID: <18314.58208.527011.666002@kuku.melbourne.sgi.com> >>>>> "sashak" == Sasha Khapyorsky writes: >> Should opensm ignore requests while it's initializing? sashak> It is initialized, except a newly added switch. By "initialized" I've meant that it has finished switch discovery and LID assignment at least for switches. BTW, this is a fabric with 22 switches and ~260 nodes. sashak> Could you send me the core file and exact git tree hash? I would like sashak> to investigate this deeper. I can send you the core but I cannot get git tree hash - I've only got source tarball (srpm, actually). It supposed to be OFED 1.2 GA. I can give you a tarball of opensm and its link dependencies from a similar machine. And, just in case it makes a difference, this is on x86_64 box. max From keshetti85-student at yahoo.co.in Sun Jan 13 20:33:55 2008 From: keshetti85-student at yahoo.co.in (Keshetti Mahesh) Date: Mon, 14 Jan 2008 10:03:55 +0530 Subject: [ofa-general] Re: ib_macro_model on OMNET++ In-Reply-To: <6C2C79E72C305246B504CBA17B5500C90314A996@mtlexch01.mtl.com> References: <829ded920801092245j3c11c251n3711a0d23ac55a30@mail.gmail.com> <6C2C79E72C305246B504CBA17B5500C90314A996@mtlexch01.mtl.com> Message-ID: <829ded920801132033j46cef44ak9d215efd73065a24@mail.gmail.com> > The model describes IB HCAs and Switches. > One can build networks of these models and > simulate traffic through the network. > The model accuratly describes how credits are flowing through the network. > The switches are built out of virtual output queues. > It also let you play with parameters for the switches and HCAs. > FDBs are programable. Hi Eitan, Thanks for the reply. I have successfully installed OMNET++ package on my machine and I am able to run samples available in that package. But when I tried to run a sample network from the ib_macro_model package, the run is getting aborted with the following message. -------------------------------------------------------------------------------------------------------------------------------- [root at n161 2h_1s]# ./2h_1s OMNeT++/OMNEST Discrete Event Simulation (C) 1992-2005 Andras Varga Release: 3.2, edition: Academic Public License. See the license for distribution terms and warranty disclaimer Setting up Cmdenv... Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/hca.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/switch.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/gen.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/sink.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/ibuf.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/obuf.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/vlarb.ned Preparing for Run #1... Setting up network `FABRIC'... Initializing... RUNTIME ERROR. A cRuntimeError exception is about to be thrown, and you requested (by setting debug-on-errors=true in the ini file) that errors abort execution and break into the debugger. - on Linux or Unix-like systems: you should now probably be running the simulation under gdb or another debugger. The simulation kernel will now raise a SIGABRT signal which will get you into the debugger. If you're not running under a debugger, you can still use the core dump for post-mortem debugging. - on Windows: your should have a just-in-time debugger (such as the Visual C++ IDE) enabled. The simulation kernel will now cause a debugger interrupt to get you into the debugger -- press the [Debug] button in the dialog that comes up. Once in the debugger, use its "view stack trace" command (in gdb: "bt") to see the context of the runtime error. See error text below. Error in module (IBGenerator) FABRIC.H_1.gen: has no parameter called `GenModel'. Aborted -------------------------------------------------------------------------------------------------------------------------------- Do you have any idea why it is happening ? Thanks and Regards, -Mahesh > > Etc etc > > Eitan From fingering at simplybarn.com Sun Jan 13 23:59:47 2008 From: fingering at simplybarn.com (Stephen Henderson) Date: Mon, 14 Jan 2008 15:59:47 +0800 Subject: [ofa-general] Ado6e FotoshopCS3 Extended for MAC\XP\Vlsta 89, Retail 999 (save 909) Message-ID: <000901c85682$b99efa00$0100007f@tquxrk> microsoft vista ultimate - 89 propellerhead reason 3 - 69 v!sit 'getadobenow .com' in your Internet Exp1orer endnote x1 for mac - 59 adobe audition 2.0 - 49 symantec norton antivirus 10.1 for mac - 29 ulead photoimpact 12 - 79 systran 6 premium translator - 159 virtualdj 4.3 for mac - 39 cakewalk project 5 - 59 microsoft exchange server enterprise 2003 - 59 conitec gamestudio pro a7 7.05 - 89 adobe photoshop cs2 v 9.0 - 69 From eitan at mellanox.co.il Mon Jan 14 00:01:46 2008 From: eitan at mellanox.co.il (Eitan Zahavi) Date: Mon, 14 Jan 2008 10:01:46 +0200 Subject: [ofa-general] RE: ib_macro_model on OMNET++ In-Reply-To: <829ded920801132033j46cef44ak9d215efd73065a24@mail.gmail.com> Message-ID: <6C2C79E72C305246B504CBA17B5500C90319F7B4@mtlexch01.mtl.com> Hi Mahesh I suspect the non existing parameter "GenModel" is still accessed by the gen.cc code. So there must be a bug in the gen.cc code. As you can guess the code I opened is a stripped down version of our internal model. As such I did not do too much of testing after the strip down. I will provide a fix later this week. If you are able to debug and fix it yourself - please let me know. Thanks Eitan -----Original Message----- From: keshetti.mahesh at gmail.com [mailto:keshetti.mahesh at gmail.com] On Behalf Of Keshetti Mahesh Sent: ב 14 ינואר 2008 06:34 To: Eitan Zahavi Cc: openIB Subject: Re: ib_macro_model on OMNET++ > The model describes IB HCAs and Switches. > One can build networks of these models and simulate traffic through > the network. > The model accuratly describes how credits are flowing through the network. > The switches are built out of virtual output queues. > It also let you play with parameters for the switches and HCAs. > FDBs are programable. Hi Eitan, Thanks for the reply. I have successfully installed OMNET++ package on my machine and I am able to run samples available in that package. But when I tried to run a sample network from the ib_macro_model package, the run is getting aborted with the following message. -------------------------------------------------------------------------------------------------------------------------------- [root at n161 2h_1s]# ./2h_1s OMNeT++/OMNEST Discrete Event Simulation (C) 1992-2005 Andras Varga Release: 3.2, edition: Academic Public License. See the license for distribution terms and warranty disclaimer Setting up Cmdenv... Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/hca.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/switch.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/gen.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/sink.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/ibuf.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/obuf.ned Loading NED file: /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/vlarb.ned Preparing for Run #1... Setting up network `FABRIC'... Initializing... RUNTIME ERROR. A cRuntimeError exception is about to be thrown, and you requested (by setting debug-on-errors=true in the ini file) that errors abort execution and break into the debugger. - on Linux or Unix-like systems: you should now probably be running the simulation under gdb or another debugger. The simulation kernel will now raise a SIGABRT signal which will get you into the debugger. If you're not running under a debugger, you can still use the core dump for post-mortem debugging. - on Windows: your should have a just-in-time debugger (such as the Visual C++ IDE) enabled. The simulation kernel will now cause a debugger interrupt to get you into the debugger -- press the [Debug] button in the dialog that comes up. Once in the debugger, use its "view stack trace" command (in gdb: "bt") to see the context of the runtime error. See error text below. Error in module (IBGenerator) FABRIC.H_1.gen: has no parameter called `GenModel'. Aborted -------------------------------------------------------------------------------------------------------------------------------- Do you have any idea why it is happening ? Thanks and Regards, -Mahesh > > Etc etc > > Eitan From koen.segers at vrt.be Mon Jan 14 00:22:39 2008 From: koen.segers at vrt.be (Koen Segers) Date: Mon, 14 Jan 2008 09:22:39 +0100 Subject: [ofa-general] RE: ib_macro_model on OMNET++ In-Reply-To: <6C2C79E72C305246B504CBA17B5500C90319F7B4@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C90319F7B4@mtlexch01.mtl.com> Message-ID: <1200298959.6668.7.camel@koenVRT> Hi, Is this simulation package based on ofed-1.2.5 or an older version of ofed? Might it be possible to load different ofed versions in omnet++? I'm not aware of the internal structure of omnet++, but I thought it was possible in ns2 to insert different library versions (for instance of tcp). Kind regards, Koen On Mon, 2008-01-14 at 10:01 +0200, Eitan Zahavi wrote: > Hi Mahesh > > I suspect the non existing parameter "GenModel" is still accessed by the gen.cc code. > So there must be a bug in the gen.cc code. > > As you can guess the code I opened is a stripped down version of our internal model. > As such I did not do too much of testing after the strip down. > > I will provide a fix later this week. > If you are able to debug and fix it yourself - please let me know. > > Thanks > > Eitan > > > > > -----Original Message----- > From: keshetti.mahesh at gmail.com [mailto:keshetti.mahesh at gmail.com] On Behalf Of Keshetti Mahesh > Sent: ב 14 ינואר 2008 06:34 > To: Eitan Zahavi > Cc: openIB > Subject: Re: ib_macro_model on OMNET++ > > > The model describes IB HCAs and Switches. > > One can build networks of these models and simulate traffic through > > the network. > > The model accuratly describes how credits are flowing through the network. > > The switches are built out of virtual output queues. > > It also let you play with parameters for the switches and HCAs. > > FDBs are programable. > > Hi Eitan, > > Thanks for the reply. I have successfully installed OMNET++ package on my machine and I am able to run samples available in that package. > > But when I tried to run a sample network from the ib_macro_model package, the run is getting aborted with the following message. > -------------------------------------------------------------------------------------------------------------------------------- > [root at n161 2h_1s]# ./2h_1s > OMNeT++/OMNEST Discrete Event Simulation (C) 1992-2005 Andras Varga > Release: 3.2, edition: Academic Public License. > See the license for distribution terms and warranty disclaimer Setting up Cmdenv... > > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/hca.ned > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/switch.ned > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/gen.ned > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/sink.ned > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/ibuf.ned > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/obuf.ned > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/vlarb.ned > > Preparing for Run #1... > Setting up network `FABRIC'... > Initializing... > > RUNTIME ERROR. A cRuntimeError exception is about to be thrown, and you requested (by setting debug-on-errors=true in the ini file) that errors abort execution and break into the debugger. > - on Linux or Unix-like systems: you should now probably be running the > simulation under gdb or another debugger. The simulation kernel will now > raise a SIGABRT signal which will get you into the debugger. If you're not > running under a debugger, you can still use the core dump for post-mortem > debugging. > - on Windows: your should have a just-in-time debugger (such as > the Visual C++ IDE) enabled. The simulation kernel will now > cause a debugger interrupt to get you into the debugger -- press > the [Debug] button in the dialog that comes up. > Once in the debugger, use its "view stack trace" command (in gdb: "bt") to see the context of the runtime error. See error text below. > > Error in module (IBGenerator) FABRIC.H_1.gen: has no parameter called `GenModel'. > Aborted > -------------------------------------------------------------------------------------------------------------------------------- > > Do you have any idea why it is happening ? > > Thanks and Regards, > -Mahesh > > > > > Etc etc > > > > Eitan > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general *** Disclaimer *** Vlaamse Radio- en Televisieomroep Auguste Reyerslaan 52, 1043 Brussel nv van publiek recht BTW BE 0244.142.664 RPR Brussel http://www.vrt.be/disclaimer From eitan at mellanox.co.il Mon Jan 14 00:23:46 2008 From: eitan at mellanox.co.il (Eitan Zahavi) Date: Mon, 14 Jan 2008 10:23:46 +0200 Subject: [ofa-general] RE: ib_macro_model on OMNET++ In-Reply-To: <1200298959.6668.7.camel@koenVRT> Message-ID: <6C2C79E72C305246B504CBA17B5500C90319F80B@mtlexch01.mtl.com> Has nothing to do with OFED -----Original Message----- From: Koen Segers [mailto:koen.segers at vrt.be] Sent: ב 14 ינואר 2008 10:23 To: Eitan Zahavi Cc: keshetti.mahesh at gmail.com; openIB Subject: Re: [ofa-general] RE: ib_macro_model on OMNET++ Hi, Is this simulation package based on ofed-1.2.5 or an older version of ofed? Might it be possible to load different ofed versions in omnet++? I'm not aware of the internal structure of omnet++, but I thought it was possible in ns2 to insert different library versions (for instance of tcp). Kind regards, Koen On Mon, 2008-01-14 at 10:01 +0200, Eitan Zahavi wrote: > Hi Mahesh > > I suspect the non existing parameter "GenModel" is still accessed by the gen.cc code. > So there must be a bug in the gen.cc code. > > As you can guess the code I opened is a stripped down version of our internal model. > As such I did not do too much of testing after the strip down. > > I will provide a fix later this week. > If you are able to debug and fix it yourself - please let me know. > > Thanks > > Eitan > > > > > -----Original Message----- > From: keshetti.mahesh at gmail.com [mailto:keshetti.mahesh at gmail.com] On > Behalf Of Keshetti Mahesh > Sent: ב 14 ינואר 2008 06:34 > To: Eitan Zahavi > Cc: openIB > Subject: Re: ib_macro_model on OMNET++ > > > The model describes IB HCAs and Switches. > > One can build networks of these models and simulate traffic through > > the network. > > The model accuratly describes how credits are flowing through the network. > > The switches are built out of virtual output queues. > > It also let you play with parameters for the switches and HCAs. > > FDBs are programable. > > Hi Eitan, > > Thanks for the reply. I have successfully installed OMNET++ package on my machine and I am able to run samples available in that package. > > But when I tried to run a sample network from the ib_macro_model package, the run is getting aborted with the following message. > ---------------------------------------------------------------------- > ---------------------------------------------------------- > [root at n161 2h_1s]# ./2h_1s > OMNeT++/OMNEST Discrete Event Simulation (C) 1992-2005 Andras Varga > Release: 3.2, edition: Academic Public License. > See the license for distribution terms and warranty disclaimer Setting up Cmdenv... > > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/hca.ne > d > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/switch > .ned > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/gen.ne > d > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/sink.n > ed > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/ibuf.n > ed > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/obuf.n > ed > Loading NED file: > /home/maheshk/softwares/ib_macro_model/networks/2h_1s/../../src/vlarb. > ned > > Preparing for Run #1... > Setting up network `FABRIC'... > Initializing... > > RUNTIME ERROR. A cRuntimeError exception is about to be thrown, and you requested (by setting debug-on-errors=true in the ini file) that errors abort execution and break into the debugger. > - on Linux or Unix-like systems: you should now probably be running the > simulation under gdb or another debugger. The simulation kernel will now > raise a SIGABRT signal which will get you into the debugger. If you're not > running under a debugger, you can still use the core dump for post-mortem > debugging. > - on Windows: your should have a just-in-time debugger (such as > the Visual C++ IDE) enabled. The simulation kernel will now > cause a debugger interrupt to get you into the debugger -- press > the [Debug] button in the dialog that comes up. > Once in the debugger, use its "view stack trace" command (in gdb: "bt") to see the context of the runtime error. See error text below. > > Error in module (IBGenerator) FABRIC.H_1.gen: has no parameter called `GenModel'. > Aborted > ---------------------------------------------------------------------- > ---------------------------------------------------------- > > Do you have any idea why it is happening ? > > Thanks and Regards, > -Mahesh > > > > > Etc etc > > > > Eitan > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general *** Disclaimer *** Vlaamse Radio- en Televisieomroep Auguste Reyerslaan 52, 1043 Brussel nv van publiek recht BTW BE 0244.142.664 RPR Brussel http://www.vrt.be/disclaimer From dotanb at dev.mellanox.co.il Mon Jan 14 01:31:26 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Mon, 14 Jan 2008 11:31:26 +0200 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <20080113200859.GJ10650@sashak.voltaire.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <20080113200859.GJ10650@sashak.voltaire.com> Message-ID: <478B2BEE.4000706@dev.mellanox.co.il> Sasha Khapyorsky wrote: > Hi Dotan, > > On 14:49 Mon 07 Jan , Dotan Barak wrote: > >> Dotan Barak wrote: >> >>> Yevgeny Kliteynik wrote: >>> >>>> Dotan Barak wrote: >>>> >>>>> Or Gerlitz wrote: >>>>> >>>>>> Dotan Barak wrote: >>>>>> >>>>>>> Fix the value of the pkey_index in the completion to get a valid value >>>>>>> for GSI QPs. >>>>>>> >>>>>> Is libmthca fine in that respect? >>>>>> >>>>> As much as i know, everything is fine with mthca/libmthca. >>>>> >>>>> We saw several problems only in ConnectX (because of the new low level >>>>> driver). >>>>> >>>>> Right now, we are doing some more checks to check the mlx4_0 low level >>>>> driver as well as the IB core. >>>>> After that we'll check the mthca low level driver too. >>>>> >>>> Currently OpenSM doesn't support any non-default pkey >>>> (or any pkey at index other than 0) in sa queries. >>>> When a request is received, opensm doesn't extract the >>>> right pkey from the mad header - it replaces it with a >>>> default pkey, and when a response is sent, OpenSM always >>>> uses pkey at index 0. >>>> >>>> -- Yevgeny >>>> >>> FYI: after several testings it seems that mthca low level driver don't have >>> this problem. >>> >>> Dotan >>> >>> >> Just to make sure that everything is clear: I checked that the mthca low >> level driver can extract >> the right pkey_index in the completion of GSI QP. >> >> The problem that Yevgeny mentioned exists in the openSM and i opened a bug >> on this issue. >> > > I tried mthca connected back-to-back (between ports 1 and 2). When > non-default P_Key value is configured (at any index, full membership on > both ports pkey tables and no 0xffff), saquery is timed out and trap 257 > (Bad P_Key) is reported to OpenSM. > > I'm using kernel 2.6.24-rc7-gcdf71a10 and FW 3.2.000. Could this be old > FW issue? > This is a really old FW and you should consider updating it ... What exactly did you do (and how)? I have troubles to set the pkey table in the subnet to not use the default pkey. Dotan From info at othbal.net Mon Jan 14 01:40:35 2008 From: info at othbal.net (=?windows-1255?B?9+Hl9vog8enw6Pjk?=) Date: Mon, 14 Jan 2008 01:40:35 -0800 Subject: [ofa-general] =?windows-1255?b?+OX25CDs9+PtIOD6IOTu5fb4IPns6iDh?= =?windows-1255?b?+Pn65fog5Pnp5eX3ID8/Pw==?= Message-ID: <20080114094046.C7B6EE60090@openfabrics.org> An HTML attachment was scrubbed... URL: From vst at vlnb.net Mon Jan 14 02:48:06 2008 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Mon, 14 Jan 2008 13:48:06 +0300 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fix target data corruption In-Reply-To: <47888D9F.3020905@vlnb.net> References: <4787F142.mailGZ011OYG6@systemfabricworks.com> <47888D9F.3020905@vlnb.net> Message-ID: <478B3DE6.2080104@vlnb.net> Vladislav Bolkhovitin wrote: > davem at systemfabricworks.com wrote: > >> This is an updated version of [PATCH] drivers/infiniband/ulp/srpt: Fix >> target data corruption >> >> It was pointed out to me that the code to round up to a power of 2 was >> not as clean as it should be, plus I extracted two unrelated patches and >> submitted them separately. >> >> ===================================================================== >> >> Change the local buffer allocator to use a spin-lock protected linked >> list instead of an array of atomic_t used/free variables. The >> atomic_t >> code was open to a multi-thread race between test and set. This has >> been observed with the result that the same data buffer was used for >> more than one SCSI operation, either writing the wrong data to the >> disk >> or sending the wrong data to the initiator. > > > I, as a main SCST developer and implementor, would suggest to completely > remove internal memory management from the SRPT driver and use SCST > memory management instead. It will provide the following advantages: > > 1. Simplify SRPT driver and completely remove such kind of bugs. > > 2. Make SRPT target driver compatible with scst_user module, i.e. will > allow to use SRPT target driver with backstorage devices, implemented in > user space. Usual example of such devices is a VTL (Virtual Tape Library). > > 3. (Most likely, since I'm not too familiar with SRPT drivers internals, > but for me it looks like so) Allow SRPT driver to reliably work with > many outstanding commands with big data transfer sizes (>=1MB) > > 4. Might improve performance by caching and reusing already allocated > and "iomaped" to Infiniband hardware SG vectors. Vu knows the details, > we discussed them with him. It will require some minor SCST > modifications (extending its interface with target drivers), but I'm > willing to make them if somebody ask for it. Oops, I forgot to mention, probably, the most important advantage: 5. Make SRPT target driver compatible with the upcoming zero-copy cache usage, where the system cache will be used directly without the extra copy. > Vlad > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From vst at vlnb.net Mon Jan 14 02:49:02 2008 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Mon, 14 Jan 2008 13:49:02 +0300 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fixtarget data corruption In-Reply-To: <6grf06$3258oc@rrcs-agw-02.hrndva.rr.com> References: <6grf06$3258oc@rrcs-agw-02.hrndva.rr.com> Message-ID: <478B3E1E.5000608@vlnb.net> Robert Pearson wrote: > Vlad, > > I think we agree. But, when we tried the experiment of running without the > local memory allocator scst hung when we did large IO operations. Probably > something simple. Why do you think it scst hung, not something else? Do you have evidences for that? ;) According to Vu, you can't simply switch to SCST memory allocator, because IB hardware has very limited number of available SG entries in commands (few tens), so for large request, where there are too many SG entries, they should be "iomapped" using the corresponding IB facility. > We can look harder. Next step for us is to sync up with Vu > on a few other changes in the works. > > Bob Pearson > > -----Original Message----- > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vladislav > Bolkhovitin > Sent: Saturday, January 12, 2008 3:51 AM > To: davem at systemfabricworks.com > Cc: vu at mellanox.com; general at lists.openfabrics.org > Subject: Re: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: > Fixtarget data corruption > > davem at systemfabricworks.com wrote: > >>This is an updated version of [PATCH] drivers/infiniband/ulp/srpt: Fix >>target data corruption >> >>It was pointed out to me that the code to round up to a power of 2 was >>not as clean as it should be, plus I extracted two unrelated patches and >>submitted them separately. >> >>===================================================================== >> >> Change the local buffer allocator to use a spin-lock protected linked >> list instead of an array of atomic_t used/free variables. The atomic_t >> code was open to a multi-thread race between test and set. This has >> been observed with the result that the same data buffer was used for >> more than one SCSI operation, either writing the wrong data to the disk >> or sending the wrong data to the initiator. > > > I, as a main SCST developer and implementor, would suggest to completely > remove internal memory management from the SRPT driver and use SCST > memory management instead. It will provide the following advantages: > > 1. Simplify SRPT driver and completely remove such kind of bugs. > > 2. Make SRPT target driver compatible with scst_user module, i.e. will > allow to use SRPT target driver with backstorage devices, implemented in > user space. Usual example of such devices is a VTL (Virtual Tape Library). > > 3. (Most likely, since I'm not too familiar with SRPT drivers internals, > but for me it looks like so) Allow SRPT driver to reliably work with > many outstanding commands with big data transfer sizes (>=1MB) > > 4. Might improve performance by caching and reusing already allocated > and "iomaped" to Infiniband hardware SG vectors. Vu knows the details, > we discussed them with him. It will require some minor SCST > modifications (extending its interface with target drivers), but I'm > willing to make them if somebody ask for it. > > Vlad > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > From vlad at lists.openfabrics.org Mon Jan 14 03:11:01 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 14 Jan 2008 03:11:01 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080114-0200 daily build status Message-ID: <20080114111102.05879E6008A@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.19 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.16 Passed on ppc64 with linux-2.6.16 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-53.el5 Passed on ppc64 with linux-2.6.15 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.20 Passed on powerpc with linux-2.6.13 Passed on powerpc with linux-2.6.14 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on powerpc with linux-2.6.15 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.13 Passed on ppc64 with linux-2.6.18 Passed on ia64 with linux-2.6.15 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.14 Passed on ia64 with linux-2.6.14 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ia64 with linux-2.6.16.21-0.8-default Failed: From tziporet at dev.mellanox.co.il Mon Jan 14 04:01:20 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 14 Jan 2008 14:01:20 +0200 Subject: [ofa-general] ofa_kernel build error on RH 4.6 (2.6.9-67.ELsmp) In-Reply-To: <2E020D3DD4A80647AE77E1692F6E97D9C8C65E@FMSMSX420> References: <2E020D3DD4A80647AE77E1692F6E97D9C8C65E@FMSMSX420> Message-ID: <478B4F10.70707@mellanox.co.il> Benninghoff, John wrote: > I can build ofa_user RPMs but building the ofa_kernel RPMs fails with > the error below. Is this a known issue with RH 4.6? > > OFED 1.2.5.4 does not supports RHEL4 up6 Only 1.2.5.5 that will be released soon will support it Meanwhile you can use the RC2 at: http://www.openfabrics.org/builds/connectx/ Tziporet From gerardo at avenuesupply.com Mon Jan 14 03:33:50 2008 From: gerardo at avenuesupply.com (Watches) Date: Mon, 14 Jan 2008 11:33:50 +0000 Subject: [ofa-general] A.Lange & Sohne Watches Message-ID: <000601c856b0$04bce7f6$dd811aac@dgjnsix> Replica Watches We offer the finest quality luxury designer replica watches and timepieces at the most attractive prices anywhere. Visit our catalogue... -------------- next part -------------- An HTML attachment was scrubbed... URL: From eli at mellanox.co.il Mon Jan 14 05:53:29 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Mon, 14 Jan 2008 15:53:29 +0200 Subject: [ofa-general] [PATCH] libmthca: Ensure an Rx WQE is in memory before linking Message-ID: <1200318809.11174.191.camel@mtls03> Ensure an Rx WQE is in memory before linking Use a write memory barrier to ensure a WQE is written to memory before linking it to the previous WQE. Signed-off-by: Eli Cohen --- src/qp.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/qp.c b/src/qp.c index 23d7774..841e316 100644 --- a/src/qp.c +++ b/src/qp.c @@ -390,6 +390,7 @@ int mthca_tavor_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, ((struct mthca_next_seg *) prev_wqe)->nda_op = htonl((ind << qp->rq.wqe_shift) | 1); + wmb(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = htonl(MTHCA_NEXT_DBD | size); -- 1.5.3.8 From pk at q-leap.com Mon Jan 14 06:47:25 2008 From: pk at q-leap.com (Peter Kruse) Date: Mon, 14 Jan 2008 15:47:25 +0100 Subject: [ofa-general] "File listed twice" when running build.sh Message-ID: <478B75FD.4060500@q-leap.com> Hello, while trying to build OFED 1.2.5.5-rc2 (and 1.2.5.4) under SL 5 (Scientific Linux), I get these messages: File listed twice: /usr/lib/librdmacm.so.1.0.0 File listed twice: /usr/lib/librdmacm.so.1.0.2 File listed twice: /usr/lib/librdmacm.so File not found: /var/tmp/OFED/etc/dat.conf File listed twice: /usr/lib/libdaplcma.a File listed twice: /usr/lib/libdat.a See Bug #856 for the complete logs. The build fails. Where do these "File listed twice" message come from, and what can I do to run this build.sh script with success? Thanks, Peter -- Peter Kruse Q-Leap Networks GmbH phone: +497071-703171, mobile: +49172-6340044 From tziporet at dev.mellanox.co.il Mon Jan 14 07:36:18 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 14 Jan 2008 17:36:18 +0200 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: References: <475607AA.301@dls.net> Message-ID: <478B8172.2010104@mellanox.co.il> Roland Dreier wrote: > > Mellanox: can you take this test case and see if it is indeed a > > firmware issue? I could believe that there is a bug in libmthca's > > mthca_tavor_post_recv() function too... > > Hi Tziporet -- any update about this issue (bad WQE address in CQE on > non-mem-free HCAs)? > > We succeeded to reproduce this problem here and its under debug Tziporet From tziporet at dev.mellanox.co.il Mon Jan 14 08:15:22 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 14 Jan 2008 18:15:22 +0200 Subject: [ofa-general] Agenda for OFED meeting today Message-ID: <478B8A9A.6020502@mellanox.co.il> 1. Review release status 2. Update on tasks that should be completed for RC2: * XRC - enhanced API - will be ready on Wed * IPoIB performance improvements for small messages - not a must for RC2 * Open MPI 1.2.5-rc2 - done 3. Bugs review: Critical and major bug_id bug_severity assigned_to short_short_desc 858 critical bugzilla at openib.org ibv_cmd_query_device fails 846 critical jim at mellanox.com SDP crash on RHEL5 ppc64 running netserver 849 critical vlad at mellanox.co.il IB_core is not compiling on ppc64 with rhel-4.6 760 major eli at mellanox.co.il UDP performance on Rx is lower than Tx 761 major eli at mellanox.co.il Poor and jittery UDP performance at small messages 736 major rolandd at cisco.com IBV_WC_RETRY_EXC_ERR errors with local rdma_reads 767 major swise at opengridcomputing.com Non backport Kernels that don't build in genalloc cause compile errors for cxgb3 From sean.hefty at intel.com Mon Jan 14 09:26:32 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Mon, 14 Jan 2008 09:26:32 -0800 Subject: [ofa-general] [PATCH] libibumad: umad_get_pkey() function In-Reply-To: <20080113193559.GH10650@sashak.voltaire.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il><4781F18F.1070506@voltaire.com><4781FB41.6040204@dev.mellanox.co.il><4782160B.1080709@dev.mellanox.co.il><478217A6.80307@dev.mellanox.co.il><47821FF3.7020705@dev.mellanox.co.il><1199718187.20870.102.camel@hrosenstock-ws.xsigo.com><20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> Message-ID: <000201c856d2$9b8ae0b0$c0d0180a@amr.corp.intel.com> >This returns value of pkey_index in network byte order from user_mad Why would you return an index in network byte order? - Sean From sean.hefty at intel.com Mon Jan 14 09:28:48 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Mon, 14 Jan 2008 09:28:48 -0800 Subject: [ofa-general] [PATCH] libmlx4: prevent seg fault when sending bigmessages as inline In-Reply-To: <200801131743.24014.dotanb@dev.mellanox.co.il> References: <200801131743.24014.dotanb@dev.mellanox.co.il> Message-ID: <000301c856d2$ec87e8f0$c0d0180a@amr.corp.intel.com> >Fix the type of the variable that hold the number of bytes sent >as inline message so far. >Without the patch, If the user will try to use very big messages (total > 2^31) >there will be a seg fault. 2^31 is the max message size supported by IB. - Sean From hrosenstock at xsigo.com Mon Jan 14 09:47:35 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 09:47:35 -0800 Subject: [ofa-general] [PATCH] libibumad: umad_get_pkey() function In-Reply-To: <000201c856d2$9b8ae0b0$c0d0180a@amr.corp.intel.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com><4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il><478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <000201c856d2$9b8ae0b0$c0d0180a@amr.corp.intel.com> Message-ID: <1200332855.8962.58.camel@hrosenstock-ws.xsigo.com> On Mon, 2008-01-14 at 09:26 -0800, Sean Hefty wrote: > >This returns value of pkey_index in network byte order from user_mad > > Why would you return an index in network byte order? Looking at the code, I think the description is wrong in terms of that and should remove the words "in network byte order" but the code looks right to me. In fact, it appears with new_user_api in umad_set_pkey that there was a bug (in the existing code) in that pkey_index was converted to network order. -- Hal > - Sean > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Mon Jan 14 09:50:56 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 09:50:56 -0800 Subject: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups. In-Reply-To: <1200211526.11174.128.camel@mtls03> References: <20080111193657.58477fb0.weiny2@llnl.gov> <1200211526.11174.128.camel@mtls03> Message-ID: <1200333056.8962.61.camel@hrosenstock-ws.xsigo.com> On Sun, 2008-01-13 at 10:05 +0200, Eli Cohen wrote: > IPOIB does not initiate a join to a mulitcast group (except for the > broadcast group). IPv6 does indeed do this on an IPoIB interface for solicited node multicast. > This comes from routing protocols or use space > sockets. Do you run processes that use many different multicast groups? > > > On Fri, 2008-01-11 at 19:36 -0800, Ira Weiny wrote: > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > really stupid question but: > > > > Is it a bug that there is a Multicast group created for every node in our > > clusters? > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > and opensm is complaining there are not enough multicast groups. > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > couple of times to print this nice report.) > > > > > > 19:17:24 > whatsup > > up: 9: wopr[0-7],wopri > > down: 0: > > root at wopri:/tftpboot/images > > 19:25:03 > ibnodesinmcast -g > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > In 9: wopr[0-7],wopri > > Out 0: 0 > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > In 9: wopr[0-7],wopri > > Out 0: 0 > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > In 1: wopr3 > > Out 8: wopr[0-2,4-7],wopri > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > In 9: wopr[0-7],wopri > > Out 0: 0 > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > In 1: wopr4 > > Out 8: wopr[0-3,5-7],wopri > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > In 1: wopri > > Out 8: wopr[0-7] > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > In 1: wopr6 > > Out 8: wopr[0-5,7],wopri > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > In 1: wopr7 > > Out 8: wopr[0-6],wopri > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > In 1: wopr1 > > Out 8: wopr[0,2-7],wopri > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > In 1: wopr2 > > Out 8: wopr[0-1,3-7],wopri > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > In 1: wopr0 > > Out 8: wopr[1-7],wopri > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > In 1: wopr5 > > Out 8: wopr[0-4,6-7],wopri > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > IPoIB? > > > > In a bind, > > Ira > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hal.rosenstock at gmail.com Mon Jan 14 09:57:45 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 12:57:45 -0500 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080112000117.6b52b53c.weiny2@llnl.gov> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> Message-ID: Hi Ira, On 1/12/08, Ira Weiny wrote: > And to further answer my question...[*] > > This seems to fix the problem for us, however I know that it could be better. > For example it only takes care of partition 0xFFFF, and I think Jason's idea of > having say 16 Mcast Groups and some hash of these into them would be nice. But > is this on the right track? Am I missing some other place in the code? This is a start. Some initial comments on a quick scan of the approach used: This assumes a homogeneous subnet (in terms of rates and MTUs). I think that only groups which share the same rate and MTU can share the same MLID. Also, MLIDs will now need to be use counted and only removed when all the groups sharing that MLID are removed. I think this is a policy and rather than this always being the case, there should be a policy parameter added to OpenSM for this. IMO default should be to not do this. Maybe more later... -- Hal > Thanks, > Ira > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > have the big system for the weekend and IB was not part of the test... ;-) > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 11 Jan 2008 22:58:19 -0800 > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > Group. > > Signed-off-by: root > --- > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > 2 files changed, 59 insertions(+), 2 deletions(-) > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > index 8eb97ad..6bcc124 100644 > --- a/opensm/opensm/osm_sa_mcmember_record.c > +++ b/opensm/opensm/osm_sa_mcmember_record.c > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > /* compare entire MGID so different scope will not sneak in for > the same MGID */ > if (memcmp(&p_mgrp->mcmember_rec.mgid, > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > + > + /* Special Case IPV6 Multicast Loopback addresses */ > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > +#define SPEC_PREFIX (0xff12601bffff0000) > +#define INT_ID_MASK (0x00000001ff000000) > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > + > + if (rcv_prefix == SPEC_PREFIX > + && > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > + > + if ((g_prefix == rcv_prefix) > + && > + (g_interface_id & INT_ID_MASK) == > + (rcv_interface_id & INT_ID_MASK) > + ) { > + osm_log(sa->p_log, OSM_LOG_INFO, > + "Special Case Mcast Join for MGID " > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > + rcv_prefix, rcv_interface_id); > + goto match; > + } > + } > return; > + } > > +match: > if (p_ctxt->p_mgrp) { > osm_log(sa->p_log, OSM_LOG_ERROR, > "__search_mgrp_by_mgid: ERR 1B03: " > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > index 749a936..469773a 100644 > --- a/opensm/opensm/osm_sa_path_record.c > +++ b/opensm/opensm/osm_sa_path_record.c > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > /* compare entire MGID so different scope will not sneak in for > the same MGID */ > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > + > + /* Special Case IPV6 Multicast Loopback addresses */ > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > +#define SPEC_PREFIX (0xff12601bffff0000) > +#define INT_ID_MASK (0x00000001ff000000) > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > + > + if (rcv_prefix == SPEC_PREFIX > + && > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > + > + if ((g_prefix == rcv_prefix) > + && > + (g_interface_id & INT_ID_MASK) == > + (rcv_interface_id & INT_ID_MASK) > + ) { > + osm_log(sa->p_log, OSM_LOG_INFO, > + "Special Case Mcast Join for MGID " > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > + rcv_prefix, rcv_interface_id); > + goto match; > + } > + } > return; > + } > + > +match: > > #if 0 > for (i = 0; > -- > 1.5.1 > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > Ira Weiny wrote: > > > Ok, > > > > I found my own answer. Sorry for the spam. > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > Sorry, > > Ira > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > Ira Weiny wrote: > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > really stupid question but: > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > clusters? > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > and opensm is complaining there are not enough multicast groups. > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > couple of times to print this nice report.) > > > > > > > > > 19:17:24 > whatsup > > > up: 9: wopr[0-7],wopri > > > down: 0: > > > root at wopri:/tftpboot/images > > > 19:25:03 > ibnodesinmcast -g > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > In 9: wopr[0-7],wopri > > > Out 0: 0 > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > In 9: wopr[0-7],wopri > > > Out 0: 0 > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > In 1: wopr3 > > > Out 8: wopr[0-2,4-7],wopri > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > In 9: wopr[0-7],wopri > > > Out 0: 0 > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > In 1: wopr4 > > > Out 8: wopr[0-3,5-7],wopri > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > In 1: wopri > > > Out 8: wopr[0-7] > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > In 1: wopr6 > > > Out 8: wopr[0-5,7],wopri > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > In 1: wopr7 > > > Out 8: wopr[0-6],wopri > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > In 1: wopr1 > > > Out 8: wopr[0,2-7],wopri > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > In 1: wopr2 > > > Out 8: wopr[0-1,3-7],wopri > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > In 1: wopr0 > > > Out 8: wopr[1-7],wopri > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > In 1: wopr5 > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > IPoIB? > > > > > > In a bind, > > > Ira > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > From weiny2 at llnl.gov Mon Jan 14 10:51:32 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 14 Jan 2008 10:51:32 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> Message-ID: <20080114105132.19eafcee.weiny2@llnl.gov> Hey Hal, thanks for the response. Comments below. On Mon, 14 Jan 2008 12:57:45 -0500 "Hal Rosenstock" wrote: > Hi Ira, > > On 1/12/08, Ira Weiny wrote: > > And to further answer my question...[*] > > > > This seems to fix the problem for us, however I know that it could be better. > > For example it only takes care of partition 0xFFFF, and I think Jason's idea of > > having say 16 Mcast Groups and some hash of these into them would be nice. But > > is this on the right track? Am I missing some other place in the code? > > This is a start. > > Some initial comments on a quick scan of the approach used: > > This assumes a homogeneous subnet (in terms of rates and MTUs). I > think that only groups which share the same rate and MTU can share the > same MLID. Ah indeed this might be an issue. This might not be the best place for the code. :-( > > Also, MLIDs will now need to be use counted and only removed when all > the groups sharing that MLID are removed. I don't quite understand what you mean here. There is still a 1:1 mapping of MLID's to MGID's. All of the requests for this type of MGRP join are routed to one group. Therefore, I thought the same rules for deleting the group would apply; when all the members are gone it is removed? Just to be clear, after this patch the mgroups are: 09:36:40 > saquery -g MCMemberRecord group dump: MGID....................0xff12401bffff0000 : 0x00000000ffffffff Mlid....................0xC000 Mtu.....................0x84 pkey....................0xFFFF Rate....................0x83 MCMemberRecord group dump: MGID....................0xff12401bffff0000 : 0x0000000000000001 Mlid....................0xC001 Mtu.....................0x84 pkey....................0xFFFF Rate....................0x83 MCMemberRecord group dump: MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 Mlid....................0xC002 Mtu.....................0x84 pkey....................0xFFFF Rate....................0x83 MCMemberRecord group dump: MGID....................0xff12601bffff0000 : 0x0000000000000001 Mlid....................0xC003 Mtu.....................0x84 pkey....................0xFFFF Rate....................0x83 All of these requests are added to the MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 Mlid....................0xC002 group. But as you say, how do we determine that the pkey, mtu, and rate are valid? :-/ But here is a question: What happens if someone with an incorrect MTU tries to join the MGID....................0xff12401bffff0000 : 0x0000000000000001 group? Wouldn't this code return this mgrp pointer and the subsequent MTU and rate checks fail? I seem to recall a thread discussing this before. I don't remember what the outcome was. I seem to remember the question was if OpenSM should create/modify a group to the "lowest common" MTU/Rate, and succeed all the joins, vs enforcing the faster MTU/Rate and failing the joins. > > I think this is a policy and rather than this always being the case, > there should be a policy parameter added to OpenSM for this. IMO > default should be to not do this. Yes, for sure there needs to be some options to control the behavior. > > Maybe more later... Thanks again, Ira > > -- Hal > > > Thanks, > > Ira > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > have the big system for the weekend and IB was not part of the test... ;-) > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > From: Ira K. Weiny > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > Group. > > > > Signed-off-by: root > > --- > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > index 8eb97ad..6bcc124 100644 > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > /* compare entire MGID so different scope will not sneak in for > > the same MGID */ > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > + > > + /* Special Case IPV6 Multicast Loopback addresses */ > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > +#define SPEC_PREFIX (0xff12601bffff0000) > > +#define INT_ID_MASK (0x00000001ff000000) > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > + > > + if (rcv_prefix == SPEC_PREFIX > > + && > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > + > > + if ((g_prefix == rcv_prefix) > > + && > > + (g_interface_id & INT_ID_MASK) == > > + (rcv_interface_id & INT_ID_MASK) > > + ) { > > + osm_log(sa->p_log, OSM_LOG_INFO, > > + "Special Case Mcast Join for MGID " > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > + rcv_prefix, rcv_interface_id); > > + goto match; > > + } > > + } > > return; > > + } > > > > +match: > > if (p_ctxt->p_mgrp) { > > osm_log(sa->p_log, OSM_LOG_ERROR, > > "__search_mgrp_by_mgid: ERR 1B03: " > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > index 749a936..469773a 100644 > > --- a/opensm/opensm/osm_sa_path_record.c > > +++ b/opensm/opensm/osm_sa_path_record.c > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > /* compare entire MGID so different scope will not sneak in for > > the same MGID */ > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > + > > + /* Special Case IPV6 Multicast Loopback addresses */ > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > +#define SPEC_PREFIX (0xff12601bffff0000) > > +#define INT_ID_MASK (0x00000001ff000000) > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > + > > + if (rcv_prefix == SPEC_PREFIX > > + && > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > + > > + if ((g_prefix == rcv_prefix) > > + && > > + (g_interface_id & INT_ID_MASK) == > > + (rcv_interface_id & INT_ID_MASK) > > + ) { > > + osm_log(sa->p_log, OSM_LOG_INFO, > > + "Special Case Mcast Join for MGID " > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > + rcv_prefix, rcv_interface_id); > > + goto match; > > + } > > + } > > return; > > + } > > + > > +match: > > > > #if 0 > > for (i = 0; > > -- > > 1.5.1 > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > Ira Weiny wrote: > > > > > Ok, > > > > > > I found my own answer. Sorry for the spam. > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > Sorry, > > > Ira > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > Ira Weiny wrote: > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > really stupid question but: > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > clusters? > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > couple of times to print this nice report.) > > > > > > > > > > > > 19:17:24 > whatsup > > > > up: 9: wopr[0-7],wopri > > > > down: 0: > > > > root at wopri:/tftpboot/images > > > > 19:25:03 > ibnodesinmcast -g > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > In 9: wopr[0-7],wopri > > > > Out 0: 0 > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > In 9: wopr[0-7],wopri > > > > Out 0: 0 > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > In 1: wopr3 > > > > Out 8: wopr[0-2,4-7],wopri > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > In 9: wopr[0-7],wopri > > > > Out 0: 0 > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > In 1: wopr4 > > > > Out 8: wopr[0-3,5-7],wopri > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > In 1: wopri > > > > Out 8: wopr[0-7] > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > In 1: wopr6 > > > > Out 8: wopr[0-5,7],wopri > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > In 1: wopr7 > > > > Out 8: wopr[0-6],wopri > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > In 1: wopr1 > > > > Out 8: wopr[0,2-7],wopri > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > In 1: wopr2 > > > > Out 8: wopr[0-1,3-7],wopri > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > In 1: wopr0 > > > > Out 8: wopr[1-7],wopri > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > In 1: wopr5 > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > IPoIB? > > > > > > > > In a bind, > > > > Ira > > > > _______________________________________________ > > > > general mailing list > > > > general at lists.openfabrics.org > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > From vu at mellanox.com Mon Jan 14 10:54:32 2008 From: vu at mellanox.com (Vu Pham) Date: Mon, 14 Jan 2008 10:54:32 -0800 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fixtarget data corruption In-Reply-To: <478B3E1E.5000608@vlnb.net> References: <6grf06$3258oc@rrcs-agw-02.hrndva.rr.com> <478B3E1E.5000608@vlnb.net> Message-ID: <478BAFE8.1090108@mellanox.com> Vladislav Bolkhovitin wrote: > Robert Pearson wrote: >> Vlad, >> >> I think we agree. But, when we tried the experiment of running without >> the >> local memory allocator scst hung when we did large IO operations. >> Probably >> something simple. > > Why do you think it scst hung, not something else? Do you have evidences > for that? ;) According to Vu, you can't simply switch to SCST memory > allocator, because IB hardware has very limited number of available SG > entries in commands (few tens), so for large request, where there are > too many SG entries, they should be "iomapped" using the corresponding > IB facility. No - you can easily switch to SCST memory by set srpt's module parameter mem_element=0. There is limited number of sg entries per IB work request (29); however, the current srpt can submit several IB work requests to cover large sg entries IO. -vu > >> We can look harder. Next step for us is to sync up with Vu >> on a few other changes in the works. >> >> Bob Pearson >> >> -----Original Message----- >> From: general-bounces at lists.openfabrics.org >> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vladislav >> Bolkhovitin >> Sent: Saturday, January 12, 2008 3:51 AM >> To: davem at systemfabricworks.com >> Cc: vu at mellanox.com; general at lists.openfabrics.org >> Subject: Re: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: >> Fixtarget data corruption >> >> davem at systemfabricworks.com wrote: >> >>> This is an updated version of [PATCH] drivers/infiniband/ulp/srpt: Fix >>> target data corruption >>> >>> It was pointed out to me that the code to round up to a power of 2 was >>> not as clean as it should be, plus I extracted two unrelated patches and >>> submitted them separately. >>> >>> ===================================================================== >>> >>> Change the local buffer allocator to use a spin-lock protected linked >>> list instead of an array of atomic_t used/free variables. The >>> atomic_t >>> code was open to a multi-thread race between test and set. This has >>> been observed with the result that the same data buffer was used for >>> more than one SCSI operation, either writing the wrong data to the >>> disk >>> or sending the wrong data to the initiator. >> >> >> I, as a main SCST developer and implementor, would suggest to >> completely remove internal memory management from the SRPT driver and >> use SCST memory management instead. It will provide the following >> advantages: >> >> 1. Simplify SRPT driver and completely remove such kind of bugs. >> >> 2. Make SRPT target driver compatible with scst_user module, i.e. will >> allow to use SRPT target driver with backstorage devices, implemented >> in user space. Usual example of such devices is a VTL (Virtual Tape >> Library). >> >> 3. (Most likely, since I'm not too familiar with SRPT drivers >> internals, but for me it looks like so) Allow SRPT driver to reliably >> work with many outstanding commands with big data transfer sizes (>=1MB) >> >> 4. Might improve performance by caching and reusing already allocated >> and "iomaped" to Infiniband hardware SG vectors. Vu knows the details, >> we discussed them with him. It will require some minor SCST >> modifications (extending its interface with target drivers), but I'm >> willing to make them if somebody ask for it. >> >> Vlad >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> >> > From vst at vlnb.net Mon Jan 14 11:02:18 2008 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Mon, 14 Jan 2008 22:02:18 +0300 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fixtarget data corruption In-Reply-To: <478BAFE8.1090108@mellanox.com> References: <6grf06$3258oc@rrcs-agw-02.hrndva.rr.com> <478B3E1E.5000608@vlnb.net> <478BAFE8.1090108@mellanox.com> Message-ID: <478BB1BA.5040402@vlnb.net> Vu Pham wrote: > Vladislav Bolkhovitin wrote: > >> Robert Pearson wrote: >> >>> Vlad, >>> >>> I think we agree. But, when we tried the experiment of running >>> without the >>> local memory allocator scst hung when we did large IO operations. >>> Probably >>> something simple. >> >> Why do you think it scst hung, not something else? Do you have >> evidences for that? ;) According to Vu, you can't simply switch to >> SCST memory allocator, because IB hardware has very limited number of >> available SG entries in commands (few tens), so for large request, >> where there are too many SG entries, they should be "iomapped" using >> the corresponding IB facility. > > No - you can easily switch to SCST memory by set srpt's module parameter > mem_element=0. There is limited number of sg entries per IB work request > (29); however, the current srpt can submit several IB work requests to > cover large sg entries IO. What's the point in the internal memory management in the SRPT driver then? > -vu > >> >>> We can look harder. Next step for us is to sync up with Vu >>> on a few other changes in the works. >>> >>> Bob Pearson >>> >>> -----Original Message----- >>> From: general-bounces at lists.openfabrics.org >>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vladislav >>> Bolkhovitin >>> Sent: Saturday, January 12, 2008 3:51 AM >>> To: davem at systemfabricworks.com >>> Cc: vu at mellanox.com; general at lists.openfabrics.org >>> Subject: Re: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: >>> Fixtarget data corruption >>> >>> davem at systemfabricworks.com wrote: >>> >>>> This is an updated version of [PATCH] drivers/infiniband/ulp/srpt: Fix >>>> target data corruption >>>> >>>> It was pointed out to me that the code to round up to a power of 2 was >>>> not as clean as it should be, plus I extracted two unrelated patches >>>> and >>>> submitted them separately. >>>> >>>> ===================================================================== >>>> >>>> Change the local buffer allocator to use a spin-lock protected linked >>>> list instead of an array of atomic_t used/free variables. The >>>> atomic_t >>>> code was open to a multi-thread race between test and set. This has >>>> been observed with the result that the same data buffer was used for >>>> more than one SCSI operation, either writing the wrong data to the >>>> disk >>>> or sending the wrong data to the initiator. >>> >>> >>> >>> I, as a main SCST developer and implementor, would suggest to >>> completely remove internal memory management from the SRPT driver and >>> use SCST memory management instead. It will provide the following >>> advantages: >>> >>> 1. Simplify SRPT driver and completely remove such kind of bugs. >>> >>> 2. Make SRPT target driver compatible with scst_user module, i.e. >>> will allow to use SRPT target driver with backstorage devices, >>> implemented in user space. Usual example of such devices is a VTL >>> (Virtual Tape Library). >>> >>> 3. (Most likely, since I'm not too familiar with SRPT drivers >>> internals, but for me it looks like so) Allow SRPT driver to reliably >>> work with many outstanding commands with big data transfer sizes (>=1MB) >>> >>> 4. Might improve performance by caching and reusing already allocated >>> and "iomaped" to Infiniband hardware SG vectors. Vu knows the >>> details, we discussed them with him. It will require some minor SCST >>> modifications (extending its interface with target drivers), but I'm >>> willing to make them if somebody ask for it. >>> >>> Vlad >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit >>> http://openib.org/mailman/listinfo/openib-general >>> >>> >> > > From weiny2 at llnl.gov Mon Jan 14 11:45:25 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 14 Jan 2008 11:45:25 -0800 Subject: [ofa-general] [PATCH 0/3] clean up "__get_mgrp_by_mgid" and add option --consolodate_ipv6_snm_req Message-ID: <20080114114525.1555fa6c.weiny2@llnl.gov> The following 3 patches are a much cleaner implementation of what I sent to the list on Friday. The first 2 patches are just code clean up and I feel should be applied. The 3rd adds the option --consolodate_ipv6_snm_req which causes all IPv6 Solicited Node Multicast requests to be grouped into one MCast group per partition. Hal has already raised concerns over the MTU and rate differences which might cause problems with this approach. I would also like to make an option to create multiple groups via some sort of hash function. But for now this is a clean patch which works for a homogenious network. Thanks, Ira From weiny2 at llnl.gov Mon Jan 14 11:45:28 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 14 Jan 2008 11:45:28 -0800 Subject: [ofa-general] [PATCH 2/3] Removed unused and commented out var. Message-ID: <20080114114528.00df98ec.weiny2@llnl.gov> >From 42844aad5bf498a04ff36d4b97babb243caa83b3 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Sun, 13 Jan 2008 15:42:58 -0800 Subject: [PATCH] Removed unused and commented out var. Signed-off-by: Ira K. Weiny --- opensm/opensm/osm_sa_mcmember_record.c | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index bd1f42b..d37a655 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -1157,7 +1157,6 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) (osm_sa_pr_mcmr_search_ctxt_t *) context; const ib_gid_t *p_recvd_mgid; osm_sa_t *sa; - /* uint32_t i; */ p_recvd_mgid = p_ctxt->p_mgid; sa = p_ctxt->sa; -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Removed-unused-and-commented-out-var.patch Type: application/octet-stream Size: 846 bytes Desc: not available URL: From weiny2 at llnl.gov Mon Jan 14 11:45:27 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 14 Jan 2008 11:45:27 -0800 Subject: [ofa-general] [PATCH 1/3] Consolidate the 2 __get_mgrp_by_mgid functions into one global function which actually takes a mgid and returns a mgrp. Message-ID: <20080114114527.68a7109d.weiny2@llnl.gov> >From 18869ffce87b6b3fa906d300b793b881ce37fb9e Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Sun, 13 Jan 2008 15:28:34 -0800 Subject: [PATCH] Consolidate the 2 __get_mgrp_by_mgid functions into one global function which actually takes a mgid and returns a mgrp. Signed-off-by: Ira K. Weiny --- opensm/include/opensm/osm_sa.h | 5 + opensm/opensm/osm_sa_mcmember_record.c | 137 +++++++++++++++---------------- opensm/opensm/osm_sa_path_record.c | 74 +----------------- 3 files changed, 72 insertions(+), 144 deletions(-) diff --git a/opensm/include/opensm/osm_sa.h b/opensm/include/opensm/osm_sa.h index 82ca1dc..751bc96 100644 --- a/opensm/include/opensm/osm_sa.h +++ b/opensm/include/opensm/osm_sa.h @@ -469,5 +469,10 @@ osm_mcmr_rcv_find_or_create_new_mgrp(IN osm_sa_t * sa, * *********/ +ib_api_status_t +osm_get_mgrp_by_mgid(IN osm_sa_t * sa, + IN ib_gid_t *p_mgid, + OUT osm_mgrp_t ** pp_mgrp); + END_C_DECLS #endif /* _OSM_SA_H_ */ diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 8eb97ad..bd1f42b 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -96,49 +96,6 @@ typedef struct osm_sa_mcmr_search_ctxt { } osm_sa_mcmr_search_ctxt_t; /********************************************************************** - A search function that compares the given mgrp with the search context - if there is a match by mgid the p_mgrp is copied to the search context - p_mgrp component - - Inputs: - p_map_item - which is part of a mgrp object - context - points to the osm_sa_mcmr_search_ctxt_t including the mgid - looked for and the result p_mgrp -**********************************************************************/ -static void -__search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) -{ - osm_mgrp_t *p_mgrp = (osm_mgrp_t *) p_map_item; - osm_sa_mcmr_search_ctxt_t *p_ctxt = - (osm_sa_mcmr_search_ctxt_t *) context; - const ib_member_rec_t *p_recvd_mcmember_rec; - osm_sa_t *sa; - - p_recvd_mcmember_rec = p_ctxt->p_mcmember_rec; - sa = p_ctxt->sa; - - /* ignore groups marked for deletion */ - if (p_mgrp->to_be_deleted) - return; - - /* compare entire MGID so different scope will not sneak in for - the same MGID */ - if (memcmp(&p_mgrp->mcmember_rec.mgid, - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) - return; - - if (p_ctxt->p_mgrp) { - osm_log(sa->p_log, OSM_LOG_ERROR, - "__search_mgrp_by_mgid: ERR 1B03: " - "Multiple MC groups for same MGID\n"); - return; - } - - p_ctxt->p_mgrp = p_mgrp; - -} - -/********************************************************************** Look for a MGRP in the mgrp_mlid_tbl by mlid **********************************************************************/ static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_sa_t * sa, @@ -154,31 +111,6 @@ static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_sa_t * sa, } -/********************************************************************** -Look for a MGRP in the mgrp_mlid_tbl by mgid -***********************************************************************/ -static ib_api_status_t -__get_mgrp_by_mgid(IN osm_sa_t * sa, - IN ib_member_rec_t * p_recvd_mcmember_rec, - OUT osm_mgrp_t ** pp_mgrp) -{ - osm_sa_mcmr_search_ctxt_t mcmr_search_context; - - mcmr_search_context.p_mcmember_rec = p_recvd_mcmember_rec; - mcmr_search_context.sa = sa; - mcmr_search_context.p_mgrp = NULL; - - cl_qmap_apply_func(&sa->p_subn->mgrp_mlid_tbl, - __search_mgrp_by_mgid, &mcmr_search_context); - - if (mcmr_search_context.p_mgrp == NULL) { - return IB_NOT_FOUND; - } - - *pp_mgrp = mcmr_search_context.p_mgrp; - return IB_SUCCESS; -} - /********************************************************************* Copy certain fields between two mcmember records used during the process of join request to copy data from the mgrp to the @@ -1208,6 +1140,69 @@ osm_mcmr_rcv_create_new_mgrp(IN osm_sa_t * sa, } + +typedef struct osm_sa_pr_mcmr_search_ctxt { + ib_gid_t *p_mgid; + osm_mgrp_t *p_mgrp; + osm_sa_t *sa; +} osm_sa_pr_mcmr_search_ctxt_t; + +/********************************************************************** + *********************************************************************/ +static void +__search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) +{ + osm_mgrp_t *p_mgrp = (osm_mgrp_t *) p_map_item; + osm_sa_pr_mcmr_search_ctxt_t *p_ctxt = + (osm_sa_pr_mcmr_search_ctxt_t *) context; + const ib_gid_t *p_recvd_mgid; + osm_sa_t *sa; + /* uint32_t i; */ + + p_recvd_mgid = p_ctxt->p_mgid; + sa = p_ctxt->sa; + + /* ignore groups marked for deletion */ + if (p_mgrp->to_be_deleted) + return; + + /* compare entire MGID so different scope will not sneak in for + the same MGID */ + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) + return; + + if (p_ctxt->p_mgrp) { + osm_log(sa->p_log, OSM_LOG_ERROR, + "__search_mgrp_by_mgid: ERR 1F08: " + "Multiple MC groups for same MGID\n"); + return; + } + p_ctxt->p_mgrp = p_mgrp; +} + +/********************************************************************** + **********************************************************************/ +ib_api_status_t +osm_get_mgrp_by_mgid(IN osm_sa_t *sa, + IN ib_gid_t *p_mgid, + OUT osm_mgrp_t **pp_mgrp) +{ + osm_sa_pr_mcmr_search_ctxt_t mcmr_search_context; + + mcmr_search_context.p_mgid = p_mgid; + mcmr_search_context.sa = sa; + mcmr_search_context.p_mgrp = NULL; + + cl_qmap_apply_func(&sa->p_subn->mgrp_mlid_tbl, + __search_mgrp_by_mgid, &mcmr_search_context); + + if (mcmr_search_context.p_mgrp == NULL) + return IB_NOT_FOUND; + + *pp_mgrp = mcmr_search_context.p_mgrp; + return IB_SUCCESS; +} + /********************************************************************** Call this function to find or create a new mgrp. **********************************************************************/ @@ -1220,7 +1215,7 @@ osm_mcmr_rcv_find_or_create_new_mgrp(IN osm_sa_t * sa, { ib_api_status_t status; - status = __get_mgrp_by_mgid(sa, p_recvd_mcmember_rec, pp_mgrp); + status = osm_get_mgrp_by_mgid(sa, &p_recvd_mcmember_rec->mgid, pp_mgrp); if (status == IB_SUCCESS) return status; return osm_mcmr_rcv_create_new_mgrp(sa, comp_mask, @@ -1264,7 +1259,7 @@ __osm_mcmr_rcv_leave_mgrp(IN osm_sa_t * sa, } CL_PLOCK_EXCL_ACQUIRE(sa->p_lock); - status = __get_mgrp_by_mgid(sa, p_recvd_mcmember_rec, &p_mgrp); + status = osm_get_mgrp_by_mgid(sa, &p_recvd_mcmember_rec->mgid, &p_mgrp); if (status == IB_SUCCESS) { mlid = p_mgrp->mlid; portguid = p_recvd_mcmember_rec->port_gid.unicast.interface_id; @@ -1440,7 +1435,7 @@ __osm_mcmr_rcv_join_mgrp(IN osm_sa_t * sa, &join_state); /* do we need to create a new group? */ - status = __get_mgrp_by_mgid(sa, p_recvd_mcmember_rec, &p_mgrp); + status = osm_get_mgrp_by_mgid(sa, &p_recvd_mcmember_rec->mgid, &p_mgrp); if ((status == IB_NOT_FOUND) || p_mgrp->to_be_deleted) { /* check for JoinState.FullMember = 1 o15.0.1.9 */ if ((join_state & 0x01) != 0x01) { diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 749a936..816e8e2 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -88,12 +88,6 @@ typedef struct _osm_path_parms { boolean_t reversible; } osm_path_parms_t; -typedef struct osm_sa_pr_mcmr_search_ctxt { - ib_gid_t *p_mgid; - osm_mgrp_t *p_mgrp; - osm_sa_t *sa; -} osm_sa_pr_mcmr_search_ctxt_t; - static const ib_gid_t zero_gid = { {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, @@ -1516,72 +1510,6 @@ __osm_pr_rcv_process_pair(IN osm_sa_t * sa, } /********************************************************************** - *********************************************************************/ -static void -__search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) -{ - osm_mgrp_t *p_mgrp = (osm_mgrp_t *) p_map_item; - osm_sa_pr_mcmr_search_ctxt_t *p_ctxt = - (osm_sa_pr_mcmr_search_ctxt_t *) context; - const ib_gid_t *p_recvd_mgid; - osm_sa_t *sa; - /* uint32_t i; */ - - p_recvd_mgid = p_ctxt->p_mgid; - sa = p_ctxt->sa; - - /* ignore groups marked for deletion */ - if (p_mgrp->to_be_deleted) - return; - - /* compare entire MGID so different scope will not sneak in for - the same MGID */ - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) - return; - -#if 0 - for (i = 0; - i < sizeof(p_mgrp->mcmember_rec.mgid.multicast.raw_group_id); - i++) { - if (p_mgrp->mcmember_rec.mgid.multicast.raw_group_id[i] != - p_recvd_mgid->mgid.multicast.raw_group_id[i]) - return; - } -#endif - - if (p_ctxt->p_mgrp) { - osm_log(sa->p_log, OSM_LOG_ERROR, - "__search_mgrp_by_mgid: ERR 1F08: " - "Multiple MC groups for same MGID\n"); - return; - } - p_ctxt->p_mgrp = p_mgrp; -} - -/********************************************************************** - **********************************************************************/ -static ib_api_status_t -__get_mgrp_by_mgid(IN osm_sa_t * sa, - IN ib_path_rec_t * p_recvd_path_rec, - OUT osm_mgrp_t ** pp_mgrp) -{ - osm_sa_pr_mcmr_search_ctxt_t mcmr_search_context; - - mcmr_search_context.p_mgid = &p_recvd_path_rec->dgid; - mcmr_search_context.sa = sa; - mcmr_search_context.p_mgrp = NULL; - - cl_qmap_apply_func(&sa->p_subn->mgrp_mlid_tbl, - __search_mgrp_by_mgid, &mcmr_search_context); - - if (mcmr_search_context.p_mgrp == NULL) - return IB_NOT_FOUND; - - *pp_mgrp = mcmr_search_context.p_mgrp; - return IB_SUCCESS; -} - -/********************************************************************** **********************************************************************/ static osm_mgrp_t *__get_mgrp_by_mlid(IN osm_sa_t * sa, IN ib_net16_t const mlid) @@ -1615,7 +1543,7 @@ __osm_pr_get_mgrp(IN osm_sa_t * sa, comp_mask = p_sa_mad->comp_mask; if (comp_mask & IB_PR_COMPMASK_DGID) { - status = __get_mgrp_by_mgid(sa, p_pr, pp_mgrp); + status = osm_get_mgrp_by_mgid(sa, &p_pr->dgid, pp_mgrp); if (status != IB_SUCCESS) { osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_get_mgrp: ERR 1F09: " -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Consolidate-the-2-__get_mgrp_by_mgid-functions-into.patch Type: application/octet-stream Size: 10252 bytes Desc: not available URL: From weiny2 at llnl.gov Mon Jan 14 11:45:30 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 14 Jan 2008 11:45:30 -0800 Subject: [ofa-general] [PATCH 3/3] Add option to Special Case the IPv6 Solicited Node Multicast address into a single Mcast Group Message-ID: <20080114114530.22b15f58.weiny2@llnl.gov> >From a1d38895e7e34e9fec297b1dbdb0637ed858d6f0 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Sun, 13 Jan 2008 16:03:31 -0800 Subject: [PATCH] Add option to Special Case the IPv6 Solicited Node Multicast address into a single Mcast Group Signed-off-by: Ira K. Weiny --- opensm/include/opensm/osm_subnet.h | 1 + opensm/man/opensm.8 | 4 +++ opensm/opensm/main.c | 4 +++ opensm/opensm/osm_sa_mcmember_record.c | 35 +++++++++++++++++++++++++++++++- opensm/opensm/osm_subnet.c | 9 ++++++++ 5 files changed, 52 insertions(+), 1 deletions(-) diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h index 2a28045..558b34e 100644 --- a/opensm/include/opensm/osm_subnet.h +++ b/opensm/include/opensm/osm_subnet.h @@ -283,6 +283,7 @@ typedef struct _osm_subn_opt { char *event_plugin_name; char *node_name_map_name; char *prefix_routes_file; + boolean_t consolodate_ipv6_snm_req; } osm_subn_opt_t; /* * FIELDS diff --git a/opensm/man/opensm.8 b/opensm/man/opensm.8 index 475eeec..9c7b371 100644 --- a/opensm/man/opensm.8 +++ b/opensm/man/opensm.8 @@ -239,6 +239,10 @@ Specify the sweep time for the performance manager in seconds (default is 180 seconds). Only takes effect if --enable-perfmgr was specified at configure time. .TP +.BI --consolodate_ipv6_snm_reqests +Consolodate IPv6 Solicited Node Multicast group joins into 1 IB multicast +group. +.TP \fB\-v\fR, \fB\-\-verbose\fR This option increases the log verbosity level. The -v option may be specified multiple times diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c index 4d0d51d..a84f6c2 100644 --- a/opensm/opensm/main.c +++ b/opensm/opensm/main.c @@ -615,6 +615,7 @@ int main(int argc, char *argv[]) {"perfmgr_sweep_time_s", 1, NULL, 2}, #endif {"prefix_routes_file", 1, NULL, 3}, + {"consolodate_ipv6_snm_reqests", 0, NULL, 4}, {NULL, 0, NULL, 0} /* Required at the end of the array */ }; @@ -916,6 +917,9 @@ int main(int argc, char *argv[]) case 3: opt.prefix_routes_file = optarg; break; + case 4: + opt.consolodate_ipv6_snm_req = TRUE; + break; case 'h': case '?': case ':': diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index d37a655..bfa5d2d 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -1167,9 +1167,42 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) /* compare entire MGID so different scope will not sneak in for the same MGID */ - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { + + if (sa->p_subn->opt.consolodate_ipv6_snm_req) { + /* Special Case IPV6 Multicast Loopback addresses */ + /* 0xff12601bXXXX0000 : 0x00000001ffYYYYYY */ + /* Where XXXX is the partition and YYYYYY is the last 24 bits + * of the port guid */ +#define PREFIX_MASK (0xff12601b00000000) +#define INT_ID_MASK (0x00000001ff000000) + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); + + if (((rcv_prefix & PREFIX_MASK) == PREFIX_MASK) + && + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { + + if ((g_prefix == rcv_prefix) + && + (g_interface_id & INT_ID_MASK) == + (rcv_interface_id & INT_ID_MASK) + ) { + osm_log(sa->p_log, OSM_LOG_INFO, + "Special Case Mcast Join for MGID " + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", + rcv_prefix, rcv_interface_id); + goto match; + } + } + } + return; + } +match: if (p_ctxt->p_mgrp) { osm_log(sa->p_log, OSM_LOG_ERROR, "__search_mgrp_by_mgid: ERR 1F08: " diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 0103940..558ea68 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -481,6 +481,7 @@ void osm_subn_set_default_opt(IN osm_subn_opt_t * const p_opt) p_opt->enable_quirks = FALSE; p_opt->no_clients_rereg = FALSE; p_opt->prefix_routes_file = OSM_DEFAULT_PREFIX_ROUTES_FILE; + p_opt->consolodate_ipv6_snm_req = FALSE; subn_set_default_qos_options(&p_opt->qos_options); subn_set_default_qos_options(&p_opt->qos_ca_options); subn_set_default_qos_options(&p_opt->qos_sw0_options); @@ -1394,6 +1395,9 @@ ib_api_status_t osm_subn_parse_conf_file(IN osm_subn_opt_t * const p_opts) opts_unpack_charp("prefix_routes_file", p_key, p_val, &p_opts->prefix_routes_file); + + opts_unpack_boolean("consolodate_ipv6_snm_req", + p_key, p_val, &p_opts->consolodate_ipv6_snm_req); } fclose(opts_file); @@ -1721,6 +1725,11 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) "prefix_routes_file %s\n\n", p_opts->prefix_routes_file); + fprintf(opts_file, + "#\n# IPv6 MCast Options\n#\n" + "consolodate_ipv6_snm_req %s\n\n", + p_opts->consolodate_ipv6_snm_req ? "TRUE" : "FALSE"); + /* optional string attributes ... */ fclose(opts_file); -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-Add-option-to-Special-Case-the-IPv6-Solicited-Node-M.patch Type: application/octet-stream Size: 5330 bytes Desc: not available URL: From hrosenstock at xsigo.com Mon Jan 14 12:23:34 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 12:23:34 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080114105132.19eafcee.weiny2@llnl.gov> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> <20080114105132.19eafcee.weiny2@llnl.gov> Message-ID: <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > Hey Hal, thanks for the response. Comments below. > > On Mon, 14 Jan 2008 12:57:45 -0500 > "Hal Rosenstock" wrote: > > > Hi Ira, > > > > On 1/12/08, Ira Weiny wrote: > > > And to further answer my question...[*] > > > > > > This seems to fix the problem for us, however I know that it could be better. > > > For example it only takes care of partition 0xFFFF, and I think Jason's idea of > > > having say 16 Mcast Groups and some hash of these into them would be nice. But > > > is this on the right track? Am I missing some other place in the code? > > > > This is a start. > > > > Some initial comments on a quick scan of the approach used: > > > > This assumes a homogeneous subnet (in terms of rates and MTUs). I > > think that only groups which share the same rate and MTU can share the > > same MLID. > > Ah indeed this might be an issue. This might not be the best place for the > code. :-( > > > > > Also, MLIDs will now need to be use counted and only removed when all > > the groups sharing that MLID are removed. > > I don't quite understand what you mean here. There is still a 1:1 mapping of > MLID's to MGID's. Didn't you just change that in that many MGIDs go to one MLID ? > All of the requests for this type of MGRP join are routed to > one group. Therefore, I thought the same rules for deleting the group would > apply; when all the members are gone it is removed? Yes, the group may go but not the underlying MLID as there are other groups which are sharing this. That's not what happens now. > Just to be clear, after > this patch the mgroups are: > > 09:36:40 > saquery -g > MCMemberRecord group dump: > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > Mlid....................0xC000 > Mtu.....................0x84 > pkey....................0xFFFF > Rate....................0x83 > MCMemberRecord group dump: > MGID....................0xff12401bffff0000 : 0x0000000000000001 > Mlid....................0xC001 > Mtu.....................0x84 > pkey....................0xFFFF > Rate....................0x83 > MCMemberRecord group dump: > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > Mlid....................0xC002 > Mtu.....................0x84 > pkey....................0xFFFF > Rate....................0x83 > MCMemberRecord group dump: > MGID....................0xff12601bffff0000 : 0x0000000000000001 > Mlid....................0xC003 > Mtu.....................0x84 > pkey....................0xFFFF > Rate....................0x83 > > All of these requests are added to the > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > Mlid....................0xC002 > group. But as you say, how do we determine that the pkey, mtu, and rate are > valid? :-/ > > But here is a question: > > What happens if someone with an incorrect MTU tries to join the > MGID....................0xff12401bffff0000 : 0x0000000000000001 > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > rate checks fail? I seem to recall a thread discussing this before. I don't > remember what the outcome was. I seem to remember the question was if OpenSM > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > the joins, vs enforcing the faster MTU/Rate and failing the joins. Yes, the join would fail, but I don't think that's what we would want. The alternative with the patch is to make it the lowest rate but there is a minimum MTU which might not be right. > > I think this is a policy and rather than this always being the case, > > there should be a policy parameter added to OpenSM for this. IMO > > default should be to not do this. > > Yes, for sure there needs to be some options to control the behavior. > > > > > Maybe more later... > > Thanks again, > Ira > > > > > -- Hal > > > > > Thanks, > > > Ira > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > From: Ira K. Weiny > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > Group. > > > > > > Signed-off-by: root > > > --- > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > index 8eb97ad..6bcc124 100644 > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > /* compare entire MGID so different scope will not sneak in for > > > the same MGID */ > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > + > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > +#define INT_ID_MASK (0x00000001ff000000) > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > + > > > + if (rcv_prefix == SPEC_PREFIX > > > + && > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > + > > > + if ((g_prefix == rcv_prefix) > > > + && > > > + (g_interface_id & INT_ID_MASK) == > > > + (rcv_interface_id & INT_ID_MASK) > > > + ) { > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > + "Special Case Mcast Join for MGID " > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > + rcv_prefix, rcv_interface_id); > > > + goto match; > > > + } > > > + } > > > return; > > > + } > > > > > > +match: > > > if (p_ctxt->p_mgrp) { > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > index 749a936..469773a 100644 > > > --- a/opensm/opensm/osm_sa_path_record.c > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > /* compare entire MGID so different scope will not sneak in for > > > the same MGID */ > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > + > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > +#define INT_ID_MASK (0x00000001ff000000) > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > + > > > + if (rcv_prefix == SPEC_PREFIX > > > + && > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > + > > > + if ((g_prefix == rcv_prefix) > > > + && > > > + (g_interface_id & INT_ID_MASK) == > > > + (rcv_interface_id & INT_ID_MASK) > > > + ) { > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > + "Special Case Mcast Join for MGID " > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > + rcv_prefix, rcv_interface_id); > > > + goto match; > > > + } > > > + } > > > return; > > > + } > > > + > > > +match: > > > > > > #if 0 > > > for (i = 0; > > > -- > > > 1.5.1 > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > Ira Weiny wrote: > > > > > > > Ok, > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > Sorry, > > > > Ira > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > Ira Weiny wrote: > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > really stupid question but: > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > clusters? > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > up: 9: wopr[0-7],wopri > > > > > down: 0: > > > > > root at wopri:/tftpboot/images > > > > > 19:25:03 > ibnodesinmcast -g > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > In 9: wopr[0-7],wopri > > > > > Out 0: 0 > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > In 9: wopr[0-7],wopri > > > > > Out 0: 0 > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > In 1: wopr3 > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > In 9: wopr[0-7],wopri > > > > > Out 0: 0 > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > In 1: wopr4 > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > In 1: wopri > > > > > Out 8: wopr[0-7] > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > In 1: wopr6 > > > > > Out 8: wopr[0-5,7],wopri > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > In 1: wopr7 > > > > > Out 8: wopr[0-6],wopri > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > In 1: wopr1 > > > > > Out 8: wopr[0,2-7],wopri > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > In 1: wopr2 > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > In 1: wopr0 > > > > > Out 8: wopr[1-7],wopri > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > In 1: wopr5 > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > IPoIB? > > > > > > > > > > In a bind, > > > > > Ira > > > > > _______________________________________________ > > > > > general mailing list > > > > > general at lists.openfabrics.org > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Mon Jan 14 12:24:59 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 12:24:59 -0800 Subject: [ofa-general] [PATCH 0/3] clean up "__get_mgrp_by_mgid" and add option --consolodate_ipv6_snm_req In-Reply-To: <20080114114525.1555fa6c.weiny2@llnl.gov> References: <20080114114525.1555fa6c.weiny2@llnl.gov> Message-ID: <1200342299.8962.78.camel@hrosenstock-ws.xsigo.com> On Mon, 2008-01-14 at 11:45 -0800, Ira Weiny wrote: > The following 3 patches are a much cleaner implementation of what I sent to the > list on Friday. > > The first 2 patches are just code clean up and I feel should be applied. The > 3rd adds the option --consolodate_ipv6_snm_req which causes all IPv6 Solicited > Node Multicast requests to be grouped into one MCast group per partition. > > Hal has already raised concerns over the MTU and rate differences which might > cause problems with this approach. I would also like to make an option to > create multiple groups via some sort of hash function. But for now this is a > clean patch which works for a homogenious network. I'm not so sure; I think there are problems on the deletion side. I think the MLID gets torn down with the first group going; and the other groups wouldn't work. -- Hal > > Thanks, > Ira > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sean.hefty at intel.com Mon Jan 14 13:02:27 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Mon, 14 Jan 2008 13:02:27 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> Message-ID: <000201c856f0$c50ea430$9b37170a@amr.corp.intel.com> >@@ -703,7 +708,8 @@ static int ib_umad_unreg_agent(struct ib_umad_file *file, >u32 __user *arg) > if (get_user(id, arg)) > return -EFAULT; > >- down_write(&file->port->mutex); >+ mutex_unlock(&file->port->file_mutex); This should be mutex_lock(). The other changes look okay. With your latest patch and the change above, I was not able to reproduce the lockdep warnings. (I've seen the warnings 3 or 4 times now, but it's not easy to reproduce.) Thanks - Sean From rdreier at cisco.com Mon Jan 14 13:24:51 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 14 Jan 2008 13:24:51 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: <000201c856f0$c50ea430$9b37170a@amr.corp.intel.com> (Sean Hefty's message of "Mon, 14 Jan 2008 13:02:27 -0800") References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> <000201c856f0$c50ea430$9b37170a@amr.corp.intel.com> Message-ID: > >- down_write(&file->port->mutex); > >+ mutex_unlock(&file->port->file_mutex); > > This should be mutex_lock(). Yup, thanks... I fixed it in my tree. I guess I never tested unregistering a MAD agent... > The other changes look okay. With your latest patch and the change above, I was > not able to reproduce the lockdep warnings. (I've seen the warnings 3 or 4 > times now, but it's not easy to reproduce.) OK, thanks. I think these changes are an improvement anyway, so I guess I'll queue them up for 2.6.25 and hope things are fixed... - R. From hrosenstock at xsigo.com Mon Jan 14 13:29:35 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 13:29:35 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> <000201c856f0$c50ea430$9b37170a@amr.corp.intel.com> Message-ID: <1200346175.8962.91.camel@hrosenstock-ws.xsigo.com> Roland, On Mon, 2008-01-14 at 13:24 -0800, Roland Dreier wrote: > > >- down_write(&file->port->mutex); > > >+ mutex_unlock(&file->port->file_mutex); > > > > This should be mutex_lock(). > > Yup, thanks... I fixed it in my tree. I guess I never tested > unregistering a MAD agent... > > > The other changes look okay. With your latest patch and the change above, I was > > not able to reproduce the lockdep warnings. (I've seen the warnings 3 or 4 > > times now, but it's not easy to reproduce.) > > OK, thanks. I think these changes are an improvement anyway, so I > guess I'll queue them up for 2.6.25 and hope things are fixed... Has there been any OpenSM (and diags) testing with this ? I'd like Sasha to ack this change (including testing multiple instances of opensm) prior to submitting this to 2.6.25. -- Hal > - R. > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From rdreier at cisco.com Mon Jan 14 13:29:44 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 14 Jan 2008 13:29:44 -0800 Subject: [ofa-general] Re: [PATCH] libmthca: Ensure an Rx WQE is in memory before linking In-Reply-To: <1200318809.11174.191.camel@mtls03> (Eli Cohen's message of "Mon, 14 Jan 2008 15:53:29 +0200") References: <1200318809.11174.191.camel@mtls03> Message-ID: thanks, applied. Did you find this from code review or is it fixing a real problem on some platform? - R. From rdreier at cisco.com Mon Jan 14 13:58:58 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 14 Jan 2008 13:58:58 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: <1200346175.8962.91.camel@hrosenstock-ws.xsigo.com> (Hal Rosenstock's message of "Mon, 14 Jan 2008 13:29:35 -0800") References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> <000201c856f0$c50ea430$9b37170a@amr.corp.intel.com> <1200346175.8962.91.camel@hrosenstock-ws.xsigo.com> Message-ID: > Has there been any OpenSM (and diags) testing with this ? I'd like Sasha > to ack this change (including testing multiple instances of opensm) > prior to submitting this to 2.6.25. I did run opensm and osmtest briefly and they seemed to be fine. I haven't done any extensive testing though. - R. From hrosenstock at xsigo.com Mon Jan 14 14:11:22 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 14:11:22 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> <000201c856f0$c50ea430$9b37170a@amr.corp.intel.com> <1200346175.8962.91.camel@hrosenstock-ws.xsigo.com> Message-ID: <1200348682.8962.107.camel@hrosenstock-ws.xsigo.com> On Mon, 2008-01-14 at 13:58 -0800, Roland Dreier wrote: > > Has there been any OpenSM (and diags) testing with this ? I'd like Sasha > > to ack this change (including testing multiple instances of opensm) > > prior to submitting this to 2.6.25. > > I did run opensm and osmtest briefly and they seemed to be fine. Glad to hear this :-) > I haven't done any extensive testing though. Did you try multiple OpenSM instances on the same port ? My memory is a little fuzzy on this but I think that there used to be some user_mad error returned for this rather than it blocking. Was that behavior preserved ? -- Hal > - R. From rdreier at cisco.com Mon Jan 14 14:13:33 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 14 Jan 2008 14:13:33 -0800 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: <1200348682.8962.107.camel@hrosenstock-ws.xsigo.com> (Hal Rosenstock's message of "Mon, 14 Jan 2008 14:11:22 -0800") References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> <000201c856f0$c50ea430$9b37170a@amr.corp.intel.com> <1200346175.8962.91.camel@hrosenstock-ws.xsigo.com> <1200348682.8962.107.camel@hrosenstock-ws.xsigo.com> Message-ID: > Did you try multiple OpenSM instances on the same port ? My memory is a > little fuzzy on this but I think that there used to be some user_mad > error returned for this rather than it blocking. Was that behavior > preserved ? There should be no changes in behavior other than fixes like not deadlocking in some cases. If I run opensm twice I get Error from osm_opensm_bind (0x2A) Perhaps another instance of OpenSM is already running Exiting SM with the new code. - R. From rdreier at cisco.com Mon Jan 14 14:18:57 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 14 Jan 2008 14:18:57 -0800 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: <1199980899.11174.91.camel@mtls03> (Eli Cohen's message of "Thu, 10 Jan 2008 18:01:39 +0200") References: <1199980899.11174.91.camel@mtls03> Message-ID: > +enum qp_create_flags { > + QP_CREATE_LSO = 1 << 0, > +}; > + > struct ib_qp_init_attr { > void (*event_handler)(struct ib_event *, void *); > void *qp_context; > @@ -496,6 +500,7 @@ struct ib_qp_init_attr { > enum ib_sig_type sq_sig_type; > enum ib_qp_type qp_type; > u8 port_num; /* special QP types only */ > + enum qp_create_flags create_flags; > }; Not sure if this approach is a good one... would it make sense to create a new QP type like IB_QPT_UD_LSO to handle LSO instead? Are there other flags we're going to want to add too? Also this patch doesn't make much sense without the rest of the LSO stuff really. Finally, I think you need to audit all the places where struct ib_qp_init_attr is used to make sure the flags are set correctly; for example the uverbs_cmd.c create QP function seems like it would end up passing a random stack value into create_flags. - R. From sashak at voltaire.com Mon Jan 14 14:46:25 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 14 Jan 2008 22:46:25 +0000 Subject: [ofa-general] [PATCH] libmlx4: Fix the value of the pkey_index in the completion In-Reply-To: <478B2BEE.4000706@dev.mellanox.co.il> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <20080113200859.GJ10650@sashak.voltaire.com> <478B2BEE.4000706@dev.mellanox.co.il> Message-ID: <20080114224625.GA16009@sashak.voltaire.com> On 11:31 Mon 14 Jan , Dotan Barak wrote: > >>> > >> Just to make sure that everything is clear: I checked that the mthca low > >> level driver can extract > >> the right pkey_index in the completion of GSI QP. > >> > >> The problem that Yevgeny mentioned exists in the openSM and i opened a > >> bug on this issue. > >> > > > > I tried mthca connected back-to-back (between ports 1 and 2). When > > non-default P_Key value is configured (at any index, full membership on > > both ports pkey tables and no 0xffff), saquery is timed out and trap 257 > > (Bad P_Key) is reported to OpenSM. > > > > I'm using kernel 2.6.24-rc7-gcdf71a10 and FW 3.2.000. Could this be old > > FW issue? > > > This is a really old FW and you should consider updating it ... Actually everything works fine now (even with old FW) - it was a bug in my test yesterday. > What exactly did you do (and how)? I'm using this in order to enforce saquery to use non-zero pkey index: diff --git a/opensm/libvendor/osm_vendor_ibumad_sa.c b/opensm/libvendor/osm_vendor_ibumad_sa.c index 24f70bb..f23e67d 100644 --- a/opensm/libvendor/osm_vendor_ibumad_sa.c +++ b/opensm/libvendor/osm_vendor_ibumad_sa.c @@ -440,6 +440,7 @@ __osmv_send_sa_req(IN osmv_sa_bind_info_t * p_bind, p_madw->mad_addr.addr_type.smi.source_lid = cl_hton16(p_bind->p_vendor->umad_port.base_lid); p_madw->mad_addr.addr_type.gsi.remote_qp = CL_HTON32(1); + p_madw->mad_addr.addr_type.gsi.pkey = 1; p_madw->resp_expected = TRUE; p_madw->fail_msg = CL_DISP_MSGID_NONE; > > I have troubles to set the pkey table in the subnet to not use the default > pkey. This will prevent from OpenSM to create/update 0xffff P_Key: diff --git a/opensm/opensm/osm_prtn.c b/opensm/opensm/osm_prtn.c index 15a9c2a..bfb682f 100644 --- a/opensm/opensm/osm_prtn.c +++ b/opensm/opensm/osm_prtn.c @@ -368,9 +368,11 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * const p_log, global_pkey_counter = 0; +#if 0 status = osm_prtn_make_default(p_log, p_subn, !is_config); if (status != IB_SUCCESS) goto _err; +#endif if (is_config && osm_prtn_config_parse_file(p_log, p_subn, file_name)) { osm_log(p_log, OSM_LOG_VERBOSE, And with it I'm using partition config file to create various test cases. Something like: #Default=0xffff: ALL=limi ; #P1=0x8001 : ALL = full ; P2=0x0002 : ALL = limi, SELF=full ; P3=0x0003 : ALL ; Sasha From sashak at voltaire.com Mon Jan 14 14:50:19 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 14 Jan 2008 22:50:19 +0000 Subject: [ofa-general] [PATCH] libibumad: umad_get_pkey() function In-Reply-To: <1200332855.8962.58.camel@hrosenstock-ws.xsigo.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <000201c856d2$9b8ae0b0$c0d0180a@amr.corp.intel.com> <1200332855.8962.58.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080114225019.GB16009@sashak.voltaire.com> On 09:47 Mon 14 Jan , Hal Rosenstock wrote: > On Mon, 2008-01-14 at 09:26 -0800, Sean Hefty wrote: > > >This returns value of pkey_index in network byte order from user_mad > > > > Why would you return an index in network byte order? > > Looking at the code, I think the description is wrong in terms of that > and should remove the words "in network byte order" but the code looks > right to me. In fact, it appears with new_user_api in umad_set_pkey that > there was a bug (in the existing code) in that pkey_index was converted > to network order. Right, there was a bug there. And agree about wrong patch description, I will change it. Sasha From sashak at voltaire.com Mon Jan 14 14:54:15 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 14 Jan 2008 22:54:15 +0000 Subject: [ofa-general] opensm dumps core when using LASH for routing In-Reply-To: <18314.58208.527011.666002@kuku.melbourne.sgi.com> References: <18311.19963.735177.83038@kuku.melbourne.sgi.com> <20080113084001.GC1903@sashak.voltaire.com> <18313.62780.11344.45768@kuku.melbourne.sgi.com> <20080113201747.GK10650@sashak.voltaire.com> <18314.58208.527011.666002@kuku.melbourne.sgi.com> Message-ID: <20080114225415.GC16009@sashak.voltaire.com> On 15:21 Mon 14 Jan , Max Matveev wrote: > >>>>> "sashak" == Sasha Khapyorsky writes: > > >> Should opensm ignore requests while it's initializing? > > sashak> It is initialized, except a newly added switch. > By "initialized" I've meant that it has finished switch discovery and > LID assignment at least for switches. > > BTW, this is a fabric with 22 switches and ~260 nodes. > > sashak> Could you send me the core file and exact git tree hash? I would like > sashak> to investigate this deeper. > I can send you the core but I cannot get git tree hash - I've only got > source tarball (srpm, actually). It supposed to be OFED 1.2 GA. > > I can give you a tarball of opensm and its link dependencies from a > similar machine. Yes, it would be fine. Thanks. Sasha > And, just in case it makes a difference, this is on x86_64 box. > > max From weiny2 at llnl.gov Mon Jan 14 15:35:46 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Mon, 14 Jan 2008 15:35:46 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080114153546.73d43d6b.weiny2@llnl.gov> On Mon, 14 Jan 2008 12:23:34 -0800 Hal Rosenstock wrote: > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > Hey Hal, thanks for the response. Comments below. > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > "Hal Rosenstock" wrote: > > > > > Hi Ira, > > > > > > On 1/12/08, Ira Weiny wrote: > > > > And to further answer my question...[*] > > > > > > > > This seems to fix the problem for us, however I know that it could be better. > > > > For example it only takes care of partition 0xFFFF, and I think Jason's idea of > > > > having say 16 Mcast Groups and some hash of these into them would be nice. But > > > > is this on the right track? Am I missing some other place in the code? > > > > > > This is a start. > > > > > > Some initial comments on a quick scan of the approach used: > > > > > > This assumes a homogeneous subnet (in terms of rates and MTUs). I > > > think that only groups which share the same rate and MTU can share the > > > same MLID. > > > > Ah indeed this might be an issue. This might not be the best place for the > > code. :-( > > > > > > > > Also, MLIDs will now need to be use counted and only removed when all > > > the groups sharing that MLID are removed. > > > > I don't quite understand what you mean here. There is still a 1:1 mapping of > > MLID's to MGID's. > > Didn't you just change that in that many MGIDs go to one MLID ? Ah, this is where the confusion has been. No, this is _not_ what I did... I see now; that is what was proposed in the thread a year ago, however, I don't think mapping many MGIDs to 1 MLID will work well. What I did was to allow the first IPv6 request to create the group and then all other requests were added to this group. This sends all the neighbor discovery messages to all nodes on the network. This might seem inefficient but should work. (... and seems to.) > > > All of the requests for this type of MGRP join are routed to > > one group. Therefore, I thought the same rules for deleting the group would > > apply; when all the members are gone it is removed? > > Yes, the group may go but not the underlying MLID as there are other > groups which are sharing this. That's not what happens now. No, since there is only 1 group in this implementation it should work like others. The first node of this "mgid type" will create the group. Others will join it and will continue to use it even if the creator leaves. Does this make more sense? Ira > > > Just to be clear, after > > this patch the mgroups are: > > > > 09:36:40 > saquery -g > > MCMemberRecord group dump: > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > Mlid....................0xC000 > > Mtu.....................0x84 > > pkey....................0xFFFF > > Rate....................0x83 > > MCMemberRecord group dump: > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > Mlid....................0xC001 > > Mtu.....................0x84 > > pkey....................0xFFFF > > Rate....................0x83 > > MCMemberRecord group dump: > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > Mlid....................0xC002 > > Mtu.....................0x84 > > pkey....................0xFFFF > > Rate....................0x83 > > MCMemberRecord group dump: > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > Mlid....................0xC003 > > Mtu.....................0x84 > > pkey....................0xFFFF > > Rate....................0x83 > > > > All of these requests are added to the > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > Mlid....................0xC002 > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > valid? :-/ > > > > But here is a question: > > > > What happens if someone with an incorrect MTU tries to join the > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > rate checks fail? I seem to recall a thread discussing this before. I don't > > remember what the outcome was. I seem to remember the question was if OpenSM > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > Yes, the join would fail, but I don't think that's what we would want. > The alternative with the patch is to make it the lowest rate but there > is a minimum MTU which might not be right. > > > > I think this is a policy and rather than this always being the case, > > > there should be a policy parameter added to OpenSM for this. IMO > > > default should be to not do this. > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > Maybe more later... > > > > Thanks again, > > Ira > > > > > > > > -- Hal > > > > > > > Thanks, > > > > Ira > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > From: Ira K. Weiny > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > Group. > > > > > > > > Signed-off-by: root > > > > --- > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > index 8eb97ad..6bcc124 100644 > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > /* compare entire MGID so different scope will not sneak in for > > > > the same MGID */ > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > + > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > + > > > > + if (rcv_prefix == SPEC_PREFIX > > > > + && > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > + > > > > + if ((g_prefix == rcv_prefix) > > > > + && > > > > + (g_interface_id & INT_ID_MASK) == > > > > + (rcv_interface_id & INT_ID_MASK) > > > > + ) { > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > + "Special Case Mcast Join for MGID " > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > + rcv_prefix, rcv_interface_id); > > > > + goto match; > > > > + } > > > > + } > > > > return; > > > > + } > > > > > > > > +match: > > > > if (p_ctxt->p_mgrp) { > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > index 749a936..469773a 100644 > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > the same MGID */ > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > + > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > + > > > > + if (rcv_prefix == SPEC_PREFIX > > > > + && > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > + > > > > + if ((g_prefix == rcv_prefix) > > > > + && > > > > + (g_interface_id & INT_ID_MASK) == > > > > + (rcv_interface_id & INT_ID_MASK) > > > > + ) { > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > + "Special Case Mcast Join for MGID " > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > + rcv_prefix, rcv_interface_id); > > > > + goto match; > > > > + } > > > > + } > > > > return; > > > > + } > > > > + > > > > +match: > > > > > > > > #if 0 > > > > for (i = 0; > > > > -- > > > > 1.5.1 > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > Ira Weiny wrote: > > > > > > > > > Ok, > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > Sorry, > > > > > Ira > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > Ira Weiny wrote: > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > really stupid question but: > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > clusters? > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > up: 9: wopr[0-7],wopri > > > > > > down: 0: > > > > > > root at wopri:/tftpboot/images > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > In 9: wopr[0-7],wopri > > > > > > Out 0: 0 > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > In 9: wopr[0-7],wopri > > > > > > Out 0: 0 > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > In 1: wopr3 > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > In 9: wopr[0-7],wopri > > > > > > Out 0: 0 > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > In 1: wopr4 > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > In 1: wopri > > > > > > Out 8: wopr[0-7] > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > In 1: wopr6 > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > In 1: wopr7 > > > > > > Out 8: wopr[0-6],wopri > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > In 1: wopr1 > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > In 1: wopr2 > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > In 1: wopr0 > > > > > > Out 8: wopr[1-7],wopri > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > In 1: wopr5 > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > IPoIB? > > > > > > > > > > > > In a bind, > > > > > > Ira > > > > > > _______________________________________________ > > > > > > general mailing list > > > > > > general at lists.openfabrics.org > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > _______________________________________________ > > > > general mailing list > > > > general at lists.openfabrics.org > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Mon Jan 14 16:11:18 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 00:11:18 +0000 Subject: [ofa-general] Re: [PATCH 0/3] clean up "__get_mgrp_by_mgid" and add option --consolodate_ipv6_snm_req In-Reply-To: <20080114114525.1555fa6c.weiny2@llnl.gov> References: <20080114114525.1555fa6c.weiny2@llnl.gov> Message-ID: <20080115001118.GE16009@sashak.voltaire.com> On 11:45 Mon 14 Jan , Ira Weiny wrote: > The following 3 patches are a much cleaner implementation of what I sent to the > list on Friday. > > The first 2 patches are just code clean up and I feel should be applied. The > 3rd adds the option --consolodate_ipv6_snm_req which causes all IPv6 Solicited > Node Multicast requests to be grouped into one MCast group per partition. All three patches are applied. Thanks. Sasha From vu at mellanox.com Mon Jan 14 15:58:36 2008 From: vu at mellanox.com (Vu Pham) Date: Mon, 14 Jan 2008 15:58:36 -0800 Subject: [ofa-general] [PATCH][REPOST] drivers/infiniband/ulp/srpt: Fixtarget data corruption In-Reply-To: <478BB1BA.5040402@vlnb.net> References: <6grf06$3258oc@rrcs-agw-02.hrndva.rr.com> <478B3E1E.5000608@vlnb.net> <478BAFE8.1090108@mellanox.com> <478BB1BA.5040402@vlnb.net> Message-ID: <478BF72C.1000009@mellanox.com> Vladislav Bolkhovitin wrote: > Vu Pham wrote: >> Vladislav Bolkhovitin wrote: >> >>> Robert Pearson wrote: >>> >>>> Vlad, >>>> >>>> I think we agree. But, when we tried the experiment of running >>>> without the >>>> local memory allocator scst hung when we did large IO operations. >>>> Probably >>>> something simple. >>> >>> Why do you think it scst hung, not something else? Do you have >>> evidences for that? ;) According to Vu, you can't simply switch to >>> SCST memory allocator, because IB hardware has very limited number of >>> available SG entries in commands (few tens), so for large request, >>> where there are too many SG entries, they should be "iomapped" using >>> the corresponding IB facility. >> >> No - you can easily switch to SCST memory by set srpt's module >> parameter mem_element=0. There is limited number of sg entries per IB >> work request (29); however, the current srpt can submit several IB >> work requests to cover large sg entries IO. > > What's the point in the internal memory management in the SRPT driver then? > + To avoid some back-end storage which can not handle big sg entries + To save work requests However, I agree that I need to remove srpt' internal memory management. -vu From hrosenstock at xsigo.com Mon Jan 14 16:05:00 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 16:05:00 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080114153546.73d43d6b.weiny2@llnl.gov> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> <20080114153546.73d43d6b.weiny2@llnl.gov> Message-ID: <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > On Mon, 14 Jan 2008 12:23:34 -0800 > Hal Rosenstock wrote: > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > Hey Hal, thanks for the response. Comments below. > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > "Hal Rosenstock" wrote: > > > > > > > Hi Ira, > > > > > > > > On 1/12/08, Ira Weiny wrote: > > > > > And to further answer my question...[*] > > > > > > > > > > This seems to fix the problem for us, however I know that it could be better. > > > > > For example it only takes care of partition 0xFFFF, and I think Jason's idea of > > > > > having say 16 Mcast Groups and some hash of these into them would be nice. But > > > > > is this on the right track? Am I missing some other place in the code? > > > > > > > > This is a start. > > > > > > > > Some initial comments on a quick scan of the approach used: > > > > > > > > This assumes a homogeneous subnet (in terms of rates and MTUs). I > > > > think that only groups which share the same rate and MTU can share the > > > > same MLID. > > > > > > Ah indeed this might be an issue. This might not be the best place for the > > > code. :-( > > > > > > > > > > > Also, MLIDs will now need to be use counted and only removed when all > > > > the groups sharing that MLID are removed. > > > > > > I don't quite understand what you mean here. There is still a 1:1 mapping of > > > MLID's to MGID's. > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > see now; that is what was proposed in the thread a year ago, however, I don't > think mapping many MGIDs to 1 MLID will work well. Why not ? It appears to be what you did (multiple MGIDs are mapped onto MLID (in the case below 0xc002)). Am I mistaken ? > What I did was to allow the first IPv6 request to create the group and then all > other requests were added to this group. You are using the word group loosely here and that is the source of the confusion IMO. I think by group you mean MLID. > This sends all the neighbor discovery messages to all nodes on the network. All nodes part of that MLID tree. > This might seem inefficient but should work. (... and seems to.) Sure; the hosts will filter based on MGID. The tradeoff is MLID utilization versus fabric utilization. > > > All of the requests for this type of MGRP join are routed to > > > one group. Therefore, I thought the same rules for deleting the group would > > > apply; when all the members are gone it is removed? > > > > Yes, the group may go but not the underlying MLID as there are other > > groups which are sharing this. That's not what happens now. > > No, since there is only 1 group in this implementation it should work like > others. The first node of this "mgid type" will create the group. Others will > join it and will continue to use it even if the creator leaves. Are you saying all these groups appear as 1 "group" to OpenSM (as the real groups are masked to the same value) ? -- Hal > Does this make more sense? > > Ira > > > > > > Just to be clear, after > > > this patch the mgroups are: > > > > > > 09:36:40 > saquery -g > > > MCMemberRecord group dump: > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > Mlid....................0xC000 > > > Mtu.....................0x84 > > > pkey....................0xFFFF > > > Rate....................0x83 > > > MCMemberRecord group dump: > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > Mlid....................0xC001 > > > Mtu.....................0x84 > > > pkey....................0xFFFF > > > Rate....................0x83 > > > MCMemberRecord group dump: > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > Mlid....................0xC002 > > > Mtu.....................0x84 > > > pkey....................0xFFFF > > > Rate....................0x83 > > > MCMemberRecord group dump: > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > Mlid....................0xC003 > > > Mtu.....................0x84 > > > pkey....................0xFFFF > > > Rate....................0x83 > > > > > > All of these requests are added to the > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > Mlid....................0xC002 > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > valid? :-/ > > > > > > But here is a question: > > > > > > What happens if someone with an incorrect MTU tries to join the > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > Yes, the join would fail, but I don't think that's what we would want. > > The alternative with the patch is to make it the lowest rate but there > > is a minimum MTU which might not be right. > > > > > > I think this is a policy and rather than this always being the case, > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > default should be to not do this. > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > Maybe more later... > > > > > > Thanks again, > > > Ira > > > > > > > > > > > -- Hal > > > > > > > > > Thanks, > > > > > Ira > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > From: Ira K. Weiny > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > Group. > > > > > > > > > > Signed-off-by: root > > > > > --- > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > index 8eb97ad..6bcc124 100644 > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > the same MGID */ > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > + > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > + > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > + && > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > + > > > > > + if ((g_prefix == rcv_prefix) > > > > > + && > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > + ) { > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > + "Special Case Mcast Join for MGID " > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > + rcv_prefix, rcv_interface_id); > > > > > + goto match; > > > > > + } > > > > > + } > > > > > return; > > > > > + } > > > > > > > > > > +match: > > > > > if (p_ctxt->p_mgrp) { > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > index 749a936..469773a 100644 > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > the same MGID */ > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > + > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > + > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > + && > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > + > > > > > + if ((g_prefix == rcv_prefix) > > > > > + && > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > + ) { > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > + "Special Case Mcast Join for MGID " > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > + rcv_prefix, rcv_interface_id); > > > > > + goto match; > > > > > + } > > > > > + } > > > > > return; > > > > > + } > > > > > + > > > > > +match: > > > > > > > > > > #if 0 > > > > > for (i = 0; > > > > > -- > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > Ira Weiny wrote: > > > > > > > > > > > Ok, > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > Sorry, > > > > > > Ira > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > really stupid question but: > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > clusters? > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > down: 0: > > > > > > > root at wopri:/tftpboot/images > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > In 9: wopr[0-7],wopri > > > > > > > Out 0: 0 > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > In 9: wopr[0-7],wopri > > > > > > > Out 0: 0 > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > In 1: wopr3 > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > In 9: wopr[0-7],wopri > > > > > > > Out 0: 0 > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > In 1: wopr4 > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > In 1: wopri > > > > > > > Out 8: wopr[0-7] > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > In 1: wopr6 > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > In 1: wopr7 > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > In 1: wopr1 > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > In 1: wopr2 > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > In 1: wopr0 > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > In 1: wopr5 > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > IPoIB? > > > > > > > > > > > > > > In a bind, > > > > > > > Ira > > > > > > > _______________________________________________ > > > > > > > general mailing list > > > > > > > general at lists.openfabrics.org > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > _______________________________________________ > > > > > general mailing list > > > > > general at lists.openfabrics.org > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Mon Jan 14 16:09:51 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 16:09:51 -0800 Subject: [ofa-general] Re: [PATCH 0/3] clean up "__get_mgrp_by_mgid" and add option --consolodate_ipv6_snm_req In-Reply-To: <20080115001118.GE16009@sashak.voltaire.com> References: <20080114114525.1555fa6c.weiny2@llnl.gov> <20080115001118.GE16009@sashak.voltaire.com> Message-ID: <1200355791.8962.153.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-15 at 00:11 +0000, Sasha Khapyorsky wrote: > On 11:45 Mon 14 Jan , Ira Weiny wrote: > > The following 3 patches are a much cleaner implementation of what I sent to the > > list on Friday. > > > > The first 2 patches are just code clean up and I feel should be applied. The > > 3rd adds the option --consolodate_ipv6_snm_req which causes all IPv6 Solicited > > Node Multicast requests to be grouped into one MCast group per partition. > > All three patches are applied. Thanks. This seems a little fast for me. I didn't think we were even done with the discussion yet. -- Hal > Sasha > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Mon Jan 14 16:42:37 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 00:42:37 +0000 Subject: [ofa-general] Re: [PATCH 0/3] clean up "__get_mgrp_by_mgid" and add option --consolodate_ipv6_snm_req In-Reply-To: <1200355791.8962.153.camel@hrosenstock-ws.xsigo.com> References: <20080114114525.1555fa6c.weiny2@llnl.gov> <20080115001118.GE16009@sashak.voltaire.com> <1200355791.8962.153.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080115004237.GF16009@sashak.voltaire.com> On 16:09 Mon 14 Jan , Hal Rosenstock wrote: > On Tue, 2008-01-15 at 00:11 +0000, Sasha Khapyorsky wrote: > > On 11:45 Mon 14 Jan , Ira Weiny wrote: > > > The following 3 patches are a much cleaner implementation of what I sent to the > > > list on Friday. > > > > > > The first 2 patches are just code clean up and I feel should be applied. The > > > 3rd adds the option --consolodate_ipv6_snm_req which causes all IPv6 Solicited > > > Node Multicast requests to be grouped into one MCast group per partition. > > > > All three patches are applied. Thanks. > > This seems a little fast for me. It looks like a nice start for me. > I didn't think we were even done with > the discussion yet. Sure, and we can continue from this point. (Ira stated that this patch series are not a final solution, but already provides sophisticated functionality right now. The feature itself is optional and I don't see a big risk here.) Sasha From sashak at voltaire.com Mon Jan 14 16:50:45 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 00:50:45 +0000 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080115005045.GG16009@sashak.voltaire.com> On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > On Mon, 14 Jan 2008 12:23:34 -0800 > > Hal Rosenstock wrote: > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > "Hal Rosenstock" wrote: > > > > > > > > > Hi Ira, > > > > > > > > > > On 1/12/08, Ira Weiny wrote: > > > > > > And to further answer my question...[*] > > > > > > > > > > > > This seems to fix the problem for us, however I know that it could be better. > > > > > > For example it only takes care of partition 0xFFFF, and I think Jason's idea of > > > > > > having say 16 Mcast Groups and some hash of these into them would be nice. But > > > > > > is this on the right track? Am I missing some other place in the code? > > > > > > > > > > This is a start. > > > > > > > > > > Some initial comments on a quick scan of the approach used: > > > > > > > > > > This assumes a homogeneous subnet (in terms of rates and MTUs). I > > > > > think that only groups which share the same rate and MTU can share the > > > > > same MLID. > > > > > > > > Ah indeed this might be an issue. This might not be the best place for the > > > > code. :-( > > > > > > > > > > > > > > Also, MLIDs will now need to be use counted and only removed when all > > > > > the groups sharing that MLID are removed. > > > > > > > > I don't quite understand what you mean here. There is still a 1:1 mapping of > > > > MLID's to MGID's. > > > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > > see now; that is what was proposed in the thread a year ago, however, I don't > > think mapping many MGIDs to 1 MLID will work well. > > Why not ? > > It appears to be what you did (multiple MGIDs are mapped onto MLID (in > the case below 0xc002)). Am I mistaken ? As far as I understand this patch it is the different. Here multiple ports which match ipv6 solicited node multicast address will try to join a single MC group (with single MGID and unique MLID). Sasha > > > What I did was to allow the first IPv6 request to create the group and then all > > other requests were added to this group. > > You are using the word group loosely here and that is the source of the > confusion IMO. I think by group you mean MLID. > > > This sends all the neighbor discovery messages to all nodes on the network. > > All nodes part of that MLID tree. > > > This might seem inefficient but should work. (... and seems to.) > > Sure; the hosts will filter based on MGID. The tradeoff is MLID > utilization versus fabric utilization. > > > > > All of the requests for this type of MGRP join are routed to > > > > one group. Therefore, I thought the same rules for deleting the group would > > > > apply; when all the members are gone it is removed? > > > > > > Yes, the group may go but not the underlying MLID as there are other > > > groups which are sharing this. That's not what happens now. > > > > No, since there is only 1 group in this implementation it should work like > > others. The first node of this "mgid type" will create the group. Others will > > join it and will continue to use it even if the creator leaves. > > Are you saying all these groups appear as 1 "group" to OpenSM (as the > real groups are masked to the same value) ? > > -- Hal > > > Does this make more sense? > > > > Ira > > > > > > > > > Just to be clear, after > > > > this patch the mgroups are: > > > > > > > > 09:36:40 > saquery -g > > > > MCMemberRecord group dump: > > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > > Mlid....................0xC000 > > > > Mtu.....................0x84 > > > > pkey....................0xFFFF > > > > Rate....................0x83 > > > > MCMemberRecord group dump: > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > Mlid....................0xC001 > > > > Mtu.....................0x84 > > > > pkey....................0xFFFF > > > > Rate....................0x83 > > > > MCMemberRecord group dump: > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > Mlid....................0xC002 > > > > Mtu.....................0x84 > > > > pkey....................0xFFFF > > > > Rate....................0x83 > > > > MCMemberRecord group dump: > > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > > Mlid....................0xC003 > > > > Mtu.....................0x84 > > > > pkey....................0xFFFF > > > > Rate....................0x83 > > > > > > > > All of these requests are added to the > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > Mlid....................0xC002 > > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > > valid? :-/ > > > > > > > > But here is a question: > > > > > > > > What happens if someone with an incorrect MTU tries to join the > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > > > Yes, the join would fail, but I don't think that's what we would want. > > > The alternative with the patch is to make it the lowest rate but there > > > is a minimum MTU which might not be right. > > > > > > > > I think this is a policy and rather than this always being the case, > > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > > default should be to not do this. > > > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > > > > Maybe more later... > > > > > > > > Thanks again, > > > > Ira > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > Thanks, > > > > > > Ira > > > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > > From: Ira K. Weiny > > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > > Group. > > > > > > > > > > > > Signed-off-by: root > > > > > > --- > > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > index 8eb97ad..6bcc124 100644 > > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > the same MGID */ > > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > > + > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > > + > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > + && > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > + > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > + && > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > + ) { > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > + "Special Case Mcast Join for MGID " > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > + goto match; > > > > > > + } > > > > > > + } > > > > > > return; > > > > > > + } > > > > > > > > > > > > +match: > > > > > > if (p_ctxt->p_mgrp) { > > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > > index 749a936..469773a 100644 > > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > the same MGID */ > > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > > + > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > > + > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > + && > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > + > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > + && > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > + ) { > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > + "Special Case Mcast Join for MGID " > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > + goto match; > > > > > > + } > > > > > > + } > > > > > > return; > > > > > > + } > > > > > > + > > > > > > +match: > > > > > > > > > > > > #if 0 > > > > > > for (i = 0; > > > > > > -- > > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > > > Sorry, > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > > really stupid question but: > > > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > > clusters? > > > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > > down: 0: > > > > > > > > root at wopri:/tftpboot/images > > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > Out 0: 0 > > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > Out 0: 0 > > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > > In 1: wopr3 > > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > Out 0: 0 > > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > > In 1: wopr4 > > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > > In 1: wopri > > > > > > > > Out 8: wopr[0-7] > > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > > In 1: wopr6 > > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > > In 1: wopr7 > > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > > In 1: wopr1 > > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > > In 1: wopr2 > > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > > In 1: wopr0 > > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > > In 1: wopr5 > > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > > IPoIB? > > > > > > > > > > > > > > > > In a bind, > > > > > > > > Ira > > > > > > > > _______________________________________________ > > > > > > > > general mailing list > > > > > > > > general at lists.openfabrics.org > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > _______________________________________________ > > > > > > general mailing list > > > > > > general at lists.openfabrics.org > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > _______________________________________________ > > > > general mailing list > > > > general at lists.openfabrics.org > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Mon Jan 14 17:08:16 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 17:08:16 -0800 Subject: [ofa-general] Re: [PATCH 0/3] clean up "__get_mgrp_by_mgid" and add option --consolodate_ipv6_snm_req In-Reply-To: <20080115004237.GF16009@sashak.voltaire.com> References: <20080114114525.1555fa6c.weiny2@llnl.gov> <20080115001118.GE16009@sashak.voltaire.com> <1200355791.8962.153.camel@hrosenstock-ws.xsigo.com> <20080115004237.GF16009@sashak.voltaire.com> Message-ID: <1200359296.8962.186.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-15 at 00:42 +0000, Sasha Khapyorsky wrote: > On 16:09 Mon 14 Jan , Hal Rosenstock wrote: > > On Tue, 2008-01-15 at 00:11 +0000, Sasha Khapyorsky wrote: > > > On 11:45 Mon 14 Jan , Ira Weiny wrote: > > > > The following 3 patches are a much cleaner implementation of what I sent to the > > > > list on Friday. > > > > > > > > The first 2 patches are just code clean up and I feel should be applied. The > > > > 3rd adds the option --consolodate_ipv6_snm_req which causes all IPv6 Solicited > > > > Node Multicast requests to be grouped into one MCast group per partition. > > > > > > All three patches are applied. Thanks. > > > > This seems a little fast for me. > > It looks like a nice start for me. > > > I didn't think we were even done with > > the discussion yet. > > Sure, and we can continue from this point. > > (Ira stated that this patch series are not a final solution, but already > provides sophisticated functionality right now. The feature itself is > optional and I don't see a big risk here.) It may work in Ira's configuration but I'm not even sure it follows the IBA spec. It may also cause issues if misused (e.g. MTU, rate, partitions, etc.). -- Hal > > Sasha > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From kliteyn at mellanox.co.il Mon Jan 14 17:10:14 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 15 Jan 2008 03:10:14 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-15:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-14 OpenSM git rev = Thu_Jan_10_03:48:16_2008 [7bb2045bd9f659f8466a4494f4ec983f0edbf96a] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From john.benninghoff at intel.com Mon Jan 14 17:11:28 2008 From: john.benninghoff at intel.com (Benninghoff, John) Date: Mon, 14 Jan 2008 17:11:28 -0800 Subject: [ofa-general] ofa_kernel build error on RH 4.6 (2.6.9-67.ELsmp) In-Reply-To: <478B4F10.70707@mellanox.co.il> References: <2E020D3DD4A80647AE77E1692F6E97D9C8C65E@FMSMSX420> <478B4F10.70707@mellanox.co.il> Message-ID: <2E020D3DD4A80647AE77E1692F6E97D9CCCEE3@FMSMSX420> That built. Thanks. Found bug# 849 which could use this solution and probably close it. /jb -----Original Message----- From: Tziporet Koren [mailto:tziporet at dev.mellanox.co.il] Sent: Monday, January 14, 2008 4:01 AM To: Benninghoff, John Cc: general at lists.openfabrics.org Subject: Re: [ofa-general] ofa_kernel build error on RH 4.6 (2.6.9-67.ELsmp) Benninghoff, John wrote: > I can build ofa_user RPMs but building the ofa_kernel RPMs fails with > the error below. Is this a known issue with RH 4.6? > > OFED 1.2.5.4 does not supports RHEL4 up6 Only 1.2.5.5 that will be released soon will support it Meanwhile you can use the RC2 at: http://www.openfabrics.org/builds/connectx/ Tziporet From sashak at voltaire.com Mon Jan 14 17:31:38 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 01:31:38 +0000 Subject: [ofa-general] Re: [PATCH 0/3] clean up "__get_mgrp_by_mgid" and add option --consolodate_ipv6_snm_req In-Reply-To: <1200359296.8962.186.camel@hrosenstock-ws.xsigo.com> References: <20080114114525.1555fa6c.weiny2@llnl.gov> <20080115001118.GE16009@sashak.voltaire.com> <1200355791.8962.153.camel@hrosenstock-ws.xsigo.com> <20080115004237.GF16009@sashak.voltaire.com> <1200359296.8962.186.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080115013138.GJ16009@sashak.voltaire.com> On 17:08 Mon 14 Jan , Hal Rosenstock wrote: > On Tue, 2008-01-15 at 00:42 +0000, Sasha Khapyorsky wrote: > > On 16:09 Mon 14 Jan , Hal Rosenstock wrote: > > > On Tue, 2008-01-15 at 00:11 +0000, Sasha Khapyorsky wrote: > > > > On 11:45 Mon 14 Jan , Ira Weiny wrote: > > > > > The following 3 patches are a much cleaner implementation of what I sent to the > > > > > list on Friday. > > > > > > > > > > The first 2 patches are just code clean up and I feel should be applied. The > > > > > 3rd adds the option --consolodate_ipv6_snm_req which causes all IPv6 Solicited > > > > > Node Multicast requests to be grouped into one MCast group per partition. > > > > > > > > All three patches are applied. Thanks. > > > > > > This seems a little fast for me. > > > > It looks like a nice start for me. > > > > > I didn't think we were even done with > > > the discussion yet. > > > > Sure, and we can continue from this point. > > > > (Ira stated that this patch series are not a final solution, but already > > provides sophisticated functionality right now. The feature itself is > > optional and I don't see a big risk here.) > > It may work in Ira's configuration but I'm not even sure it follows the > IBA spec. Why it is not? > It may also cause issues if misused (e.g. MTU, rate, > partitions, etc.). Right, but it is optional. Default OpenSM behavior is not changed. Sasha From hrosenstock at xsigo.com Mon Jan 14 17:22:40 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 17:22:40 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080115005045.GG16009@sashak.voltaire.com> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> Message-ID: <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-15 at 00:50 +0000, Sasha Khapyorsky wrote: > On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > > On Mon, 14 Jan 2008 12:23:34 -0800 > > > Hal Rosenstock wrote: > > > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > > "Hal Rosenstock" wrote: > > > > > > > > > > > Hi Ira, > > > > > > > > > > > > On 1/12/08, Ira Weiny wrote: > > > > > > > And to further answer my question...[*] > > > > > > > > > > > > > > This seems to fix the problem for us, however I know that it could be better. > > > > > > > For example it only takes care of partition 0xFFFF, and I think Jason's idea of > > > > > > > having say 16 Mcast Groups and some hash of these into them would be nice. But > > > > > > > is this on the right track? Am I missing some other place in the code? > > > > > > > > > > > > This is a start. > > > > > > > > > > > > Some initial comments on a quick scan of the approach used: > > > > > > > > > > > > This assumes a homogeneous subnet (in terms of rates and MTUs). I > > > > > > think that only groups which share the same rate and MTU can share the > > > > > > same MLID. > > > > > > > > > > Ah indeed this might be an issue. This might not be the best place for the > > > > > code. :-( > > > > > > > > > > > > > > > > > Also, MLIDs will now need to be use counted and only removed when all > > > > > > the groups sharing that MLID are removed. > > > > > > > > > > I don't quite understand what you mean here. There is still a 1:1 mapping of > > > > > MLID's to MGID's. > > > > > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > > > > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > > > see now; that is what was proposed in the thread a year ago, however, I don't > > > think mapping many MGIDs to 1 MLID will work well. > > > > Why not ? > > > > It appears to be what you did (multiple MGIDs are mapped onto MLID (in > > the case below 0xc002)). Am I mistaken ? > > As far as I understand this patch it is the different. Here multiple > ports which match ipv6 solicited node multicast address will try to > join a single MC group (with single MGID and unique MLID). I don't think you are using the IBA defined terminology. A MC group is an MGID in terms of the IBA spec. Also, the SA GetTable with MGIDs wildcarded shows all the MGIDs. (Does it show that "special" MGID ?) I would phrase this differently: All IPv6 SNM groups are mapped to a single MLID (when this feature is enabled). It so happens that OpenSM internally does the accounting on membership by treating them all as members of the same "base" or "masked" group by masking off partition and the low 24 bits (port GUID). -- Hal > Sasha > > > > > > What I did was to allow the first IPv6 request to create the group and then all > > > other requests were added to this group. > > > > You are using the word group loosely here and that is the source of the > > confusion IMO. I think by group you mean MLID. > > > > > This sends all the neighbor discovery messages to all nodes on the network. > > > > All nodes part of that MLID tree. > > > > > This might seem inefficient but should work. (... and seems to.) > > > > Sure; the hosts will filter based on MGID. The tradeoff is MLID > > utilization versus fabric utilization. > > > > > > > All of the requests for this type of MGRP join are routed to > > > > > one group. Therefore, I thought the same rules for deleting the group would > > > > > apply; when all the members are gone it is removed? > > > > > > > > Yes, the group may go but not the underlying MLID as there are other > > > > groups which are sharing this. That's not what happens now. > > > > > > No, since there is only 1 group in this implementation it should work like > > > others. The first node of this "mgid type" will create the group. Others will > > > join it and will continue to use it even if the creator leaves. > > > > Are you saying all these groups appear as 1 "group" to OpenSM (as the > > real groups are masked to the same value) ? > > > > -- Hal > > > > > Does this make more sense? > > > > > > Ira > > > > > > > > > > > > Just to be clear, after > > > > > this patch the mgroups are: > > > > > > > > > > 09:36:40 > saquery -g > > > > > MCMemberRecord group dump: > > > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > > > Mlid....................0xC000 > > > > > Mtu.....................0x84 > > > > > pkey....................0xFFFF > > > > > Rate....................0x83 > > > > > MCMemberRecord group dump: > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > Mlid....................0xC001 > > > > > Mtu.....................0x84 > > > > > pkey....................0xFFFF > > > > > Rate....................0x83 > > > > > MCMemberRecord group dump: > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > Mlid....................0xC002 > > > > > Mtu.....................0x84 > > > > > pkey....................0xFFFF > > > > > Rate....................0x83 > > > > > MCMemberRecord group dump: > > > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > > > Mlid....................0xC003 > > > > > Mtu.....................0x84 > > > > > pkey....................0xFFFF > > > > > Rate....................0x83 > > > > > > > > > > All of these requests are added to the > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > Mlid....................0xC002 > > > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > > > valid? :-/ > > > > > > > > > > But here is a question: > > > > > > > > > > What happens if someone with an incorrect MTU tries to join the > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > > > > > Yes, the join would fail, but I don't think that's what we would want. > > > > The alternative with the patch is to make it the lowest rate but there > > > > is a minimum MTU which might not be right. > > > > > > > > > > I think this is a policy and rather than this always being the case, > > > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > > > default should be to not do this. > > > > > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > > > > > > > Maybe more later... > > > > > > > > > > Thanks again, > > > > > Ira > > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > Thanks, > > > > > > > Ira > > > > > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > > > From: Ira K. Weiny > > > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > > > Group. > > > > > > > > > > > > > > Signed-off-by: root > > > > > > > --- > > > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > index 8eb97ad..6bcc124 100644 > > > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > the same MGID */ > > > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > > > + > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > > > + > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > + && > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > + > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > + && > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > + ) { > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > + goto match; > > > > > > > + } > > > > > > > + } > > > > > > > return; > > > > > > > + } > > > > > > > > > > > > > > +match: > > > > > > > if (p_ctxt->p_mgrp) { > > > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > > > index 749a936..469773a 100644 > > > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > the same MGID */ > > > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > > > + > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > > > + > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > + && > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > + > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > + && > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > + ) { > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > + goto match; > > > > > > > + } > > > > > > > + } > > > > > > > return; > > > > > > > + } > > > > > > > + > > > > > > > +match: > > > > > > > > > > > > > > #if 0 > > > > > > > for (i = 0; > > > > > > > -- > > > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > > > > > Sorry, > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > > > really stupid question but: > > > > > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > > > clusters? > > > > > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > > > down: 0: > > > > > > > > > root at wopri:/tftpboot/images > > > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > Out 0: 0 > > > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > Out 0: 0 > > > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > > > In 1: wopr3 > > > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > Out 0: 0 > > > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > > > In 1: wopr4 > > > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > > > In 1: wopri > > > > > > > > > Out 8: wopr[0-7] > > > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > > > In 1: wopr6 > > > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > > > In 1: wopr7 > > > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > > > In 1: wopr1 > > > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > > > In 1: wopr2 > > > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > > > In 1: wopr0 > > > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > > > In 1: wopr5 > > > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > > > IPoIB? > > > > > > > > > > > > > > > > > > In a bind, > > > > > > > > > Ira > > > > > > > > > _______________________________________________ > > > > > > > > > general mailing list > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > _______________________________________________ > > > > > > > general mailing list > > > > > > > general at lists.openfabrics.org > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > general mailing list > > > > > general at lists.openfabrics.org > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Mon Jan 14 17:26:38 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 17:26:38 -0800 Subject: [ofa-general] Re: [PATCH 0/3] clean up "__get_mgrp_by_mgid" and add option --consolodate_ipv6_snm_req In-Reply-To: <20080115013138.GJ16009@sashak.voltaire.com> References: <20080114114525.1555fa6c.weiny2@llnl.gov> <20080115001118.GE16009@sashak.voltaire.com> <1200355791.8962.153.camel@hrosenstock-ws.xsigo.com> <20080115004237.GF16009@sashak.voltaire.com> <1200359296.8962.186.camel@hrosenstock-ws.xsigo.com> <20080115013138.GJ16009@sashak.voltaire.com> Message-ID: <1200360398.8962.201.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-15 at 01:31 +0000, Sasha Khapyorsky wrote: > On 17:08 Mon 14 Jan , Hal Rosenstock wrote: > > On Tue, 2008-01-15 at 00:42 +0000, Sasha Khapyorsky wrote: > > > On 16:09 Mon 14 Jan , Hal Rosenstock wrote: > > > > On Tue, 2008-01-15 at 00:11 +0000, Sasha Khapyorsky wrote: > > > > > On 11:45 Mon 14 Jan , Ira Weiny wrote: > > > > > > The following 3 patches are a much cleaner implementation of what I sent to the > > > > > > list on Friday. > > > > > > > > > > > > The first 2 patches are just code clean up and I feel should be applied. The > > > > > > 3rd adds the option --consolodate_ipv6_snm_req which causes all IPv6 Solicited > > > > > > Node Multicast requests to be grouped into one MCast group per partition. > > > > > > > > > > All three patches are applied. Thanks. > > > > > > > > This seems a little fast for me. > > > > > > It looks like a nice start for me. > > > > > > > I didn't think we were even done with > > > > the discussion yet. > > > > > > Sure, and we can continue from this point. > > > > > > (Ira stated that this patch series are not a final solution, but already > > > provides sophisticated functionality right now. The feature itself is > > > optional and I don't see a big risk here.) > > > > It may work in Ira's configuration but I'm not even sure it follows the > > IBA spec. > > Why it is not? I'll dig this out shortly; it's been discussed on the list before. > > It may also cause issues if misused (e.g. MTU, rate, > > partitions, etc.). > > Right, but it is optional. Default OpenSM behavior is not changed. Doesn't stop someone else from misusing it; e.g. IPv6 is broken, or in the presence of partitioning, certain things don't work right, etc. I may be wrong but I think depending on how quickly the shortcomings are addresses or not, this becomes a support issue. -- Hal > Sasha > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Mon Jan 14 17:43:55 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 01:43:55 +0000 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080115014355.GK16009@sashak.voltaire.com> On 17:22 Mon 14 Jan , Hal Rosenstock wrote: > On Tue, 2008-01-15 at 00:50 +0000, Sasha Khapyorsky wrote: > > On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > > > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > > > On Mon, 14 Jan 2008 12:23:34 -0800 > > > > Hal Rosenstock wrote: > > > > > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > > > "Hal Rosenstock" wrote: > > > > > > > > > > > > > Hi Ira, > > > > > > > > > > > > > > On 1/12/08, Ira Weiny wrote: > > > > > > > > And to further answer my question...[*] > > > > > > > > > > > > > > > > This seems to fix the problem for us, however I know that it could be better. > > > > > > > > For example it only takes care of partition 0xFFFF, and I think Jason's idea of > > > > > > > > having say 16 Mcast Groups and some hash of these into them would be nice. But > > > > > > > > is this on the right track? Am I missing some other place in the code? > > > > > > > > > > > > > > This is a start. > > > > > > > > > > > > > > Some initial comments on a quick scan of the approach used: > > > > > > > > > > > > > > This assumes a homogeneous subnet (in terms of rates and MTUs). I > > > > > > > think that only groups which share the same rate and MTU can share the > > > > > > > same MLID. > > > > > > > > > > > > Ah indeed this might be an issue. This might not be the best place for the > > > > > > code. :-( > > > > > > > > > > > > > > > > > > > > Also, MLIDs will now need to be use counted and only removed when all > > > > > > > the groups sharing that MLID are removed. > > > > > > > > > > > > I don't quite understand what you mean here. There is still a 1:1 mapping of > > > > > > MLID's to MGID's. > > > > > > > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > > > > > > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > > > > see now; that is what was proposed in the thread a year ago, however, I don't > > > > think mapping many MGIDs to 1 MLID will work well. > > > > > > Why not ? > > > > > > It appears to be what you did (multiple MGIDs are mapped onto MLID (in > > > the case below 0xc002)). Am I mistaken ? > > > > As far as I understand this patch it is the different. Here multiple > > ports which match ipv6 solicited node multicast address will try to > > join a single MC group (with single MGID and unique MLID). > > I don't think you are using the IBA defined terminology. > > A MC group is an MGID in terms of the IBA spec. Also, the SA GetTable > with MGIDs wildcarded shows all the MGIDs. (Does it show that "special" > MGID ?) Yes, it has MLID 0xc002 in Ira's 'saquery -g' example. > I would phrase this differently: > All IPv6 SNM groups are mapped to a single MLID (when this feature is > enabled). No, all ports join single IPv6 SNM MC group, and yes, it has single MGID (and single MLID). Sasha > It so happens that OpenSM internally does the accounting on > membership by treating them all as members of the same "base" or > "masked" group by masking off partition and the low 24 bits (port GUID). > > -- Hal > > > Sasha > > > > > > > > > What I did was to allow the first IPv6 request to create the group and then all > > > > other requests were added to this group. > > > > > > You are using the word group loosely here and that is the source of the > > > confusion IMO. I think by group you mean MLID. > > > > > > > This sends all the neighbor discovery messages to all nodes on the network. > > > > > > All nodes part of that MLID tree. > > > > > > > This might seem inefficient but should work. (... and seems to.) > > > > > > Sure; the hosts will filter based on MGID. The tradeoff is MLID > > > utilization versus fabric utilization. > > > > > > > > > All of the requests for this type of MGRP join are routed to > > > > > > one group. Therefore, I thought the same rules for deleting the group would > > > > > > apply; when all the members are gone it is removed? > > > > > > > > > > Yes, the group may go but not the underlying MLID as there are other > > > > > groups which are sharing this. That's not what happens now. > > > > > > > > No, since there is only 1 group in this implementation it should work like > > > > others. The first node of this "mgid type" will create the group. Others will > > > > join it and will continue to use it even if the creator leaves. > > > > > > Are you saying all these groups appear as 1 "group" to OpenSM (as the > > > real groups are masked to the same value) ? > > > > > > -- Hal > > > > > > > Does this make more sense? > > > > > > > > Ira > > > > > > > > > > > > > > > Just to be clear, after > > > > > > this patch the mgroups are: > > > > > > > > > > > > 09:36:40 > saquery -g > > > > > > MCMemberRecord group dump: > > > > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > > > > Mlid....................0xC000 > > > > > > Mtu.....................0x84 > > > > > > pkey....................0xFFFF > > > > > > Rate....................0x83 > > > > > > MCMemberRecord group dump: > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > Mlid....................0xC001 > > > > > > Mtu.....................0x84 > > > > > > pkey....................0xFFFF > > > > > > Rate....................0x83 > > > > > > MCMemberRecord group dump: > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > Mlid....................0xC002 > > > > > > Mtu.....................0x84 > > > > > > pkey....................0xFFFF > > > > > > Rate....................0x83 > > > > > > MCMemberRecord group dump: > > > > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > > > > Mlid....................0xC003 > > > > > > Mtu.....................0x84 > > > > > > pkey....................0xFFFF > > > > > > Rate....................0x83 > > > > > > > > > > > > All of these requests are added to the > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > Mlid....................0xC002 > > > > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > > > > valid? :-/ > > > > > > > > > > > > But here is a question: > > > > > > > > > > > > What happens if someone with an incorrect MTU tries to join the > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > > > > > > > Yes, the join would fail, but I don't think that's what we would want. > > > > > The alternative with the patch is to make it the lowest rate but there > > > > > is a minimum MTU which might not be right. > > > > > > > > > > > > I think this is a policy and rather than this always being the case, > > > > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > > > > default should be to not do this. > > > > > > > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > > > > > > > > > > Maybe more later... > > > > > > > > > > > > Thanks again, > > > > > > Ira > > > > > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > Thanks, > > > > > > > > Ira > > > > > > > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > > > > From: Ira K. Weiny > > > > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > > > > Group. > > > > > > > > > > > > > > > > Signed-off-by: root > > > > > > > > --- > > > > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > index 8eb97ad..6bcc124 100644 > > > > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > the same MGID */ > > > > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > > > > + > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > > > > + > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > + && > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > + > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > + && > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > + ) { > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > + goto match; > > > > > > > > + } > > > > > > > > + } > > > > > > > > return; > > > > > > > > + } > > > > > > > > > > > > > > > > +match: > > > > > > > > if (p_ctxt->p_mgrp) { > > > > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > > > > index 749a936..469773a 100644 > > > > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > the same MGID */ > > > > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > > > > + > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > > > > + > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > + && > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > + > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > + && > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > + ) { > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > + goto match; > > > > > > > > + } > > > > > > > > + } > > > > > > > > return; > > > > > > > > + } > > > > > > > > + > > > > > > > > +match: > > > > > > > > > > > > > > > > #if 0 > > > > > > > > for (i = 0; > > > > > > > > -- > > > > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > > > > > > > Sorry, > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > > > > really stupid question but: > > > > > > > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > > > > clusters? > > > > > > > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > > > > down: 0: > > > > > > > > > > root at wopri:/tftpboot/images > > > > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > Out 0: 0 > > > > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > Out 0: 0 > > > > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > > > > In 1: wopr3 > > > > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > Out 0: 0 > > > > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > > > > In 1: wopr4 > > > > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > > > > In 1: wopri > > > > > > > > > > Out 8: wopr[0-7] > > > > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > > > > In 1: wopr6 > > > > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > > > > In 1: wopr7 > > > > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > > > > In 1: wopr1 > > > > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > > > > In 1: wopr2 > > > > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > > > > In 1: wopr0 > > > > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > > > > In 1: wopr5 > > > > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > > > > IPoIB? > > > > > > > > > > > > > > > > > > > > In a bind, > > > > > > > > > > Ira > > > > > > > > > > _______________________________________________ > > > > > > > > > > general mailing list > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > general mailing list > > > > > > > > general at lists.openfabrics.org > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > general mailing list > > > > > > general at lists.openfabrics.org > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Mon Jan 14 17:40:33 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 17:40:33 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080115014355.GK16009@sashak.voltaire.com> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> <20080115014355.GK16009@sashak.voltaire.com> Message-ID: <1200361233.8962.206.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-15 at 01:43 +0000, Sasha Khapyorsky wrote: > On 17:22 Mon 14 Jan , Hal Rosenstock wrote: > > On Tue, 2008-01-15 at 00:50 +0000, Sasha Khapyorsky wrote: > > > On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > > > > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > > > > On Mon, 14 Jan 2008 12:23:34 -0800 > > > > > Hal Rosenstock wrote: > > > > > > > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > > > > "Hal Rosenstock" wrote: > > > > > > > > > > > > > > > Hi Ira, > > > > > > > > > > > > > > > > On 1/12/08, Ira Weiny wrote: > > > > > > > > > And to further answer my question...[*] > > > > > > > > > > > > > > > > > > This seems to fix the problem for us, however I know that it could be better. > > > > > > > > > For example it only takes care of partition 0xFFFF, and I think Jason's idea of > > > > > > > > > having say 16 Mcast Groups and some hash of these into them would be nice. But > > > > > > > > > is this on the right track? Am I missing some other place in the code? > > > > > > > > > > > > > > > > This is a start. > > > > > > > > > > > > > > > > Some initial comments on a quick scan of the approach used: > > > > > > > > > > > > > > > > This assumes a homogeneous subnet (in terms of rates and MTUs). I > > > > > > > > think that only groups which share the same rate and MTU can share the > > > > > > > > same MLID. > > > > > > > > > > > > > > Ah indeed this might be an issue. This might not be the best place for the > > > > > > > code. :-( > > > > > > > > > > > > > > > > > > > > > > > Also, MLIDs will now need to be use counted and only removed when all > > > > > > > > the groups sharing that MLID are removed. > > > > > > > > > > > > > > I don't quite understand what you mean here. There is still a 1:1 mapping of > > > > > > > MLID's to MGID's. > > > > > > > > > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > > > > > > > > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > > > > > see now; that is what was proposed in the thread a year ago, however, I don't > > > > > think mapping many MGIDs to 1 MLID will work well. > > > > > > > > Why not ? > > > > > > > > It appears to be what you did (multiple MGIDs are mapped onto MLID (in > > > > the case below 0xc002)). Am I mistaken ? > > > > > > As far as I understand this patch it is the different. Here multiple > > > ports which match ipv6 solicited node multicast address will try to > > > join a single MC group (with single MGID and unique MLID). > > > > I don't think you are using the IBA defined terminology. > > > > A MC group is an MGID in terms of the IBA spec. Also, the SA GetTable > > with MGIDs wildcarded shows all the MGIDs. (Does it show that "special" > > MGID ?) > > Yes, it has MLID 0xc002 in Ira's 'saquery -g' example. No, MLID is not the group (at least in IBA terms); I was referring to the base SNM MGID (with partition and low 24 bits masked off). > > I would phrase this differently: > > All IPv6 SNM groups are mapped to a single MLID (when this feature is > > enabled). > > No, all ports join single IPv6 SNM MC group, and yes, it has single MGID > (and single MLID). It does not have a single MGID; it has many MGIDs including the base one (just look at the group dump). -- Hal > Sasha > > > It so happens that OpenSM internally does the accounting on > > membership by treating them all as members of the same "base" or > > "masked" group by masking off partition and the low 24 bits (port GUID). > > > > -- Hal > > > > > Sasha > > > > > > > > > > > > What I did was to allow the first IPv6 request to create the group and then all > > > > > other requests were added to this group. > > > > > > > > You are using the word group loosely here and that is the source of the > > > > confusion IMO. I think by group you mean MLID. > > > > > > > > > This sends all the neighbor discovery messages to all nodes on the network. > > > > > > > > All nodes part of that MLID tree. > > > > > > > > > This might seem inefficient but should work. (... and seems to.) > > > > > > > > Sure; the hosts will filter based on MGID. The tradeoff is MLID > > > > utilization versus fabric utilization. > > > > > > > > > > > All of the requests for this type of MGRP join are routed to > > > > > > > one group. Therefore, I thought the same rules for deleting the group would > > > > > > > apply; when all the members are gone it is removed? > > > > > > > > > > > > Yes, the group may go but not the underlying MLID as there are other > > > > > > groups which are sharing this. That's not what happens now. > > > > > > > > > > No, since there is only 1 group in this implementation it should work like > > > > > others. The first node of this "mgid type" will create the group. Others will > > > > > join it and will continue to use it even if the creator leaves. > > > > > > > > Are you saying all these groups appear as 1 "group" to OpenSM (as the > > > > real groups are masked to the same value) ? > > > > > > > > -- Hal > > > > > > > > > Does this make more sense? > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > Just to be clear, after > > > > > > > this patch the mgroups are: > > > > > > > > > > > > > > 09:36:40 > saquery -g > > > > > > > MCMemberRecord group dump: > > > > > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > > > > > Mlid....................0xC000 > > > > > > > Mtu.....................0x84 > > > > > > > pkey....................0xFFFF > > > > > > > Rate....................0x83 > > > > > > > MCMemberRecord group dump: > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > Mlid....................0xC001 > > > > > > > Mtu.....................0x84 > > > > > > > pkey....................0xFFFF > > > > > > > Rate....................0x83 > > > > > > > MCMemberRecord group dump: > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > Mlid....................0xC002 > > > > > > > Mtu.....................0x84 > > > > > > > pkey....................0xFFFF > > > > > > > Rate....................0x83 > > > > > > > MCMemberRecord group dump: > > > > > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > > > > > Mlid....................0xC003 > > > > > > > Mtu.....................0x84 > > > > > > > pkey....................0xFFFF > > > > > > > Rate....................0x83 > > > > > > > > > > > > > > All of these requests are added to the > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > Mlid....................0xC002 > > > > > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > > > > > valid? :-/ > > > > > > > > > > > > > > But here is a question: > > > > > > > > > > > > > > What happens if someone with an incorrect MTU tries to join the > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > > > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > > > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > > > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > > > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > > > > > > > > > Yes, the join would fail, but I don't think that's what we would want. > > > > > > The alternative with the patch is to make it the lowest rate but there > > > > > > is a minimum MTU which might not be right. > > > > > > > > > > > > > > I think this is a policy and rather than this always being the case, > > > > > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > > > > > default should be to not do this. > > > > > > > > > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > > > > > > > > > > > > > Maybe more later... > > > > > > > > > > > > > > Thanks again, > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > > > > > From: Ira K. Weiny > > > > > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > > > > > Group. > > > > > > > > > > > > > > > > > > Signed-off-by: root > > > > > > > > > --- > > > > > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > index 8eb97ad..6bcc124 100644 > > > > > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > the same MGID */ > > > > > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > > > > > + > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > > > > > + > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > + && > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > + > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > + && > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > + ) { > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > + goto match; > > > > > > > > > + } > > > > > > > > > + } > > > > > > > > > return; > > > > > > > > > + } > > > > > > > > > > > > > > > > > > +match: > > > > > > > > > if (p_ctxt->p_mgrp) { > > > > > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > index 749a936..469773a 100644 > > > > > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > the same MGID */ > > > > > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > > > > > + > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > > > > > + > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > + && > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > + > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > + && > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > + ) { > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > + goto match; > > > > > > > > > + } > > > > > > > > > + } > > > > > > > > > return; > > > > > > > > > + } > > > > > > > > > + > > > > > > > > > +match: > > > > > > > > > > > > > > > > > > #if 0 > > > > > > > > > for (i = 0; > > > > > > > > > -- > > > > > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > > > > > > > > > Sorry, > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > > > > > really stupid question but: > > > > > > > > > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > > > > > clusters? > > > > > > > > > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > > > > > down: 0: > > > > > > > > > > > root at wopri:/tftpboot/images > > > > > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > > > > > In 1: wopr3 > > > > > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > > > > > In 1: wopr4 > > > > > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > > > > > In 1: wopri > > > > > > > > > > > Out 8: wopr[0-7] > > > > > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > > > > > In 1: wopr6 > > > > > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > > > > > In 1: wopr7 > > > > > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > > > > > In 1: wopr1 > > > > > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > > > > > In 1: wopr2 > > > > > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > > > > > In 1: wopr0 > > > > > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > > > > > In 1: wopr5 > > > > > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > > > > > IPoIB? > > > > > > > > > > > > > > > > > > > > > > In a bind, > > > > > > > > > > > Ira > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > general mailing list > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > general mailing list > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > general mailing list > > > > > > > general at lists.openfabrics.org > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > _______________________________________________ > > > > general mailing list > > > > general at lists.openfabrics.org > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hal.rosenstock at gmail.com Mon Jan 14 18:06:49 2008 From: hal.rosenstock at gmail.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 21:06:49 -0500 Subject: [ofa-general] [PATCH] libibumad: umad_get_pkey() function In-Reply-To: <20080113193559.GH10650@sashak.voltaire.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> Message-ID: On 1/13/08, Sasha Khapyorsky wrote: > > This returns value of pkey_index in network byte order from user_mad > header. If we are running with kernel where pkey_index is not supported > yet it will return 0. > > Signed-off-by: Sasha Khapyorsky > --- > libibumad/include/infiniband/umad.h | 1 + > libibumad/src/libibumad.map | 1 + > libibumad/src/umad.c | 13 ++++++++++++- > 3 files changed, 14 insertions(+), 1 deletions(-) > > diff --git a/libibumad/include/infiniband/umad.h b/libibumad/include/infiniband/umad.h > index 681b440..742c7b0 100644 > --- a/libibumad/include/infiniband/umad.h > +++ b/libibumad/include/infiniband/umad.h > @@ -174,6 +174,7 @@ int umad_set_grh(void *umad, void *mad_addr); > int umad_set_addr_net(void *umad, int dlid, int dqp, int sl, int qkey); > int umad_set_addr(void *umad, int dlid, int dqp, int sl, int qkey); > int umad_set_pkey(void *umad, int pkey_index); > +int umad_get_pkey(void *umad); > > int umad_send(int portid, int agentid, void *umad, int length, > int timeout_ms, int retries); > diff --git a/libibumad/src/libibumad.map b/libibumad/src/libibumad.map > index 9444aa9..0154b7f 100644 > --- a/libibumad/src/libibumad.map > +++ b/libibumad/src/libibumad.map > @@ -15,6 +15,7 @@ IBUMAD_1.0 { > umad_size; > umad_set_grh; > umad_set_pkey; > + umad_get_pkey; Shouldn't running rev in libibumad.ver be updated to go along with this added API ? -- Hal > umad_set_addr; > umad_set_addr_net; > umad_send; > diff --git a/libibumad/src/umad.c b/libibumad/src/umad.c > index 1dc328d..b01e313 100644 > --- a/libibumad/src/umad.c > +++ b/libibumad/src/umad.c > @@ -722,7 +722,18 @@ umad_set_pkey(void *umad, int pkey_index) > struct ib_user_mad *mad = umad; > > if (new_user_mad_api) > - mad->addr.pkey_index = htons(pkey_index); > + mad->addr.pkey_index = pkey_index; > + > + return 0; > +} > + > +int > +umad_get_pkey(void *umad) > +{ > + struct ib_user_mad *mad = umad; > + > + if (new_user_mad_api) > + return mad->addr.pkey_index; > > return 0; > } > -- > 1.5.4.rc2.60.gb2e62 > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From hrosenstock at xsigo.com Mon Jan 14 18:31:43 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 14 Jan 2008 18:31:43 -0800 Subject: [ofa-general] [PATCH 3/3] Add option to Special Case the IPv6 Solicited Node Multicast address into a single Mcast Group In-Reply-To: <20080114114530.22b15f58.weiny2@llnl.gov> References: <20080114114530.22b15f58.weiny2@llnl.gov> Message-ID: <1200364303.8962.211.camel@hrosenstock-ws.xsigo.com> On Mon, 2008-01-14 at 11:45 -0800, Ira Weiny wrote: > >From a1d38895e7e34e9fec297b1dbdb0637ed858d6f0 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Sun, 13 Jan 2008 16:03:31 -0800 > Subject: [PATCH] Add option to Special Case the IPv6 Solicited Node Multicast address into a single Mcast Group > > > Signed-off-by: Ira K. Weiny > --- > opensm/include/opensm/osm_subnet.h | 1 + > opensm/man/opensm.8 | 4 +++ > opensm/opensm/main.c | 4 +++ > opensm/opensm/osm_sa_mcmember_record.c | 35 +++++++++++++++++++++++++++++++- > opensm/opensm/osm_subnet.c | 9 ++++++++ > 5 files changed, 52 insertions(+), 1 deletions(-) > > diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h > index 2a28045..558b34e 100644 > --- a/opensm/include/opensm/osm_subnet.h > +++ b/opensm/include/opensm/osm_subnet.h > @@ -283,6 +283,7 @@ typedef struct _osm_subn_opt { > char *event_plugin_name; > char *node_name_map_name; > char *prefix_routes_file; > + boolean_t consolodate_ipv6_snm_req; Nit: in all of the this, consolodate -> consolidate > } osm_subn_opt_t; > /* > * FIELDS > diff --git a/opensm/man/opensm.8 b/opensm/man/opensm.8 > index 475eeec..9c7b371 100644 > --- a/opensm/man/opensm.8 > +++ b/opensm/man/opensm.8 > @@ -239,6 +239,10 @@ Specify the sweep time for the performance manager in seconds > (default is 180 seconds). Only takes > effect if --enable-perfmgr was specified at configure time. > .TP > +.BI --consolodate_ipv6_snm_reqests > +Consolodate IPv6 Solicited Node Multicast group joins into 1 IB multicast > +group. > +.TP > \fB\-v\fR, \fB\-\-verbose\fR > This option increases the log verbosity level. > The -v option may be specified multiple times > diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c > index 4d0d51d..a84f6c2 100644 > --- a/opensm/opensm/main.c > +++ b/opensm/opensm/main.c > @@ -615,6 +615,7 @@ int main(int argc, char *argv[]) > {"perfmgr_sweep_time_s", 1, NULL, 2}, > #endif > {"prefix_routes_file", 1, NULL, 3}, > + {"consolodate_ipv6_snm_reqests", 0, NULL, 4}, > {NULL, 0, NULL, 0} /* Required at the end of the array */ > }; > > @@ -916,6 +917,9 @@ int main(int argc, char *argv[]) > case 3: > opt.prefix_routes_file = optarg; > break; > + case 4: > + opt.consolodate_ipv6_snm_req = TRUE; > + break; > case 'h': > case '?': > case ':': > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > index d37a655..bfa5d2d 100644 > --- a/opensm/opensm/osm_sa_mcmember_record.c > +++ b/opensm/opensm/osm_sa_mcmember_record.c > @@ -1167,9 +1167,42 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > /* compare entire MGID so different scope will not sneak in for > the same MGID */ > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > + > + if (sa->p_subn->opt.consolodate_ipv6_snm_req) { > + /* Special Case IPV6 Multicast Loopback addresses */ > + /* 0xff12601bXXXX0000 : 0x00000001ffYYYYYY */ > + /* Where XXXX is the partition and YYYYYY is the last 24 bits > + * of the port guid */ Masking off the partition is counter to IBA 1.2.1 vol 1 p. 151 10) which states: "When a multicast LID is overloaded, the multicast groups sharing the same MLID must have the same P_Key. This simplification is required to allow switches and routers that implement optional P_Key enforcement for multicast operations." -- Hal > +#define PREFIX_MASK (0xff12601b00000000) > +#define INT_ID_MASK (0x00000001ff000000) > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > + > + if (((rcv_prefix & PREFIX_MASK) == PREFIX_MASK) > + && > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > + > + if ((g_prefix == rcv_prefix) > + && > + (g_interface_id & INT_ID_MASK) == > + (rcv_interface_id & INT_ID_MASK) > + ) { > + osm_log(sa->p_log, OSM_LOG_INFO, > + "Special Case Mcast Join for MGID " > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > + rcv_prefix, rcv_interface_id); > + goto match; > + } > + } > + } > + > return; > + } > > +match: > if (p_ctxt->p_mgrp) { > osm_log(sa->p_log, OSM_LOG_ERROR, > "__search_mgrp_by_mgid: ERR 1F08: " > diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c > index 0103940..558ea68 100644 > --- a/opensm/opensm/osm_subnet.c > +++ b/opensm/opensm/osm_subnet.c > @@ -481,6 +481,7 @@ void osm_subn_set_default_opt(IN osm_subn_opt_t * const p_opt) > p_opt->enable_quirks = FALSE; > p_opt->no_clients_rereg = FALSE; > p_opt->prefix_routes_file = OSM_DEFAULT_PREFIX_ROUTES_FILE; > + p_opt->consolodate_ipv6_snm_req = FALSE; > subn_set_default_qos_options(&p_opt->qos_options); > subn_set_default_qos_options(&p_opt->qos_ca_options); > subn_set_default_qos_options(&p_opt->qos_sw0_options); > @@ -1394,6 +1395,9 @@ ib_api_status_t osm_subn_parse_conf_file(IN osm_subn_opt_t * const p_opts) > > opts_unpack_charp("prefix_routes_file", > p_key, p_val, &p_opts->prefix_routes_file); > + > + opts_unpack_boolean("consolodate_ipv6_snm_req", > + p_key, p_val, &p_opts->consolodate_ipv6_snm_req); > } > fclose(opts_file); > > @@ -1721,6 +1725,11 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) > "prefix_routes_file %s\n\n", > p_opts->prefix_routes_file); > > + fprintf(opts_file, > + "#\n# IPv6 MCast Options\n#\n" > + "consolodate_ipv6_snm_req %s\n\n", > + p_opts->consolodate_ipv6_snm_req ? "TRUE" : "FALSE"); > + > /* optional string attributes ... */ > > fclose(opts_file); > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From eli at mellanox.co.il Mon Jan 14 23:16:09 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Tue, 15 Jan 2008 09:16:09 +0200 Subject: [ofa-general] Re: [PATCH] libmthca: Ensure an Rx WQE is in memory before linking In-Reply-To: References: <1200318809.11174.191.camel@mtls03> Message-ID: <1200381369.11174.236.camel@mtls03> On Mon, 2008-01-14 at 13:29 -0800, Roland Dreier wrote: > Did you find this from code review or is it fixing a real problem on > some platform? I found this by code review but was not able to see that it fixes a specific problem. From bart.vanassche at gmail.com Mon Jan 14 23:39:22 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 15 Jan 2008 08:39:22 +0100 Subject: [ofa-general] Discovering new SRP targets and SRP userspace tools Message-ID: Hello, I have succeeded setting up an SRP initiator and SRP target by installing Linux + SCST + ib_srpt. Performance is great: dd reports about 500 MB/s for both reading and writing data that is buffered on the target in RAM (InfiniBand network with throughput of 8 Gbit/s). It took me some time however to find out the location of the source code of the userspace SRP tools. Can these tools either be included in OFED or can a link be added to the SRPT installation wiki page ? References: http://lists.openfabrics.org/pipermail/iwg/2007-March/000378.html https://wiki.openfabrics.org/tiki-index.php?page=SRPT+Installation https://svn.openfabrics.org/svn/openib/gen2/branches/1.1/src/userspace/srptools/ https://wiki.openfabrics.org/tiki-index.php?page=SRPT+Installation Regards, Bart Van Assche. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bookwork at tttcdb.org Mon Jan 14 23:44:43 2008 From: bookwork at tttcdb.org (Byron Cooper) Date: Tue, 15 Jan 2008 08:44:43 +0100 Subject: [ofa-general] Ado6e Akrobat Pro 8 for MAC\XP\Vlsta 79, Retail 599 (save 520) Message-ID: <000301c85749$d8b48f00$0100007f@qmwgq> v!sit "fastmicrosoftnow. com" in your Internet Exp1orer microsoft sql server developer edition 2005 - 69 parallels desktop 3.0 for mac - 29 media tools professional 5 - 39 adobe flash cs3 professional - 59 sony sound forge 9.0 - 49 google sketchup pro 6 for mac - 59 mcafee desktop firewall 8.0.493 - 39 apollo divx2dvd divx to dvd creator v3.3.0 - 29 alias motionbuilder 6.0 - 49 systran 6 premium translator - 159 adobe acrobat professional 7 - 69 adobe indesign cs2 - 59 sonic scenarist 3.0 - 49 microsoft vista business - 79 office professional xp - 49 From info at fortune.com Mon Jan 14 18:37:41 2008 From: info at fortune.com (FORTUNE SEASONAL PROMO) Date: Tue, 15 Jan 2008 02:37:41 -0000 (UCT) Subject: [ofa-general] oshrc@kyrnet.kg, osims@olivers.cix.co.uk, osnat_rosenwax@packetlight.com, osst@riede.org, osst-users@lists.sourceforge.net, otka@ella.hu, ovc@wanadoo.fr, overmann@atv.de, overseas@thewayout.net, overview@speechwriting.com, owgrodno@owgrodno.pl, owlmouse@yourchildlearns.com, owner@egroups.com, owner-af-aids@lists.healthdev.net, owner-bmw-r1100@cinnamon.com, owner-discussion@softimage.com, owner-edlug@xxx.xxx.xxx, owner-edlug@xxx.xxx.xxxdomain.hidden, owner-hpux-admin@dutchworks.nl, owner-intaids@lists.healthdev.net, owner-nettime-l@basis.desk.nl, owner-sex-work@lists.healthdev.net, owner-universidad@aida.usal.es, oxford@weddingoxfordshire.co.uk, oxl100@txt.electronicstalk.com, oxygene.traduction@wanadoo.fr, oyvin.sather@adm.ntnu.no, ozdelen@e-kolay.net, ozon@desk.nl, p.b.rekdal@ukm.uio.no, p.bizzarri@icube.it, p.bless@natureny.com, p.burggasser@uta1002.at, p.kozlowski@ifs.com.pl, p.terry@ru.ac.za, p.u-bunk@t-online.de, p2@ace.ulyssis.student.kuleuven.ac.be, p7ru@egroups.com, p_ghikas@hotmail.com, p_gortmaker@yahoo.com, paajglannoye@europarl.eu.int, paassen@scnet.de, pablorosenthal@gmx.net, pabry@physique.ens-lyon.fr, pac0169@comune.re.it, pacific@internews.am, paddy.nixon@cs.tcd.ie, paivi.jokimaki@jttpalvelut.fi, paivi.konttinen@atea.com, pak100@txt.laboratorytalk.com Message-ID: <28124.41.204.224.13.1200364661.squirrel@www.suanet.ac.tz> WINNING NOTIFICATION After this automated computer ballot, your e-mail address emerged as one of ten winners .You as well as the other winners are therefore to receive a cash prize of £2,000,000(ONE MILLION BRITISH POUNDS) each from the total payout.Your prize award has been insured with your e-mail address and will be transferred to you upon meeting the requirements, verifications, CLAIMS PROCESSING FORM First Name:................................................ Last Name.................................................. Sex........................................................... Office/Resident........................................... Address:.................................................... Telephone.................................................. Mobile........................................................ Fax:.......................................................... Occupation:............................................... Date of Birth............................................... Nationality:................................................ Country:.................................................... ONCE AGAIN CONGRATULATIONS,FROM ME AND THE STAFF OF THIS GREAT PROMO. Mr.Micheal Adams FUDICIARY AGENT FORTUNE ONLINE PROMO CLAIMS AGENT Tel: +44-704-5704-068 Tel: +44-704-5704-065 Foreign Services Manager, Payment and Release Order Dept. Mr.Micheal Adam From bart.vanassche at gmail.com Tue Jan 15 00:28:53 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 15 Jan 2008 09:28:53 +0100 Subject: [ofa-general] [PATCH] Make srptools 1.1 compile with recent automake Message-ID: Recent enough versions of automake only generate a Makefile.in file if the AUTHORS file is present. Can such a file please be added to srptools (https://svn.openfabrics.org/svn/openib/gen2/branches/1.1/src/userspace/srptools/) ? Index: AUTHORS =================================================================== --- AUTHORS (revision 0) +++ AUTHORS (revision 0) @@ -0,0 +1,6 @@ +Authors that have worked on srptools, in alphabetical order: + +Ishai Rabinovitz +Roland Dreier +Vladimir Sokolovsky +Yael Shenhav Regards, Bart Van Assche. From bart.vanassche at gmail.com Tue Jan 15 00:36:52 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 15 Jan 2008 09:36:52 +0100 Subject: [ofa-general] [PATCH] Suppress autoconf / automake warnings printed during compilation of srptools 1.1 Message-ID: When compiling srptools 1.1, autoconf and automake print several warnings (https://svn.openfabrics.org/svn/openib/gen2/branches/1.1/src/userspace/srptools/). The patch below fixes this. How I ran autoconf and automake: aclocal && autoconf && automake && ./configure Patch: diff -ur ../orig/srptools/configure.in ./configure.in --- ../orig/srptools/configure.in 2008-01-15 09:32:21.000000000 +0100 +++ ./configure.in 2008-01-15 09:19:23.000000000 +0100 @@ -5,7 +5,6 @@ AC_INIT(srptools, 0.0.4, openib-general at openib.org) AC_CONFIG_SRCDIR(src/srp-dm.c) AC_CONFIG_AUX_DIR(config) -AM_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE(srptools, 0.0.4) AC_ARG_ENABLE(libcheck, [ --disable-libcheck do not test for presence of ib libraries], @@ -16,6 +15,7 @@ # Checks for programs. AC_PROG_CC +AM_PROG_CC_C_O # Checks for libraries. if test "$disable_libcheck" != "yes" Only in .: configure.in~ diff -ur ../orig/srptools/Makefile.am ./Makefile.am --- ../orig/srptools/Makefile.am 2008-01-15 09:32:21.000000000 +0100 +++ ./Makefile.am 2008-01-15 09:35:12.000000000 +0100 @@ -1,5 +1,7 @@ # $Id: Makefile.am 9269 2006-09-05 15:48:06Z vlad $ +AUTOMAKE_OPTIONS = foreign + sbin_PROGRAMS = src/ibsrpdm srp_daemon/srp_daemon man_MANS = man/ibsrpdm.1 man/srp_daemon.1 -- Regards, Bart Van Assche. From a-abdiga at abnocto.com Mon Jan 14 01:48:05 2008 From: a-abdiga at abnocto.com (Dana Craft) Date: Tue, 14 Jan 2008 11:48:05 +0200 Subject: [ofa-general] Let's chat Message-ID: <01c856a3$52de0880$3ce1b14e@a-abdiga> Hello! I am tired tonight. I am nice girl that would like to chat with you. Email me at Ellen at HonorDays.info only, because I am using my friend's email to write this. To see my pics From jackm at dev.mellanox.co.il Tue Jan 15 01:46:29 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Tue, 15 Jan 2008 11:46:29 +0200 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: References: <1199980899.11174.91.camel@mtls03> Message-ID: <200801151146.30095.jackm@dev.mellanox.co.il> On Tuesday 15 January 2008 00:18, Roland Dreier wrote: > Not sure if this approach is a good one... would it make sense to > create a new QP type like IB_QPT_UD_LSO to handle LSO instead?  Are > there other flags we're going to want to add too? > I'm already using this flags field for XRC receive QPs (These are XRC qps for receiving only, which are created in kernel space via a userspace call). (I've not yet posted this to the list, since I'm still writing the code). The usespace app has NO access to the qp -- its only point-of-reference is the qp number, which may be sent to a remote process to use as a target XRC QP (for establishing the RC connection so that the remote can send packets to local XRC SRQs). For these QPs, I will need to distribute an XRC_QP_LAST_WQE_REACHED event to all processes which receive packets on XRC SRQs via this QP (i.e. have registered with this QP). In order to do this, I need to know that the QP is an "xrc rcv" qp. (see my post: http://lists.openfabrics.org/pipermail/general/2007-December/044477.html) - Jack From vlad at dev.mellanox.co.il Tue Jan 15 02:12:21 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 15 Jan 2008 12:12:21 +0200 Subject: [ofa-general] Discovering new SRP targets and SRP userspace tools In-Reply-To: References: Message-ID: <478C8705.4050601@dev.mellanox.co.il> Bart Van Assche wrote: > Hello, > > I have succeeded setting up an SRP initiator and SRP target by > installing Linux + SCST + ib_srpt. Performance is great: dd reports > about 500 MB/s for both reading and writing data that is buffered on > the target in RAM (InfiniBand network with throughput of 8 Gbit/s). It > took me some time however to find out the location of the source code > of the userspace SRP tools. Can these tools either be included in OFED > or can a link be added to the SRPT installation wiki page ? > > References: > http://lists.openfabrics.org/pipermail/iwg/2007-March/000378.html > https://wiki.openfabrics.org/tiki-index.php?page=SRPT+Installation > https://svn.openfabrics.org/svn/openib/gen2/branches/1.1/src/userspace/srptools/ > > https://wiki.openfabrics.org/tiki-index.php?page=SRPT+Installation > > Regards, > > Bart Van Assche. srptools RPM is included in OFED-1.3. Regards, Vladimir From vlad at lists.openfabrics.org Tue Jan 15 03:13:02 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 15 Jan 2008 03:13:02 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080115-0200 daily build status Message-ID: <20080115111302.5A1BFE6004D@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.18 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.18-53.el5 Passed on powerpc with linux-2.6.15 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.19 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.13 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.17 Passed on powerpc with linux-2.6.13 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Failed: From dotanb at dev.mellanox.co.il Tue Jan 15 04:22:36 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Tue, 15 Jan 2008 14:22:36 +0200 Subject: [ofa-general] [PATCH] libmlx4: prevent seg fault when sending bigmessages as inline In-Reply-To: <000301c856d2$ec87e8f0$c0d0180a@amr.corp.intel.com> References: <200801131743.24014.dotanb@dev.mellanox.co.il> <000301c856d2$ec87e8f0$c0d0180a@amr.corp.intel.com> Message-ID: <478CA58C.90209@dev.mellanox.co.il> Sean Hefty wrote: >> Fix the type of the variable that hold the number of bytes sent >> as inline message so far. >> Without the patch, If the user will try to use very big messages (total > 2^31) >> there will be a seg fault. >> > > 2^31 is the max message size supported by IB. > This is true, but if one would try to send this message not as inline he will get completion with error not a seg fault ... Dotan From kliteyn at dev.mellanox.co.il Tue Jan 15 05:04:40 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 15 Jan 2008 15:04:40 +0200 Subject: [ofa-general] [PATCH] opensm/osm_ucast_ftree.c: fixing coredump in fat-tree routing Message-ID: <478CAF68.8020609@dev.mellanox.co.il> Hi Sasha, Fat-tree routing wasn't clearing the data structure at the beginning of the new routing recalculation, which was causing wrong routing and sometimes coredump. Please apply this patch to master and ofed_1_3. If it's possible, please apply to ofed_1_3 ASAP - we're doing RC2 today, and it's important to have this fix included. -- Yevgeny Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_ucast_ftree.c | 11 ++++++++++- 1 files changed, 10 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_ucast_ftree.c b/opensm/opensm/osm_ucast_ftree.c index 948129c..51685c5 100644 --- a/opensm/opensm/osm_ucast_ftree.c +++ b/opensm/opensm/osm_ucast_ftree.c @@ -1054,13 +1054,17 @@ static void __osm_ftree_fabric_clear(ftree_fabric_t * p_ftree) while ((p_guid = (uint64_t *) cl_list_remove_head(&p_ftree->root_guid_list))) free(p_guid); - cl_list_destroy(&p_ftree->root_guid_list); /* free the leaf switches array */ if ((p_ftree->leaf_switches_num > 0) && (p_ftree->leaf_switches)) free(p_ftree->leaf_switches); p_ftree->leaf_switches_num = 0; + p_ftree->cn_num = 0; + p_ftree->leaf_switch_rank = 0; + p_ftree->max_switch_rank = 0; + p_ftree->max_cn_per_leaf = 0; + p_ftree->lft_max_lid_ho = 0; p_ftree->leaf_switches = NULL; p_ftree->fabric_built = FALSE; @@ -1073,6 +1077,7 @@ static void __osm_ftree_fabric_destroy(ftree_fabric_t * p_ftree) if (!p_ftree) return; __osm_ftree_fabric_clear(p_ftree); + cl_list_destroy(&p_ftree->root_guid_list); cl_pool_destroy(&p_ftree->sw_fwd_tbl_pool); free(p_ftree); } @@ -1245,6 +1250,8 @@ static void __osm_ftree_fabric_dump_general_info(IN ftree_fabric_t * p_ftree) cl_qmap_count(&p_ftree->hca_tbl), p_ftree->cn_num, cl_qmap_count(&p_ftree->sw_tbl)); + CL_ASSERT(cl_qmap_count(&p_ftree->hca_tbl) >= p_ftree->cn_num); + for (i = 0; i <= p_ftree->max_switch_rank; i++) { j = 0; for (p_sw = (ftree_sw_t *) cl_qmap_head(&p_ftree->sw_tbl); @@ -3599,6 +3606,8 @@ static int __osm_ftree_construct_fabric(IN void *context) OSM_LOG_ENTER(&p_ftree->p_osm->log, __osm_ftree_construct_fabric); + __osm_ftree_fabric_clear(p_ftree); + if (p_ftree->p_osm->subn.opt.lmc > 0) { osm_log(&p_ftree->p_osm->log, OSM_LOG_SYS, "LMC > 0 is not supported by fat-tree routing.\n" -- 1.5.1.4 From kliteyn at dev.mellanox.co.il Tue Jan 15 05:10:26 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Tue, 15 Jan 2008 15:10:26 +0200 Subject: [ofa-general] [PATCH] opensm/osm_ucast_ftree.c: cosmetics in log messages Message-ID: <478CB0C2.2040109@dev.mellanox.co.il> Hi Sasha, Cosmetics in log messages of fat-tree routing - printing lid in 4 digits. Please apply to master and ofed_1_3 -- Yevgeny Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_ucast_ftree.c | 50 +++++++++++++++++++------------------- 1 files changed, 25 insertions(+), 25 deletions(-) diff --git a/opensm/opensm/osm_ucast_ftree.c b/opensm/opensm/osm_ucast_ftree.c index 51685c5..dcbdc44 100644 --- a/opensm/opensm/osm_ucast_ftree.c +++ b/opensm/opensm/osm_ucast_ftree.c @@ -550,7 +550,7 @@ __osm_ftree_port_group_dump(IN ftree_fabric_t * p_ftree, "__osm_ftree_port_group_dump:" " Port Group of size %u, port(s): %s, direction: %s\n" " Local <--> Remote GUID (LID):" - "0x%016" PRIx64 " (0x%x) <--> 0x%016" PRIx64 " (0x%x)\n", + "0x%016" PRIx64 " (0x%04x) <--> 0x%016" PRIx64 " (0x%04x)\n", size, buff, (direction == FTREE_DIRECTION_DOWN) ? "DOWN" : "UP", @@ -1288,7 +1288,7 @@ static void __osm_ftree_fabric_dump_general_info(IN ftree_fabric_t * p_ftree) osm_log(&p_ftree->p_osm->log, OSM_LOG_VERBOSE, "__osm_ftree_fabric_dump_general_info: " " GUID: 0x%016" PRIx64 - ", LID: 0x%x, Index %s\n", + ", LID: 0x%04x, Index %s\n", __osm_ftree_sw_get_guid_ho(p_sw), cl_ntoh16(p_sw->base_lid), __osm_ftree_tuple_to_str(p_sw->tuple)); @@ -1301,7 +1301,7 @@ static void __osm_ftree_fabric_dump_general_info(IN ftree_fabric_t * p_ftree) osm_log(&p_ftree->p_osm->log, OSM_LOG_VERBOSE, "__osm_ftree_fabric_dump_general_info: " " GUID: 0x%016" PRIx64 - ", LID: 0x%x, Index %s\n", + ", LID: 0x%04x, Index %s\n", __osm_ftree_sw_get_guid_ho(p_ftree-> leaf_switches[i]), cl_ntoh16(p_ftree->leaf_switches[i]->base_lid), @@ -1362,7 +1362,7 @@ static void __osm_ftree_fabric_dump_hca_ordering(IN ftree_fabric_t * p_ftree) if (!p_group_on_hca->is_cn) continue; - fprintf(p_hca_ordering_file, "0x%x\t%s\n", + fprintf(p_hca_ordering_file, "0x%04x\t%s\n", cl_ntoh16(p_group_on_hca->base_lid), p_hca->p_osm_node->print_desc); @@ -1566,7 +1566,7 @@ static void __osm_ftree_fabric_make_indexing(IN ftree_fabric_t * p_ftree) "__osm_ftree_fabric_make_indexing: Indexing starting point:\n" " - Switch rank : %u\n" " - Switch index : %s\n" - " - Node LID : 0x%x\n" + " - Node LID : 0x%04x\n" " - Node GUID : 0x%016" PRIx64 "\n", p_sw->rank, __osm_ftree_tuple_to_str(p_sw->tuple), cl_ntoh16(p_sw->base_lid), __osm_ftree_sw_get_guid_ho(p_sw)); @@ -1859,9 +1859,9 @@ static boolean_t __osm_ftree_fabric_validate_topology(IN ftree_fabric_t * "__osm_ftree_fabric_validate_topology: " "ERR AB09: Different number of upward port groups on switches:\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, Index %s - %u groups\n" + ", LID 0x%04x, Index %s - %u groups\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, Index %s - %u groups\n", + ", LID 0x%04x, Index %s - %u groups\n", __osm_ftree_sw_get_guid_ho (reference_sw_arr[p_sw->rank]), cl_ntoh16(reference_sw_arr[p_sw->rank]-> @@ -1887,9 +1887,9 @@ static boolean_t __osm_ftree_fabric_validate_topology(IN ftree_fabric_t * "__osm_ftree_fabric_validate_topology: " "ERR AB0A: Different number of downward port groups on switches:\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, Index %s - %u port groups\n" + ", LID 0x%04x, Index %s - %u port groups\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, Index %s - %u port groups\n", + ", LID 0x%04x, Index %s - %u port groups\n", __osm_ftree_sw_get_guid_ho (reference_sw_arr[p_sw->rank]), cl_ntoh16(reference_sw_arr[p_sw->rank]-> @@ -1923,10 +1923,10 @@ static boolean_t __osm_ftree_fabric_validate_topology(IN ftree_fabric_t * "ERR AB0B: Different number of ports in an upward port group on switches:\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, Index %s - %u ports\n" + ", LID 0x%04x, Index %s - %u ports\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, Index %s - %u ports\n", + ", LID 0x%04x, Index %s - %u ports\n", __osm_ftree_sw_get_guid_ho (reference_sw_arr [p_sw->rank]), @@ -1971,10 +1971,10 @@ static boolean_t __osm_ftree_fabric_validate_topology(IN ftree_fabric_t * "ERR AB0C: Different number of ports in an downward port group on switches:\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, Index %s - %u ports\n" + ", LID 0x%04x, Index %s - %u ports\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, Index %s - %u ports\n", + ", LID 0x%04x, Index %s - %u ports\n", __osm_ftree_sw_get_guid_ho (reference_sw_arr [p_sw->rank]), @@ -2137,7 +2137,7 @@ __osm_ftree_fabric_route_upgoing_by_going_down(IN ftree_fabric_t * p_ftree, osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, "__osm_ftree_fabric_route_upgoing_by_going_down: " "Loop of lenght %d in the fabric:\n " - "Switch %s (LID 0x%x) closes loop through switch %s (LID 0x%x)\n", + "Switch %s (LID 0x%04x) closes loop through switch %s (LID 0x%04x)\n", (p_remote_sw->rank - highest_rank_in_route) * 2, __osm_ftree_tuple_to_str(p_remote_sw->tuple), cl_ntoh16(p_group->base_lid), @@ -2190,7 +2190,7 @@ __osm_ftree_fabric_route_upgoing_by_going_down(IN ftree_fabric_t * p_ftree, remote_port_num); osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, "__osm_ftree_fabric_route_upgoing_by_going_down: " - "Switch %s: set path to CA LID 0x%x through port %u\n", + "Switch %s: set path to CA LID 0x%04x through port %u\n", __osm_ftree_tuple_to_str(p_remote_sw->tuple), cl_ntoh16(target_lid), p_min_port->remote_port_num); @@ -2355,7 +2355,7 @@ __osm_ftree_fabric_route_downgoing_by_going_up(IN ftree_fabric_t * p_ftree, if (p_sw->is_leaf) { osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, "__osm_ftree_fabric_route_downgoing_by_going_up: " - " - Routing MAIN path for %s CA LID 0x%x: %s --> %s\n", + " - Routing MAIN path for %s CA LID 0x%04x: %s --> %s\n", (is_real_lid) ? "real" : "DUMMY", cl_ntoh16(target_lid), __osm_ftree_tuple_to_str(p_sw->tuple), @@ -2375,7 +2375,7 @@ __osm_ftree_fabric_route_downgoing_by_going_up(IN ftree_fabric_t * p_ftree, remote_port_num); osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, "__osm_ftree_fabric_route_downgoing_by_going_up: " - "Switch %s: set path to CA LID 0x%x through port %u\n", + "Switch %s: set path to CA LID 0x%04x through port %u\n", __osm_ftree_tuple_to_str(p_remote_sw->tuple), cl_ntoh16(target_lid), p_min_port->remote_port_num); @@ -2456,7 +2456,7 @@ __osm_ftree_fabric_route_downgoing_by_going_up(IN ftree_fabric_t * p_ftree, if (p_sw->is_leaf) { osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, "__osm_ftree_fabric_route_downgoing_by_going_up: " - " - Routing SECONDARY path for LID 0x%x: %s --> %s\n", + " - Routing SECONDARY path for LID 0x%04x: %s --> %s\n", cl_ntoh16(target_lid), __osm_ftree_tuple_to_str(p_sw->tuple), __osm_ftree_tuple_to_str(p_remote_sw->tuple)); @@ -2562,7 +2562,7 @@ static void __osm_ftree_fabric_route_to_cns(IN ftree_fabric_t * p_ftree) p_port->port_num); osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, "__osm_ftree_fabric_route_to_cns: " - "Switch %s: set path to CN LID 0x%x through port %u\n", + "Switch %s: set path to CN LID 0x%04x through port %u\n", __osm_ftree_tuple_to_str(p_sw->tuple), cl_ntoh16(hca_lid), p_port->port_num); @@ -2673,7 +2673,7 @@ static void __osm_ftree_fabric_route_to_non_cns(IN ftree_fabric_t * p_ftree) osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, "__osm_ftree_fabric_route_to_non_cns: " - "Switch %s: set path to non-CN HCA LID 0x%x through port %u\n", + "Switch %s: set path to non-CN HCA LID 0x%04x through port %u\n", __osm_ftree_tuple_to_str(p_sw->tuple), cl_ntoh16(hca_lid), port_num_on_switch); @@ -2732,7 +2732,7 @@ static void __osm_ftree_fabric_route_to_switches(IN ftree_fabric_t * p_ftree) osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, "__osm_ftree_fabric_route_to_switches: " - "Switch %s (LID 0x%x): routing switch-to-switch pathes\n", + "Switch %s (LID 0x%04x): routing switch-to-switch pathes\n", __osm_ftree_tuple_to_str(p_sw->tuple), cl_ntoh16(p_sw->base_lid)); @@ -2947,7 +2947,7 @@ __osm_ftree_rank_leaf_switches(IN ftree_fabric_t * p_ftree, PRIx64 "\n" " - Switch guid: 0x%016" PRIx64 "\n" - " - Switch LID : 0x%x\n", + " - Switch LID : 0x%04x\n", __osm_ftree_hca_get_guid_ho(p_hca), __osm_ftree_sw_get_guid_ho(p_sw), cl_ntoh16(p_sw->base_lid)); @@ -3165,9 +3165,9 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, "__osm_ftree_fabric_construct_sw_ports: ERR AB16: " "Illegal link between switches with ranks %u and %u:\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, rank %u\n" + ", LID 0x%04x, rank %u\n" " GUID 0x%016" PRIx64 - ", LID 0x%x, rank %u\n", p_sw->rank, + ", LID 0x%04x, rank %u\n", p_sw->rank, p_remote_sw->rank, __osm_ftree_sw_get_guid_ho(p_sw), cl_ntoh16(p_sw->base_lid), p_sw->rank, @@ -3778,7 +3778,7 @@ static int __osm_ftree_construct_fabric(IN void *context) osm_log(&p_ftree->p_osm->log, OSM_LOG_VERBOSE, "__osm_ftree_construct_fabric: " - "Max LID in switch LFTs (in host order): 0x%x\n", + "Max LID in switch LFTs (in host order): 0x%04x\n", p_ftree->lft_max_lid_ho); Exit: -- 1.5.1.4 From eli at mellanox.co.il Tue Jan 15 05:13:18 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Tue, 15 Jan 2008 15:13:18 +0200 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: References: <1199980899.11174.91.camel@mtls03> Message-ID: <1200402798.11174.279.camel@mtls03> On Mon, 2008-01-14 at 14:18 -0800, Roland Dreier wrote: > > +enum qp_create_flags { > > + QP_CREATE_LSO = 1 << 0, > > +}; > > + > > struct ib_qp_init_attr { > > void (*event_handler)(struct ib_event *, void *); > > void *qp_context; > > @@ -496,6 +500,7 @@ struct ib_qp_init_attr { > > enum ib_sig_type sq_sig_type; > > enum ib_qp_type qp_type; > > u8 port_num; /* special QP types only */ > > + enum qp_create_flags create_flags; > > }; > > Not sure if this approach is a good one... would it make sense to > create a new QP type like IB_QPT_UD_LSO to handle LSO instead? Are > there other flags we're going to want to add too? As Jack replied already, we do need this also for his XRC code. Not shown in the patch is that the flags representation at the hw level is different from the verbs (QP_CREATE_LSO at the verbs layer is MLX4_QP_LSO ). > > Also this patch doesn't make much sense without the rest of the LSO > stuff really. Finally, I think you need to audit all the places where > struct ib_qp_init_attr is used to make sure the flags are set > correctly; for example the uverbs_cmd.c create QP function seems like > it would end up passing a random stack value into create_flags. I missed that one. There is one more place that might be a problem and that is rdma_create_qp which is an exported function which accepts struct ib_qp_init_attr * as an argument. This means that we need to either clear the create_flags field or require to the caller to put a valid value. What do you think? From eli at dev.mellanox.co.il Tue Jan 15 05:16:52 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 15 Jan 2008 15:16:52 +0200 Subject: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups. In-Reply-To: <1200333056.8962.61.camel@hrosenstock-ws.xsigo.com> References: <20080111193657.58477fb0.weiny2@llnl.gov> <1200211526.11174.128.camel@mtls03> <1200333056.8962.61.camel@hrosenstock-ws.xsigo.com> Message-ID: <1200403012.11174.280.camel@mtls03> On Mon, 2008-01-14 at 09:50 -0800, Hal Rosenstock wrote: > On Sun, 2008-01-13 at 10:05 +0200, Eli Cohen wrote: > > IPOIB does not initiate a join to a mulitcast group (except for the > > broadcast group). > > IPv6 does indeed do this on an IPoIB interface for solicited node > multicast. > Yes I saw that after I sent that email. Thanks. From jackm at mellanox.co.il Tue Jan 15 05:28:34 2008 From: jackm at mellanox.co.il (Jack Morgenstein) Date: Tue, 15 Jan 2008 15:28:34 +0200 Subject: [ofa-general] RE: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: <1200402798.11174.279.camel@mtls03> References: <1199980899.11174.91.camel@mtls03> <1200402798.11174.279.camel@mtls03> Message-ID: <6C2C79E72C305246B504CBA17B5500C9031E5D23@mtlexch01.mtl.com> At this time, ALL kernel callers of rdma_create_qp either: - memset the qp_init_attr structure to zero - define it with an initializer (which places zero in all unspecified fields). Therefore, currently there is no problem. We might put in a note that callers should always cause unused fields of the ib_qp_init_attr structure to be zero. - Jack > -----Original Message----- > From: Eli Cohen > Sent: Tuesday, January 15, 2008 3:13 PM > To: Roland Dreier; Jack Morgenstein > Cc: openfabrics > Subject: Re: [PATCH] ib/core: Add creation flags to create QP > > > On Mon, 2008-01-14 at 14:18 -0800, Roland Dreier wrote: > > > +enum qp_create_flags { > > > + QP_CREATE_LSO = 1 << 0, > > > +}; > > > + > > > struct ib_qp_init_attr { > > > void (*event_handler)(struct > ib_event *, void *); > > > void *qp_context; > > > @@ -496,6 +500,7 @@ struct ib_qp_init_attr { > > > enum ib_sig_type sq_sig_type; > > > enum ib_qp_type qp_type; > > > u8 port_num; /* special QP > types only */ > > > + enum qp_create_flags create_flags; > > > }; > > > > Not sure if this approach is a good one... would it make sense to > > create a new QP type like IB_QPT_UD_LSO to handle LSO instead? Are > > there other flags we're going to want to add too? > As Jack replied already, we do need this also for his XRC code. Not > shown in the patch is that the flags representation at the hw level is > different from the verbs (QP_CREATE_LSO at the verbs layer is > MLX4_QP_LSO ). > > > > Also this patch doesn't make much sense without the rest of the LSO > > stuff really. Finally, I think you need to audit all the > places where > > struct ib_qp_init_attr is used to make sure the flags are set > > correctly; for example the uverbs_cmd.c create QP function > seems like > > it would end up passing a random stack value into create_flags. > > I missed that one. There is one more place that might be a problem and > that is rdma_create_qp which is an exported function which accepts > struct ib_qp_init_attr * as an argument. This means that we need to > either clear the create_flags field or require to the caller to put a > valid value. What do you think? > > > From tziporet at dev.mellanox.co.il Tue Jan 15 05:51:59 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Tue, 15 Jan 2008 15:51:59 +0200 Subject: [ofa-general] Discovering new SRP targets and SRP userspace tools In-Reply-To: <478C8705.4050601@dev.mellanox.co.il> References: <478C8705.4050601@dev.mellanox.co.il> Message-ID: <478CBA7F.8050308@mellanox.co.il> Vladimir Sokolovsky wrote: > Bart Van Assche wrote: >> Hello, >> >> I have succeeded setting up an SRP initiator and SRP target by >> installing Linux + SCST + ib_srpt. Performance is great: dd reports >> about 500 MB/s for both reading and writing data that is buffered on >> the target in RAM (InfiniBand network with throughput of 8 Gbit/s). >> It took me some time however to find out the location of the source >> code of the userspace SRP tools. Can these tools either be included >> in OFED or can a link be added to the SRPT installation wiki page ? >> >> References: >> http://lists.openfabrics.org/pipermail/iwg/2007-March/000378.html >> https://wiki.openfabrics.org/tiki-index.php?page=SRPT+Installation >> https://svn.openfabrics.org/svn/openib/gen2/branches/1.1/src/userspace/srptools/ >> >> >> https://wiki.openfabrics.org/tiki-index.php?page=SRPT+Installation >> >> Regards, >> >> Bart Van Assche. > > srptools RPM is included in OFED-1.3. > > Also in OFED 1.2 and 1.2.5 Tziporet > > From sashak at voltaire.com Tue Jan 15 06:10:13 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 14:10:13 +0000 Subject: [ofa-general] [PATCH] libibumad: umad_get_pkey() function In-Reply-To: References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> Message-ID: <20080115141013.GM16009@sashak.voltaire.com> On 21:06 Mon 14 Jan , Hal Rosenstock wrote: > On 1/13/08, Sasha Khapyorsky wrote: > > > > This returns value of pkey_index in network byte order from user_mad > > header. If we are running with kernel where pkey_index is not supported > > yet it will return 0. > > > > Signed-off-by: Sasha Khapyorsky > > --- > > libibumad/include/infiniband/umad.h | 1 + > > libibumad/src/libibumad.map | 1 + > > libibumad/src/umad.c | 13 ++++++++++++- > > 3 files changed, 14 insertions(+), 1 deletions(-) > > > > diff --git a/libibumad/include/infiniband/umad.h b/libibumad/include/infiniband/umad.h > > index 681b440..742c7b0 100644 > > --- a/libibumad/include/infiniband/umad.h > > +++ b/libibumad/include/infiniband/umad.h > > @@ -174,6 +174,7 @@ int umad_set_grh(void *umad, void *mad_addr); > > int umad_set_addr_net(void *umad, int dlid, int dqp, int sl, int qkey); > > int umad_set_addr(void *umad, int dlid, int dqp, int sl, int qkey); > > int umad_set_pkey(void *umad, int pkey_index); > > +int umad_get_pkey(void *umad); > > > > int umad_send(int portid, int agentid, void *umad, int length, > > int timeout_ms, int retries); > > diff --git a/libibumad/src/libibumad.map b/libibumad/src/libibumad.map > > index 9444aa9..0154b7f 100644 > > --- a/libibumad/src/libibumad.map > > +++ b/libibumad/src/libibumad.map > > @@ -15,6 +15,7 @@ IBUMAD_1.0 { > > umad_size; > > umad_set_grh; > > umad_set_pkey; > > + umad_get_pkey; > > Shouldn't running rev in libibumad.ver be updated to go along with > this added API ? Yes, I think it should. Also we will need to add man page for this new function. Sasha From sashak at voltaire.com Tue Jan 15 06:16:47 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 14:16:47 +0000 Subject: [ofa-general] Re: [PATCH] opensm/osm_ucast_ftree.c: fixing coredump in fat-tree routing In-Reply-To: <478CAF68.8020609@dev.mellanox.co.il> References: <478CAF68.8020609@dev.mellanox.co.il> Message-ID: <20080115141647.GN16009@sashak.voltaire.com> On 15:04 Tue 15 Jan , Yevgeny Kliteynik wrote: > Hi Sasha, > > Fat-tree routing wasn't clearing the data structure at the > beginning of the new routing recalculation, which was causing > wrong routing and sometimes coredump. > > Please apply this patch to master and ofed_1_3. > If it's possible, please apply to ofed_1_3 ASAP - we're doing > RC2 today, and it's important to have this fix included. > > -- Yevgeny > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From sashak at voltaire.com Tue Jan 15 06:17:05 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 14:17:05 +0000 Subject: [ofa-general] Re: [PATCH] opensm/osm_ucast_ftree.c: cosmetics in log messages In-Reply-To: <478CB0C2.2040109@dev.mellanox.co.il> References: <478CB0C2.2040109@dev.mellanox.co.il> Message-ID: <20080115141705.GO16009@sashak.voltaire.com> On 15:10 Tue 15 Jan , Yevgeny Kliteynik wrote: > Hi Sasha, > > Cosmetics in log messages of fat-tree routing - printing lid in 4 digits. > Please apply to master and ofed_1_3 > > -- Yevgeny > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From dwrutledgem at rutledge.com Mon Jan 14 06:31:17 2008 From: dwrutledgem at rutledge.com (Darius Love) Date: Tue, 14 Jan 2008 16:31:17 +0200 Subject: [ofa-general] Claim what's yours! YOU play we PAY! Message-ID: <01c856ca$e2e37880$074df458@dwrutledgem> Never lose again! Big Dollars casino gives you a chance to win! Sign up & collect $500! Participate in friendly tournaments of any game you like! A giant progressive jackpot waits to be grabbed! USA players, you're in luck! Make Big Dollars your new cash cow and find out what VIP is all about$! Don't miss the cash-giving spree at Big Dollars. Let it roll and get yourself some new money! REGISTER PLAY WIN! Download and Play -------------- next part -------------- An HTML attachment was scrubbed... URL: From bart.vanassche at gmail.com Tue Jan 15 06:36:36 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 15 Jan 2008 15:36:36 +0100 Subject: [ofa-general] Discovering new SRP targets and SRP userspace tools In-Reply-To: <478CBA7F.8050308@mellanox.co.il> References: <478C8705.4050601@dev.mellanox.co.il> <478CBA7F.8050308@mellanox.co.il> Message-ID: On Jan 15, 2008 2:51 PM, Tziporet Koren wrote: > > Vladimir Sokolovsky wrote: > > Bart Van Assche wrote: > >> ... > >> It took me some time however to find out the location of the source > >> code of the userspace SRP tools. Can these tools either be included > >> in OFED or can a link be added to the SRPT installation wiki page ? > > > > srptools RPM is included in OFED-1.3. > > > > Also in OFED 1.2 and 1.2.5 Are you sure ? I only found the SRP release notes in OFED-1.2.5.4, but not the source code of the SRP tools. I downloaded OFED-1.2.5.4 from http://www.openfabrics.org/builds/ofed-1.2.5/release/. Bart. From vlad at dev.mellanox.co.il Tue Jan 15 06:44:55 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 15 Jan 2008 16:44:55 +0200 Subject: [ofa-general] Discovering new SRP targets and SRP userspace tools In-Reply-To: References: <478C8705.4050601@dev.mellanox.co.il> <478CBA7F.8050308@mellanox.co.il> Message-ID: <478CC6E7.9040106@dev.mellanox.co.il> Bart Van Assche wrote: > On Jan 15, 2008 2:51 PM, Tziporet Koren wrote: > >> Vladimir Sokolovsky wrote: >> >>> Bart Van Assche wrote: >>> >>>> ... >>>> It took me some time however to find out the location of the source >>>> code of the userspace SRP tools. Can these tools either be included >>>> in OFED or can a link be added to the SRPT installation wiki page ? >>>> >>> srptools RPM is included in OFED-1.3. >>> >>> Also in OFED 1.2 and 1.2.5 >>> > > Are you sure ? I only found the SRP release notes in OFED-1.2.5.4, but > not the source code of the SRP tools. I downloaded OFED-1.2.5.4 from > http://www.openfabrics.org/builds/ofed-1.2.5/release/. > > Bart. > In the OFED-1.2.5.4 srptools is a part of the ofa_user-1.2.5.4-0.src.rpm. After OFED installation you should see srptools RPM installed. Sources are taken from /Ishai Rabinovitz: /http://www.openfabrics.org/git/?p=~ishai/srptools.git;a=summary In the OFED-1.3 srptools is a stand alone src.rpm Regards, Vladimir From tziporet at mellanox.co.il Tue Jan 15 07:23:11 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Tue, 15 Jan 2008 17:23:11 +0200 Subject: [ofa-general] OFED Jan 14 meeting summary on RC2 readiness Message-ID: <6C2C79E72C305246B504CBA17B5500C9031E5E56@mtlexch01.mtl.com> OFED Jan-14 meeting summary on OFED 1.3-rc2 readiness Meeting decisions: ------------------ * RC2 will be released on Wednesday * XRC will be added after this - meaning we must have RC4 Meeting details: ---------------- 1. Review release status * Qlogic - status is good - still work on some bugs * Intel - RC1 is OK, will test the vmapich fix for ia64 * Mellanox - regression is good and stable * IBM - see several issues on PPC * Neteffect - new maintainer Glenn Streiff * Chelsio - no one participated * Cisco - no one participated * Voltaire - no one participated 2. Update on tasks that should be completed for RC2: * XRC - enhanced API - will be ready on Wed - decided it will not delay RC2 so it will be merged next week * IPoIB performance improvements for small messages - not a must for RC2 * Open MPI 1.2.5-rc2 - done 3. Bugs review: Critical and major bug_id bug_severity assigned_to short_short_desc 858 critical bugzilla at openib.org ibv_cmd_query_device fails - Jack answered 846 critical jim at mellanox.com SDP crash on RHEL5 ppc64 running netserver - will be debugged next week 849 critical vlad at mellanox.co.il IB_core is not compiling on ppc64 with rhel-4.6 - no issue (checked it in Mellanox) 760 major eli at mellanox.co.il UDP performance on Rx is lower than Tx 761 major eli at mellanox.co.il Poor and jittery UDP performance at small messages 736 major rolandd at cisco.com IBV_WC_RETRY_EXC_ERR errors with local rdma_reads 767 major swise at opengridcomputing.com Non backport Kernels that don't build in genalloc cause compile errors for cxgb3 Tziporet From bart.vanassche at gmail.com Tue Jan 15 07:28:47 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 15 Jan 2008 16:28:47 +0100 Subject: [ofa-general] Discovering new SRP targets and SRP userspace tools In-Reply-To: <478CC6E7.9040106@dev.mellanox.co.il> References: <478C8705.4050601@dev.mellanox.co.il> <478CBA7F.8050308@mellanox.co.il> <478CC6E7.9040106@dev.mellanox.co.il> Message-ID: > In the OFED-1.2.5.4 srptools is a part of the ofa_user-1.2.5.4-0.src.rpm. > After OFED installation you should see srptools RPM installed. > > Sources are taken from /Ishai Rabinovitz: > /http://www.openfabrics.org/git/?p=~ishai/srptools.git;a=summary > > In the OFED-1.3 srptools is a stand alone src.rpm I had another look at the menu's displayed by the build.sh script. In my opinion it would be convenient if the SRP tools could be built without having to build the MPI RPM's and without having to select '4) Customize'. Bart. From mschlining at datadirectnet.com Tue Jan 15 07:31:29 2008 From: mschlining at datadirectnet.com (Martin W. Schlining III) Date: Tue, 15 Jan 2008 10:31:29 -0500 Subject: [ofa-general] srp_daemon not picking up all parameter from srp_daemon.conf Message-ID: <478CD1D1.30706@datadirectnet.com> The srp_daemon used to login to SRP targets is not picking up more than one module parameter from /etc/srp_daemon.conf. I've seen this behavior with OFED 1.2, 1.2.5.3, and the IB software in 2.6.24-rc6. Is this a bug or is the srp_daemon.conf not being used properly? Here are a couple of examples: ----------------------------------------------------------------- With one parameter on each line: /etc/srp_daemon.conf ## This is an example rules configuration file for srp_daemon. ## #This is a comment ## disallow the following dgid #d dgid=fe800000000000000002c90200402bd5 ## allow target with the following ioc_guid #a ioc_guid=00a0b80200402bd7 ## allow target with the following id_ext and ioc_guid #a id_ext=200500A0B81146A1,ioc_guid=00a0b80200402bef ## disallow all the rest #d a max_sect=65535 a max_cmd_per_lun=4 fury:~ # modprobe -r ib_srp fury:~ # modprobe ib_srp fury:~ # srp_daemon -o -e /dev/infiniband/umad0 fury:~ # lsscsi [0:0:0:0] disk SEAGATE ST373455SS S513 /dev/sda [26:0:0:0] disk DDN S2A 9550 3.10 /dev/sdb fury:~ # cat /sys/block/sdb/queue/max_hw_sectors_kb 32767 fury:~ # cat /sys/class/infiniband_srp/srp-mthca0-1/device/host26/scsi_host\:host26/cmd_per_lun 63 fury:~ # max_sect was picked up, but max_cmd_per_lun is still the default. ------------------------------------------------------------------ With both parameters on one line seperated by a comma: ## This is an example rules configuration file for srp_daemon. ## #This is a comment ## disallow the following dgid #d dgid=fe800000000000000002c90200402bd5 ## allow target with the following ioc_guid #a ioc_guid=00a0b80200402bd7 ## allow target with the following id_ext and ioc_guid #a id_ext=200500A0B81146A1,ioc_guid=00a0b80200402bef ## disallow all the rest #d a max_sect=65535,max_cmd_per_lun=4 fury:~ # modprobe -r ib_srp fury:~ # modprobe ib_srp fury:~ # srp_daemon -o -e /dev/infiniband/umad0 fury:~ # lsscsi [0:0:0:0] disk SEAGATE ST373455SS S513 /dev/sda [27:0:0:0] disk DDN S2A 9550 3.10 /dev/sdb fury:~ # cat /sys/block/sdb/queue/max_hw_sectors_kb 512 fury:~ # cat /sys/class/infiniband_srp/srp-mthca0-1/device/host27/scsi_host\:host27/cmd_p er_lun 4 fury:~ # max_cmd_per_lun was picked up, but max_sect was not. ------------------------------------------------------------------ The original method for adding SRP targets still allows for more than module parameter to be added. This method works fine. fury:~/tools/ib # ibsrpdm -d /dev/infiniband/umad0 -c | \ awk '{ORS="";print $1",max_sect=65535,max_cmd_per_lun=4"}' \ > /sys/class/infiniband_srp/srp-mthca0-1/add_target fury:~/tools/ib # lsscsi [0:0:0:0] disk SEAGATE ST373455SS S513 /dev/sda [31:0:0:0] disk DDN S2A 9550 3.10 /dev/sdb fury:~/tools/ib # cat /sys/block/sdb/queue/max_hw_sectors_kb 32767 fury:~/tools/ib # cat /sys/class/infiniband_srp/srp-mthca0-1/device/host31/scsi_host\:hos t31/cmd_per_lun 4 fury:~/tools/ib # From vlad at dev.mellanox.co.il Tue Jan 15 07:33:18 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 15 Jan 2008 17:33:18 +0200 Subject: [ofa-general] Discovering new SRP targets and SRP userspace tools In-Reply-To: References: <478C8705.4050601@dev.mellanox.co.il> <478CBA7F.8050308@mellanox.co.il> <478CC6E7.9040106@dev.mellanox.co.il> Message-ID: <478CD23E.6070909@dev.mellanox.co.il> Bart Van Assche wrote: >> In the OFED-1.2.5.4 srptools is a part of the ofa_user-1.2.5.4-0.src.rpm. >> After OFED installation you should see srptools RPM installed. >> >> Sources are taken from /Ishai Rabinovitz: >> /http://www.openfabrics.org/git/?p=~ishai/srptools.git;a=summary >> >> In the OFED-1.3 srptools is a stand alone src.rpm >> > > I had another look at the menu's displayed by the build.sh script. In > my opinion it would be convenient if the SRP tools could be built > without having to build the MPI RPM's and without having to select '4) > Customize'. > > Bart. > _______________________________________________ > To build srptools RPM only: echo srptools=y > srptools.conf ./build.sh -c srptools.conf Vladimir From harms at alcf.anl.gov Tue Jan 15 07:33:18 2008 From: harms at alcf.anl.gov (Kevin Harms) Date: Tue, 15 Jan 2008 09:33:18 -0600 Subject: [ofa-general] srp_daemon not picking up all parameter from srp_daemon.conf In-Reply-To: <478CD1D1.30706@datadirectnet.com> References: <478CD1D1.30706@datadirectnet.com> Message-ID: <6450F1A4-175B-44DD-ABD2-94E9017C6923@alcf.anl.gov> martin, there parameters are all supposed to go on one line, but yes there is a bug where only the last parameter takes effect. i believe that bug is captured in the database along with a suggested code fix. > ## disallow all the rest > #d > a max_sect=65535,max_cmd_per_lun=4 kevin On Jan 15, 2008, at 9:31 AM, Martin W. Schlining III wrote: > The srp_daemon used to login to SRP targets is not picking up more > than one module parameter from /etc/srp_daemon.conf. I've seen this > behavior with OFED 1.2, 1.2.5.3, and the IB software in 2.6.24-rc6. > > Is this a bug or is the srp_daemon.conf not being used properly? > > Here are a couple of examples: > > ----------------------------------------------------------------- > > With one parameter on each line: > > /etc/srp_daemon.conf > > ## This is an example rules configuration file for srp_daemon. > ## > #This is a comment > ## disallow the following dgid > #d dgid=fe800000000000000002c90200402bd5 > ## allow target with the following ioc_guid > #a ioc_guid=00a0b80200402bd7 > ## allow target with the following id_ext and ioc_guid > #a id_ext=200500A0B81146A1,ioc_guid=00a0b80200402bef > ## disallow all the rest > #d > a max_sect=65535 > a max_cmd_per_lun=4 > > fury:~ # modprobe -r ib_srp > fury:~ # modprobe ib_srp > fury:~ # srp_daemon -o -e /dev/infiniband/umad0 > fury:~ # lsscsi > [0:0:0:0] disk SEAGATE ST373455SS S513 /dev/sda > [26:0:0:0] disk DDN S2A 9550 3.10 /dev/sdb > fury:~ # cat /sys/block/sdb/queue/max_hw_sectors_kb > 32767 > fury:~ # cat /sys/class/infiniband_srp/srp-mthca0-1/device/host26/ > scsi_host\:host26/cmd_per_lun > 63 > fury:~ # > > max_sect was picked up, but max_cmd_per_lun is still the default. > > ------------------------------------------------------------------ > > With both parameters on one line seperated by a comma: > > ## This is an example rules configuration file for srp_daemon. > ## > #This is a comment > ## disallow the following dgid > #d dgid=fe800000000000000002c90200402bd5 > ## allow target with the following ioc_guid > #a ioc_guid=00a0b80200402bd7 > ## allow target with the following id_ext and ioc_guid > #a id_ext=200500A0B81146A1,ioc_guid=00a0b80200402bef > ## disallow all the rest > #d > a max_sect=65535,max_cmd_per_lun=4 > > fury:~ # modprobe -r ib_srp > fury:~ # modprobe ib_srp > fury:~ # srp_daemon -o -e /dev/infiniband/umad0 > fury:~ # lsscsi > [0:0:0:0] disk SEAGATE ST373455SS S513 /dev/sda > [27:0:0:0] disk DDN S2A 9550 3.10 /dev/sdb > fury:~ # cat /sys/block/sdb/queue/max_hw_sectors_kb > 512 > fury:~ # cat /sys/class/infiniband_srp/srp-mthca0-1/device/host27/ > scsi_host\:host27/cmd_p > er_lun > 4 > fury:~ # > > max_cmd_per_lun was picked up, but max_sect was not. > > ------------------------------------------------------------------ > > The original method for adding SRP targets still allows for more > than module parameter to be added. This method works fine. > > fury:~/tools/ib # ibsrpdm -d /dev/infiniband/umad0 -c | \ > awk '{ORS="";print > $1",max_sect=65535,max_cmd_per_lun=4"}' \ > > /sys/class/infiniband_srp/srp-mthca0-1/add_target > fury:~/tools/ib # lsscsi > [0:0:0:0] disk SEAGATE ST373455SS S513 /dev/sda > [31:0:0:0] disk DDN S2A 9550 3.10 /dev/sdb > fury:~/tools/ib # cat /sys/block/sdb/queue/max_hw_sectors_kb > 32767 > fury:~/tools/ib # cat /sys/class/infiniband_srp/srp-mthca0-1/device/ > host31/scsi_host\:hos > t31/cmd_per_lun > 4 > fury:~/tools/ib # > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From bart.vanassche at gmail.com Tue Jan 15 07:36:33 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 15 Jan 2008 16:36:33 +0100 Subject: [ofa-general] Discovering new SRP targets and SRP userspace tools In-Reply-To: <478CD23E.6070909@dev.mellanox.co.il> References: <478C8705.4050601@dev.mellanox.co.il> <478CBA7F.8050308@mellanox.co.il> <478CC6E7.9040106@dev.mellanox.co.il> <478CD23E.6070909@dev.mellanox.co.il> Message-ID: On Jan 15, 2008 4:33 PM, Vladimir Sokolovsky wrote: > > To build srptools RPM only: > echo srptools=y > srptools.conf > ./build.sh -c srptools.conf Thanks ! Bart. From weiny2 at llnl.gov Tue Jan 15 08:47:00 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 15 Jan 2008 08:47:00 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <1200361233.8962.206.camel@hrosenstock-ws.xsigo.com> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> <20080115014355.GK16009@sashak.voltaire.com> <1200361233.8962.206.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080115084700.1d46407c.weiny2@llnl.gov> On Mon, 14 Jan 2008 17:40:33 -0800 Hal Rosenstock wrote: > On Tue, 2008-01-15 at 01:43 +0000, Sasha Khapyorsky wrote: > > On 17:22 Mon 14 Jan , Hal Rosenstock wrote: > > > On Tue, 2008-01-15 at 00:50 +0000, Sasha Khapyorsky wrote: > > > > On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > > > > > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > > > > > On Mon, 14 Jan 2008 12:23:34 -0800 > > > > > > Hal Rosenstock wrote: > > > > > > > > > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > > > > > "Hal Rosenstock" wrote: > > > > > > > > > > > > > > > > > > > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > > > > > > > > > > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > > > > > > see now; that is what was proposed in the thread a year ago, however, I don't > > > > > > think mapping many MGIDs to 1 MLID will work well. > > > > > > > > > > Why not ? > > > > > > > > > > It appears to be what you did (multiple MGIDs are mapped onto MLID (in > > > > > the case below 0xc002)). Am I mistaken ? > > > > > > > > As far as I understand this patch it is the different. Here multiple > > > > ports which match ipv6 solicited node multicast address will try to > > > > join a single MC group (with single MGID and unique MLID). > > > > > > I don't think you are using the IBA defined terminology. > > > > > > A MC group is an MGID in terms of the IBA spec. Also, the SA GetTable > > > with MGIDs wildcarded shows all the MGIDs. (Does it show that "special" > > > MGID ?) > > > > Yes, it has MLID 0xc002 in Ira's 'saquery -g' example. > > No, MLID is not the group (at least in IBA terms); I was referring to > the base SNM MGID (with partition and low 24 bits masked off). You are right the MLID is not the group. But the patch only creates 1 MGID as well. I think I see where you are coming from but... Let me ask this. (I think I know the answer but I will ask anyway.) If you have 3 MGID's (0xFF...1, 0xFF...2, 0xFF...3, 3 actual mgrp structures in opensm) and you map them all to MLID 0xC001 will a message to 0xC001 reach all 3 nodes? > > > > I would phrase this differently: > > > All IPv6 SNM groups are mapped to a single MLID (when this feature is > > > enabled). > > > > No, all ports join single IPv6 SNM MC group, and yes, it has single MGID > > (and single MLID). > > It does not have a single MGID; it has many MGIDs including the base one > (just look at the group dump). Are you mainly objecting to the fact that the IPoIB client nodes request joining of MGID 0xFF...X and we are joining them to 0xFF...Y? Ira > > -- Hal > > > Sasha > > > > > It so happens that OpenSM internally does the accounting on > > > membership by treating them all as members of the same "base" or > > > "masked" group by masking off partition and the low 24 bits (port GUID). > > > > > > -- Hal > > > > > > > Sasha > > > > > > > > > > > > > > > What I did was to allow the first IPv6 request to create the group and then all > > > > > > other requests were added to this group. > > > > > > > > > > You are using the word group loosely here and that is the source of the > > > > > confusion IMO. I think by group you mean MLID. > > > > > > > > > > > This sends all the neighbor discovery messages to all nodes on the network. > > > > > > > > > > All nodes part of that MLID tree. > > > > > > > > > > > This might seem inefficient but should work. (... and seems to.) > > > > > > > > > > Sure; the hosts will filter based on MGID. The tradeoff is MLID > > > > > utilization versus fabric utilization. > > > > > > > > > > > > > All of the requests for this type of MGRP join are routed to > > > > > > > > one group. Therefore, I thought the same rules for deleting the group would > > > > > > > > apply; when all the members are gone it is removed? > > > > > > > > > > > > > > Yes, the group may go but not the underlying MLID as there are other > > > > > > > groups which are sharing this. That's not what happens now. > > > > > > > > > > > > No, since there is only 1 group in this implementation it should work like > > > > > > others. The first node of this "mgid type" will create the group. Others will > > > > > > join it and will continue to use it even if the creator leaves. > > > > > > > > > > Are you saying all these groups appear as 1 "group" to OpenSM (as the > > > > > real groups are masked to the same value) ? > > > > > > > > > > -- Hal > > > > > > > > > > > Does this make more sense? > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > Just to be clear, after > > > > > > > > this patch the mgroups are: > > > > > > > > > > > > > > > > 09:36:40 > saquery -g > > > > > > > > MCMemberRecord group dump: > > > > > > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > > > > > > Mlid....................0xC000 > > > > > > > > Mtu.....................0x84 > > > > > > > > pkey....................0xFFFF > > > > > > > > Rate....................0x83 > > > > > > > > MCMemberRecord group dump: > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > Mlid....................0xC001 > > > > > > > > Mtu.....................0x84 > > > > > > > > pkey....................0xFFFF > > > > > > > > Rate....................0x83 > > > > > > > > MCMemberRecord group dump: > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > Mlid....................0xC002 > > > > > > > > Mtu.....................0x84 > > > > > > > > pkey....................0xFFFF > > > > > > > > Rate....................0x83 > > > > > > > > MCMemberRecord group dump: > > > > > > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > > > > > > Mlid....................0xC003 > > > > > > > > Mtu.....................0x84 > > > > > > > > pkey....................0xFFFF > > > > > > > > Rate....................0x83 > > > > > > > > > > > > > > > > All of these requests are added to the > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > Mlid....................0xC002 > > > > > > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > > > > > > valid? :-/ > > > > > > > > > > > > > > > > But here is a question: > > > > > > > > > > > > > > > > What happens if someone with an incorrect MTU tries to join the > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > > > > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > > > > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > > > > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > > > > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > > > > > > > > > > > Yes, the join would fail, but I don't think that's what we would want. > > > > > > > The alternative with the patch is to make it the lowest rate but there > > > > > > > is a minimum MTU which might not be right. > > > > > > > > > > > > > > > > I think this is a policy and rather than this always being the case, > > > > > > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > > > > > > default should be to not do this. > > > > > > > > > > > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe more later... > > > > > > > > > > > > > > > > Thanks again, > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > > > > > > From: Ira K. Weiny > > > > > > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > > > > > > Group. > > > > > > > > > > > > > > > > > > > > Signed-off-by: root > > > > > > > > > > --- > > > > > > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > index 8eb97ad..6bcc124 100644 > > > > > > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > the same MGID */ > > > > > > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > > > > > > + > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > > > > > > + > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > + && > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > + > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > + && > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > + ) { > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > + goto match; > > > > > > > > > > + } > > > > > > > > > > + } > > > > > > > > > > return; > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > +match: > > > > > > > > > > if (p_ctxt->p_mgrp) { > > > > > > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > index 749a936..469773a 100644 > > > > > > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > the same MGID */ > > > > > > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > > > > > > + > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > > > > > > + > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > + && > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > + > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > + && > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > + ) { > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > + goto match; > > > > > > > > > > + } > > > > > > > > > > + } > > > > > > > > > > return; > > > > > > > > > > + } > > > > > > > > > > + > > > > > > > > > > +match: > > > > > > > > > > > > > > > > > > > > #if 0 > > > > > > > > > > for (i = 0; > > > > > > > > > > -- > > > > > > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > > > > > > > > > > > Sorry, > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > > > > > > really stupid question but: > > > > > > > > > > > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > > > > > > clusters? > > > > > > > > > > > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > > > > > > down: 0: > > > > > > > > > > > > root at wopri:/tftpboot/images > > > > > > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > > > > > > In 1: wopr3 > > > > > > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > > > > > > In 1: wopr4 > > > > > > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > > > > > > In 1: wopri > > > > > > > > > > > > Out 8: wopr[0-7] > > > > > > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > > > > > > In 1: wopr6 > > > > > > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > > > > > > In 1: wopr7 > > > > > > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > > > > > > In 1: wopr1 > > > > > > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > > > > > > In 1: wopr2 > > > > > > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > > > > > > In 1: wopr0 > > > > > > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > > > > > > In 1: wopr5 > > > > > > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > > > > > > IPoIB? > > > > > > > > > > > > > > > > > > > > > > > > In a bind, > > > > > > > > > > > > Ira > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > general mailing list > > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > general mailing list > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > general mailing list > > > > > > > > general at lists.openfabrics.org > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > _______________________________________________ > > > > > general mailing list > > > > > general at lists.openfabrics.org > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From eli at dev.mellanox.co.il Tue Jan 15 08:55:03 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 15 Jan 2008 18:55:03 +0200 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: <1200402798.11174.279.camel@mtls03> References: <1199980899.11174.91.camel@mtls03> <1200402798.11174.279.camel@mtls03> Message-ID: <1200416103.11174.303.camel@mtls03> On a second thought, we could add - in addition to the creation flags - a special qp type, IB_QPT_UD_IPOIB for example, where each low level driver could add special enhancements for the sake of achieving higher performance. What do you thinks? On Tue, 2008-01-15 at 15:13 +0200, Eli Cohen wrote: > On Mon, 2008-01-14 at 14:18 -0800, Roland Dreier wrote: > > > +enum qp_create_flags { > > > + QP_CREATE_LSO = 1 << 0, > > > +}; > > > + > > > struct ib_qp_init_attr { > > > void (*event_handler)(struct ib_event *, void *); > > > void *qp_context; > > > @@ -496,6 +500,7 @@ struct ib_qp_init_attr { > > > enum ib_sig_type sq_sig_type; > > > enum ib_qp_type qp_type; > > > u8 port_num; /* special QP types only */ > > > + enum qp_create_flags create_flags; > > > }; > > > > Not sure if this approach is a good one... would it make sense to > > create a new QP type like IB_QPT_UD_LSO to handle LSO instead? Are > > there other flags we're going to want to add too? > As Jack replied already, we do need this also for his XRC code. Not > shown in the patch is that the flags representation at the hw level is > different from the verbs (QP_CREATE_LSO at the verbs layer is > MLX4_QP_LSO ). > > > > Also this patch doesn't make much sense without the rest of the LSO > > stuff really. Finally, I think you need to audit all the places where > > struct ib_qp_init_attr is used to make sure the flags are set > > correctly; for example the uverbs_cmd.c create QP function seems like > > it would end up passing a random stack value into create_flags. > > I missed that one. There is one more place that might be a problem and > that is rdma_create_qp which is an exported function which accepts > struct ib_qp_init_attr * as an argument. This means that we need to > either clear the create_flags field or require to the caller to put a > valid value. What do you think? > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Tue Jan 15 09:21:01 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 17:21:01 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/configure.in: complib doesn't have opensm dependencies anymore Message-ID: <20080115172101.GY16009@sashak.voltaire.com> complib doesn't have dependency from opensm libs (libopensm and libosmvendor), so cleanup AC_CHECK_LIB flags. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/configure.in | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/infiniband-diags/configure.in b/infiniband-diags/configure.in index 1baa6cb..5cf72ec 100644 --- a/infiniband-diags/configure.in +++ b/infiniband-diags/configure.in @@ -41,7 +41,7 @@ AC_CHECK_LIB(ibmad, mad_dump_int, [], AC_CHECK_LIB(ibmad, port_performance_ext_query, [], AC_MSG_ERROR([port_performance_ext_query() not found. diags require more recent libibmad.])) AC_CHECK_LIB(osmcomp, cl_thread_init, [], - AC_MSG_ERROR([cl_thread_init() not found. diags require libosmcomp.]), [-lopensm -losmvendor]) + AC_MSG_ERROR([cl_thread_init() not found. diags require libosmcomp.])) AC_CHECK_LIB(osmvendor, osmv_query_sa, [], AC_MSG_ERROR([osmv_query_sa() not found. diags require libosmvendor.]), [-lopensm]) AC_CHECK_LIB(opensm, osm_log_init_v2, [], -- 1.5.4.rc2.60.gb2e62 From hrosenstock at xsigo.com Tue Jan 15 09:37:18 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Tue, 15 Jan 2008 09:37:18 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080115084700.1d46407c.weiny2@llnl.gov> References: <20080111193657.58477fb0.weiny2@llnl.gov> <20080111220456.0d62de97.weiny2@llnl.gov> <20080112000117.6b52b53c.weiny2@llnl.gov> <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> <20080115014355.GK16009@sashak.voltaire.com> <1200361233.8962.206.camel@hrosenstock-ws.xsigo.com> <20080115084700.1d46407c.weiny2@llnl.gov> Message-ID: <1200418638.8962.328.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-15 at 08:47 -0800, Ira Weiny wrote: > On Mon, 14 Jan 2008 17:40:33 -0800 > Hal Rosenstock wrote: > > > On Tue, 2008-01-15 at 01:43 +0000, Sasha Khapyorsky wrote: > > > On 17:22 Mon 14 Jan , Hal Rosenstock wrote: > > > > On Tue, 2008-01-15 at 00:50 +0000, Sasha Khapyorsky wrote: > > > > > On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > > > > > > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > > > > > > On Mon, 14 Jan 2008 12:23:34 -0800 > > > > > > > Hal Rosenstock wrote: > > > > > > > > > > > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > > > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > > > > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > > > > > > "Hal Rosenstock" wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > > > > > > > > > > > > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > > > > > > > see now; that is what was proposed in the thread a year ago, however, I don't > > > > > > > think mapping many MGIDs to 1 MLID will work well. > > > > > > > > > > > > Why not ? > > > > > > > > > > > > It appears to be what you did (multiple MGIDs are mapped onto MLID (in > > > > > > the case below 0xc002)). Am I mistaken ? > > > > > > > > > > As far as I understand this patch it is the different. Here multiple > > > > > ports which match ipv6 solicited node multicast address will try to > > > > > join a single MC group (with single MGID and unique MLID). > > > > > > > > I don't think you are using the IBA defined terminology. > > > > > > > > A MC group is an MGID in terms of the IBA spec. Also, the SA GetTable > > > > with MGIDs wildcarded shows all the MGIDs. (Does it show that "special" > > > > MGID ?) > > > > > > Yes, it has MLID 0xc002 in Ira's 'saquery -g' example. > > > > No, MLID is not the group (at least in IBA terms); I was referring to > > the base SNM MGID (with partition and low 24 bits masked off). > > You are right the MLID is not the group. But the patch only creates 1 MGID as > well. I think you are confusing internal implementation with the outside view of what is going on. > I think I see where you are coming from but... > > Let me ask this. (I think I know the answer but I will ask anyway.) If you > have 3 MGID's (0xFF...1, 0xFF...2, 0xFF...3, 3 actual mgrp structures in > opensm) and you map them all to MLID 0xC001 will a message to 0xC001 reach all > 3 nodes? Ignoring the partitioning (and assuming the rates and MTUs are all the same), then yes. Let me ask you this: Do all IPv6 SNM MGIDs show up when you do SA GetTable for groups ? > > > > I would phrase this differently: > > > > All IPv6 SNM groups are mapped to a single MLID (when this feature is > > > > enabled). > > > > > > No, all ports join single IPv6 SNM MC group, and yes, it has single MGID > > > (and single MLID). > > > > It does not have a single MGID; it has many MGIDs including the base one > > (just look at the group dump). > > Are you mainly objecting to the fact that the IPoIB client nodes request > joining of MGID 0xFF...X and we are joining them to 0xFF...Y? No; my concerns are: 1. Your's and Sasha's categorization is wrong in that many MGIDs are mapped to a single MLID (it is MLID overloading which is allowed by the IBA spec) 2. I think that the rate and MTU issue will rear it's ugly head just like has occurred and still occurs for the IP broadcast group but this one is even more subtle 3. It is a spec violation in terms of the current (and previous versions of the) IBA spec. -- Hal > Ira > > > > > -- Hal > > > > > Sasha > > > > > > > It so happens that OpenSM internally does the accounting on > > > > membership by treating them all as members of the same "base" or > > > > "masked" group by masking off partition and the low 24 bits (port GUID). > > > > > > > > -- Hal > > > > > > > > > Sasha > > > > > > > > > > > > > > > > > > What I did was to allow the first IPv6 request to create the group and then all > > > > > > > other requests were added to this group. > > > > > > > > > > > > You are using the word group loosely here and that is the source of the > > > > > > confusion IMO. I think by group you mean MLID. > > > > > > > > > > > > > This sends all the neighbor discovery messages to all nodes on the network. > > > > > > > > > > > > All nodes part of that MLID tree. > > > > > > > > > > > > > This might seem inefficient but should work. (... and seems to.) > > > > > > > > > > > > Sure; the hosts will filter based on MGID. The tradeoff is MLID > > > > > > utilization versus fabric utilization. > > > > > > > > > > > > > > > All of the requests for this type of MGRP join are routed to > > > > > > > > > one group. Therefore, I thought the same rules for deleting the group would > > > > > > > > > apply; when all the members are gone it is removed? > > > > > > > > > > > > > > > > Yes, the group may go but not the underlying MLID as there are other > > > > > > > > groups which are sharing this. That's not what happens now. > > > > > > > > > > > > > > No, since there is only 1 group in this implementation it should work like > > > > > > > others. The first node of this "mgid type" will create the group. Others will > > > > > > > join it and will continue to use it even if the creator leaves. > > > > > > > > > > > > Are you saying all these groups appear as 1 "group" to OpenSM (as the > > > > > > real groups are masked to the same value) ? > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > Does this make more sense? > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > Just to be clear, after > > > > > > > > > this patch the mgroups are: > > > > > > > > > > > > > > > > > > 09:36:40 > saquery -g > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > > > > > > > Mlid....................0xC000 > > > > > > > > > Mtu.....................0x84 > > > > > > > > > pkey....................0xFFFF > > > > > > > > > Rate....................0x83 > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > > Mlid....................0xC001 > > > > > > > > > Mtu.....................0x84 > > > > > > > > > pkey....................0xFFFF > > > > > > > > > Rate....................0x83 > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > > Mlid....................0xC002 > > > > > > > > > Mtu.....................0x84 > > > > > > > > > pkey....................0xFFFF > > > > > > > > > Rate....................0x83 > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > > > > > > > Mlid....................0xC003 > > > > > > > > > Mtu.....................0x84 > > > > > > > > > pkey....................0xFFFF > > > > > > > > > Rate....................0x83 > > > > > > > > > > > > > > > > > > All of these requests are added to the > > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > > Mlid....................0xC002 > > > > > > > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > > > > > > > valid? :-/ > > > > > > > > > > > > > > > > > > But here is a question: > > > > > > > > > > > > > > > > > > What happens if someone with an incorrect MTU tries to join the > > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > > > > > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > > > > > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > > > > > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > > > > > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > > > > > > > > > > > > > Yes, the join would fail, but I don't think that's what we would want. > > > > > > > > The alternative with the patch is to make it the lowest rate but there > > > > > > > > is a minimum MTU which might not be right. > > > > > > > > > > > > > > > > > > I think this is a policy and rather than this always being the case, > > > > > > > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > > > > > > > default should be to not do this. > > > > > > > > > > > > > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe more later... > > > > > > > > > > > > > > > > > > Thanks again, > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > > > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > > > > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > > > > > > > From: Ira K. Weiny > > > > > > > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > > > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > > > > > > > Group. > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: root > > > > > > > > > > > --- > > > > > > > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > > > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > > > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > index 8eb97ad..6bcc124 100644 > > > > > > > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > > the same MGID */ > > > > > > > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > > > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > > > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > > > > > > > + > > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > > > > > > > + > > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > > + && > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > > + > > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > > + && > > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > > + ) { > > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > > + goto match; > > > > > > > > > > > + } > > > > > > > > > > > + } > > > > > > > > > > > return; > > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > > > +match: > > > > > > > > > > > if (p_ctxt->p_mgrp) { > > > > > > > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > > > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > index 749a936..469773a 100644 > > > > > > > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > > the same MGID */ > > > > > > > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > > > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > > > > > > > + > > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > > > > > > > + > > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > > + && > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > > + > > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > > + && > > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > > + ) { > > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > > + goto match; > > > > > > > > > > > + } > > > > > > > > > > > + } > > > > > > > > > > > return; > > > > > > > > > > > + } > > > > > > > > > > > + > > > > > > > > > > > +match: > > > > > > > > > > > > > > > > > > > > > > #if 0 > > > > > > > > > > > for (i = 0; > > > > > > > > > > > -- > > > > > > > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > > > > > > > > > > > > > Sorry, > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > > > > > > > really stupid question but: > > > > > > > > > > > > > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > > > > > > > clusters? > > > > > > > > > > > > > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > > > > > > > down: 0: > > > > > > > > > > > > > root at wopri:/tftpboot/images > > > > > > > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > > > > > > > In 1: wopr3 > > > > > > > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > > > > > > > In 1: wopr4 > > > > > > > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > > > > > > > In 1: wopri > > > > > > > > > > > > > Out 8: wopr[0-7] > > > > > > > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > > > > > > > In 1: wopr6 > > > > > > > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > > > > > > > In 1: wopr7 > > > > > > > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > > > > > > > In 1: wopr1 > > > > > > > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > > > > > > > In 1: wopr2 > > > > > > > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > > > > > > > In 1: wopr0 > > > > > > > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > > > > > > > In 1: wopr5 > > > > > > > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > > > > > > > IPoIB? > > > > > > > > > > > > > > > > > > > > > > > > > > In a bind, > > > > > > > > > > > > > Ira > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > > general mailing list > > > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > general mailing list > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > general mailing list > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > _______________________________________________ > > > > > > general mailing list > > > > > > general at lists.openfabrics.org > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From weiny2 at llnl.gov Tue Jan 15 09:37:23 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 15 Jan 2008 09:37:23 -0800 Subject: [ofa-general] [PATCH 3/3] Add option to Special Case the IPv6 Solicited Node Multicast address into a single Mcast Group In-Reply-To: <1200364303.8962.211.camel@hrosenstock-ws.xsigo.com> References: <20080114114530.22b15f58.weiny2@llnl.gov> <1200364303.8962.211.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080115093723.09a688fd.weiny2@llnl.gov> On Mon, 14 Jan 2008 18:31:43 -0800 Hal Rosenstock wrote: > On Mon, 2008-01-14 at 11:45 -0800, Ira Weiny wrote: > > >From a1d38895e7e34e9fec297b1dbdb0637ed858d6f0 Mon Sep 17 00:00:00 2001 > > From: Ira K. Weiny > > Date: Sun, 13 Jan 2008 16:03:31 -0800 > > Subject: [PATCH] Add option to Special Case the IPv6 Solicited Node Multicast address into a single Mcast Group > > > > > > Signed-off-by: Ira K. Weiny > > --- > > opensm/include/opensm/osm_subnet.h | 1 + > > opensm/man/opensm.8 | 4 +++ > > opensm/opensm/main.c | 4 +++ > > opensm/opensm/osm_sa_mcmember_record.c | 35 +++++++++++++++++++++++++++++++- > > opensm/opensm/osm_subnet.c | 9 ++++++++ > > 5 files changed, 52 insertions(+), 1 deletions(-) > > > > diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h > > index 2a28045..558b34e 100644 > > --- a/opensm/include/opensm/osm_subnet.h > > +++ b/opensm/include/opensm/osm_subnet.h > > @@ -283,6 +283,7 @@ typedef struct _osm_subn_opt { > > char *event_plugin_name; > > char *node_name_map_name; > > char *prefix_routes_file; > > + boolean_t consolodate_ipv6_snm_req; > > Nit: in all of the this, consolodate -> consolidate Oops... I will fix... :-) > > > } osm_subn_opt_t; > > /* > > * FIELDS > > diff --git a/opensm/man/opensm.8 b/opensm/man/opensm.8 > > index 475eeec..9c7b371 100644 > > --- a/opensm/man/opensm.8 > > +++ b/opensm/man/opensm.8 > > @@ -239,6 +239,10 @@ Specify the sweep time for the performance manager in seconds > > (default is 180 seconds). Only takes > > effect if --enable-perfmgr was specified at configure time. > > .TP > > +.BI --consolodate_ipv6_snm_reqests > > +Consolodate IPv6 Solicited Node Multicast group joins into 1 IB multicast > > +group. > > +.TP > > \fB\-v\fR, \fB\-\-verbose\fR > > This option increases the log verbosity level. > > The -v option may be specified multiple times > > diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c > > index 4d0d51d..a84f6c2 100644 > > --- a/opensm/opensm/main.c > > +++ b/opensm/opensm/main.c > > @@ -615,6 +615,7 @@ int main(int argc, char *argv[]) > > {"perfmgr_sweep_time_s", 1, NULL, 2}, > > #endif > > {"prefix_routes_file", 1, NULL, 3}, > > + {"consolodate_ipv6_snm_reqests", 0, NULL, 4}, > > {NULL, 0, NULL, 0} /* Required at the end of the array */ > > }; > > > > @@ -916,6 +917,9 @@ int main(int argc, char *argv[]) > > case 3: > > opt.prefix_routes_file = optarg; > > break; > > + case 4: > > + opt.consolodate_ipv6_snm_req = TRUE; > > + break; > > case 'h': > > case '?': > > case ':': > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > index d37a655..bfa5d2d 100644 > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > @@ -1167,9 +1167,42 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > /* compare entire MGID so different scope will not sneak in for > > the same MGID */ > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > + > > + if (sa->p_subn->opt.consolodate_ipv6_snm_req) { > > + /* Special Case IPV6 Multicast Loopback addresses */ > > + /* 0xff12601bXXXX0000 : 0x00000001ffYYYYYY */ > > + /* Where XXXX is the partition and YYYYYY is the last 24 bits > > + * of the port guid */ > > Masking off the partition is counter to IBA 1.2.1 vol 1 p. 151 10) which > states: > > "When a multicast LID is overloaded, the multicast groups > sharing the same MLID must have the same P_Key. This simplification > is required to allow switches and routers that implement optional > P_Key enforcement for multicast operations." I believe this will create a different group (MGID, group, and MLID) for each partition which comes in. The first check ignores the partition to see if the MGID is "special". But the second. "if ((g_prefix == rcv_prefix)" will separate out the partitions. I should test this. Ira > > -- Hal > > > +#define PREFIX_MASK (0xff12601b00000000) > > +#define INT_ID_MASK (0x00000001ff000000) > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > + > > + if (((rcv_prefix & PREFIX_MASK) == PREFIX_MASK) > > + && > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > + > > + if ((g_prefix == rcv_prefix) > > + && > > + (g_interface_id & INT_ID_MASK) == > > + (rcv_interface_id & INT_ID_MASK) > > + ) { > > + osm_log(sa->p_log, OSM_LOG_INFO, > > + "Special Case Mcast Join for MGID " > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > + rcv_prefix, rcv_interface_id); > > + goto match; > > + } > > + } > > + } > > + > > return; > > + } > > > > +match: > > if (p_ctxt->p_mgrp) { > > osm_log(sa->p_log, OSM_LOG_ERROR, > > "__search_mgrp_by_mgid: ERR 1F08: " > > diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c > > index 0103940..558ea68 100644 > > --- a/opensm/opensm/osm_subnet.c > > +++ b/opensm/opensm/osm_subnet.c > > @@ -481,6 +481,7 @@ void osm_subn_set_default_opt(IN osm_subn_opt_t * const p_opt) > > p_opt->enable_quirks = FALSE; > > p_opt->no_clients_rereg = FALSE; > > p_opt->prefix_routes_file = OSM_DEFAULT_PREFIX_ROUTES_FILE; > > + p_opt->consolodate_ipv6_snm_req = FALSE; > > subn_set_default_qos_options(&p_opt->qos_options); > > subn_set_default_qos_options(&p_opt->qos_ca_options); > > subn_set_default_qos_options(&p_opt->qos_sw0_options); > > @@ -1394,6 +1395,9 @@ ib_api_status_t osm_subn_parse_conf_file(IN osm_subn_opt_t * const p_opts) > > > > opts_unpack_charp("prefix_routes_file", > > p_key, p_val, &p_opts->prefix_routes_file); > > + > > + opts_unpack_boolean("consolodate_ipv6_snm_req", > > + p_key, p_val, &p_opts->consolodate_ipv6_snm_req); > > } > > fclose(opts_file); > > > > @@ -1721,6 +1725,11 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) > > "prefix_routes_file %s\n\n", > > p_opts->prefix_routes_file); > > > > + fprintf(opts_file, > > + "#\n# IPv6 MCast Options\n#\n" > > + "consolodate_ipv6_snm_req %s\n\n", > > + p_opts->consolodate_ipv6_snm_req ? "TRUE" : "FALSE"); > > + > > /* optional string attributes ... */ > > > > fclose(opts_file); > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Tue Jan 15 09:42:40 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Tue, 15 Jan 2008 09:42:40 -0800 Subject: [ofa-general] [PATCH 3/3] Add option to Special Case the IPv6 Solicited Node Multicast address into a single Mcast Group In-Reply-To: <20080115093723.09a688fd.weiny2@llnl.gov> References: <20080114114530.22b15f58.weiny2@llnl.gov> <1200364303.8962.211.camel@hrosenstock-ws.xsigo.com> <20080115093723.09a688fd.weiny2@llnl.gov> Message-ID: <1200418960.8962.334.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-15 at 09:37 -0800, Ira Weiny wrote: > On Mon, 14 Jan 2008 18:31:43 -0800 > Hal Rosenstock wrote: > > > On Mon, 2008-01-14 at 11:45 -0800, Ira Weiny wrote: > > > >From a1d38895e7e34e9fec297b1dbdb0637ed858d6f0 Mon Sep 17 00:00:00 2001 > > > From: Ira K. Weiny > > > Date: Sun, 13 Jan 2008 16:03:31 -0800 > > > Subject: [PATCH] Add option to Special Case the IPv6 Solicited Node Multicast address into a single Mcast Group > > > > > > > > > Signed-off-by: Ira K. Weiny > > > --- > > > opensm/include/opensm/osm_subnet.h | 1 + > > > opensm/man/opensm.8 | 4 +++ > > > opensm/opensm/main.c | 4 +++ > > > opensm/opensm/osm_sa_mcmember_record.c | 35 +++++++++++++++++++++++++++++++- > > > opensm/opensm/osm_subnet.c | 9 ++++++++ > > > 5 files changed, 52 insertions(+), 1 deletions(-) > > > > > > diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h > > > index 2a28045..558b34e 100644 > > > --- a/opensm/include/opensm/osm_subnet.h > > > +++ b/opensm/include/opensm/osm_subnet.h > > > @@ -283,6 +283,7 @@ typedef struct _osm_subn_opt { > > > char *event_plugin_name; > > > char *node_name_map_name; > > > char *prefix_routes_file; > > > + boolean_t consolodate_ipv6_snm_req; > > > > Nit: in all of the this, consolodate -> consolidate > > Oops... I will fix... :-) > > > > > > } osm_subn_opt_t; > > > /* > > > * FIELDS > > > diff --git a/opensm/man/opensm.8 b/opensm/man/opensm.8 > > > index 475eeec..9c7b371 100644 > > > --- a/opensm/man/opensm.8 > > > +++ b/opensm/man/opensm.8 > > > @@ -239,6 +239,10 @@ Specify the sweep time for the performance manager in seconds > > > (default is 180 seconds). Only takes > > > effect if --enable-perfmgr was specified at configure time. > > > .TP > > > +.BI --consolodate_ipv6_snm_reqests > > > +Consolodate IPv6 Solicited Node Multicast group joins into 1 IB multicast > > > +group. > > > +.TP > > > \fB\-v\fR, \fB\-\-verbose\fR > > > This option increases the log verbosity level. > > > The -v option may be specified multiple times > > > diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c > > > index 4d0d51d..a84f6c2 100644 > > > --- a/opensm/opensm/main.c > > > +++ b/opensm/opensm/main.c > > > @@ -615,6 +615,7 @@ int main(int argc, char *argv[]) > > > {"perfmgr_sweep_time_s", 1, NULL, 2}, > > > #endif > > > {"prefix_routes_file", 1, NULL, 3}, > > > + {"consolodate_ipv6_snm_reqests", 0, NULL, 4}, > > > {NULL, 0, NULL, 0} /* Required at the end of the array */ > > > }; > > > > > > @@ -916,6 +917,9 @@ int main(int argc, char *argv[]) > > > case 3: > > > opt.prefix_routes_file = optarg; > > > break; > > > + case 4: > > > + opt.consolodate_ipv6_snm_req = TRUE; > > > + break; > > > case 'h': > > > case '?': > > > case ':': > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > index d37a655..bfa5d2d 100644 > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > @@ -1167,9 +1167,42 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > /* compare entire MGID so different scope will not sneak in for > > > the same MGID */ > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > + > > > + if (sa->p_subn->opt.consolodate_ipv6_snm_req) { > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > + /* 0xff12601bXXXX0000 : 0x00000001ffYYYYYY */ > > > + /* Where XXXX is the partition and YYYYYY is the last 24 bits > > > + * of the port guid */ > > > > Masking off the partition is counter to IBA 1.2.1 vol 1 p. 151 10) which > > states: > > > > "When a multicast LID is overloaded, the multicast groups > > sharing the same MLID must have the same P_Key. This simplification > > is required to allow switches and routers that implement optional > > P_Key enforcement for multicast operations." > > I believe this will create a different group (MGID, group, and MLID) for each > partition which comes in. The first check ignores the partition to see if the > MGID is "special". But the second. "if ((g_prefix == rcv_prefix)" will > separate out the partitions. Even if it doesn't, making it do so is a better approach and removes this as an issue as far as I am concerned. -- Hal > I should test this. > > Ira > > > > > -- Hal > > > > > +#define PREFIX_MASK (0xff12601b00000000) > > > +#define INT_ID_MASK (0x00000001ff000000) > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > + > > > + if (((rcv_prefix & PREFIX_MASK) == PREFIX_MASK) > > > + && > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > + > > > + if ((g_prefix == rcv_prefix) > > > + && > > > + (g_interface_id & INT_ID_MASK) == > > > + (rcv_interface_id & INT_ID_MASK) > > > + ) { > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > + "Special Case Mcast Join for MGID " > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > + rcv_prefix, rcv_interface_id); > > > + goto match; > > > + } > > > + } > > > + } > > > + > > > return; > > > + } > > > > > > +match: > > > if (p_ctxt->p_mgrp) { > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > "__search_mgrp_by_mgid: ERR 1F08: " > > > diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c > > > index 0103940..558ea68 100644 > > > --- a/opensm/opensm/osm_subnet.c > > > +++ b/opensm/opensm/osm_subnet.c > > > @@ -481,6 +481,7 @@ void osm_subn_set_default_opt(IN osm_subn_opt_t * const p_opt) > > > p_opt->enable_quirks = FALSE; > > > p_opt->no_clients_rereg = FALSE; > > > p_opt->prefix_routes_file = OSM_DEFAULT_PREFIX_ROUTES_FILE; > > > + p_opt->consolodate_ipv6_snm_req = FALSE; > > > subn_set_default_qos_options(&p_opt->qos_options); > > > subn_set_default_qos_options(&p_opt->qos_ca_options); > > > subn_set_default_qos_options(&p_opt->qos_sw0_options); > > > @@ -1394,6 +1395,9 @@ ib_api_status_t osm_subn_parse_conf_file(IN osm_subn_opt_t * const p_opts) > > > > > > opts_unpack_charp("prefix_routes_file", > > > p_key, p_val, &p_opts->prefix_routes_file); > > > + > > > + opts_unpack_boolean("consolodate_ipv6_snm_req", > > > + p_key, p_val, &p_opts->consolodate_ipv6_snm_req); > > > } > > > fclose(opts_file); > > > > > > @@ -1721,6 +1725,11 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) > > > "prefix_routes_file %s\n\n", > > > p_opts->prefix_routes_file); > > > > > > + fprintf(opts_file, > > > + "#\n# IPv6 MCast Options\n#\n" > > > + "consolodate_ipv6_snm_req %s\n\n", > > > + p_opts->consolodate_ipv6_snm_req ? "TRUE" : "FALSE"); > > > + > > > /* optional string attributes ... */ > > > > > > fclose(opts_file); > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From mark.coonihan at ventisquero.com Tue Jan 15 09:43:35 2008 From: mark.coonihan at ventisquero.com (mark.coonihan at ventisquero.com) Date: Tue, 15 Jan 2008 12:43:35 -0500 Subject: [ofa-general] Special Romance Message-ID: <478CF0C7.3030607@ventisquero.com> Magic Power Of Love http://69.236.21.121/ From sashak at voltaire.com Tue Jan 15 10:21:42 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 18:21:42 +0000 Subject: [ofa-general] [PATCH] opensm/perfmgr: use pkey at index 0 In-Reply-To: <20080113193559.GH10650@sashak.voltaire.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> Message-ID: <20080115182142.GC16009@sashak.voltaire.com> Use pkey at index 0 of port's pkey table, now this value is passed to user_mad. For some reason the old code (where 0xffff was passed) worked too. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_perfmgr.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index 76ef080..66d919d 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -399,7 +399,7 @@ osm_perfmgr_send_pc_mad(osm_perfmgr_t * perfmgr, ib_net16_t dest_lid, p_madw->mad_addr.addr_type.gsi.remote_qkey = cl_hton32(IB_QP1_WELL_KNOWN_Q_KEY); /* FIXME what about other partitions */ - p_madw->mad_addr.addr_type.gsi.pkey = cl_hton16(0xFFFF); + p_madw->mad_addr.addr_type.gsi.pkey = 0; p_madw->mad_addr.addr_type.gsi.service_level = 0; p_madw->mad_addr.addr_type.gsi.global_route = FALSE; p_madw->resp_expected = TRUE; -- 1.5.4.rc2.60.gb2e62 From sean.hefty at intel.com Tue Jan 15 10:03:27 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 15 Jan 2008 10:03:27 -0800 Subject: [ofa-general] OFED Jan 14 meeting summary on RC2 readiness In-Reply-To: <6C2C79E72C305246B504CBA17B5500C9031E5E56@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C9031E5E56@mtlexch01.mtl.com> Message-ID: <000801c857a0$ee083bc0$a937170a@amr.corp.intel.com> >Meeting decisions: >------------------ >* RC2 will be released on Wednesday >* XRC will be added after this - meaning we must have RC4 If XRC hasn't been added, then this isn't an 'RC', it's simply an alpha release, with another alpha release to follow. IMO, XRC should simply be pushed out to the next release. I don't see the reason to delay the release for a non-standard feature that likely has only 1-2 users. Maybe the EWG needs to ask themselves if OFED trying to distribute production ready, enterprise tested software, or software that simply provides the latest and greatest features. - Sean From sean.hefty at intel.com Tue Jan 15 10:06:18 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 15 Jan 2008 10:06:18 -0800 Subject: [ofa-general] RE: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: <6C2C79E72C305246B504CBA17B5500C9031E5D23@mtlexch01.mtl.com> References: <1199980899.11174.91.camel@mtls03> <1200402798.11174.279.camel@mtls03> <6C2C79E72C305246B504CBA17B5500C9031E5D23@mtlexch01.mtl.com> Message-ID: <000901c857a1$538315b0$a937170a@amr.corp.intel.com> >- define it with an initializer (which places zero in all unspecified fields). > >Therefore, currently there is no problem. > >We might put in a note that callers should always cause unused fields of the >ib_qp_init_attr >structure to be zero. This doesn't help when adding fields... From sashak at voltaire.com Tue Jan 15 11:05:02 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 19:05:02 +0000 Subject: [ofa-general] [PATCH] libibumad: increase the version of the library In-Reply-To: <20080115141013.GM16009@sashak.voltaire.com> References: <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <20080115141013.GM16009@sashak.voltaire.com> Message-ID: <20080115190502.GE16009@sashak.voltaire.com> Signed-off-by: Sasha Khapyorsky --- libibumad/libibumad.ver | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/libibumad/libibumad.ver b/libibumad/libibumad.ver index 7bdd1a7..21cf1ed 100644 --- a/libibumad/libibumad.ver +++ b/libibumad/libibumad.ver @@ -6,4 +6,4 @@ # API_REV - advance on any added API # RUNNING_REV - advance any change to the vendor files # AGE - number of backward versions the API still supports -LIBVERSION=1:2:0 +LIBVERSION=1:3:0 -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Tue Jan 15 11:05:40 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 19:05:40 +0000 Subject: [ofa-general] [PATCH] libibumad/man: umad_get_pkey man page In-Reply-To: <20080115190502.GE16009@sashak.voltaire.com> References: <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <20080115141013.GM16009@sashak.voltaire.com> <20080115190502.GE16009@sashak.voltaire.com> Message-ID: <20080115190540.GF16009@sashak.voltaire.com> Signed-off-by: Sasha Khapyorsky --- libibumad/Makefile.am | 1 + libibumad/man/umad_get_pkey.3 | 23 +++++++++++++++++++++++ 2 files changed, 24 insertions(+), 0 deletions(-) create mode 100644 libibumad/man/umad_get_pkey.3 diff --git a/libibumad/Makefile.am b/libibumad/Makefile.am index 5b8a69a..f014f57 100644 --- a/libibumad/Makefile.am +++ b/libibumad/Makefile.am @@ -12,6 +12,7 @@ man_MANS = man/umad_debug.3 man/umad_get_ca.3 \ man/umad_get_mad.3 man/umad_get_mad_addr.3 \ man/umad_set_grh_net.3 man/umad_set_grh.3 \ man/umad_set_addr_net.3 man/umad_set_addr.3 man/umad_set_pkey.3 \ + man/umad_get_pkey.3 \ man/umad_register.3 man/umad_register_oui.3 man/umad_unregister.3 \ man/umad_send.3 man/umad_recv.3 man/umad_poll.3 \ man/umad_get_issm_path.3 diff --git a/libibumad/man/umad_get_pkey.3 b/libibumad/man/umad_get_pkey.3 new file mode 100644 index 0000000..a03c0a6 --- /dev/null +++ b/libibumad/man/umad_get_pkey.3 @@ -0,0 +1,23 @@ +.\" -*- nroff -*- +.\" +.TH UMAD_GET_PKEY 3 "Jan 15, 2008" "OpenIB" "OpenIB Programmer\'s Manual" +.SH "NAME" +umad_get_pkey \- get pkey index from umad buffer +.SH "SYNOPSIS" +.nf +.B #include +.sp +.BI "int umad_get_pkey(void " "*umad"); +.fi +.SH "DESCRIPTION" +.B umad_get_pkey() +gets the pkey index from the specified +.I umad\fR +buffer. +.SH "RETURN VALUE" +.B umad_get_pkey() +returns value of pkey index (or zero if pkey index is not supported by +user_mad interface). +.SH "AUTHOR" +.TP +Sasha Khapyorsky -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Tue Jan 15 11:20:10 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 19:20:10 +0000 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <1200418638.8962.328.camel@hrosenstock-ws.xsigo.com> References: <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> <20080115014355.GK16009@sashak.voltaire.com> <1200361233.8962.206.camel@hrosenstock-ws.xsigo.com> <20080115084700.1d46407c.weiny2@llnl.gov> <1200418638.8962.328.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080115192010.GG16009@sashak.voltaire.com> On 09:37 Tue 15 Jan , Hal Rosenstock wrote: > On Tue, 2008-01-15 at 08:47 -0800, Ira Weiny wrote: > > On Mon, 14 Jan 2008 17:40:33 -0800 > > Hal Rosenstock wrote: > > > > > On Tue, 2008-01-15 at 01:43 +0000, Sasha Khapyorsky wrote: > > > > On 17:22 Mon 14 Jan , Hal Rosenstock wrote: > > > > > On Tue, 2008-01-15 at 00:50 +0000, Sasha Khapyorsky wrote: > > > > > > On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > > > > > > > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > > > > > > > On Mon, 14 Jan 2008 12:23:34 -0800 > > > > > > > > Hal Rosenstock wrote: > > > > > > > > > > > > > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > > > > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > > > > > > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > > > > > > > "Hal Rosenstock" wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > > > > > > > > > > > > > > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > > > > > > > > see now; that is what was proposed in the thread a year ago, however, I don't > > > > > > > > think mapping many MGIDs to 1 MLID will work well. > > > > > > > > > > > > > > Why not ? > > > > > > > > > > > > > > It appears to be what you did (multiple MGIDs are mapped onto MLID (in > > > > > > > the case below 0xc002)). Am I mistaken ? > > > > > > > > > > > > As far as I understand this patch it is the different. Here multiple > > > > > > ports which match ipv6 solicited node multicast address will try to > > > > > > join a single MC group (with single MGID and unique MLID). > > > > > > > > > > I don't think you are using the IBA defined terminology. > > > > > > > > > > A MC group is an MGID in terms of the IBA spec. Also, the SA GetTable > > > > > with MGIDs wildcarded shows all the MGIDs. (Does it show that "special" > > > > > MGID ?) > > > > > > > > Yes, it has MLID 0xc002 in Ira's 'saquery -g' example. > > > > > > No, MLID is not the group (at least in IBA terms); I was referring to > > > the base SNM MGID (with partition and low 24 bits masked off). > > > > You are right the MLID is not the group. But the patch only creates 1 MGID as > > well. > > I think you are confusing internal implementation with the outside view > of what is going on. > > > I think I see where you are coming from but... > > > > Let me ask this. (I think I know the answer but I will ask anyway.) If you > > have 3 MGID's (0xFF...1, 0xFF...2, 0xFF...3, 3 actual mgrp structures in > > opensm) and you map them all to MLID 0xC001 will a message to 0xC001 reach all > > 3 nodes? > > Ignoring the partitioning (and assuming the rates and MTUs are all the > same), then yes. > > Let me ask you this: > Do all IPv6 SNM MGIDs show up when you do SA GetTable for groups ? Let me answer. No. Only first MGID will be shown. I guess it is wrong in terms of IBTA, but since whole feature is optional I don't think it is a disaster (rather known limitation), and ipv6 not working with big clusters is. Sasha > > > > > > I would phrase this differently: > > > > > All IPv6 SNM groups are mapped to a single MLID (when this feature is > > > > > enabled). > > > > > > > > No, all ports join single IPv6 SNM MC group, and yes, it has single MGID > > > > (and single MLID). > > > > > > It does not have a single MGID; it has many MGIDs including the base one > > > (just look at the group dump). > > > > Are you mainly objecting to the fact that the IPoIB client nodes request > > joining of MGID 0xFF...X and we are joining them to 0xFF...Y? > > No; my concerns are: > 1. Your's and Sasha's categorization is wrong in that many MGIDs are > mapped to a single MLID (it is MLID overloading which is allowed by the > IBA spec) > 2. I think that the rate and MTU issue will rear it's ugly head just > like has occurred and still occurs for the IP broadcast group but this > one is even more subtle > 3. It is a spec violation in terms of the current (and previous versions > of the) IBA spec. > > -- Hal > > > Ira > > > > > > > > -- Hal > > > > > > > Sasha > > > > > > > > > It so happens that OpenSM internally does the accounting on > > > > > membership by treating them all as members of the same "base" or > > > > > "masked" group by masking off partition and the low 24 bits (port GUID). > > > > > > > > > > -- Hal > > > > > > > > > > > Sasha > > > > > > > > > > > > > > > > > > > > > What I did was to allow the first IPv6 request to create the group and then all > > > > > > > > other requests were added to this group. > > > > > > > > > > > > > > You are using the word group loosely here and that is the source of the > > > > > > > confusion IMO. I think by group you mean MLID. > > > > > > > > > > > > > > > This sends all the neighbor discovery messages to all nodes on the network. > > > > > > > > > > > > > > All nodes part of that MLID tree. > > > > > > > > > > > > > > > This might seem inefficient but should work. (... and seems to.) > > > > > > > > > > > > > > Sure; the hosts will filter based on MGID. The tradeoff is MLID > > > > > > > utilization versus fabric utilization. > > > > > > > > > > > > > > > > > All of the requests for this type of MGRP join are routed to > > > > > > > > > > one group. Therefore, I thought the same rules for deleting the group would > > > > > > > > > > apply; when all the members are gone it is removed? > > > > > > > > > > > > > > > > > > Yes, the group may go but not the underlying MLID as there are other > > > > > > > > > groups which are sharing this. That's not what happens now. > > > > > > > > > > > > > > > > No, since there is only 1 group in this implementation it should work like > > > > > > > > others. The first node of this "mgid type" will create the group. Others will > > > > > > > > join it and will continue to use it even if the creator leaves. > > > > > > > > > > > > > > Are you saying all these groups appear as 1 "group" to OpenSM (as the > > > > > > > real groups are masked to the same value) ? > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > Does this make more sense? > > > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > Just to be clear, after > > > > > > > > > > this patch the mgroups are: > > > > > > > > > > > > > > > > > > > > 09:36:40 > saquery -g > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > > > > > > > > Mlid....................0xC000 > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > Rate....................0x83 > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > > > Mlid....................0xC001 > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > Rate....................0x83 > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > > > Mlid....................0xC002 > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > Rate....................0x83 > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > > > > > > > > Mlid....................0xC003 > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > Rate....................0x83 > > > > > > > > > > > > > > > > > > > > All of these requests are added to the > > > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > > > Mlid....................0xC002 > > > > > > > > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > > > > > > > > valid? :-/ > > > > > > > > > > > > > > > > > > > > But here is a question: > > > > > > > > > > > > > > > > > > > > What happens if someone with an incorrect MTU tries to join the > > > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > > > > > > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > > > > > > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > > > > > > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > > > > > > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > > > > > > > > > > > > > > > Yes, the join would fail, but I don't think that's what we would want. > > > > > > > > > The alternative with the patch is to make it the lowest rate but there > > > > > > > > > is a minimum MTU which might not be right. > > > > > > > > > > > > > > > > > > > > I think this is a policy and rather than this always being the case, > > > > > > > > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > > > > > > > > default should be to not do this. > > > > > > > > > > > > > > > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe more later... > > > > > > > > > > > > > > > > > > > > Thanks again, > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > > > > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > > > > > > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > > > > > > > > From: Ira K. Weiny > > > > > > > > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > > > > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > > > > > > > > Group. > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: root > > > > > > > > > > > > --- > > > > > > > > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > > > > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > > > > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > > index 8eb97ad..6bcc124 100644 > > > > > > > > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > > > the same MGID */ > > > > > > > > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > > > > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > > > > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > > > > > > > > + > > > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > > > > > > > > + > > > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > > > + && > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > > > + > > > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > > > + && > > > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > > > + ) { > > > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > > > + goto match; > > > > > > > > > > > > + } > > > > > > > > > > > > + } > > > > > > > > > > > > return; > > > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > > > > > +match: > > > > > > > > > > > > if (p_ctxt->p_mgrp) { > > > > > > > > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > > > > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > > index 749a936..469773a 100644 > > > > > > > > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > > > the same MGID */ > > > > > > > > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > > > > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > > > > > > > > + > > > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > > > > > > > > + > > > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > > > + && > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > > > + > > > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > > > + && > > > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > > > + ) { > > > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > > > + goto match; > > > > > > > > > > > > + } > > > > > > > > > > > > + } > > > > > > > > > > > > return; > > > > > > > > > > > > + } > > > > > > > > > > > > + > > > > > > > > > > > > +match: > > > > > > > > > > > > > > > > > > > > > > > > #if 0 > > > > > > > > > > > > for (i = 0; > > > > > > > > > > > > -- > > > > > > > > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > > > > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > > > > > > > > > > > > > > > Sorry, > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > > > > > > > > really stupid question but: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > > > > > > > > clusters? > > > > > > > > > > > > > > > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > > > > > > > > down: 0: > > > > > > > > > > > > > > root at wopri:/tftpboot/images > > > > > > > > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > > > > > > > > In 1: wopr3 > > > > > > > > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > > > > > > > > In 1: wopr4 > > > > > > > > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > > > > > > > > In 1: wopri > > > > > > > > > > > > > > Out 8: wopr[0-7] > > > > > > > > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > > > > > > > > In 1: wopr6 > > > > > > > > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > > > > > > > > In 1: wopr7 > > > > > > > > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > > > > > > > > In 1: wopr1 > > > > > > > > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > > > > > > > > In 1: wopr2 > > > > > > > > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > > > > > > > > In 1: wopr0 > > > > > > > > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > > > > > > > > In 1: wopr5 > > > > > > > > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > > > > > > > > IPoIB? > > > > > > > > > > > > > > > > > > > > > > > > > > > > In a bind, > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > > > general mailing list > > > > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > general mailing list > > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > general mailing list > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > _______________________________________________ > > > > > > > general mailing list > > > > > > > general at lists.openfabrics.org > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > _______________________________________________ > > > > general mailing list > > > > general at lists.openfabrics.org > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Tue Jan 15 11:17:53 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Tue, 15 Jan 2008 11:17:53 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080115192010.GG16009@sashak.voltaire.com> References: <20080114105132.19eafcee.weiny2@llnl.gov> <1200342214.8962.75.camel@hrosenstock-ws.xsigo.com> <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> <20080115014355.GK16009@sashak.voltaire.com> <1200361233.8962.206.camel@hrosenstock-ws.xsigo.com> <20080115084700.1d46407c.weiny2@llnl.gov> <1200418638.8962.328.camel@hrosenstock-ws.xsigo.com> <20080115192010.GG16009@sashak.voltaire.com> Message-ID: <1200424673.8962.364.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-15 at 19:20 +0000, Sasha Khapyorsky wrote: > On 09:37 Tue 15 Jan , Hal Rosenstock wrote: > > On Tue, 2008-01-15 at 08:47 -0800, Ira Weiny wrote: > > > On Mon, 14 Jan 2008 17:40:33 -0800 > > > Hal Rosenstock wrote: > > > > > > > On Tue, 2008-01-15 at 01:43 +0000, Sasha Khapyorsky wrote: > > > > > On 17:22 Mon 14 Jan , Hal Rosenstock wrote: > > > > > > On Tue, 2008-01-15 at 00:50 +0000, Sasha Khapyorsky wrote: > > > > > > > On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > > > > > > > > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > > > > > > > > On Mon, 14 Jan 2008 12:23:34 -0800 > > > > > > > > > Hal Rosenstock wrote: > > > > > > > > > > > > > > > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > > > > > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > > > > > > > > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > > > > > > > > "Hal Rosenstock" wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > > > > > > > > > > > > > > > > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > > > > > > > > > see now; that is what was proposed in the thread a year ago, however, I don't > > > > > > > > > think mapping many MGIDs to 1 MLID will work well. > > > > > > > > > > > > > > > > Why not ? > > > > > > > > > > > > > > > > It appears to be what you did (multiple MGIDs are mapped onto MLID (in > > > > > > > > the case below 0xc002)). Am I mistaken ? > > > > > > > > > > > > > > As far as I understand this patch it is the different. Here multiple > > > > > > > ports which match ipv6 solicited node multicast address will try to > > > > > > > join a single MC group (with single MGID and unique MLID). > > > > > > > > > > > > I don't think you are using the IBA defined terminology. > > > > > > > > > > > > A MC group is an MGID in terms of the IBA spec. Also, the SA GetTable > > > > > > with MGIDs wildcarded shows all the MGIDs. (Does it show that "special" > > > > > > MGID ?) > > > > > > > > > > Yes, it has MLID 0xc002 in Ira's 'saquery -g' example. > > > > > > > > No, MLID is not the group (at least in IBA terms); I was referring to > > > > the base SNM MGID (with partition and low 24 bits masked off). > > > > > > You are right the MLID is not the group. But the patch only creates 1 MGID as > > > well. > > > > I think you are confusing internal implementation with the outside view > > of what is going on. > > > > > I think I see where you are coming from but... > > > > > > Let me ask this. (I think I know the answer but I will ask anyway.) If you > > > have 3 MGID's (0xFF...1, 0xFF...2, 0xFF...3, 3 actual mgrp structures in > > > opensm) and you map them all to MLID 0xC001 will a message to 0xC001 reach all > > > 3 nodes? > > > > Ignoring the partitioning (and assuming the rates and MTUs are all the > > same), then yes. > > > > Let me ask you this: > > Do all IPv6 SNM MGIDs show up when you do SA GetTable for groups ? > > Let me answer. No. Only first MGID will be shown. > > I guess it is wrong in terms of IBTA, Yup; all those MGIDs need to be able to be queried. > but since whole feature is optional > I don't think it is a disaster (rather known limitation), and ipv6 not > working with big clusters is. Yes, but IMO we can/should do better than this. It's not much more work, is it ? -- Hal > Sasha > > > > > > > > > I would phrase this differently: > > > > > > All IPv6 SNM groups are mapped to a single MLID (when this feature is > > > > > > enabled). > > > > > > > > > > No, all ports join single IPv6 SNM MC group, and yes, it has single MGID > > > > > (and single MLID). > > > > > > > > It does not have a single MGID; it has many MGIDs including the base one > > > > (just look at the group dump). > > > > > > Are you mainly objecting to the fact that the IPoIB client nodes request > > > joining of MGID 0xFF...X and we are joining them to 0xFF...Y? > > > > No; my concerns are: > > 1. Your's and Sasha's categorization is wrong in that many MGIDs are > > mapped to a single MLID (it is MLID overloading which is allowed by the > > IBA spec) > > 2. I think that the rate and MTU issue will rear it's ugly head just > > like has occurred and still occurs for the IP broadcast group but this > > one is even more subtle > > 3. It is a spec violation in terms of the current (and previous versions > > of the) IBA spec. > > > > -- Hal > > > > > Ira > > > > > > > > > > > -- Hal > > > > > > > > > Sasha > > > > > > > > > > > It so happens that OpenSM internally does the accounting on > > > > > > membership by treating them all as members of the same "base" or > > > > > > "masked" group by masking off partition and the low 24 bits (port GUID). > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > Sasha > > > > > > > > > > > > > > > > > > > > > > > > What I did was to allow the first IPv6 request to create the group and then all > > > > > > > > > other requests were added to this group. > > > > > > > > > > > > > > > > You are using the word group loosely here and that is the source of the > > > > > > > > confusion IMO. I think by group you mean MLID. > > > > > > > > > > > > > > > > > This sends all the neighbor discovery messages to all nodes on the network. > > > > > > > > > > > > > > > > All nodes part of that MLID tree. > > > > > > > > > > > > > > > > > This might seem inefficient but should work. (... and seems to.) > > > > > > > > > > > > > > > > Sure; the hosts will filter based on MGID. The tradeoff is MLID > > > > > > > > utilization versus fabric utilization. > > > > > > > > > > > > > > > > > > > All of the requests for this type of MGRP join are routed to > > > > > > > > > > > one group. Therefore, I thought the same rules for deleting the group would > > > > > > > > > > > apply; when all the members are gone it is removed? > > > > > > > > > > > > > > > > > > > > Yes, the group may go but not the underlying MLID as there are other > > > > > > > > > > groups which are sharing this. That's not what happens now. > > > > > > > > > > > > > > > > > > No, since there is only 1 group in this implementation it should work like > > > > > > > > > others. The first node of this "mgid type" will create the group. Others will > > > > > > > > > join it and will continue to use it even if the creator leaves. > > > > > > > > > > > > > > > > Are you saying all these groups appear as 1 "group" to OpenSM (as the > > > > > > > > real groups are masked to the same value) ? > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > > > Does this make more sense? > > > > > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Just to be clear, after > > > > > > > > > > > this patch the mgroups are: > > > > > > > > > > > > > > > > > > > > > > 09:36:40 > saquery -g > > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > > > > > > > > > Mlid....................0xC000 > > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > > Rate....................0x83 > > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > > > > Mlid....................0xC001 > > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > > Rate....................0x83 > > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > > > > Mlid....................0xC002 > > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > > Rate....................0x83 > > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > > > > > > > > > Mlid....................0xC003 > > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > > Rate....................0x83 > > > > > > > > > > > > > > > > > > > > > > All of these requests are added to the > > > > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > > > > Mlid....................0xC002 > > > > > > > > > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > > > > > > > > > valid? :-/ > > > > > > > > > > > > > > > > > > > > > > But here is a question: > > > > > > > > > > > > > > > > > > > > > > What happens if someone with an incorrect MTU tries to join the > > > > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > > > > > > > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > > > > > > > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > > > > > > > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > > > > > > > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > > > > > > > > > > > > > > > > > Yes, the join would fail, but I don't think that's what we would want. > > > > > > > > > > The alternative with the patch is to make it the lowest rate but there > > > > > > > > > > is a minimum MTU which might not be right. > > > > > > > > > > > > > > > > > > > > > > I think this is a policy and rather than this always being the case, > > > > > > > > > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > > > > > > > > > default should be to not do this. > > > > > > > > > > > > > > > > > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe more later... > > > > > > > > > > > > > > > > > > > > > > Thanks again, > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > > > > > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > > > > > > > > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > > > > > > > > > From: Ira K. Weiny > > > > > > > > > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > > > > > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > > > > > > > > > Group. > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: root > > > > > > > > > > > > > --- > > > > > > > > > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > > > > > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > > > > > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > > > index 8eb97ad..6bcc124 100644 > > > > > > > > > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > > > > the same MGID */ > > > > > > > > > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > > > > > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > > > > > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > > > > > > > > > + > > > > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > > > > > > > > > + > > > > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > > > > + && > > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > > > > + > > > > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > > > > + && > > > > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > > > > + ) { > > > > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > > > > + goto match; > > > > > > > > > > > > > + } > > > > > > > > > > > > > + } > > > > > > > > > > > > > return; > > > > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > > > > > > > +match: > > > > > > > > > > > > > if (p_ctxt->p_mgrp) { > > > > > > > > > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > > > > > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > > > index 749a936..469773a 100644 > > > > > > > > > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > > > > the same MGID */ > > > > > > > > > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > > > > > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > > > > > > > > > + > > > > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > > > > > > > > > + > > > > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > > > > + && > > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > > > > + > > > > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > > > > + && > > > > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > > > > + ) { > > > > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > > > > + goto match; > > > > > > > > > > > > > + } > > > > > > > > > > > > > + } > > > > > > > > > > > > > return; > > > > > > > > > > > > > + } > > > > > > > > > > > > > + > > > > > > > > > > > > > +match: > > > > > > > > > > > > > > > > > > > > > > > > > > #if 0 > > > > > > > > > > > > > for (i = 0; > > > > > > > > > > > > > -- > > > > > > > > > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > > > > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sorry, > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > > > > > > > > > really stupid question but: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > > > > > > > > > clusters? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > > > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > > > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > > > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > > > > > > > > > down: 0: > > > > > > > > > > > > > > > root at wopri:/tftpboot/images > > > > > > > > > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > > > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > > > > > > > > > In 1: wopr3 > > > > > > > > > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > > > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > > > > > > > > > In 1: wopr4 > > > > > > > > > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > > > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > > > > > > > > > In 1: wopri > > > > > > > > > > > > > > > Out 8: wopr[0-7] > > > > > > > > > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > > > > > > > > > In 1: wopr6 > > > > > > > > > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > > > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > > > > > > > > > In 1: wopr7 > > > > > > > > > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > > > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > > > > > > > > > In 1: wopr1 > > > > > > > > > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > > > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > > > > > > > > > In 1: wopr2 > > > > > > > > > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > > > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > > > > > > > > > In 1: wopr0 > > > > > > > > > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > > > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > > > > > > > > > In 1: wopr5 > > > > > > > > > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > > > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > > > > > > > > > IPoIB? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In a bind, > > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > > > > general mailing list > > > > > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > > general mailing list > > > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > general mailing list > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > _______________________________________________ > > > > > > > > general mailing list > > > > > > > > general at lists.openfabrics.org > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > _______________________________________________ > > > > > general mailing list > > > > > general at lists.openfabrics.org > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > _______________________________________________ > > > > general mailing list > > > > general at lists.openfabrics.org > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Tue Jan 15 11:33:47 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 19:33:47 +0000 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <1200424673.8962.364.camel@hrosenstock-ws.xsigo.com> References: <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> <20080115014355.GK16009@sashak.voltaire.com> <1200361233.8962.206.camel@hrosenstock-ws.xsigo.com> <20080115084700.1d46407c.weiny2@llnl.gov> <1200418638.8962.328.camel@hrosenstock-ws.xsigo.com> <20080115192010.GG16009@sashak.voltaire.com> <1200424673.8962.364.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080115193347.GI16009@sashak.voltaire.com> On 11:17 Tue 15 Jan , Hal Rosenstock wrote: > On Tue, 2008-01-15 at 19:20 +0000, Sasha Khapyorsky wrote: > > On 09:37 Tue 15 Jan , Hal Rosenstock wrote: > > > On Tue, 2008-01-15 at 08:47 -0800, Ira Weiny wrote: > > > > On Mon, 14 Jan 2008 17:40:33 -0800 > > > > Hal Rosenstock wrote: > > > > > > > > > On Tue, 2008-01-15 at 01:43 +0000, Sasha Khapyorsky wrote: > > > > > > On 17:22 Mon 14 Jan , Hal Rosenstock wrote: > > > > > > > On Tue, 2008-01-15 at 00:50 +0000, Sasha Khapyorsky wrote: > > > > > > > > On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > > > > > > > > > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > > > > > > > > > On Mon, 14 Jan 2008 12:23:34 -0800 > > > > > > > > > > Hal Rosenstock wrote: > > > > > > > > > > > > > > > > > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > > > > > > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > > > > > > > > > "Hal Rosenstock" wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Didn't you just change that in that many MGIDs go to one MLID ? > > > > > > > > > > > > > > > > > > > > Ah, this is where the confusion has been. No, this is _not_ what I did... I > > > > > > > > > > see now; that is what was proposed in the thread a year ago, however, I don't > > > > > > > > > > think mapping many MGIDs to 1 MLID will work well. > > > > > > > > > > > > > > > > > > Why not ? > > > > > > > > > > > > > > > > > > It appears to be what you did (multiple MGIDs are mapped onto MLID (in > > > > > > > > > the case below 0xc002)). Am I mistaken ? > > > > > > > > > > > > > > > > As far as I understand this patch it is the different. Here multiple > > > > > > > > ports which match ipv6 solicited node multicast address will try to > > > > > > > > join a single MC group (with single MGID and unique MLID). > > > > > > > > > > > > > > I don't think you are using the IBA defined terminology. > > > > > > > > > > > > > > A MC group is an MGID in terms of the IBA spec. Also, the SA GetTable > > > > > > > with MGIDs wildcarded shows all the MGIDs. (Does it show that "special" > > > > > > > MGID ?) > > > > > > > > > > > > Yes, it has MLID 0xc002 in Ira's 'saquery -g' example. > > > > > > > > > > No, MLID is not the group (at least in IBA terms); I was referring to > > > > > the base SNM MGID (with partition and low 24 bits masked off). > > > > > > > > You are right the MLID is not the group. But the patch only creates 1 MGID as > > > > well. > > > > > > I think you are confusing internal implementation with the outside view > > > of what is going on. > > > > > > > I think I see where you are coming from but... > > > > > > > > Let me ask this. (I think I know the answer but I will ask anyway.) If you > > > > have 3 MGID's (0xFF...1, 0xFF...2, 0xFF...3, 3 actual mgrp structures in > > > > opensm) and you map them all to MLID 0xC001 will a message to 0xC001 reach all > > > > 3 nodes? > > > > > > Ignoring the partitioning (and assuming the rates and MTUs are all the > > > same), then yes. > > > > > > Let me ask you this: > > > Do all IPv6 SNM MGIDs show up when you do SA GetTable for groups ? > > > > Let me answer. No. Only first MGID will be shown. > > > > I guess it is wrong in terms of IBTA, > > Yup; all those MGIDs need to be able to be queried. > > > but since whole feature is optional > > I don't think it is a disaster (rather known limitation), and ipv6 not > > working with big clusters is. > > Yes, but IMO we can/should do better than this. Nobody said anything against it. > It's not much more work, > is it ? I guess it is more work, don't know for sure yet. Sasha > > -- Hal > > > Sasha > > > > > > > > > > > > I would phrase this differently: > > > > > > > All IPv6 SNM groups are mapped to a single MLID (when this feature is > > > > > > > enabled). > > > > > > > > > > > > No, all ports join single IPv6 SNM MC group, and yes, it has single MGID > > > > > > (and single MLID). > > > > > > > > > > It does not have a single MGID; it has many MGIDs including the base one > > > > > (just look at the group dump). > > > > > > > > Are you mainly objecting to the fact that the IPoIB client nodes request > > > > joining of MGID 0xFF...X and we are joining them to 0xFF...Y? > > > > > > No; my concerns are: > > > 1. Your's and Sasha's categorization is wrong in that many MGIDs are > > > mapped to a single MLID (it is MLID overloading which is allowed by the > > > IBA spec) > > > 2. I think that the rate and MTU issue will rear it's ugly head just > > > like has occurred and still occurs for the IP broadcast group but this > > > one is even more subtle > > > 3. It is a spec violation in terms of the current (and previous versions > > > of the) IBA spec. > > > > > > -- Hal > > > > > > > Ira > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > Sasha > > > > > > > > > > > > > It so happens that OpenSM internally does the accounting on > > > > > > > membership by treating them all as members of the same "base" or > > > > > > > "masked" group by masking off partition and the low 24 bits (port GUID). > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > Sasha > > > > > > > > > > > > > > > > > > > > > > > > > > > What I did was to allow the first IPv6 request to create the group and then all > > > > > > > > > > other requests were added to this group. > > > > > > > > > > > > > > > > > > You are using the word group loosely here and that is the source of the > > > > > > > > > confusion IMO. I think by group you mean MLID. > > > > > > > > > > > > > > > > > > > This sends all the neighbor discovery messages to all nodes on the network. > > > > > > > > > > > > > > > > > > All nodes part of that MLID tree. > > > > > > > > > > > > > > > > > > > This might seem inefficient but should work. (... and seems to.) > > > > > > > > > > > > > > > > > > Sure; the hosts will filter based on MGID. The tradeoff is MLID > > > > > > > > > utilization versus fabric utilization. > > > > > > > > > > > > > > > > > > > > > All of the requests for this type of MGRP join are routed to > > > > > > > > > > > > one group. Therefore, I thought the same rules for deleting the group would > > > > > > > > > > > > apply; when all the members are gone it is removed? > > > > > > > > > > > > > > > > > > > > > > Yes, the group may go but not the underlying MLID as there are other > > > > > > > > > > > groups which are sharing this. That's not what happens now. > > > > > > > > > > > > > > > > > > > > No, since there is only 1 group in this implementation it should work like > > > > > > > > > > others. The first node of this "mgid type" will create the group. Others will > > > > > > > > > > join it and will continue to use it even if the creator leaves. > > > > > > > > > > > > > > > > > > Are you saying all these groups appear as 1 "group" to OpenSM (as the > > > > > > > > > real groups are masked to the same value) ? > > > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > > > > > Does this make more sense? > > > > > > > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Just to be clear, after > > > > > > > > > > > > this patch the mgroups are: > > > > > > > > > > > > > > > > > > > > > > > > 09:36:40 > saquery -g > > > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > > > MGID....................0xff12401bffff0000 : 0x00000000ffffffff > > > > > > > > > > > > Mlid....................0xC000 > > > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > > > Rate....................0x83 > > > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > > > > > Mlid....................0xC001 > > > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > > > Rate....................0x83 > > > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > > > > > Mlid....................0xC002 > > > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > > > Rate....................0x83 > > > > > > > > > > > > MCMemberRecord group dump: > > > > > > > > > > > > MGID....................0xff12601bffff0000 : 0x0000000000000001 > > > > > > > > > > > > Mlid....................0xC003 > > > > > > > > > > > > Mtu.....................0x84 > > > > > > > > > > > > pkey....................0xFFFF > > > > > > > > > > > > Rate....................0x83 > > > > > > > > > > > > > > > > > > > > > > > > All of these requests are added to the > > > > > > > > > > > > MGID....................0xff12601bffff0000 : 0x00000001ff0021e9 > > > > > > > > > > > > Mlid....................0xC002 > > > > > > > > > > > > group. But as you say, how do we determine that the pkey, mtu, and rate are > > > > > > > > > > > > valid? :-/ > > > > > > > > > > > > > > > > > > > > > > > > But here is a question: > > > > > > > > > > > > > > > > > > > > > > > > What happens if someone with an incorrect MTU tries to join the > > > > > > > > > > > > MGID....................0xff12401bffff0000 : 0x0000000000000001 > > > > > > > > > > > > group? Wouldn't this code return this mgrp pointer and the subsequent MTU and > > > > > > > > > > > > rate checks fail? I seem to recall a thread discussing this before. I don't > > > > > > > > > > > > remember what the outcome was. I seem to remember the question was if OpenSM > > > > > > > > > > > > should create/modify a group to the "lowest common" MTU/Rate, and succeed all > > > > > > > > > > > > the joins, vs enforcing the faster MTU/Rate and failing the joins. > > > > > > > > > > > > > > > > > > > > > > Yes, the join would fail, but I don't think that's what we would want. > > > > > > > > > > > The alternative with the patch is to make it the lowest rate but there > > > > > > > > > > > is a minimum MTU which might not be right. > > > > > > > > > > > > > > > > > > > > > > > > I think this is a policy and rather than this always being the case, > > > > > > > > > > > > > there should be a policy parameter added to OpenSM for this. IMO > > > > > > > > > > > > > default should be to not do this. > > > > > > > > > > > > > > > > > > > > > > > > Yes, for sure there needs to be some options to control the behavior. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe more later... > > > > > > > > > > > > > > > > > > > > > > > > Thanks again, > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > [*] Again I apologize for the spam but we were in a bit of a panic as we only > > > > > > > > > > > > > > have the big system for the weekend and IB was not part of the test... ;-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > >From 35e35a9534bd49147886ac93ab1601acadcdbe26 Mon Sep 17 00:00:00 2001 > > > > > > > > > > > > > > From: Ira K. Weiny > > > > > > > > > > > > > > Date: Fri, 11 Jan 2008 22:58:19 -0800 > > > > > > > > > > > > > > Subject: [PATCH] Special Case the IPv6 Solicited Node Multicast address to use a single Mcast > > > > > > > > > > > > > > Group. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: root > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > opensm/opensm/osm_sa_mcmember_record.c | 30 +++++++++++++++++++++++++++++- > > > > > > > > > > > > > > opensm/opensm/osm_sa_path_record.c | 31 ++++++++++++++++++++++++++++++- > > > > > > > > > > > > > > 2 files changed, 59 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > > > > index 8eb97ad..6bcc124 100644 > > > > > > > > > > > > > > --- a/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > > > > +++ b/opensm/opensm/osm_sa_mcmember_record.c > > > > > > > > > > > > > > @@ -124,9 +124,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > > > > > the same MGID */ > > > > > > > > > > > > > > if (memcmp(&p_mgrp->mcmember_rec.mgid, > > > > > > > > > > > > > > - &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) > > > > > > > > > > > > > > + &p_recvd_mcmember_rec->mgid, sizeof(ib_gid_t))) { > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.prefix); > > > > > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mcmember_rec->mgid.unicast.interface_id); > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > > > > > + && > > > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > > > > > + && > > > > > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > > > > > + ) { > > > > > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > > > > > + goto match; > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > return; > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > > > > > > > > > > > +match: > > > > > > > > > > > > > > if (p_ctxt->p_mgrp) { > > > > > > > > > > > > > > osm_log(sa->p_log, OSM_LOG_ERROR, > > > > > > > > > > > > > > "__search_mgrp_by_mgid: ERR 1B03: " > > > > > > > > > > > > > > diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > > > > index 749a936..469773a 100644 > > > > > > > > > > > > > > --- a/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > > > > +++ b/opensm/opensm/osm_sa_path_record.c > > > > > > > > > > > > > > @@ -1536,8 +1536,37 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) > > > > > > > > > > > > > > > > > > > > > > > > > > > > /* compare entire MGID so different scope will not sneak in for > > > > > > > > > > > > > > the same MGID */ > > > > > > > > > > > > > > - if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) > > > > > > > > > > > > > > + if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + /* Special Case IPV6 Multicast Loopback addresses */ > > > > > > > > > > > > > > + /* 0xff12601bffff0000 : 0x00000001ffXXXXXX */ > > > > > > > > > > > > > > +#define SPEC_PREFIX (0xff12601bffff0000) > > > > > > > > > > > > > > +#define INT_ID_MASK (0x00000001ff000000) > > > > > > > > > > > > > > + uint64_t g_prefix = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix); > > > > > > > > > > > > > > + uint64_t g_interface_id = cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id); > > > > > > > > > > > > > > + uint64_t rcv_prefix = cl_ntoh64(p_recvd_mgid->unicast.prefix); > > > > > > > > > > > > > > + uint64_t rcv_interface_id = cl_ntoh64(p_recvd_mgid->unicast.interface_id); > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + if (rcv_prefix == SPEC_PREFIX > > > > > > > > > > > > > > + && > > > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) == INT_ID_MASK) { > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + if ((g_prefix == rcv_prefix) > > > > > > > > > > > > > > + && > > > > > > > > > > > > > > + (g_interface_id & INT_ID_MASK) == > > > > > > > > > > > > > > + (rcv_interface_id & INT_ID_MASK) > > > > > > > > > > > > > > + ) { > > > > > > > > > > > > > > + osm_log(sa->p_log, OSM_LOG_INFO, > > > > > > > > > > > > > > + "Special Case Mcast Join for MGID " > > > > > > > > > > > > > > + " MGID 0x%016"PRIx64" : 0x%016"PRIx64"\n", > > > > > > > > > > > > > > + rcv_prefix, rcv_interface_id); > > > > > > > > > > > > > > + goto match; > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > return; > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > + > > > > > > > > > > > > > > +match: > > > > > > > > > > > > > > > > > > > > > > > > > > > > #if 0 > > > > > > > > > > > > > > for (i = 0; > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > 1.5.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 22:04:56 -0800 > > > > > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ok, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I found my own answer. Sorry for the spam. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://lists.openfabrics.org/pipermail/general/2006-November/029617.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sorry, > > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 11 Jan 2008 19:36:57 -0800 > > > > > > > > > > > > > > > Ira Weiny wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't really understand the innerworkings of IPoIB so forgive me if this is a > > > > > > > > > > > > > > > > really stupid question but: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Is it a bug that there is a Multicast group created for every node in our > > > > > > > > > > > > > > > > clusters? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If not a bug why is this done? We just tried to boot on a 1151 node cluster > > > > > > > > > > > > > > > > and opensm is complaining there are not enough multicast groups. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jan 11 18:30:42 728984 [40C05960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > > > > > Jan 11 18:30:42 729050 [40C05960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > Jan 11 18:30:42 730647 [40401960] -> __get_new_mlid: ERR 1B23: All available:1024 mlids are taken > > > > > > > > > > > > > > > > Jan 11 18:30:42 730691 [40401960] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here is the output from my small test cluster: (ibnodesinmcast uses saquery a > > > > > > > > > > > > > > > > couple of times to print this nice report.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 19:17:24 > whatsup > > > > > > > > > > > > > > > > up: 9: wopr[0-7],wopri > > > > > > > > > > > > > > > > down: 0: > > > > > > > > > > > > > > > > root at wopri:/tftpboot/images > > > > > > > > > > > > > > > > 19:25:03 > ibnodesinmcast -g > > > > > > > > > > > > > > > > 0xC000 (0xff12401bffff0000 : 0x00000000ffffffff) > > > > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > > > > 0xC001 (0xff12401bffff0000 : 0x0000000000000001) > > > > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > > > > 0xC002 (0xff12601bffff0000 : 0x00000001ff2265ed) > > > > > > > > > > > > > > > > In 1: wopr3 > > > > > > > > > > > > > > > > Out 8: wopr[0-2,4-7],wopri > > > > > > > > > > > > > > > > 0xC003 (0xff12601bffff0000 : 0x0000000000000001) > > > > > > > > > > > > > > > > In 9: wopr[0-7],wopri > > > > > > > > > > > > > > > > Out 0: 0 > > > > > > > > > > > > > > > > 0xC004 (0xff12601bffff0000 : 0x00000001ff222729) > > > > > > > > > > > > > > > > In 1: wopr4 > > > > > > > > > > > > > > > > Out 8: wopr[0-3,5-7],wopri > > > > > > > > > > > > > > > > 0xC005 (0xff12601bffff0000 : 0x00000001ff219e65) > > > > > > > > > > > > > > > > In 1: wopri > > > > > > > > > > > > > > > > Out 8: wopr[0-7] > > > > > > > > > > > > > > > > 0xC006 (0xff12601bffff0000 : 0x00000001ff00232d) > > > > > > > > > > > > > > > > In 1: wopr6 > > > > > > > > > > > > > > > > Out 8: wopr[0-5,7],wopri > > > > > > > > > > > > > > > > 0xC007 (0xff12601bffff0000 : 0x00000001ff002325) > > > > > > > > > > > > > > > > In 1: wopr7 > > > > > > > > > > > > > > > > Out 8: wopr[0-6],wopri > > > > > > > > > > > > > > > > 0xC008 (0xff12601bffff0000 : 0x00000001ff228d35) > > > > > > > > > > > > > > > > In 1: wopr1 > > > > > > > > > > > > > > > > Out 8: wopr[0,2-7],wopri > > > > > > > > > > > > > > > > 0xC009 (0xff12601bffff0000 : 0x00000001ff2227f1) > > > > > > > > > > > > > > > > In 1: wopr2 > > > > > > > > > > > > > > > > Out 8: wopr[0-1,3-7],wopri > > > > > > > > > > > > > > > > 0xC00A (0xff12601bffff0000 : 0x00000001ff219ef1) > > > > > > > > > > > > > > > > In 1: wopr0 > > > > > > > > > > > > > > > > Out 8: wopr[1-7],wopri > > > > > > > > > > > > > > > > 0xC00B (0xff12601bffff0000 : 0x00000001ff0021e9) > > > > > > > > > > > > > > > > In 1: wopr5 > > > > > > > > > > > > > > > > Out 8: wopr[0-4,6-7],wopri > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Each of these MGIDS of the prefix (0xff12601bffff0000) have just one node in > > > > > > > > > > > > > > > > them and represent an ipv6 address. Could you turn off ipv6 with the latest > > > > > > > > > > > > > > > > IPoIB? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In a bind, > > > > > > > > > > > > > > > > Ira > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > > > > > general mailing list > > > > > > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > > > general mailing list > > > > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > general mailing list > > > > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > > > > _______________________________________________ > > > > > > > > > general mailing list > > > > > > > > > general at lists.openfabrics.org > > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > > _______________________________________________ > > > > > > general mailing list > > > > > > general at lists.openfabrics.org > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > > > > _______________________________________________ > > > > > general mailing list > > > > > general at lists.openfabrics.org > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From weiny2 at llnl.gov Tue Jan 15 11:35:33 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 15 Jan 2008 11:35:33 -0800 Subject: [ofa-general] Re: [PATCH] opensm/perfmgr: use pkey at index 0 In-Reply-To: <20080115182142.GC16009@sashak.voltaire.com> References: <200801061801.25386.dotanb@dev.mellanox.co.il> <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <20080115182142.GC16009@sashak.voltaire.com> Message-ID: <20080115113533.10c3c7c9.weiny2@llnl.gov> Perhaps the variable name should be pkey_index? Ira On Tue, 15 Jan 2008 18:21:42 +0000 Sasha Khapyorsky wrote: > > Use pkey at index 0 of port's pkey table, now this value is passed to > user_mad. For some reason the old code (where 0xffff was passed) worked > too. > > Signed-off-by: Sasha Khapyorsky > --- > opensm/opensm/osm_perfmgr.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c > index 76ef080..66d919d 100644 > --- a/opensm/opensm/osm_perfmgr.c > +++ b/opensm/opensm/osm_perfmgr.c > @@ -399,7 +399,7 @@ osm_perfmgr_send_pc_mad(osm_perfmgr_t * perfmgr, ib_net16_t dest_lid, > p_madw->mad_addr.addr_type.gsi.remote_qkey = > cl_hton32(IB_QP1_WELL_KNOWN_Q_KEY); > /* FIXME what about other partitions */ > - p_madw->mad_addr.addr_type.gsi.pkey = cl_hton16(0xFFFF); > + p_madw->mad_addr.addr_type.gsi.pkey = 0; > p_madw->mad_addr.addr_type.gsi.service_level = 0; > p_madw->mad_addr.addr_type.gsi.global_route = FALSE; > p_madw->resp_expected = TRUE; > -- > 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Tue Jan 15 11:50:02 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 19:50:02 +0000 Subject: [ofa-general] Re: [PATCH] opensm/perfmgr: use pkey at index 0 In-Reply-To: <20080115113533.10c3c7c9.weiny2@llnl.gov> References: <4781F18F.1070506@voltaire.com> <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <20080115182142.GC16009@sashak.voltaire.com> <20080115113533.10c3c7c9.weiny2@llnl.gov> Message-ID: <20080115195002.GL16009@sashak.voltaire.com> On 11:35 Tue 15 Jan , Ira Weiny wrote: > Perhaps the variable name should be pkey_index? Yes, I thought about this too. There are some mess in this area this other old ones osmvendor libraries, I'm reviewing this yet. Sasha > > Ira > > On Tue, 15 Jan 2008 18:21:42 +0000 > Sasha Khapyorsky wrote: > > > > > Use pkey at index 0 of port's pkey table, now this value is passed to > > user_mad. For some reason the old code (where 0xffff was passed) worked > > too. > > > > Signed-off-by: Sasha Khapyorsky > > --- > > opensm/opensm/osm_perfmgr.c | 2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c > > index 76ef080..66d919d 100644 > > --- a/opensm/opensm/osm_perfmgr.c > > +++ b/opensm/opensm/osm_perfmgr.c > > @@ -399,7 +399,7 @@ osm_perfmgr_send_pc_mad(osm_perfmgr_t * perfmgr, ib_net16_t dest_lid, > > p_madw->mad_addr.addr_type.gsi.remote_qkey = > > cl_hton32(IB_QP1_WELL_KNOWN_Q_KEY); > > /* FIXME what about other partitions */ > > - p_madw->mad_addr.addr_type.gsi.pkey = cl_hton16(0xFFFF); > > + p_madw->mad_addr.addr_type.gsi.pkey = 0; > > p_madw->mad_addr.addr_type.gsi.service_level = 0; > > p_madw->mad_addr.addr_type.gsi.global_route = FALSE; > > p_madw->resp_expected = TRUE; > > -- > > 1.5.4.rc2.60.gb2e62 From tziporet at dev.mellanox.co.il Tue Jan 15 12:40:41 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Tue, 15 Jan 2008 22:40:41 +0200 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2 readiness In-Reply-To: <000801c857a0$ee083bc0$a937170a@amr.corp.intel.com> References: <6C2C79E72C305246B504CBA17B5500C9031E5E56@mtlexch01.mtl.com> <000801c857a0$ee083bc0$a937170a@amr.corp.intel.com> Message-ID: <478D1A49.1080807@mellanox.co.il> Sean Hefty wrote: > If XRC hasn't been added, then this isn't an 'RC', it's simply an alpha release, > with another alpha release to follow. IMO, XRC should simply be pushed out to > the next release. I don't see the reason to delay the release for a > non-standard feature that likely has only 1-2 users. > > We actually decided not to delay the release even if XRC is not ready RC4 was planned in advanced so it does not change our plans XRC will be in alpha stage in RC3 will be tested with OpenMPI 1.3. > Maybe the EWG needs to ask themselves if OFED trying to distribute production > ready, enterprise tested software, or software that simply provides the latest > and greatest features. > > When we started OFED we decided to enable new features that can be in lower stability level, in case they do not harm the overall stability of the OFED release. I think XRC fulfill this criteria. Tziporet From sashak at voltaire.com Tue Jan 15 13:11:04 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Tue, 15 Jan 2008 21:11:04 +0000 Subject: [ofa-general] Re: [PATCH] opensm/perfmgr: use pkey at index 0 In-Reply-To: <20080115195002.GL16009@sashak.voltaire.com> References: <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <20080115182142.GC16009@sashak.voltaire.com> <20080115113533.10c3c7c9.weiny2@llnl.gov> <20080115195002.GL16009@sashak.voltaire.com> Message-ID: <20080115211104.GN16009@sashak.voltaire.com> On 19:50 Tue 15 Jan , Sasha Khapyorsky wrote: > On 11:35 Tue 15 Jan , Ira Weiny wrote: > > Perhaps the variable name should be pkey_index? > > Yes, I thought about this too. There are some mess in this area this > other old ones osmvendor libraries, I'm reviewing this yet. Also such rename breaks ibutils build. The fix seems easy however and likely needed anyway because the value 0xffff is used currently. Sasha From sean.hefty at intel.com Tue Jan 15 13:02:12 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 15 Jan 2008 13:02:12 -0800 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: <478D1A49.1080807@mellanox.co.il> References: <6C2C79E72C305246B504CBA17B5500C9031E5E56@mtlexch01.mtl.com><000801c857a0$ee083bc0$a937170a@amr.corp.intel.com> <478D1A49.1080807@mellanox.co.il> Message-ID: <000201c857b9$e67ee020$a937170a@amr.corp.intel.com> >When we started OFED we decided to enable new features that can be in >lower stability level, in case they do not harm the overall stability of >the OFED release. >I think XRC fulfill this criteria. XRC changes the verbs interfaces and code. It increases the risk of instability. Changes to IPoIB increase the risk of instability. Something like SDP doesn't, since it's an entirely separate module. >From my view, OFED is enabling new features over stability. I would rather see OFED pull code from upstream with patches added on only for backports and fixes. - Sean From a-amaury.pci at aci-anhalt.de Mon Jan 14 13:42:59 2008 From: a-amaury.pci at aci-anhalt.de (Anthony Cornelius) Date: Tue, 14 Jan 2008 23:42:59 +0200 Subject: [ofa-general] Let's chat Message-ID: <575794263.42915648396787@aci-anhalt.de> Hello! I am bored today. I am nice girl that would like to chat with you. Email me at Kayla at HonorDays.info only, because I am using my friend's email to write this. Would you mind me showing some nice pictures of me? From weiny2 at llnl.gov Tue Jan 15 13:20:12 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 15 Jan 2008 13:20:12 -0800 Subject: [ofa-general] Re: [PATCH] opensm/perfmgr: use pkey at index 0 In-Reply-To: <20080115211104.GN16009@sashak.voltaire.com> References: <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <20080115182142.GC16009@sashak.voltaire.com> <20080115113533.10c3c7c9.weiny2@llnl.gov> <20080115195002.GL16009@sashak.voltaire.com> <20080115211104.GN16009@sashak.voltaire.com> Message-ID: <20080115132012.46a7d512.weiny2@llnl.gov> On Tue, 15 Jan 2008 21:11:04 +0000 Sasha Khapyorsky wrote: > On 19:50 Tue 15 Jan , Sasha Khapyorsky wrote: > > On 11:35 Tue 15 Jan , Ira Weiny wrote: > > > Perhaps the variable name should be pkey_index? > > > > Yes, I thought about this too. There are some mess in this area this > > other old ones osmvendor libraries, I'm reviewing this yet. > > Also such rename breaks ibutils build. The fix seems easy however and > likely needed anyway because the value 0xffff is used currently. Yes, absolutely the fix is needed. Thanks, Ira From weiny2 at llnl.gov Tue Jan 15 14:16:55 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 15 Jan 2008 14:16:55 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080115193347.GI16009@sashak.voltaire.com> References: <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> <20080115014355.GK16009@sashak.voltaire.com> <1200361233.8962.206.camel@hrosenstock-ws.xsigo.com> <20080115084700.1d46407c.weiny2@llnl.gov> <1200418638.8962.328.camel@hrosenstock-ws.xsigo.com> <20080115192010.GG16009@sashak.voltaire.com> <1200424673.8962.364.camel@hrosenstock-ws.xsigo.com> <20080115193347.GI16009@sashak.voltaire.com> Message-ID: <20080115141655.1e2c4c04.weiny2@llnl.gov> On Tue, 15 Jan 2008 19:33:47 +0000 Sasha Khapyorsky wrote: > On 11:17 Tue 15 Jan , Hal Rosenstock wrote: > > On Tue, 2008-01-15 at 19:20 +0000, Sasha Khapyorsky wrote: > > > On 09:37 Tue 15 Jan , Hal Rosenstock wrote: > > > > On Tue, 2008-01-15 at 08:47 -0800, Ira Weiny wrote: > > > > > On Mon, 14 Jan 2008 17:40:33 -0800 > > > > > Hal Rosenstock wrote: > > > > > > > > > > > On Tue, 2008-01-15 at 01:43 +0000, Sasha Khapyorsky wrote: > > > > > > > On 17:22 Mon 14 Jan , Hal Rosenstock wrote: > > > > > > > > On Tue, 2008-01-15 at 00:50 +0000, Sasha Khapyorsky wrote: > > > > > > > > > On 16:05 Mon 14 Jan , Hal Rosenstock wrote: > > > > > > > > > > On Mon, 2008-01-14 at 15:35 -0800, Ira Weiny wrote: > > > > > > > > > > > On Mon, 14 Jan 2008 12:23:34 -0800 > > > > > > > > > > > Hal Rosenstock wrote: > > > > > > > > > > > > > > > > > > > > > > > On Mon, 2008-01-14 at 10:51 -0800, Ira Weiny wrote: > > > > > > > > > > > > > Hey Hal, thanks for the response. Comments below. > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 14 Jan 2008 12:57:45 -0500 > > > > > > > > > > > > > "Hal Rosenstock" wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, MLID is not the group (at least in IBA terms); I was referring to > > > > > > the base SNM MGID (with partition and low 24 bits masked off). > > > > > > > > > > You are right the MLID is not the group. But the patch only creates 1 MGID as > > > > > well. > > > > > > > > I think you are confusing internal implementation with the outside view > > > > of what is going on. > > > > > > > > > I think I see where you are coming from but... > > > > > > > > > > Let me ask this. (I think I know the answer but I will ask anyway.) If you > > > > > have 3 MGID's (0xFF...1, 0xFF...2, 0xFF...3, 3 actual mgrp structures in > > > > > opensm) and you map them all to MLID 0xC001 will a message to 0xC001 reach all > > > > > 3 nodes? > > > > > > > > Ignoring the partitioning (and assuming the rates and MTUs are all the > > > > same), then yes. > > > > > > > > Let me ask you this: > > > > Do all IPv6 SNM MGIDs show up when you do SA GetTable for groups ? > > > > > > Let me answer. No. Only first MGID will be shown. > > > > > > I guess it is wrong in terms of IBTA, > > > > Yup; all those MGIDs need to be able to be queried. I _will_ take your word for this, but I am still curious as to who is going to know these MGIDs to be queried? > > > > > but since whole feature is optional > > > I don't think it is a disaster (rather known limitation), and ipv6 not > > > working with big clusters is. > > > > Yes, but IMO we can/should do better than this. > > Nobody said anything against it. Totally agreed. Ira > > > It's not much more work, > > is it ? > > I guess it is more work, don't know for sure yet. > > Sasha > From rdreier at cisco.com Tue Jan 15 14:27:57 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 15 Jan 2008 14:27:57 -0800 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: <1200416103.11174.303.camel@mtls03> (Eli Cohen's message of "Tue, 15 Jan 2008 18:55:03 +0200") References: <1199980899.11174.91.camel@mtls03> <1200402798.11174.279.camel@mtls03> <1200416103.11174.303.camel@mtls03> Message-ID: > On a second thought, we could add - in addition to the creation flags - > a special qp type, IB_QPT_UD_IPOIB for example, where each low level > driver could add special enhancements for the sake of achieving higher > performance. What do you thinks? Did you have some specific use for this in mind? I don't think it's a good idea to add this unless there is a really big win we can get. - R. From rdreier at cisco.com Tue Jan 15 14:28:58 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 15 Jan 2008 14:28:58 -0800 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: <6C2C79E72C305246B504CBA17B5500C9031E5D23@mtlexch01.mtl.com> (Jack Morgenstein's message of "Tue, 15 Jan 2008 15:28:34 +0200") References: <1199980899.11174.91.camel@mtls03> <1200402798.11174.279.camel@mtls03> <6C2C79E72C305246B504CBA17B5500C9031E5D23@mtlexch01.mtl.com> Message-ID: > We might put in a note that callers should always cause unused > fields of the ib_qp_init_attr structure to be zero. I don't think there's any need to do this as long as the flags are only in the kernel-level API. - R. From rdreier at cisco.com Tue Jan 15 14:29:44 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 15 Jan 2008 14:29:44 -0800 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: <1200402798.11174.279.camel@mtls03> (Eli Cohen's message of "Tue, 15 Jan 2008 15:13:18 +0200") References: <1199980899.11174.91.camel@mtls03> <1200402798.11174.279.camel@mtls03> Message-ID: > I missed that one. There is one more place that might be a problem and > that is rdma_create_qp which is an exported function which accepts > struct ib_qp_init_attr * as an argument. This means that we need to > either clear the create_flags field or require to the caller to put a > valid value. What do you think? Yes, that's right, you would need to audit all the callers there too. - R. From swise at opengridcomputing.com Tue Jan 15 14:32:11 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Tue, 15 Jan 2008 16:32:11 -0600 Subject: [ofa-general] new API for the rdma-cma Message-ID: <478D346B.8070506@opengridcomputing.com> Hey Sean, What do you think about adding an API to the rdma-cma allowing applications to get a list of ip addresses associated with a particular rdma device/port? Its seems useful, and I know of two applications wanting this: 1) lamprey testing tools 2) OMPI over iWARP. From sean.hefty at intel.com Tue Jan 15 15:12:08 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 15 Jan 2008 15:12:08 -0800 Subject: [ofa-general] RE: new API for the rdma-cma In-Reply-To: <478D346B.8070506@opengridcomputing.com> References: <478D346B.8070506@opengridcomputing.com> Message-ID: <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> >What do you think about adding an API to the rdma-cma allowing >applications to get a list of ip addresses associated with a particular >rdma device/port? It seems kind of backwards from the design of the rdma_cm, but if it's useful to end users, I don't have any objections to having it. Do you know if there would be a way to obtain this data entirely from userspace, or would kernel support be needed? - Sean From arlin.r.davis at intel.com Tue Jan 15 15:12:10 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Tue, 15 Jan 2008 15:12:10 -0800 Subject: [ofa-general] [PATCH] uDAPL v2: openib_cma: cleanup destroy/accept, add SID to PORT macros, fix init/resp settings in accept Message-ID: <000001c857cc$0e93a0c0$ff0da8c0@amr.corp.intel.com> openib_cma: fix cleanup issues in destroy_conn and accept macros to convert SID and PORT, network order fix init/responder settings in accept Signed-off by: Arlin Davis diff --git a/dapl/openib_cma/dapl_ib_cm.c b/dapl/openib_cma/dapl_ib_cm.c index e8c33f2..f7d83e0 100755 --- a/dapl/openib_cma/dapl_ib_cm.c +++ b/dapl/openib_cma/dapl_ib_cm.c @@ -69,13 +69,15 @@ static inline uint64_t cpu_to_be64(uint64_t x) { return bswap_64(x); } static inline uint64_t cpu_to_be64(uint64_t x) { return x; } #endif -/* cma requires 16 bit SID */ +/* cma requires 16 bit SID, in network order */ #define IB_PORT_MOD 32001 #define IB_PORT_BASE (65535 - IB_PORT_MOD) -#define MAKE_PORT(SID) \ +#define SID_TO_PORT(SID) \ (SID > 0xffff ? \ - (unsigned short)((SID % IB_PORT_MOD) + IB_PORT_BASE) :\ - (unsigned short)SID) + htons((unsigned short)((SID % IB_PORT_MOD) + IB_PORT_BASE)) :\ + htons((unsigned short)SID)) + +#define PORT_TO_SID(p) ntohs(p) static void dapli_addr_resolve(struct dapl_cm_id *conn) { @@ -173,8 +175,10 @@ void dapli_destroy_conn(struct dapl_cm_id *conn) dapl_os_lock(&conn->lock); conn->destroy = 1; - if (conn->ep) + if (conn->ep) { conn->ep->cm_handle = IB_INVALID_HANDLE; + conn->ep->qp_handle = IB_INVALID_HANDLE; + } cm_id = conn->cm_id; conn->cm_id = NULL; @@ -220,10 +224,10 @@ static struct dapl_cm_id * dapli_req_recv(struct dapl_cm_id *conn, /* Get requesters connect data, setup for accept */ new_conn->params.responder_resources = - DAPL_MIN(event->param.conn.initiator_depth, + DAPL_MIN(event->param.conn.responder_resources, conn->hca->ib_trans.max_rdma_rd_in); new_conn->params.initiator_depth = - DAPL_MIN(event->param.conn.responder_resources, + DAPL_MIN(event->param.conn.initiator_depth, conn->hca->ib_trans.max_rdma_rd_out); new_conn->params.flow_control = event->param.conn.flow_control; @@ -348,8 +352,8 @@ static void dapli_cm_active_cb(struct dapl_cm_id *conn, &conn->cm_id->route.addr.dst_addr)->sin_addr.s_addr)); /* setup local and remote ports for ep query */ - conn->ep->param.remote_port_qual = rdma_get_dst_port(conn->cm_id); - conn->ep->param.local_port_qual = rdma_get_src_port(conn->cm_id); + conn->ep->param.remote_port_qual = PORT_TO_SID(rdma_get_dst_port(conn->cm_id)); + conn->ep->param.local_port_qual = PORT_TO_SID(rdma_get_src_port(conn->cm_id)); dapl_evd_connection_callback(conn, IB_CME_CONNECTED, event->param.conn.private_data, conn->ep); @@ -526,8 +530,9 @@ DAT_RETURN dapls_ib_connect(IN DAT_EP_HANDLE ep_handle, if (NULL == ep_ptr) return DAT_SUCCESS; - dapl_dbg_log(DAPL_DBG_TYPE_CM, " connect: rSID %d, pdata %p, ln %d\n", - r_qual,p_data,p_size); + dapl_dbg_log(DAPL_DBG_TYPE_CM, + " connect: rSID 0x%llx rPort %d, pdata %p, ln %d\n", + r_qual,ntohs(SID_TO_PORT(r_qual)),p_data,p_size); /* rdma conn and cm_id pre-bound; reference via qp_handle */ conn = ep_ptr->cm_handle = ep_ptr->qp_handle; @@ -549,7 +554,7 @@ DAT_RETURN dapls_ib_connect(IN DAT_EP_HANDLE ep_handle, dapl_os_memcpy(&conn->r_addr, r_addr, sizeof(*r_addr)); /* Resolve remote address, src already bound during QP create */ - ((struct sockaddr_in*)&conn->r_addr)->sin_port = htons(MAKE_PORT(r_qual)); + ((struct sockaddr_in*)&conn->r_addr)->sin_port = SID_TO_PORT(r_qual); ((struct sockaddr_in*)&conn->r_addr)->sin_family = AF_INET; if (rdma_resolve_addr(conn->cm_id, NULL, @@ -593,7 +598,7 @@ dapls_ib_disconnect(IN DAPL_EP *ep_ptr, " disconnect(ep %p, conn %p, id %d flags %x)\n", ep_ptr,conn, (conn?conn->cm_id:0),close_flags); - if (conn == IB_INVALID_HANDLE) + if ((conn == IB_INVALID_HANDLE) || (conn->cm_id == NULL)) return DAT_SUCCESS; /* no graceful half-pipe disconnect option */ @@ -682,8 +687,7 @@ dapls_ib_setup_conn_listener(IN DAPL_IA *ia_ptr, /* open identifies the local device; per DAT specification */ /* Get family and address then set port to consumer's ServiceID */ dapl_os_memcpy(&addr, &ia_ptr->hca_ptr->hca_address, sizeof(addr)); - ((struct sockaddr_in *)&addr)->sin_port = htons(MAKE_PORT(ServiceID)); - + ((struct sockaddr_in *)&addr)->sin_port = SID_TO_PORT(ServiceID); if (rdma_bind_addr(conn->cm_id,(struct sockaddr *)&addr)) { if (errno == EBUSY) @@ -695,8 +699,8 @@ dapls_ib_setup_conn_listener(IN DAPL_IA *ia_ptr, } dapl_dbg_log(DAPL_DBG_TYPE_CM, - " listen(ia_ptr %p SID %d sp %p conn %p id %d)\n", - ia_ptr, MAKE_PORT(ServiceID), + " listen(ia_ptr %p SID 0x%llx Port %d sp %p conn %p id %d)\n", + ia_ptr, ServiceID, ntohs(SID_TO_PORT(ServiceID)), sp_ptr, conn, conn->cm_id); sp_ptr->cm_srvc_handle = conn; @@ -841,9 +845,6 @@ dapls_ib_accept_connection(IN DAT_CR_HANDLE cr_handle, } cr_ptr->param.local_ep_handle = ep_handle; - ep_ptr->qp_handle = cr_conn; - ep_ptr->cm_handle = cr_conn; - cr_conn->ep = ep_ptr; cr_conn->params.private_data = p_data; cr_conn->params.private_data_len = p_size; @@ -854,9 +855,15 @@ dapls_ib_accept_connection(IN DAT_CR_HANDLE cr_handle, goto bail; } + /* save accepted conn and EP reference */ + ep_ptr->qp_handle = cr_conn; + ep_ptr->cm_handle = cr_conn; + cr_conn->ep = ep_ptr; + /* setup local and remote ports for ep query */ - ep_ptr->param.remote_port_qual = rdma_get_dst_port(cr_conn->cm_id); - ep_ptr->param.local_port_qual = rdma_get_src_port(cr_conn->cm_id); + /* Note: port qual in network order */ + ep_ptr->param.remote_port_qual = PORT_TO_SID(rdma_get_dst_port(cr_conn->cm_id)); + ep_ptr->param.local_port_qual = PORT_TO_SID(rdma_get_src_port(cr_conn->cm_id)); return DAT_SUCCESS; bail: From weiny2 at llnl.gov Tue Jan 15 15:34:46 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 15 Jan 2008 15:34:46 -0800 Subject: [ofa-general] [PATCH] Fix spelling of "consolidate" Message-ID: <20080115153446.6065e2e0.weiny2@llnl.gov> >From 8415e746a545526fa88f30d4b77534279fe4aec6 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Tue, 15 Jan 2008 15:17:26 -0800 Subject: [PATCH] Fix spelling of "consolidate" Signed-off-by: Ira K. Weiny --- opensm/include/opensm/osm_subnet.h | 2 +- opensm/man/opensm.8 | 4 ++-- opensm/opensm/main.c | 6 ++++-- opensm/opensm/osm_sa_mcmember_record.c | 2 +- opensm/opensm/osm_subnet.c | 10 +++++----- 5 files changed, 13 insertions(+), 11 deletions(-) diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h index 558b34e..e60cf91 100644 --- a/opensm/include/opensm/osm_subnet.h +++ b/opensm/include/opensm/osm_subnet.h @@ -283,7 +283,7 @@ typedef struct _osm_subn_opt { char *event_plugin_name; char *node_name_map_name; char *prefix_routes_file; - boolean_t consolodate_ipv6_snm_req; + boolean_t consolidate_ipv6_snm_req; } osm_subn_opt_t; /* * FIELDS diff --git a/opensm/man/opensm.8 b/opensm/man/opensm.8 index 9c7b371..ab7fb8e 100644 --- a/opensm/man/opensm.8 +++ b/opensm/man/opensm.8 @@ -239,8 +239,8 @@ Specify the sweep time for the performance manager in seconds (default is 180 seconds). Only takes effect if --enable-perfmgr was specified at configure time. .TP -.BI --consolodate_ipv6_snm_reqests -Consolodate IPv6 Solicited Node Multicast group joins into 1 IB multicast +.BI --consolidate_ipv6_snm_reqests +Consolidate IPv6 Solicited Node Multicast group joins into 1 IB multicast group. .TP \fB\-v\fR, \fB\-\-verbose\fR diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c index a84f6c2..de69f68 100644 --- a/opensm/opensm/main.c +++ b/opensm/opensm/main.c @@ -296,6 +296,8 @@ static void show_usage(void) " Prefix routes control how the SA responds to path record\n" " queries for off-subnet DGIDs. Default file is:\n" " "OSM_DEFAULT_PREFIX_ROUTES_FILE"\n\n"); + printf("--consolidate_ipv6_snm_req\n" + "Consolidate IPv6 Solicited Node Multicast group joins into 1 IB multicast group.\n"); printf("-v\n" "--verbose\n" " This option increases the log verbosity level.\n" @@ -615,7 +617,7 @@ int main(int argc, char *argv[]) {"perfmgr_sweep_time_s", 1, NULL, 2}, #endif {"prefix_routes_file", 1, NULL, 3}, - {"consolodate_ipv6_snm_reqests", 0, NULL, 4}, + {"consolidate_ipv6_snm_reqests", 0, NULL, 4}, {NULL, 0, NULL, 0} /* Required at the end of the array */ }; @@ -918,7 +920,7 @@ int main(int argc, char *argv[]) opt.prefix_routes_file = optarg; break; case 4: - opt.consolodate_ipv6_snm_req = TRUE; + opt.consolidate_ipv6_snm_req = TRUE; break; case 'h': case '?': diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c index 12c5483..d057207 100644 --- a/opensm/opensm/osm_sa_mcmember_record.c +++ b/opensm/opensm/osm_sa_mcmember_record.c @@ -1169,7 +1169,7 @@ __search_mgrp_by_mgid(IN cl_map_item_t * const p_map_item, IN void *context) the same MGID */ if (memcmp(&p_mgrp->mcmember_rec.mgid, p_recvd_mgid, sizeof(ib_gid_t))) { - if (sa->p_subn->opt.consolodate_ipv6_snm_req) { + if (sa->p_subn->opt.consolidate_ipv6_snm_req) { /* Special Case IPV6 Multicast Loopback addresses */ /* 0xff12601bXXXX0000 : 0x00000001ffYYYYYY */ /* Where XXXX is the partition and YYYYYY is the last 24 bits diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 558ea68..c9b4d57 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -481,7 +481,7 @@ void osm_subn_set_default_opt(IN osm_subn_opt_t * const p_opt) p_opt->enable_quirks = FALSE; p_opt->no_clients_rereg = FALSE; p_opt->prefix_routes_file = OSM_DEFAULT_PREFIX_ROUTES_FILE; - p_opt->consolodate_ipv6_snm_req = FALSE; + p_opt->consolidate_ipv6_snm_req = FALSE; subn_set_default_qos_options(&p_opt->qos_options); subn_set_default_qos_options(&p_opt->qos_ca_options); subn_set_default_qos_options(&p_opt->qos_sw0_options); @@ -1396,8 +1396,8 @@ ib_api_status_t osm_subn_parse_conf_file(IN osm_subn_opt_t * const p_opts) opts_unpack_charp("prefix_routes_file", p_key, p_val, &p_opts->prefix_routes_file); - opts_unpack_boolean("consolodate_ipv6_snm_req", - p_key, p_val, &p_opts->consolodate_ipv6_snm_req); + opts_unpack_boolean("consolidate_ipv6_snm_req", + p_key, p_val, &p_opts->consolidate_ipv6_snm_req); } fclose(opts_file); @@ -1727,8 +1727,8 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) fprintf(opts_file, "#\n# IPv6 MCast Options\n#\n" - "consolodate_ipv6_snm_req %s\n\n", - p_opts->consolodate_ipv6_snm_req ? "TRUE" : "FALSE"); + "consolidate_ipv6_snm_req %s\n\n", + p_opts->consolidate_ipv6_snm_req ? "TRUE" : "FALSE"); /* optional string attributes ... */ -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Fix-spelling-of-consolidate.patch Type: application/octet-stream Size: 4919 bytes Desc: not available URL: From swise at opengridcomputing.com Tue Jan 15 15:38:05 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Tue, 15 Jan 2008 17:38:05 -0600 Subject: [ofa-general] Re: new API for the rdma-cma In-Reply-To: <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> References: <478D346B.8070506@opengridcomputing.com> <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> Message-ID: <478D43DD.1090007@opengridcomputing.com> Sean Hefty wrote: >> What do you think about adding an API to the rdma-cma allowing >> applications to get a list of ip addresses associated with a particular >> rdma device/port? > > It seems kind of backwards from the design of the rdma_cm, but if it's useful to > end users, I don't have any objections to having it. > Consider OMPI, which looks at each device on the system that can be used to connect to the other nodes. Each device is analyzed to see what method of communication should be used (tcp, ib, iwarp, whatever). Then these interfaces and their attributes are conveyed to all the nodes and the desired communication mesh is determined. Well to do this type of approach and support the rdma-cm, one really needs to know which ip addresses are bound to which rdma devices so those ipaddress (and the chosen ip port number) can be advertised. BTW: mvapich2 punted on this issue by putting the local rdma ipaddr in a file called /etc/mv2.conf. Thus they cannot easily handle multiple rdma devices since the file is global to each host. OMPI is trying to be more dynamic and discover these addresses at run time on a per-device/port basis. > Do you know if there would be a way to obtain this data entirely from userspace, > or would kernel support be needed? > It can be done in userspace, but it will need to issue an ioctl() to get the ipaddress. The ioctl would be on a AF_INET/SOCK_DGRAM socket file descriptor allocated just for this query. (this is how ifconfig works by the way). Here's a simple ipv4-only way to do this: Prototype: u32 rdma_get_ipv4addr(struct ibv_context *verbs, int dev_port_num) Description: Returns the main IPv4 address associated with the rdma device/port in Network Byte Order. This is suitable for stuffing in a sockaddr_in and issuing an rdma_connect(), for instance. Implementation: This function can use the /sys/class/infiniband_verbs*/device/net:* directories to derive the netdev interface name associated with this device/port. Like this: [root at vic20 ~]# ls -d /sys/class/infiniband_verbs/uverbs*/device/net:* /sys/class/infiniband_verbs/uverbs0/device/net:ib0 /sys/class/infiniband_verbs/uverbs0/device/net:ib1 /sys/class/infiniband_verbs/uverbs1/device/net:eth1 Then the code will do SIOCGIFADDR on the appropriate netdev name, like "ib0", "ib1", or "eth1". SIOCGIFADDR returns a sockaddr with the IPv4 address bound to that interface. This is simple. It doesn't cover all the possibilities. But it might be enough for what folks need... I'm prototyping this now (as part of our OMPI/rdma-cm/iwarp work). Steve. From jon at opengridcomputing.com Tue Jan 15 15:50:27 2008 From: jon at opengridcomputing.com (Jon Mason) Date: Tue, 15 Jan 2008 17:50:27 -0600 Subject: [ofa-general] [PATCH] Non-supported functions should return NULL when returning pointers Message-ID: <20080115235027.GB31543@opengridcomputing.com> From: Jon Mason Non-supported functions should return NULL when returning pointers. Some/Most user space programs will not check for a (void *) to -ENOSYS, which can look like a real address until referenced. This patch converts the uses of (void *) -ENOSYS to NULL. Signed-off-by: Jon Mason --- src/verbs.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/verbs.c b/src/verbs.c index c6c1356..447dde3 100644 --- a/src/verbs.c +++ b/src/verbs.c @@ -263,7 +263,7 @@ int iwch_destroy_cq(struct ibv_cq *ibcq) struct ibv_srq *iwch_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { - return (void *) -ENOSYS; + return NULL; } int iwch_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, @@ -407,7 +407,7 @@ int iwch_destroy_qp(struct ibv_qp *ibqp) struct ibv_ah *iwch_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { - return (void *) -ENOSYS; + return NULL; } int iwch_destroy_ah(struct ibv_ah *ah) -- 1.5.3.3 From sashak at voltaire.com Tue Jan 15 16:04:31 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 16 Jan 2008 00:04:31 +0000 Subject: [ofa-general] Re: [PATCH] Fix spelling of "consolidate" In-Reply-To: <20080115153446.6065e2e0.weiny2@llnl.gov> References: <20080115153446.6065e2e0.weiny2@llnl.gov> Message-ID: <20080116000431.GP16009@sashak.voltaire.com> On 15:34 Tue 15 Jan , Ira Weiny wrote: > From 8415e746a545526fa88f30d4b77534279fe4aec6 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Tue, 15 Jan 2008 15:17:26 -0800 > Subject: [PATCH] Fix spelling of "consolidate" > > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From arthur.jones at qlogic.com Tue Jan 15 15:58:08 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 15:58:08 -0800 Subject: [ofa-general] [PATCH] IB/ipath - bugfixes for 2.6.24 Message-ID: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> hi roland, these fix a long-sought bug with ipath hardware and the MVAPICH over verbs software stack and a use after free bug. i'd like to squeeze this in to 2.6.24 if there's still a chance... these changes are avail for git pull from (note the new branch name for 2.6.24): git://git.qlogic.com/ipath-linux-2.6 for-roland-2.6.24 arthur From arthur.jones at qlogic.com Tue Jan 15 15:58:13 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 15:58:13 -0800 Subject: [ofa-general] [PATCH 1/2] IB/ipath - fix UD send with immediate In-Reply-To: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> References: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080115235813.7794.4890.stgit@eng-46.internal.keyresearch.com> From: Ralph Campbell This fixes a small bug which incorrectly calculated the header size for UD send with immediate and therefore dropped packets. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_ud.c | 47 ++++++++++++++++---------------- 1 files changed, 23 insertions(+), 24 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_ud.c b/drivers/infiniband/hw/ipath/ipath_ud.c index 16a2a93..de67eed 100644 --- a/drivers/infiniband/hw/ipath/ipath_ud.c +++ b/drivers/infiniband/hw/ipath/ipath_ud.c @@ -301,8 +301,6 @@ int ipath_make_ud_req(struct ipath_qp *qp) /* header size in 32-bit words LRH+BTH+DETH = (8+12+8)/4. */ qp->s_hdrwords = 7; - if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) - qp->s_hdrwords++; qp->s_cur_size = wqe->length; qp->s_cur_sge = &qp->s_sge; qp->s_wqe = wqe; @@ -327,6 +325,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) ohdr = &qp->s_hdr.u.oth; } if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) { + qp->s_hdrwords++; ohdr->u.ud.imm_data = wqe->wr.imm_data; bth0 = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE << 24; } else @@ -455,6 +454,28 @@ void ipath_ud_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr, } } + /* + * The opcode is in the low byte when its in network order + * (top byte when in host order). + */ + opcode = be32_to_cpu(ohdr->bth[0]) >> 24; + if (qp->ibqp.qp_num > 1 && + opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { + if (header_in_data) { + wc.imm_data = *(__be32 *) data; + data += sizeof(__be32); + } else + wc.imm_data = ohdr->u.ud.imm_data; + wc.wc_flags = IB_WC_WITH_IMM; + hdrsize += sizeof(u32); + } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { + wc.imm_data = 0; + wc.wc_flags = 0; + } else { + dev->n_pkt_drops++; + goto bail; + } + /* Get the number of bytes the message was padded by. */ pad = (be32_to_cpu(ohdr->bth[0]) >> 20) & 3; if (unlikely(tlen < (hdrsize + pad + 4))) { @@ -482,28 +503,6 @@ void ipath_ud_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr, wc.byte_len = tlen + sizeof(struct ib_grh); /* - * The opcode is in the low byte when its in network order - * (top byte when in host order). - */ - opcode = be32_to_cpu(ohdr->bth[0]) >> 24; - if (qp->ibqp.qp_num > 1 && - opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { - if (header_in_data) { - wc.imm_data = *(__be32 *) data; - data += sizeof(__be32); - } else - wc.imm_data = ohdr->u.ud.imm_data; - wc.wc_flags = IB_WC_WITH_IMM; - hdrsize += sizeof(u32); - } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { - wc.imm_data = 0; - wc.wc_flags = 0; - } else { - dev->n_pkt_drops++; - goto bail; - } - - /* * Get the next work request entry to find where to put the data. */ if (qp->r_reuse_sge) From arthur.jones at qlogic.com Tue Jan 15 15:58:18 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 15:58:18 -0800 Subject: [ofa-general] [PATCH 2/2] IB/ipath - fix QP use after free bug In-Reply-To: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> References: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080115235818.7794.96013.stgit@eng-46.internal.keyresearch.com> From: Ralph Campbell When calling ipath_destory_qp() while send WQEs are queued, it is possible for the ipath driver to schedule the send tasklet after tasklet_kill() which leads to the QP structure being used after it is freed. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_qp.c | 3 ++- drivers/infiniband/hw/ipath/ipath_rc.c | 34 +++++++++++++++-------------- drivers/infiniband/hw/ipath/ipath_verbs.c | 6 +++-- drivers/infiniband/hw/ipath/ipath_verbs.h | 7 ++++++ 4 files changed, 30 insertions(+), 20 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_qp.c b/drivers/infiniband/hw/ipath/ipath_qp.c index b997ff8..fbeec63 100644 --- a/drivers/infiniband/hw/ipath/ipath_qp.c +++ b/drivers/infiniband/hw/ipath/ipath_qp.c @@ -942,6 +942,7 @@ int ipath_destroy_qp(struct ib_qp *ibqp) spin_lock(&dev->n_qps_lock); dev->n_qps_allocated--; spin_unlock(&dev->n_qps_lock); + set_bit(IPATH_S_DESTROYING, &qp->s_busy); /* Stop the sending tasklet. */ tasklet_kill(&qp->s_task); @@ -1077,5 +1078,5 @@ void ipath_get_credit(struct ipath_qp *qp, u32 aeth) (qp->s_lsn == (u32) -1 || ipath_cmp24(get_swqe_ptr(qp, qp->s_cur)->ssn, qp->s_lsn + 1) <= 0)) - tasklet_hi_schedule(&qp->s_task); + ipath_schedule_send(qp); } diff --git a/drivers/infiniband/hw/ipath/ipath_rc.c b/drivers/infiniband/hw/ipath/ipath_rc.c index 120a61b..9215dad 100644 --- a/drivers/infiniband/hw/ipath/ipath_rc.c +++ b/drivers/infiniband/hw/ipath/ipath_rc.c @@ -652,8 +652,8 @@ queue_ack: qp->s_ack_psn = qp->r_ack_psn; spin_unlock_irqrestore(&qp->s_lock, flags); - /* Call ipath_do_rc_send() in another thread. */ - tasklet_hi_schedule(&qp->s_task); + /* Schedule the send tasklet. */ + ipath_schedule_send(qp); done: return; @@ -713,7 +713,7 @@ static void reset_psn(struct ipath_qp *qp, u32 psn) /* * Set the state to restart in the middle of a request. * Don't change the s_sge, s_cur_sge, or s_cur_size. - * See ipath_do_rc_send(). + * See ipath_make_rc_req(). */ switch (opcode) { case IB_WR_SEND: @@ -790,7 +790,7 @@ void ipath_restart_rc(struct ipath_qp *qp, u32 psn, struct ib_wc *wc) dev->n_rc_resends += (qp->s_psn - psn) & IPATH_PSN_MASK; reset_psn(qp, psn); - tasklet_hi_schedule(&qp->s_task); + ipath_schedule_send(qp); bail: return; @@ -798,11 +798,13 @@ bail: static inline void update_last_psn(struct ipath_qp *qp, u32 psn) { - if (qp->s_wait_credit) { - qp->s_wait_credit = 0; - tasklet_hi_schedule(&qp->s_task); + if (qp->s_last_psn != psn) { + qp->s_last_psn = psn; + if (qp->s_wait_credit) { + qp->s_wait_credit = 0; + ipath_schedule_send(qp); + } } - qp->s_last_psn = psn; } /** @@ -904,10 +906,10 @@ static int do_rc_ack(struct ipath_qp *qp, u32 aeth, u32 psn, int opcode, if ((qp->s_flags & IPATH_S_FENCE_PENDING) && !qp->s_num_rd_atomic) { qp->s_flags &= ~IPATH_S_FENCE_PENDING; - tasklet_hi_schedule(&qp->s_task); + ipath_schedule_send(qp); } else if (qp->s_flags & IPATH_S_RDMAR_PENDING) { qp->s_flags &= ~IPATH_S_RDMAR_PENDING; - tasklet_hi_schedule(&qp->s_task); + ipath_schedule_send(qp); } } /* Post a send completion queue entry if requested. */ @@ -970,7 +972,7 @@ static int do_rc_ack(struct ipath_qp *qp, u32 aeth, u32 psn, int opcode, */ if (ipath_cmp24(qp->s_psn, psn) <= 0) { reset_psn(qp, psn + 1); - tasklet_hi_schedule(&qp->s_task); + ipath_schedule_send(qp); } } else if (ipath_cmp24(qp->s_psn, psn) <= 0) { qp->s_state = OP(SEND_LAST); @@ -1484,7 +1486,7 @@ static inline int ipath_rc_rcv_error(struct ipath_ibdev *dev, break; } qp->r_nak_state = 0; - tasklet_hi_schedule(&qp->s_task); + ipath_schedule_send(qp); unlock_done: spin_unlock_irqrestore(&qp->s_lock, flags); @@ -1847,8 +1849,8 @@ void ipath_rc_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr, barrier(); qp->r_head_ack_queue = next; - /* Call ipath_do_rc_send() in another thread. */ - tasklet_hi_schedule(&qp->s_task); + /* Schedule the send tasklet. */ + ipath_schedule_send(qp); goto done; } @@ -1907,8 +1909,8 @@ void ipath_rc_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr, barrier(); qp->r_head_ack_queue = next; - /* Call ipath_do_rc_send() in another thread. */ - tasklet_hi_schedule(&qp->s_task); + /* Schedule the send tasklet. */ + ipath_schedule_send(qp); goto done; } diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c index c4c9984..dc779e0 100644 --- a/drivers/infiniband/hw/ipath/ipath_verbs.c +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c @@ -616,7 +616,7 @@ static void ipath_ib_timer(struct ipath_ibdev *dev) if (--qp->s_rnr_timeout == 0) { do { list_del_init(&qp->timerwait); - tasklet_hi_schedule(&qp->s_task); + ipath_schedule_send(qp); if (list_empty(last)) break; qp = list_entry(last->next, struct ipath_qp, @@ -1060,7 +1060,7 @@ bail: * This is called from ipath_intr() at interrupt level when a PIO buffer is * available after ipath_verbs_send() returned an error that no buffers were * available. Return 1 if we consumed all the PIO buffers and we still have - * QPs waiting for buffers (for now, just do a tasklet_hi_schedule and + * QPs waiting for buffers (for now, just restart the send tasklet and * return zero). */ int ipath_ib_piobufavail(struct ipath_ibdev *dev) @@ -1077,7 +1077,7 @@ int ipath_ib_piobufavail(struct ipath_ibdev *dev) piowait); list_del_init(&qp->piowait); clear_bit(IPATH_S_BUSY, &qp->s_busy); - tasklet_hi_schedule(&qp->s_task); + ipath_schedule_send(qp); } spin_unlock_irqrestore(&dev->pending_lock, flags); diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.h b/drivers/infiniband/hw/ipath/ipath_verbs.h index 6ccb54f..ebaf6f4 100644 --- a/drivers/infiniband/hw/ipath/ipath_verbs.h +++ b/drivers/infiniband/hw/ipath/ipath_verbs.h @@ -422,6 +422,7 @@ struct ipath_qp { /* Bit definition for s_busy. */ #define IPATH_S_BUSY 0 +#define IPATH_S_DESTROYING 1 /* * Bit definitions for s_flags. @@ -635,6 +636,12 @@ static inline struct ipath_ibdev *to_idev(struct ib_device *ibdev) return container_of(ibdev, struct ipath_ibdev, ibdev); } +static inline void ipath_schedule_send(struct ipath_qp *qp) +{ + if (!test_bit(IPATH_S_DESTROYING, &qp->s_busy)) + tasklet_hi_schedule(&qp->s_task); +} + int ipath_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, From sashak at voltaire.com Tue Jan 15 16:08:55 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 16 Jan 2008 00:08:55 +0000 Subject: [ofa-general] [PATCH] ibutils/ibis: use pkey index value In-Reply-To: <20080115132012.46a7d512.weiny2@llnl.gov> References: <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <20080115182142.GC16009@sashak.voltaire.com> <20080115113533.10c3c7c9.weiny2@llnl.gov> <20080115195002.GL16009@sashak.voltaire.com> <20080115211104.GN16009@sashak.voltaire.com> <20080115132012.46a7d512.weiny2@llnl.gov> Message-ID: <20080116000855.GQ16009@sashak.voltaire.com> Use pkey index value rather than pkey - it is used now by osm_vendor_ibumad. Signed-off-by: Sasha Khapyorsky --- ibis/src/ibbbm.c | 2 +- ibis/src/ibcr.c | 2 +- ibis/src/ibpm.c | 2 +- ibis/src/ibvs.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/ibis/src/ibbbm.c b/ibis/src/ibbbm.c index 1f4ce20..e60c861 100644 --- a/ibis/src/ibbbm.c +++ b/ibis/src/ibbbm.c @@ -226,7 +226,7 @@ __ibbbm_vpd( mad_addr.static_rate = 0; mad_addr.addr_type.gsi.remote_qp=cl_hton32(1); mad_addr.addr_type.gsi.remote_qkey = cl_hton32(0x80010000); - mad_addr.addr_type.gsi.pkey = cl_hton16(0xffff); + mad_addr.addr_type.gsi.pkey = 0; mad_addr.addr_type.gsi.service_level = 0; mad_addr.addr_type.gsi.global_route = FALSE; diff --git a/ibis/src/ibcr.c b/ibis/src/ibcr.c index 873ae38..18405ad 100644 --- a/ibis/src/ibcr.c +++ b/ibis/src/ibcr.c @@ -185,7 +185,7 @@ __ibcr_prep_cr_mad( mad_addr.static_rate = 0; mad_addr.addr_type.gsi.remote_qp=cl_hton32(1); mad_addr.addr_type.gsi.remote_qkey = cl_hton32(0x80010000); - mad_addr.addr_type.gsi.pkey = cl_hton16(0xffff); + mad_addr.addr_type.gsi.pkey = 0; mad_addr.addr_type.gsi.service_level = 0; mad_addr.addr_type.gsi.global_route = FALSE; diff --git a/ibis/src/ibpm.c b/ibis/src/ibpm.c index ae57143..fce144e 100644 --- a/ibis/src/ibpm.c +++ b/ibis/src/ibpm.c @@ -176,7 +176,7 @@ __ibpm_prep_port_counter_mad( mad_addr.static_rate = 0; mad_addr.addr_type.gsi.remote_qp=cl_hton32(1); mad_addr.addr_type.gsi.remote_qkey = cl_hton32(0x80010000); - mad_addr.addr_type.gsi.pkey = cl_hton16(0xffff); + mad_addr.addr_type.gsi.pkey = 0; mad_addr.addr_type.gsi.service_level = 0; mad_addr.addr_type.gsi.global_route = FALSE; diff --git a/ibis/src/ibvs.c b/ibis/src/ibvs.c index 55701b8..e581d0f 100644 --- a/ibis/src/ibvs.c +++ b/ibis/src/ibvs.c @@ -242,7 +242,7 @@ __ibvs_init_mad_addr( p_mad_addr->static_rate = 0; p_mad_addr->addr_type.gsi.remote_qp=cl_hton32(1); p_mad_addr->addr_type.gsi.remote_qkey = cl_hton32(0x80010000); - p_mad_addr->addr_type.gsi.pkey = cl_hton16(0); + p_mad_addr->addr_type.gsi.pkey = 0; p_mad_addr->addr_type.gsi.service_level = 0; p_mad_addr->addr_type.gsi.global_route = FALSE; } -- 1.5.4.rc2.38.gd6da3 From arthur.jones at qlogic.com Tue Jan 15 16:18:08 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 16:18:08 -0800 Subject: [ofa-general] [PATCH] IB/ipath - second prep series for iba7220 Message-ID: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> hi roland, these are changes which help prepare for the upcoming iba7220 driver series. these are changes to the existing code only, and do not support the iba7220 directly, but should help to make reviewing the iba7220 code easier when it finally arrives. it looks like we will prob not be able to get the iba7220 driver into 2.6.25, as i'm still staring at quite a jumble of patches in front of it and time is getting very short. but i don't think anything is made worse by these cleanups... these changes are avail for git pull from: git://git.qlogic.com/ipath-linux-2.6 for-roland arthur From jgunthorpe at obsidianresearch.com Tue Jan 15 16:18:08 2008 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Tue, 15 Jan 2008 17:18:08 -0700 Subject: [ofa-general] Re: new API for the rdma-cma In-Reply-To: <478D43DD.1090007@opengridcomputing.com> References: <478D346B.8070506@opengridcomputing.com> <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> <478D43DD.1090007@opengridcomputing.com> Message-ID: <20080116001808.GJ28360@obsidianresearch.com> On Tue, Jan 15, 2008 at 05:38:05PM -0600, Steve Wise wrote: > Here's a simple ipv4-only way to do this: Maybe a better approach is to return the underlying interface index (ie if_nametoindex) and let the caller worry about properly querying the kernel. The right way these days is to use netlink and get the full IPv4/v6 list for the interface and that is a fair chunk of code if you can't use the netlink library.. The ioctl really only works in a very limited case and is certainly no good if you are doing ipv6. This would also make it easier for the caller to do more advanced things like base connection decisions on the routing tables. Jason From arthur.jones at qlogic.com Tue Jan 15 16:18:14 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 16:18:14 -0800 Subject: [ofa-general] [PATCH 1/6] IB/ipath - remove unused MDIO interface code In-Reply-To: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> References: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080116001814.12687.38673.stgit@eng-46.internal.keyresearch.com> From: Dave Olson This code has been unused for some time, but still had leftovers from when it was used. Signed-off-by: Dave Olson ipath_kregs->kr_mdio); - if (!(val & IPATH_MDIO_CMDVALID)) { - ret = 0; - break; - } - cond_resched(); - if (time_after(jiffies, timeout)) { - ipath_dbg("CMDVALID stuck in mdio reg? (%llx)\n", - (unsigned long long) val); - ret = -ENODEV; - break; - } - } while (1); - - return ret; -} - /* * Flush all sends that might be in the ready to send state, as well as any diff --git a/drivers/infiniband/hw/ipath/ipath_iba6110.c b/drivers/infiniband/hw/ipath/ipath_iba6110.c index 6976d96..ac436c6 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6110.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6110.c @@ -1274,8 +1274,7 @@ static void ipath_ht_init_hwerrors(struct ipath_devdata *dd) val &= ~INFINIPATH_HWE_HTCMISCERR4; /* - * PLL ignored because MDIO interface has a logic problem - * for reads, on Comstock and Ponderosa. BRINGUP + * PLL ignored because unused MDIO interface has a logic problem */ if (dd->ipath_boardrev == 4 || dd->ipath_boardrev == 9) val &= ~INFINIPATH_HWE_SERDESPLLFAILED; @@ -1353,16 +1352,6 @@ static int ipath_ht_bringup_serdes(struct ipath_devdata *dd) } val = ipath_read_kreg64(dd, dd->ipath_kregs->kr_xgxsconfig); - if (((val >> INFINIPATH_XGXS_MDIOADDR_SHIFT) & - INFINIPATH_XGXS_MDIOADDR_MASK) != 3) { - val &= ~(INFINIPATH_XGXS_MDIOADDR_MASK << - INFINIPATH_XGXS_MDIOADDR_SHIFT); - /* - * we use address 3 - */ - val |= 3ULL << INFINIPATH_XGXS_MDIOADDR_SHIFT; - change = 1; - } if (val & INFINIPATH_XGXS_RESET) { /* normally true after boot */ val &= ~INFINIPATH_XGXS_RESET; @@ -1398,21 +1387,6 @@ static int ipath_ht_bringup_serdes(struct ipath_devdata *dd) (unsigned long long) ipath_read_kreg64(dd, dd->ipath_kregs->kr_xgxsconfig)); - if (!ipath_waitfor_mdio_cmdready(dd)) { - ipath_write_kreg(dd, dd->ipath_kregs->kr_mdio, - ipath_mdio_req(IPATH_MDIO_CMD_READ, 31, - IPATH_MDIO_CTRL_XGXS_REG_8, - 0)); - if (ipath_waitfor_complete(dd, dd->ipath_kregs->kr_mdio, - IPATH_MDIO_DATAVALID, &val)) - ipath_dbg("Never got MDIO data for XGXS status " - "read\n"); - else - ipath_cdbg(VERBOSE, "MDIO Read reg8, " - "'bank' 31 %x\n", (u32) val); - } else - ipath_dbg("Never got MDIO cmdready for XGXS status read\n"); - return ret; /* for now, say we always succeeded */ } diff --git a/drivers/infiniband/hw/ipath/ipath_iba6120.c b/drivers/infiniband/hw/ipath/ipath_iba6120.c index 066a8ea..57915fd 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6120.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6120.c @@ -725,17 +725,8 @@ static int ipath_pe_bringup_serdes(struct ipath_devdata *dd) val = ipath_read_kreg64(dd, dd->ipath_kregs->kr_xgxsconfig); prev_val = val; - if (((val >> INFINIPATH_XGXS_MDIOADDR_SHIFT) & - INFINIPATH_XGXS_MDIOADDR_MASK) != 3) { - val &= - ~(INFINIPATH_XGXS_MDIOADDR_MASK << - INFINIPATH_XGXS_MDIOADDR_SHIFT); - /* MDIO address 3 */ - val |= 3ULL << INFINIPATH_XGXS_MDIOADDR_SHIFT; - } - if (val & INFINIPATH_XGXS_RESET) { + if (val & INFINIPATH_XGXS_RESET) val &= ~INFINIPATH_XGXS_RESET; - } if (((val >> INFINIPATH_XGXS_RX_POL_SHIFT) & INFINIPATH_XGXS_RX_POL_MASK) != dd->ipath_rx_pol_inv ) { /* need to compensate for Tx inversion in partner */ @@ -765,21 +756,6 @@ static int ipath_pe_bringup_serdes(struct ipath_devdata *dd) (unsigned long long) ipath_read_kreg64(dd, dd->ipath_kregs->kr_xgxsconfig)); - if (!ipath_waitfor_mdio_cmdready(dd)) { - ipath_write_kreg( - dd, dd->ipath_kregs->kr_mdio, - ipath_mdio_req(IPATH_MDIO_CMD_READ, 31, - IPATH_MDIO_CTRL_XGXS_REG_8, 0)); - if (ipath_waitfor_complete(dd, dd->ipath_kregs->kr_mdio, - IPATH_MDIO_DATAVALID, &val)) - ipath_dbg("Never got MDIO data for XGXS " - "status read\n"); - else - ipath_cdbg(VERBOSE, "MDIO Read reg8, " - "'bank' 31 %x\n", (u32) val); - } else - ipath_dbg("Never got MDIO cmdready for XGXS status read\n"); - return ret; } diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index b84039c..07971a2 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -777,8 +777,6 @@ int ipath_set_rx_pol_inv(struct ipath_devdata *dd, u8 new_pol_inv); /* free up any allocated data at closes */ void ipath_free_data(struct ipath_portdata *dd); -int ipath_waitfor_mdio_cmdready(struct ipath_devdata *); -int ipath_waitfor_complete(struct ipath_devdata *, ipath_kreg, u64, u64 *); u32 __iomem *ipath_getpiobuf(struct ipath_devdata *, u32 *); void ipath_init_iba6120_funcs(struct ipath_devdata *); void ipath_init_iba6110_funcs(struct ipath_devdata *); @@ -802,33 +800,6 @@ void ipath_set_led_override(struct ipath_devdata *dd, unsigned int val); */ #define IPATH_DFLT_RCVHDRSIZE 9 -#define IPATH_MDIO_CMD_WRITE 1 -#define IPATH_MDIO_CMD_READ 2 -#define IPATH_MDIO_CLD_DIV 25 /* to get 2.5 Mhz mdio clock */ -#define IPATH_MDIO_CMDVALID 0x40000000 /* bit 30 */ -#define IPATH_MDIO_DATAVALID 0x80000000 /* bit 31 */ -#define IPATH_MDIO_CTRL_STD 0x0 - -static inline u64 ipath_mdio_req(int cmd, int dev, int reg, int data) -{ - return (((u64) IPATH_MDIO_CLD_DIV) << 32) | - (cmd << 26) | - (dev << 21) | - (reg << 16) | - (data & 0xFFFF); -} - - /* signal and fifo status, in bank 31 */ -#define IPATH_MDIO_CTRL_XGXS_REG_8 0x8 - /* controls loopback, redundancy */ -#define IPATH_MDIO_CTRL_8355_REG_1 0x10 - /* premph, encdec, etc. */ -#define IPATH_MDIO_CTRL_8355_REG_2 0x11 - /* Kchars, etc. */ -#define IPATH_MDIO_CTRL_8355_REG_6 0x15 -#define IPATH_MDIO_CTRL_8355_REG_9 0x18 -#define IPATH_MDIO_CTRL_8355_REG_10 0x1D - int ipath_get_user_pages(unsigned long, size_t, struct page **); void ipath_release_user_pages(struct page **, size_t); void ipath_release_user_pages_on_close(struct page **, size_t); diff --git a/drivers/infiniband/hw/ipath/ipath_registers.h b/drivers/infiniband/hw/ipath/ipath_registers.h index 156ef14..6d2a17f 100644 --- a/drivers/infiniband/hw/ipath/ipath_registers.h +++ b/drivers/infiniband/hw/ipath/ipath_registers.h @@ -271,20 +271,6 @@ #define INFINIPATH_EXTC_LEDGBLOK_ON 0x00000002ULL #define INFINIPATH_EXTC_LEDGBLERR_OFF 0x00000001ULL -/* kr_mdio bits */ -#define INFINIPATH_MDIO_CLKDIV_MASK 0x7FULL -#define INFINIPATH_MDIO_CLKDIV_SHIFT 32 -#define INFINIPATH_MDIO_COMMAND_MASK 0x7ULL -#define INFINIPATH_MDIO_COMMAND_SHIFT 26 -#define INFINIPATH_MDIO_DEVADDR_MASK 0x1FULL -#define INFINIPATH_MDIO_DEVADDR_SHIFT 21 -#define INFINIPATH_MDIO_REGADDR_MASK 0x1FULL -#define INFINIPATH_MDIO_REGADDR_SHIFT 16 -#define INFINIPATH_MDIO_DATA_MASK 0xFFFFULL -#define INFINIPATH_MDIO_DATA_SHIFT 0 -#define INFINIPATH_MDIO_CMDVALID 0x0000000040000000ULL -#define INFINIPATH_MDIO_RDDATAVALID 0x0000000080000000ULL - /* kr_partitionkey bits */ #define INFINIPATH_PKEY_SIZE 16 #define INFINIPATH_PKEY_MASK 0xFFFF @@ -302,8 +288,6 @@ /* kr_xgxsconfig bits */ #define INFINIPATH_XGXS_RESET 0x7ULL -#define INFINIPATH_XGXS_MDIOADDR_MASK 0xfULL -#define INFINIPATH_XGXS_MDIOADDR_SHIFT 4 #define INFINIPATH_XGXS_RX_POL_SHIFT 19 #define INFINIPATH_XGXS_RX_POL_MASK 0xfULL From arthur.jones at qlogic.com Tue Jan 15 16:18:19 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 16:18:19 -0800 Subject: [ofa-general] [PATCH 2/6] IB/ipath - add new chip-specific functions to older chips, consistent init In-Reply-To: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> References: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080116001819.12687.31914.stgit@eng-46.internal.keyresearch.com> From: Dave Olson This adds the new (sometimes empty) chip-specific functions to the older chips, and makes the initialization and related functions consistent across all 3 chips. Signed-off-by: Dave Olson --- drivers/infiniband/hw/ipath/ipath_common.h | 10 ++ drivers/infiniband/hw/ipath/ipath_iba6110.c | 172 ++++++++++++++++++++++++--- drivers/infiniband/hw/ipath/ipath_iba6120.c | 163 ++++++++++++++++++++++++-- drivers/infiniband/hw/ipath/ipath_kernel.h | 88 ++++++++++++++ 4 files changed, 407 insertions(+), 26 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_common.h b/drivers/infiniband/hw/ipath/ipath_common.h index aa780e7..0fa43ba 100644 --- a/drivers/infiniband/hw/ipath/ipath_common.h +++ b/drivers/infiniband/hw/ipath/ipath_common.h @@ -82,6 +82,16 @@ #define IPATH_IB_LINK_EXTERNAL 7 /* normal, disable local loopback */ /* + * These 3 values (SDR and DDR may be ORed for auto-speed + * negotiation) are used for the 3rd argument to path_f_set_ib_cfg + * with cmd IPATH_IB_CFG_SPD_ENB, by direct calls or via sysfs. They + * are also the the possible values for ipath_link_speed_enabled and active + * The values were chosen to match values used within the IB spec. + */ +#define IPATH_IB_SDR 1 +#define IPATH_IB_DDR 2 + +/* * stats maintained by the driver. For now, at least, this is global * to all minor devices. */ diff --git a/drivers/infiniband/hw/ipath/ipath_iba6110.c b/drivers/infiniband/hw/ipath/ipath_iba6110.c index ac436c6..9e2ced3 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6110.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6110.c @@ -329,6 +329,9 @@ static const struct ipath_cregs ipath_ht_cregs = { #define INFINIPATH_HWE_HTAPLL_RFSLIP 0x1000000000000000ULL #define INFINIPATH_HWE_SERDESPLLFAILED 0x2000000000000000ULL +#define IBA6110_IBCS_LINKTRAININGSTATE_MASK 0xf +#define IBA6110_IBCS_LINKSTATE_SHIFT 4 + /* kr_extstatus bits */ #define INFINIPATH_EXTS_FREQSEL 0x2 #define INFINIPATH_EXTS_SERDESSEL 0x4 @@ -705,7 +708,6 @@ static int ipath_ht_boardname(struct ipath_devdata *dd, char *name, "with ID %u\n", boardrev); snprintf(name, namelen, "Unknown_InfiniPath_QHT7xxx_%u", boardrev); - ret = 1; break; } if (n) @@ -1137,11 +1139,49 @@ static void ipath_setup_ht_setextled(struct ipath_devdata *dd, static void ipath_init_ht_variables(struct ipath_devdata *dd) { + /* + * setup the register offsets, since they are different for each + * chip + */ + dd->ipath_kregs = &ipath_ht_kregs; + dd->ipath_cregs = &ipath_ht_cregs; + dd->ipath_gpio_sda_num = _IPATH_GPIO_SDA_NUM; dd->ipath_gpio_scl_num = _IPATH_GPIO_SCL_NUM; dd->ipath_gpio_sda = IPATH_GPIO_SDA; dd->ipath_gpio_scl = IPATH_GPIO_SCL; + /* + * Fill in data for field-values that change in newer chips. + * We dynamically specify only the mask for LINKTRAININGSTATE + * and only the shift for LINKSTATE, as they are the only ones + * that change. Also precalculate the 3 link states of interest + * and the combined mask. + */ + dd->ibcs_ls_shift = IBA6110_IBCS_LINKSTATE_SHIFT; + dd->ibcs_lts_mask = IBA6110_IBCS_LINKTRAININGSTATE_MASK; + dd->ibcs_mask = (INFINIPATH_IBCS_LINKSTATE_MASK << + dd->ibcs_ls_shift) | dd->ibcs_lts_mask; + dd->ib_init = (INFINIPATH_IBCS_LT_STATE_LINKUP << + INFINIPATH_IBCS_LINKTRAININGSTATE_SHIFT) | + (INFINIPATH_IBCS_L_STATE_INIT << dd->ibcs_ls_shift); + dd->ib_arm = (INFINIPATH_IBCS_LT_STATE_LINKUP << + INFINIPATH_IBCS_LINKTRAININGSTATE_SHIFT) | + (INFINIPATH_IBCS_L_STATE_ARM << dd->ibcs_ls_shift); + dd->ib_active = (INFINIPATH_IBCS_LT_STATE_LINKUP << + INFINIPATH_IBCS_LINKTRAININGSTATE_SHIFT) | + (INFINIPATH_IBCS_L_STATE_ACTIVE << dd->ibcs_ls_shift); + + /* + * Fill in data for ibcc field-values that change in newer chips. + * We dynamically specify only the mask for LINKINITCMD + * and only the shift for LINKCMD and MAXPKTLEN, as they are + * the only ones that change. + */ + dd->ibcc_lic_mask = INFINIPATH_IBCC_LINKINITCMD_MASK; + dd->ibcc_lc_shift = INFINIPATH_IBCC_LINKCMD_SHIFT; + dd->ibcc_mpl_shift = INFINIPATH_IBCC_MAXPKTLEN_SHIFT; + /* Fill in shifts for RcvCtrl. */ dd->ipath_r_portenable_shift = INFINIPATH_R_PORTENABLE_SHIFT; dd->ipath_r_intravail_shift = INFINIPATH_R_INTRAVAIL_SHIFT; @@ -1204,6 +1244,8 @@ static void ipath_init_ht_variables(struct ipath_devdata *dd) dd->ipath_i_rcvavail_mask = INFINIPATH_I_RCVAVAIL_MASK; dd->ipath_i_rcvurg_mask = INFINIPATH_I_RCVURG_MASK; + dd->ipath_i_rcvavail_shift = INFINIPATH_I_RCVAVAIL_SHIFT; + dd->ipath_i_rcvurg_shift = INFINIPATH_I_RCVURG_SHIFT; /* * EEPROM error log 0 is TXE Parity errors. 1 is RXE Parity. @@ -1217,9 +1259,17 @@ static void ipath_init_ht_variables(struct ipath_devdata *dd) INFINIPATH_HWE_RXEMEMPARITYERR_MASK << INFINIPATH_HWE_RXEMEMPARITYERR_SHIFT; - dd->ipath_eep_st_masks[2].errs_to_log = - INFINIPATH_E_INVALIDADDR | INFINIPATH_E_RESET; + dd->ipath_eep_st_masks[2].errs_to_log = INFINIPATH_E_RESET; + dd->delay_mult = 2; /* SDR, 4X, can't change */ + + dd->ipath_link_width_supported = IB_WIDTH_1X | IB_WIDTH_4X; + dd->ipath_link_speed_supported = IPATH_IB_SDR; + dd->ipath_link_width_enabled = IB_WIDTH_4X; + dd->ipath_link_speed_enabled = dd->ipath_link_speed_supported; + /* these can't change for this chip, so set once */ + dd->ipath_link_width_active = dd->ipath_link_width_enabled; + dd->ipath_link_speed_active = dd->ipath_link_speed_enabled; } /** @@ -1281,6 +1331,9 @@ static void ipath_ht_init_hwerrors(struct ipath_devdata *dd) dd->ipath_hwerrmask = val; } + + + /** * ipath_ht_bringup_serdes - bring up the serdes * @dd: the infinipath device @@ -1439,6 +1492,7 @@ static void ipath_ht_put_tid(struct ipath_devdata *dd, pa |= lenvalid | INFINIPATH_RT_VALID; } } + writeq(pa, tidptr); } @@ -1644,6 +1698,13 @@ static void ipath_ht_free_irq(struct ipath_devdata *dd) dd->ipath_intconfig = 0; } +static struct ipath_message_header * +ipath_ht_get_msgheader(struct ipath_devdata *dd, __le32 *rhf_addr) +{ + return (struct ipath_message_header *) + &rhf_addr[sizeof(u64) / sizeof(u32)]; +} + static void ipath_ht_config_ports(struct ipath_devdata *dd, ushort cfgports) { dd->ipath_portcnt = @@ -1757,6 +1818,90 @@ static void ipath_ht_read_counters(struct ipath_devdata *dd, cntrs->RxDlidFltrCnt = 0; } + +/* no interrupt fallback for these chips */ +static int ipath_ht_nointr_fallback(struct ipath_devdata *dd) +{ + return 0; +} + + +/* + * reset the XGXS (between serdes and IBC). Slightly less intrusive + * than resetting the IBC or external link state, and useful in some + * cases to cause some retraining. To do this right, we reset IBC + * as well. + */ +static void ipath_ht_xgxs_reset(struct ipath_devdata *dd) +{ + u64 val, prev_val; + + prev_val = ipath_read_kreg64(dd, dd->ipath_kregs->kr_xgxsconfig); + val = prev_val | INFINIPATH_XGXS_RESET; + prev_val &= ~INFINIPATH_XGXS_RESET; /* be sure */ + ipath_write_kreg(dd, dd->ipath_kregs->kr_control, + dd->ipath_control & ~INFINIPATH_C_LINKENABLE); + ipath_write_kreg(dd, dd->ipath_kregs->kr_xgxsconfig, val); + ipath_read_kreg32(dd, dd->ipath_kregs->kr_scratch); + ipath_write_kreg(dd, dd->ipath_kregs->kr_xgxsconfig, prev_val); + ipath_write_kreg(dd, dd->ipath_kregs->kr_control, + dd->ipath_control); +} + + +static int ipath_ht_get_ib_cfg(struct ipath_devdata *dd, int which) +{ + int ret; + + switch (which) { + case IPATH_IB_CFG_LWID: + ret = dd->ipath_link_width_active; + break; + case IPATH_IB_CFG_SPD: + ret = dd->ipath_link_speed_active; + break; + case IPATH_IB_CFG_LWID_ENB: + ret = dd->ipath_link_width_enabled; + break; + case IPATH_IB_CFG_SPD_ENB: + ret = dd->ipath_link_speed_enabled; + break; + default: + ret = -ENOTSUPP; + break; + } + return ret; +} + + +/* we assume range checking is already done, if needed */ +static int ipath_ht_set_ib_cfg(struct ipath_devdata *dd, int which, u32 val) +{ + int ret = 0; + + if (which == IPATH_IB_CFG_LWID_ENB) + dd->ipath_link_width_enabled = val; + else if (which == IPATH_IB_CFG_SPD_ENB) + dd->ipath_link_speed_enabled = val; + else + ret = -ENOTSUPP; + return ret; +} + + +static void ipath_ht_config_jint(struct ipath_devdata *dd, u16 a, u16 b) +{ +} + + +static int ipath_ht_ib_updown(struct ipath_devdata *dd, int ibup, u64 ibcs) +{ + ipath_setup_ht_setextled(dd, ipath_ib_linkstate(dd, ibcs), + ipath_ib_linktrstate(dd, ibcs)); + return 0; +} + + /** * ipath_init_iba6110_funcs - set up the chip-specific function pointers * @dd: the infinipath device @@ -1781,24 +1926,19 @@ void ipath_init_iba6110_funcs(struct ipath_devdata *dd) dd->ipath_f_setextled = ipath_setup_ht_setextled; dd->ipath_f_get_base_info = ipath_ht_get_base_info; dd->ipath_f_free_irq = ipath_ht_free_irq; + dd->ipath_f_tidtemplate = ipath_ht_tidtemplate; + dd->ipath_f_intr_fallback = ipath_ht_nointr_fallback; + dd->ipath_f_get_msgheader = ipath_ht_get_msgheader; dd->ipath_f_config_ports = ipath_ht_config_ports; dd->ipath_f_read_counters = ipath_ht_read_counters; + dd->ipath_f_xgxs_reset = ipath_ht_xgxs_reset; + dd->ipath_f_get_ib_cfg = ipath_ht_get_ib_cfg; + dd->ipath_f_set_ib_cfg = ipath_ht_set_ib_cfg; + dd->ipath_f_config_jint = ipath_ht_config_jint; + dd->ipath_f_ib_updown = ipath_ht_ib_updown; /* * initialize chip-specific variables */ - dd->ipath_f_tidtemplate = ipath_ht_tidtemplate; - - /* - * setup the register offsets, since they are different for each - * chip - */ - dd->ipath_kregs = &ipath_ht_kregs; - dd->ipath_cregs = &ipath_ht_cregs; - - /* - * do very early init that is needed before ipath_f_bus is - * called - */ ipath_init_ht_variables(dd); } diff --git a/drivers/infiniband/hw/ipath/ipath_iba6120.c b/drivers/infiniband/hw/ipath/ipath_iba6120.c index 57915fd..597192e 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6120.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6120.c @@ -329,6 +329,9 @@ static const struct ipath_cregs ipath_pe_cregs = { #define INFINIPATH_HWE_PCIE0PLLFAILED 0x0800000000000000ULL #define INFINIPATH_HWE_SERDESPLLFAILED 0x1000000000000000ULL +#define IBA6120_IBCS_LINKTRAININGSTATE_MASK 0xf +#define IBA6120_IBCS_LINKSTATE_SHIFT 4 + /* kr_extstatus bits */ #define INFINIPATH_EXTS_FREQSEL 0x2 #define INFINIPATH_EXTS_SERDESSEL 0x4 @@ -936,12 +939,27 @@ static int ipath_setup_pe_config(struct ipath_devdata *dd, else ipath_dev_err(dd, "Can't find PCI Express " "capability!\n"); + + dd->ipath_link_width_supported = IB_WIDTH_1X | IB_WIDTH_4X; + dd->ipath_link_speed_supported = IPATH_IB_SDR; + dd->ipath_link_width_enabled = IB_WIDTH_4X; + dd->ipath_link_speed_enabled = dd->ipath_link_speed_supported; + /* these can't change for this chip, so set once */ + dd->ipath_link_width_active = dd->ipath_link_width_enabled; + dd->ipath_link_speed_active = dd->ipath_link_speed_enabled; return 0; } static void ipath_init_pe_variables(struct ipath_devdata *dd) { /* + * setup the register offsets, since they are different for each + * chip + */ + dd->ipath_kregs = &ipath_pe_kregs; + dd->ipath_cregs = &ipath_pe_cregs; + + /* * bits for selecting i2c direction and values, * used for I2C serial flash */ @@ -950,6 +968,37 @@ static void ipath_init_pe_variables(struct ipath_devdata *dd) dd->ipath_gpio_sda = IPATH_GPIO_SDA; dd->ipath_gpio_scl = IPATH_GPIO_SCL; + /* + * Fill in data for field-values that change in newer chips. + * We dynamically specify only the mask for LINKTRAININGSTATE + * and only the shift for LINKSTATE, as they are the only ones + * that change. Also precalculate the 3 link states of interest + * and the combined mask. + */ + dd->ibcs_ls_shift = IBA6120_IBCS_LINKSTATE_SHIFT; + dd->ibcs_lts_mask = IBA6120_IBCS_LINKTRAININGSTATE_MASK; + dd->ibcs_mask = (INFINIPATH_IBCS_LINKSTATE_MASK << + dd->ibcs_ls_shift) | dd->ibcs_lts_mask; + dd->ib_init = (INFINIPATH_IBCS_LT_STATE_LINKUP << + INFINIPATH_IBCS_LINKTRAININGSTATE_SHIFT) | + (INFINIPATH_IBCS_L_STATE_INIT << dd->ibcs_ls_shift); + dd->ib_arm = (INFINIPATH_IBCS_LT_STATE_LINKUP << + INFINIPATH_IBCS_LINKTRAININGSTATE_SHIFT) | + (INFINIPATH_IBCS_L_STATE_ARM << dd->ibcs_ls_shift); + dd->ib_active = (INFINIPATH_IBCS_LT_STATE_LINKUP << + INFINIPATH_IBCS_LINKTRAININGSTATE_SHIFT) | + (INFINIPATH_IBCS_L_STATE_ACTIVE << dd->ibcs_ls_shift); + + /* + * Fill in data for ibcc field-values that change in newer chips. + * We dynamically specify only the mask for LINKINITCMD + * and only the shift for LINKCMD and MAXPKTLEN, as they are + * the only ones that change. + */ + dd->ibcc_lic_mask = INFINIPATH_IBCC_LINKINITCMD_MASK; + dd->ibcc_lc_shift = INFINIPATH_IBCC_LINKCMD_SHIFT; + dd->ibcc_mpl_shift = INFINIPATH_IBCC_MAXPKTLEN_SHIFT; + /* Fill in shifts for RcvCtrl. */ dd->ipath_r_portenable_shift = INFINIPATH_R_PORTENABLE_SHIFT; dd->ipath_r_intravail_shift = INFINIPATH_R_INTRAVAIL_SHIFT; @@ -1003,6 +1052,8 @@ static void ipath_init_pe_variables(struct ipath_devdata *dd) dd->ipath_i_rcvavail_mask = INFINIPATH_I_RCVAVAIL_MASK; dd->ipath_i_rcvurg_mask = INFINIPATH_I_RCVURG_MASK; + dd->ipath_i_rcvavail_shift = INFINIPATH_I_RCVAVAIL_SHIFT; + dd->ipath_i_rcvurg_shift = INFINIPATH_I_RCVURG_SHIFT; /* * EEPROM error log 0 is TXE Parity errors. 1 is RXE Parity. @@ -1024,6 +1075,7 @@ static void ipath_init_pe_variables(struct ipath_devdata *dd) INFINIPATH_E_INVALIDADDR | INFINIPATH_E_RESET; + dd->delay_mult = 2; /* SDR, 4X, can't change */ } /* setup the MSI stuff again after a reset. I'd like to just call @@ -1329,6 +1381,9 @@ static int ipath_pe_early_init(struct ipath_devdata *dd) */ dd->ipath_rcvhdrentsize = 24; dd->ipath_rcvhdrsize = IPATH_DFLT_RCVHDRSIZE; + dd->ipath_rhf_offset = 0; + dd->ipath_egrtidbase = (u64 __iomem *) + ((char __iomem *) dd->ipath_kregbase + dd->ipath_rcvegrbase); /* * To truly support a 4KB MTU (for usermode), we need to @@ -1399,6 +1454,14 @@ static void ipath_pe_free_irq(struct ipath_devdata *dd) dd->ipath_irq = 0; } + +static struct ipath_message_header * +ipath_pe_get_msgheader(struct ipath_devdata *dd, __le32 *rhf_addr) +{ + return (struct ipath_message_header *) + &rhf_addr[sizeof(u64) / sizeof(u32)]; +} + static void ipath_pe_config_ports(struct ipath_devdata *dd, ushort cfgports) { dd->ipath_portcnt = @@ -1534,6 +1597,88 @@ static int ipath_pe_txe_recover(struct ipath_devdata *dd) return 1; } +/* no interrupt fallback for these chips */ +static int ipath_pe_nointr_fallback(struct ipath_devdata *dd) +{ + return 0; +} + + +/* + * reset the XGXS (between serdes and IBC). Slightly less intrusive + * than resetting the IBC or external link state, and useful in some + * cases to cause some retraining. To do this right, we reset IBC + * as well. + */ +static void ipath_pe_xgxs_reset(struct ipath_devdata *dd) +{ + u64 val, prev_val; + + prev_val = ipath_read_kreg64(dd, dd->ipath_kregs->kr_xgxsconfig); + val = prev_val | INFINIPATH_XGXS_RESET; + prev_val &= ~INFINIPATH_XGXS_RESET; /* be sure */ + ipath_write_kreg(dd, dd->ipath_kregs->kr_control, + dd->ipath_control & ~INFINIPATH_C_LINKENABLE); + ipath_write_kreg(dd, dd->ipath_kregs->kr_xgxsconfig, val); + ipath_read_kreg32(dd, dd->ipath_kregs->kr_scratch); + ipath_write_kreg(dd, dd->ipath_kregs->kr_xgxsconfig, prev_val); + ipath_write_kreg(dd, dd->ipath_kregs->kr_control, + dd->ipath_control); +} + + +static int ipath_pe_get_ib_cfg(struct ipath_devdata *dd, int which) +{ + int ret; + + switch (which) { + case IPATH_IB_CFG_LWID: + ret = dd->ipath_link_width_active; + break; + case IPATH_IB_CFG_SPD: + ret = dd->ipath_link_speed_active; + break; + case IPATH_IB_CFG_LWID_ENB: + ret = dd->ipath_link_width_enabled; + break; + case IPATH_IB_CFG_SPD_ENB: + ret = dd->ipath_link_speed_enabled; + break; + default: + ret = -ENOTSUPP; + break; + } + return ret; +} + + +/* we assume range checking is already done, if needed */ +static int ipath_pe_set_ib_cfg(struct ipath_devdata *dd, int which, u32 val) +{ + int ret = 0; + + if (which == IPATH_IB_CFG_LWID_ENB) + dd->ipath_link_width_enabled = val; + else if (which == IPATH_IB_CFG_SPD_ENB) + dd->ipath_link_speed_enabled = val; + else + ret = -ENOTSUPP; + return ret; +} + +static void ipath_pe_config_jint(struct ipath_devdata *dd, u16 a, u16 b) +{ +} + + +static int ipath_pe_ib_updown(struct ipath_devdata *dd, int ibup, u64 ibcs) +{ + ipath_setup_pe_setextled(dd, ipath_ib_linkstate(dd, ibcs), + ipath_ib_linktrstate(dd, ibcs)); + return 0; +} + + /** * ipath_init_iba6120_funcs - set up the chip-specific function pointers * @dd: the infinipath device @@ -1554,7 +1699,7 @@ void ipath_init_iba6120_funcs(struct ipath_devdata *dd) dd->ipath_f_bringup_serdes = ipath_pe_bringup_serdes; dd->ipath_f_clear_tids = ipath_pe_clear_tids; /* - * this may get changed after we read the chip revision, + * _f_put_tid may get changed after we read the chip revision, * but we start with the safe version for all revs */ dd->ipath_f_put_tid = ipath_pe_put_tid; @@ -1562,19 +1707,19 @@ void ipath_init_iba6120_funcs(struct ipath_devdata *dd) dd->ipath_f_setextled = ipath_setup_pe_setextled; dd->ipath_f_get_base_info = ipath_pe_get_base_info; dd->ipath_f_free_irq = ipath_pe_free_irq; - - /* initialize chip-specific variables */ dd->ipath_f_tidtemplate = ipath_pe_tidtemplate; + dd->ipath_f_intr_fallback = ipath_pe_nointr_fallback; + dd->ipath_f_xgxs_reset = ipath_pe_xgxs_reset; + dd->ipath_f_get_msgheader = ipath_pe_get_msgheader; dd->ipath_f_config_ports = ipath_pe_config_ports; dd->ipath_f_read_counters = ipath_pe_read_counters; + dd->ipath_f_get_ib_cfg = ipath_pe_get_ib_cfg; + dd->ipath_f_set_ib_cfg = ipath_pe_set_ib_cfg; + dd->ipath_f_config_jint = ipath_pe_config_jint; + dd->ipath_f_ib_updown = ipath_pe_ib_updown; - /* - * setup the register offsets, since they are different for each - * chip - */ - dd->ipath_kregs = &ipath_pe_kregs; - dd->ipath_cregs = &ipath_pe_cregs; + /* initialize chip-specific variables */ ipath_init_pe_variables(dd); } diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index 07971a2..8691a1c 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -191,6 +191,22 @@ struct ipath_skbinfo { dma_addr_t phys; }; +/* + * Possible IB config parameters for ipath_f_get/set_ib_cfg() + */ +#define IPATH_IB_CFG_LIDLMC 0 /* Get/set LID (LS16b) and Mask (MS16b) */ +#define IPATH_IB_CFG_HRTBT 1 /* Get/set Heartbeat off/enable/auto */ +#define IPATH_IB_HRTBT_ON 3 /* Heartbeat enabled, sent every 100msec */ +#define IPATH_IB_HRTBT_OFF 0 /* Heartbeat off */ +#define IPATH_IB_CFG_LWID_ENB 2 /* Get/set allowed Link-width */ +#define IPATH_IB_CFG_LWID 3 /* Get currently active Link-width */ +#define IPATH_IB_CFG_SPD_ENB 4 /* Get/set allowed Link speeds */ +#define IPATH_IB_CFG_SPD 5 /* Get current Link spd */ +#define IPATH_IB_CFG_RXPOL_ENB 6 /* Get/set Auto-RX-polarity enable */ +#define IPATH_IB_CFG_LREV_ENB 7 /* Get/set Auto-Lane-reversal enable */ +#define IPATH_IB_CFG_LINKLATENCY 8 /* Get Auto-Lane-reversal enable */ + + struct ipath_devdata { struct list_head ipath_list; @@ -231,6 +247,8 @@ struct ipath_devdata { struct _ipath_layer ipath_layer; /* setup intr */ int (*ipath_f_intrsetup)(struct ipath_devdata *); + /* fallback to alternate interrupt type if possible */ + int (*ipath_f_intr_fallback)(struct ipath_devdata *); /* setup on-chip bus config */ int (*ipath_f_bus)(struct ipath_devdata *, struct pci_dev *); /* hard reset chip */ @@ -253,9 +271,18 @@ struct ipath_devdata { int (*ipath_f_get_base_info)(struct ipath_portdata *, void *); /* free irq */ void (*ipath_f_free_irq)(struct ipath_devdata *); + struct ipath_message_header *(*ipath_f_get_msgheader) + (struct ipath_devdata *, __le32 *); void (*ipath_f_config_ports)(struct ipath_devdata *, ushort); + int (*ipath_f_get_ib_cfg)(struct ipath_devdata *, int); + int (*ipath_f_set_ib_cfg)(struct ipath_devdata *, int, u32); + void (*ipath_f_config_jint)(struct ipath_devdata *, u16 , u16); void (*ipath_f_read_counters)(struct ipath_devdata *, - struct infinipath_counters *); + struct infinipath_counters *); + void (*ipath_f_xgxs_reset)(struct ipath_devdata *); + /* per chip actions needed for IB Link up/down changes */ + int (*ipath_f_ib_updown)(struct ipath_devdata *, int, u64); + struct ipath_ibdev *verbs_dev; struct timer_list verbs_timer; /* total dwords sent (summed from counter) */ @@ -375,6 +402,7 @@ struct ipath_devdata { struct page **ipath_pageshadow; /* shadow copy of dma handles for exp tid pages */ dma_addr_t *ipath_physshadow; + u64 __iomem *ipath_egrtidbase; /* lock to workaround chip bug 9437 */ spinlock_t ipath_tid_lock; spinlock_t ipath_sendctrl_lock; @@ -565,6 +593,14 @@ struct ipath_devdata { u8 ipath_pci_cacheline; /* LID mask control */ u8 ipath_lmc; + /* link width supported */ + u8 ipath_link_width_supported; + /* link speed supported */ + u8 ipath_link_speed_supported; + u8 ipath_link_width_enabled; + u8 ipath_link_speed_enabled; + u8 ipath_link_width_active; + u8 ipath_link_speed_active; /* Rx Polarity inversion (compensate for ~tx on partner) */ u8 ipath_rx_pol_inv; @@ -599,6 +635,8 @@ struct ipath_devdata { */ u32 ipath_i_rcvavail_mask; u32 ipath_i_rcvurg_mask; + u16 ipath_i_rcvurg_shift; + u16 ipath_i_rcvavail_shift; /* * Register bits for selecting i2c direction and values, used for @@ -612,6 +650,29 @@ struct ipath_devdata { /* lock for doing RMW of shadows/regs for ExtCtrl and GPIO */ spinlock_t ipath_gpio_lock; + /* + * IB link and linktraining states and masks that vary per chip in + * some way. Set at init, to avoid each IB status change interrupt + */ + u8 ibcs_ls_shift; + u8 ibcs_lts_mask; + u32 ibcs_mask; + u32 ib_init; + u32 ib_arm; + u32 ib_active; + + u16 ipath_rhf_offset; /* offset of RHF within receive header entry */ + + /* + * shift/mask for linkcmd, linkinitcmd, maxpktlen in ibccontol + * reg. Changes for IBA7220 + */ + u8 ibcc_lic_mask; /* LinkInitCmd */ + u8 ibcc_lc_shift; /* LinkCmd */ + u8 ibcc_mpl_shift; /* Maxpktlen */ + + u8 delay_mult; + /* used to override LED behavior */ u8 ipath_led_override; /* Substituted for normal value, if non-zero */ u16 ipath_led_override_timeoff; /* delta to next timer event */ @@ -639,6 +700,10 @@ struct ipath_devdata { * each of the counters to increment. */ struct ipath_eep_log_mask ipath_eep_st_masks[IPATH_EEP_LOG_CNT]; + + /* interrupt mitigation reload register info */ + u16 ipath_jint_idle_ticks; /* idle clock ticks */ + u16 ipath_jint_max_packets; /* max packets across all ports */ }; /* Private data for file operations */ @@ -938,6 +1003,27 @@ static inline u64 ipath_read_ireg(const struct ipath_devdata *dd, ipath_kreg r) } /* + * from contents of IBCStatus (or a saved copy), return linkstate + * Report ACTIVE_DEFER as ACTIVE, because we treat them the same + * everywhere, anyway (and should be, for almost all purposes). + */ +static inline u32 ipath_ib_linkstate(struct ipath_devdata *dd, u64 ibcs) +{ + u32 state = (u32)(ibcs >> dd->ibcs_ls_shift) & + INFINIPATH_IBCS_LINKSTATE_MASK; + if (state == INFINIPATH_IBCS_L_STATE_ACT_DEFER) + state = INFINIPATH_IBCS_L_STATE_ACTIVE; + return state; +} + +/* from contents of IBCStatus (or a saved copy), return linktrainingstate */ +static inline u32 ipath_ib_linktrstate(struct ipath_devdata *dd, u64 ibcs) +{ + return (u32)(ibcs >> INFINIPATH_IBCS_LINKTRAININGSTATE_SHIFT) & + dd->ibcs_lts_mask; +} + +/* * sysfs interface. */ From arthur.jones at qlogic.com Tue Jan 15 16:18:24 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 16:18:24 -0800 Subject: [ofa-general] [PATCH 3/6] IB/ipath - new sysfs entries to control 7220 features In-Reply-To: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> References: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080116001824.12687.97534.stgit@eng-46.internal.keyresearch.com> From: Michael Albaugh IBA7220 includes many more configurable IB settings. Getting/Setting these is now grouped into a pair of chip specific functions accessed via function pointers. Provide sysfs access to these settings. Signed-off-by: Michael Albaugh --- drivers/infiniband/hw/ipath/ipath_kernel.h | 3 drivers/infiniband/hw/ipath/ipath_sysfs.c | 370 ++++++++++++++++++++++++++++ 2 files changed, 371 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index 8691a1c..41514fd 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -824,6 +824,9 @@ int ipath_set_rx_pol_inv(struct ipath_devdata *dd, u8 new_pol_inv); /* Use GPIO interrupts for new counters */ #define IPATH_GPIO_ERRINTRS 0x100000 #define IPATH_SWAP_PIOBUFS 0x200000 + /* Suppress heartbeat, even if turning off loopback */ +#define IPATH_NO_HRTBT 0x1000000 +#define IPATH_HAS_MULT_IB_SPEED 0x8000000 /* Bits in GPIO for the added interrupts */ #define IPATH_GPIO_PORT0_BIT 2 diff --git a/drivers/infiniband/hw/ipath/ipath_sysfs.c b/drivers/infiniband/hw/ipath/ipath_sysfs.c index e1ad7cf..2ec2727 100644 --- a/drivers/infiniband/hw/ipath/ipath_sysfs.c +++ b/drivers/infiniband/hw/ipath/ipath_sysfs.c @@ -363,6 +363,60 @@ static ssize_t show_unit(struct device *dev, return scnprintf(buf, PAGE_SIZE, "%u\n", dd->ipath_unit); } +static ssize_t show_jint_max_packets(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + + return scnprintf(buf, PAGE_SIZE, "%hu\n", dd->ipath_jint_max_packets); +} + +static ssize_t store_jint_max_packets(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + u16 v = 0; + int ret; + + ret = ipath_parse_ushort(buf, &v); + if (ret < 0) + ipath_dev_err(dd, "invalid jint_max_packets.\n"); + else + dd->ipath_f_config_jint(dd, dd->ipath_jint_idle_ticks, v); + + return ret; +} + +static ssize_t show_jint_idle_ticks(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + + return scnprintf(buf, PAGE_SIZE, "%hu\n", dd->ipath_jint_idle_ticks); +} + +static ssize_t store_jint_idle_ticks(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + u16 v = 0; + int ret; + + ret = ipath_parse_ushort(buf, &v); + if (ret < 0) + ipath_dev_err(dd, "invalid jint_idle_ticks.\n"); + else + dd->ipath_f_config_jint(dd, v, dd->ipath_jint_max_packets); + + return ret; +} + #define DEVICE_COUNTER(name, attr) \ static ssize_t show_counter_##name(struct device *dev, \ struct device_attribute *attr, \ @@ -670,6 +724,257 @@ static ssize_t show_logged_errs(struct device *dev, return count; } +/* + * New sysfs entries to control various IB config. These all turn into + * accesses via ipath_f_get/set_ib_cfg. + * + * Get/Set heartbeat enable. Or of 1=enabled, 2=auto + */ +static ssize_t show_hrtbt_enb(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret; + + ret = dd->ipath_f_get_ib_cfg(dd, IPATH_IB_CFG_HRTBT); + if (ret >= 0) + ret = scnprintf(buf, PAGE_SIZE, "%d\n", ret); + return ret; +} + +static ssize_t store_hrtbt_enb(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret, r; + u16 val; + + ret = ipath_parse_ushort(buf, &val); + if (ret >= 0 && val > 3) + ret = -EINVAL; + if (ret < 0) { + ipath_dev_err(dd, "attempt to set invalid Heartbeat enable\n"); + goto bail; + } + + /* + * Set the "intentional" heartbeat enable per either of + * "Enable" and "Auto", as these are normally set together. + * This bit is consulted when leaving loopback mode, + * because entering loopback mode overrides it and automatically + * disables heartbeat. + */ + r = dd->ipath_f_set_ib_cfg(dd, IPATH_IB_CFG_HRTBT, val); + if (r < 0) + ret = r; + else if (val == IPATH_IB_HRTBT_OFF) + dd->ipath_flags |= IPATH_NO_HRTBT; + else + dd->ipath_flags &= ~IPATH_NO_HRTBT; + +bail: + return ret; +} + +/* + * Get/Set Link-widths enabled. Or of 1=1x, 2=4x (this is human/IB centric, + * _not_ the particular encoding of any given chip) + */ +static ssize_t show_lwid_enb(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret; + + ret = dd->ipath_f_get_ib_cfg(dd, IPATH_IB_CFG_LWID_ENB); + if (ret >= 0) + ret = scnprintf(buf, PAGE_SIZE, "%d\n", ret); + return ret; +} + +static ssize_t store_lwid_enb(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret, r; + u16 val; + + ret = ipath_parse_ushort(buf, &val); + if (ret >= 0 && (val == 0 || val > 3)) + ret = -EINVAL; + if (ret < 0) { + ipath_dev_err(dd, + "attempt to set invalid Link Width (enable)\n"); + goto bail; + } + + r = dd->ipath_f_set_ib_cfg(dd, IPATH_IB_CFG_LWID_ENB, val); + if (r < 0) + ret = r; + +bail: + return ret; +} + +/* Get current link width */ +static ssize_t show_lwid(struct device *dev, + struct device_attribute *attr, + char *buf) + +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret; + + ret = dd->ipath_f_get_ib_cfg(dd, IPATH_IB_CFG_LWID); + if (ret >= 0) + ret = scnprintf(buf, PAGE_SIZE, "%d\n", ret); + return ret; +} + +/* + * Get/Set Link-speeds enabled. Or of 1=SDR 2=DDR. + */ +static ssize_t show_spd_enb(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret; + + ret = dd->ipath_f_get_ib_cfg(dd, IPATH_IB_CFG_SPD_ENB); + if (ret >= 0) + ret = scnprintf(buf, PAGE_SIZE, "%d\n", ret); + return ret; +} + +static ssize_t store_spd_enb(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret, r; + u16 val; + + ret = ipath_parse_ushort(buf, &val); + if (ret >= 0 && (val == 0 || val > (IPATH_IB_SDR | IPATH_IB_DDR))) + ret = -EINVAL; + if (ret < 0) { + ipath_dev_err(dd, + "attempt to set invalid Link Speed (enable)\n"); + goto bail; + } + + r = dd->ipath_f_set_ib_cfg(dd, IPATH_IB_CFG_SPD_ENB, val); + if (r < 0) + ret = r; + +bail: + return ret; +} + +/* Get current link speed */ +static ssize_t show_spd(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret; + + ret = dd->ipath_f_get_ib_cfg(dd, IPATH_IB_CFG_SPD); + if (ret >= 0) + ret = scnprintf(buf, PAGE_SIZE, "%d\n", ret); + return ret; +} + +/* + * Get/Set RX polarity-invert enable. 0=no, 1=yes. + */ +static ssize_t show_rx_polinv_enb(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret; + + ret = dd->ipath_f_get_ib_cfg(dd, IPATH_IB_CFG_RXPOL_ENB); + if (ret >= 0) + ret = scnprintf(buf, PAGE_SIZE, "%d\n", ret); + return ret; +} + +static ssize_t store_rx_polinv_enb(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret, r; + u16 val; + + ret = ipath_parse_ushort(buf, &val); + if (ret < 0 || val > 1) + goto invalid; + + r = dd->ipath_f_set_ib_cfg(dd, IPATH_IB_CFG_RXPOL_ENB, val); + if (r < 0) { + ret = r; + goto bail; + } + + goto bail; +invalid: + ipath_dev_err(dd, "attempt to set invalid Rx Polarity (enable)\n"); +bail: + return ret; +} +/* + * Get/Set RX lane-reversal enable. 0=no, 1=yes. + */ +static ssize_t show_lanerev_enb(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret; + + ret = dd->ipath_f_get_ib_cfg(dd, IPATH_IB_CFG_LREV_ENB); + if (ret >= 0) + ret = scnprintf(buf, PAGE_SIZE, "%d\n", ret); + return ret; +} + +static ssize_t store_lanerev_enb(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct ipath_devdata *dd = dev_get_drvdata(dev); + int ret, r; + u16 val; + + ret = ipath_parse_ushort(buf, &val); + if (ret >= 0 && val > 1) { + ret = -EINVAL; + ipath_dev_err(dd, + "attempt to set invalid Lane reversal (enable)\n"); + goto bail; + } + + r = dd->ipath_f_set_ib_cfg(dd, IPATH_IB_CFG_LREV_ENB, val); + if (r < 0) + ret = r; + +bail: + return ret; +} + static DRIVER_ATTR(num_units, S_IRUGO, show_num_units, NULL); static DRIVER_ATTR(version, S_IRUGO, show_version, NULL); @@ -701,6 +1006,10 @@ static DEVICE_ATTR(unit, S_IRUGO, show_unit, NULL); static DEVICE_ATTR(rx_pol_inv, S_IWUSR, NULL, store_rx_pol_inv); static DEVICE_ATTR(led_override, S_IWUSR, NULL, store_led_override); static DEVICE_ATTR(logged_errors, S_IRUGO, show_logged_errs, NULL); +static DEVICE_ATTR(jint_max_packets, S_IWUSR | S_IRUGO, + show_jint_max_packets, store_jint_max_packets); +static DEVICE_ATTR(jint_idle_ticks, S_IWUSR | S_IRUGO, + show_jint_idle_ticks, store_jint_idle_ticks); static struct attribute *dev_attributes[] = { &dev_attr_guid.attr, @@ -727,6 +1036,34 @@ static struct attribute_group dev_attr_group = { .attrs = dev_attributes }; +static DEVICE_ATTR(hrtbt_enable, S_IWUSR | S_IRUGO, show_hrtbt_enb, + store_hrtbt_enb); +static DEVICE_ATTR(link_width_enable, S_IWUSR | S_IRUGO, show_lwid_enb, + store_lwid_enb); +static DEVICE_ATTR(link_width, S_IRUGO, show_lwid, NULL); +static DEVICE_ATTR(link_speed_enable, S_IWUSR | S_IRUGO, show_spd_enb, + store_spd_enb); +static DEVICE_ATTR(link_speed, S_IRUGO, show_spd, NULL); +static DEVICE_ATTR(rx_pol_inv_enable, S_IWUSR | S_IRUGO, show_rx_polinv_enb, + store_rx_polinv_enb); +static DEVICE_ATTR(rx_lane_rev_enable, S_IWUSR | S_IRUGO, show_lanerev_enb, + store_lanerev_enb); + +static struct attribute *dev_ibcfg_attributes[] = { + &dev_attr_hrtbt_enable.attr, + &dev_attr_link_width_enable.attr, + &dev_attr_link_width.attr, + &dev_attr_link_speed_enable.attr, + &dev_attr_link_speed.attr, + &dev_attr_rx_pol_inv_enable.attr, + &dev_attr_rx_lane_rev_enable.attr, + NULL +}; + +static struct attribute_group dev_ibcfg_attr_group = { + .attrs = dev_ibcfg_attributes +}; + /** * ipath_expose_reset - create a device reset file * @dev: the device structure @@ -782,9 +1119,31 @@ int ipath_device_create_group(struct device *dev, struct ipath_devdata *dd) snprintf(unit, sizeof(unit), "%02d", dd->ipath_unit); ret = sysfs_create_link(&dev->driver->kobj, &dev->kobj, unit); - if (ret == 0) - goto bail; + if (ret) + goto bail_counter; + + if (dd->ipath_flags & IPATH_HAS_MULT_IB_SPEED) { + ret = device_create_file(dev, &dev_attr_jint_idle_ticks); + if (ret) + goto bail_unit; + ret = device_create_file(dev, &dev_attr_jint_max_packets); + if (ret) + goto bail_idle; + + ret = sysfs_create_group(&dev->kobj, &dev_ibcfg_attr_group); + if (ret) + goto bail_max; + } + + goto bail; +bail_max: + device_remove_file(dev, &dev_attr_jint_max_packets); +bail_idle: + device_remove_file(dev, &dev_attr_jint_idle_ticks); +bail_unit: + sysfs_remove_link(&dev->driver->kobj, unit); +bail_counter: sysfs_remove_group(&dev->kobj, &dev_counter_attr_group); bail_attrs: sysfs_remove_group(&dev->kobj, &dev_attr_group); @@ -800,6 +1159,13 @@ void ipath_device_remove_group(struct device *dev, struct ipath_devdata *dd) sysfs_remove_link(&dev->driver->kobj, unit); sysfs_remove_group(&dev->kobj, &dev_counter_attr_group); + + if (dd->ipath_flags & IPATH_HAS_MULT_IB_SPEED) { + sysfs_remove_group(&dev->kobj, &dev_ibcfg_attr_group); + device_remove_file(dev, &dev_attr_jint_idle_ticks); + device_remove_file(dev, &dev_attr_jint_max_packets); + } + sysfs_remove_group(&dev->kobj, &dev_attr_group); device_remove_file(dev, &dev_attr_reset); From arthur.jones at qlogic.com Tue Jan 15 16:18:29 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 16:18:29 -0800 Subject: [ofa-general] [PATCH 4/6] IB/ipath - minor cleanup of unused fields, and chip-specific errors In-Reply-To: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> References: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080116001829.12687.96664.stgit@eng-46.internal.keyresearch.com> From: Dave Olson clean up some unused header fields, minor related cleanup Signed-off-by: Dave Olson --- drivers/infiniband/hw/ipath/ipath_iba6120.c | 79 ++++++++------------------- drivers/infiniband/hw/ipath/ipath_kernel.h | 5 +- 2 files changed, 25 insertions(+), 59 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_iba6120.c b/drivers/infiniband/hw/ipath/ipath_iba6120.c index 597192e..c7a2f50 100644 --- a/drivers/infiniband/hw/ipath/ipath_iba6120.c +++ b/drivers/infiniband/hw/ipath/ipath_iba6120.c @@ -373,10 +373,28 @@ static const struct ipath_hwerror_msgs ipath_6120_hwerror_msgs[] = { INFINIPATH_HWE_TXEMEMPARITYERR_PIOPBC) \ << INFINIPATH_HWE_TXEMEMPARITYERR_SHIFT) -static int ipath_pe_txe_recover(struct ipath_devdata *); static void ipath_pe_put_tid_2(struct ipath_devdata *, u64 __iomem *, u32, unsigned long); +/* + * On platforms using this chip, and not having ordered WC stores, we + * can get TXE parity errors due to speculative reads to the PIO buffers, + * and this, due to a chip bug can result in (many) false parity error + * reports. So it's a debug print on those, and an info print on systems + * where the speculative reads don't occur. + */ +static void ipath_pe_txe_recover(struct ipath_devdata *dd) +{ + if (ipath_unordered_wc()) + ipath_dbg("Recovering from TXE PIO parity error\n"); + else { + ++ipath_stats.sps_txeparity; + dev_info(&dd->pcidev->dev, + "Recovering from TXE PIO parity error\n"); + } +} + + /** * ipath_pe_handle_hwerrors - display hardware errors. * @dd: the infinipath device @@ -456,35 +474,11 @@ static void ipath_pe_handle_hwerrors(struct ipath_devdata *dd, char *msg, * occur if a processor speculative read is done to the PIO * buffer while we are sending a packet, for example. */ - if ((hwerrs & TXE_PIO_PARITY) && ipath_pe_txe_recover(dd)) + if (hwerrs & TXE_PIO_PARITY) { + ipath_pe_txe_recover(dd); hwerrs &= ~TXE_PIO_PARITY; - if (hwerrs) { - /* - * if any set that we aren't ignoring only make the - * complaint once, in case it's stuck or recurring, - * and we get here multiple times - * Force link down, so switch knows, and - * LEDs are turned off - */ - if (dd->ipath_flags & IPATH_INITTED) { - ipath_set_linkstate(dd, IPATH_IB_LINKDOWN); - ipath_setup_pe_setextled(dd, - INFINIPATH_IBCS_L_STATE_DOWN, - INFINIPATH_IBCS_LT_STATE_DISABLED); - ipath_dev_err(dd, "Fatal Hardware Error (freeze " - "mode), no longer usable, SN %.16s\n", - dd->ipath_serial); - isfatal = 1; - } - /* - * Mark as having had an error for driver, and also - * for /sys and status word mapped to user programs. - * This marks unit as not usable, until reset - */ - *dd->ipath_statusp &= ~IPATH_STATUS_IB_READY; - *dd->ipath_statusp |= IPATH_STATUS_HWERROR; - dd->ipath_flags &= ~IPATH_INITTED; - } else { + } + if (!hwerrs) { static u32 freeze_cnt; freeze_cnt++; @@ -1569,33 +1563,6 @@ static void ipath_pe_read_counters(struct ipath_devdata *dd, cntrs->RxDlidFltrCnt = 0; } -/* - * On platforms using this chip, and not having ordered WC stores, we - * can get TXE parity errors due to speculative reads to the PIO buffers, - * and this, due to a chip bug can result in (many) false parity error - * reports. So it's a debug print on those, and an info print on systems - * where the speculative reads don't occur. - * Because we can get lots of false errors, we have no upper limit - * on recovery attempts on those platforms. - */ -static int ipath_pe_txe_recover(struct ipath_devdata *dd) -{ - if (ipath_unordered_wc()) - ipath_dbg("Recovering from TXE PIO parity error\n"); - else { - int cnt = ++ipath_stats.sps_txeparity; - if (cnt >= IPATH_MAX_PARITY_ATTEMPTS) { - if (cnt == IPATH_MAX_PARITY_ATTEMPTS) - ipath_dev_err(dd, - "Too many attempts to recover from " - "TXE parity, giving up\n"); - return 0; - } - dev_info(&dd->pcidev->dev, - "Recovering from TXE PIO parity error\n"); - } - return 1; -} /* no interrupt fallback for these chips */ static int ipath_pe_nointr_fallback(struct ipath_devdata *dd) diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index 41514fd..8351eec 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -396,14 +396,13 @@ struct ipath_devdata { unsigned long ipath_wc_len; /* ref count for each pkey */ atomic_t ipath_pkeyrefs[4]; - /* shadow copy of all exptids physaddr; used only by funcsim */ - u64 *ipath_tidsimshadow; /* shadow copy of struct page *'s for exp tid pages */ struct page **ipath_pageshadow; /* shadow copy of dma handles for exp tid pages */ dma_addr_t *ipath_physshadow; u64 __iomem *ipath_egrtidbase; - /* lock to workaround chip bug 9437 */ + /* lock to workaround chip bug 9437 and others */ + spinlock_t ipath_kernel_tid_lock; spinlock_t ipath_tid_lock; spinlock_t ipath_sendctrl_lock; From arthur.jones at qlogic.com Tue Jan 15 16:18:34 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 16:18:34 -0800 Subject: [ofa-general] [PATCH 5/6] IB/ipath - changes to support PIO bandwidth check on IBA7220 In-Reply-To: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> References: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080116001834.12687.85457.stgit@eng-46.internal.keyresearch.com> From: Dave Olson The IBA7220 uses a count-based triggering mechanism, and therefore can't use the same bandwidth verification mechanism as older chips. To support the 7220, allow enabling and disabling armlaunch errors on application request. Minor robustness improvements as well. Signed-off-by: Dave Olson --- drivers/infiniband/hw/ipath/ipath_common.h | 5 +++- drivers/infiniband/hw/ipath/ipath_driver.c | 32 ++++++++++++++++++++++++++ drivers/infiniband/hw/ipath/ipath_file_ops.c | 11 +++++++++ drivers/infiniband/hw/ipath/ipath_kernel.h | 2 ++ 4 files changed, 49 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_common.h b/drivers/infiniband/hw/ipath/ipath_common.h index 0fa43ba..4146210 100644 --- a/drivers/infiniband/hw/ipath/ipath_common.h +++ b/drivers/infiniband/hw/ipath/ipath_common.h @@ -443,8 +443,9 @@ struct ipath_user_info { #define IPATH_CMD_UNUSED_2 26 #define IPATH_CMD_PIOAVAILUPD 27 /* force an update of PIOAvail reg */ #define IPATH_CMD_POLL_TYPE 28 /* set the kind of polling we want */ +#define IPATH_CMD_ARMLAUNCH_CTRL 29 /* armlaunch detection control */ -#define IPATH_CMD_MAX 28 +#define IPATH_CMD_MAX 29 /* * Poll types @@ -487,6 +488,8 @@ struct ipath_cmd { __u64 port_info; /* enable/disable receipt of packets */ __u32 recv_ctrl; + /* enable/disable armlaunch errors (non-zero to enable) */ + __u32 armlaunch_ctrl; /* partition key to set */ __u16 part_key; /* user address of __u32 bitmask of active slaves */ diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c index 7e0966e..99f2d25 100644 --- a/drivers/infiniband/hw/ipath/ipath_driver.c +++ b/drivers/infiniband/hw/ipath/ipath_driver.c @@ -331,6 +331,8 @@ static void ipath_verify_pioperf(struct ipath_devdata *dd) udelay(1); } + ipath_disable_armlaunch(dd); + writeq(0, piobuf); /* length 0, no dwords actually sent */ ipath_flush_wc(); @@ -362,6 +364,7 @@ static void ipath_verify_pioperf(struct ipath_devdata *dd) done: /* disarm piobuf, so it's available again */ ipath_disarm_piobufs(dd, pbnum, 1); + ipath_enable_armlaunch(dd); } static int __devinit ipath_init_one(struct pci_dev *pdev, @@ -2280,5 +2283,34 @@ int ipath_set_rx_pol_inv(struct ipath_devdata *dd, u8 new_pol_inv) } return 0; } + +/* + * Disable and enable the armlaunch error. Used for PIO bandwidth testing on + * the 7220, which is count-based, rather than trigger-based. Safe for the + * driver check, since it's at init. Not completely safe when used for + * user-mode checking, since some error checking can be lost, but not + * particularly risky, and only has problematic side-effects in the face of + * very buggy user code. There is no reference counting, but that's also + * fine, given the intended use. + */ +void ipath_enable_armlaunch(struct ipath_devdata *dd) +{ + dd->ipath_lasterror &= ~INFINIPATH_E_SPIOARMLAUNCH; + ipath_write_kreg(dd, dd->ipath_kregs->kr_errorclear, + INFINIPATH_E_SPIOARMLAUNCH); + dd->ipath_errormask |= INFINIPATH_E_SPIOARMLAUNCH; + ipath_write_kreg(dd, dd->ipath_kregs->kr_errormask, + dd->ipath_errormask); +} + +void ipath_disable_armlaunch(struct ipath_devdata *dd) +{ + /* so don't re-enable if already set */ + dd->ipath_maskederrs &= ~INFINIPATH_E_SPIOARMLAUNCH; + dd->ipath_errormask &= ~INFINIPATH_E_SPIOARMLAUNCH; + ipath_write_kreg(dd, dd->ipath_kregs->kr_errormask, + dd->ipath_errormask); +} + module_init(infinipath_init); module_exit(infinipath_cleanup); diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c b/drivers/infiniband/hw/ipath/ipath_file_ops.c index 7b2f59a..7e025c8 100644 --- a/drivers/infiniband/hw/ipath/ipath_file_ops.c +++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c @@ -2224,6 +2224,11 @@ static ssize_t ipath_write(struct file *fp, const char __user *data, dest = &cmd.cmd.poll_type; src = &ucmd->cmd.poll_type; break; + case IPATH_CMD_ARMLAUNCH_CTRL: + copy = sizeof(cmd.cmd.armlaunch_ctrl); + dest = &cmd.cmd.armlaunch_ctrl; + src = &ucmd->cmd.armlaunch_ctrl; + break; default: ret = -EINVAL; goto bail; @@ -2299,6 +2304,12 @@ static ssize_t ipath_write(struct file *fp, const char __user *data, case IPATH_CMD_POLL_TYPE: pd->poll_type = cmd.cmd.poll_type; break; + case IPATH_CMD_ARMLAUNCH_CTRL: + if (cmd.cmd.armlaunch_ctrl) + ipath_enable_armlaunch(pd->port_dd); + else + ipath_disable_armlaunch(pd->port_dd); + break; } if (ret >= 0) diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h index 8351eec..1fd872f 100644 --- a/drivers/infiniband/hw/ipath/ipath_kernel.h +++ b/drivers/infiniband/hw/ipath/ipath_kernel.h @@ -771,6 +771,8 @@ int ipath_set_linkstate(struct ipath_devdata *, u8); int ipath_set_mtu(struct ipath_devdata *, u16); int ipath_set_lid(struct ipath_devdata *, u32, u8); int ipath_set_rx_pol_inv(struct ipath_devdata *dd, u8 new_pol_inv); +void ipath_enable_armlaunch(struct ipath_devdata *); +void ipath_disable_armlaunch(struct ipath_devdata *); /* for use in system calls, where we want to know device type, etc. */ #define port_fp(fp) ((struct ipath_filedata *)(fp)->private_data)->pd From arthur.jones at qlogic.com Tue Jan 15 16:18:39 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 16:18:39 -0800 Subject: [ofa-general] [PATCH 6/6] IB/ipath - add mappings from hw register to PortInfo port physical state In-Reply-To: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> References: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080116001839.12687.78755.stgit@eng-46.internal.keyresearch.com> From: Ralph Campbell Add new mappings from port physical state (a HW register value) to the IB SubnGet(PortInfo) port physical state. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_verbs.c | 47 +++++++++++++++++++---------- drivers/infiniband/hw/ipath/ipath_verbs.h | 10 ++++++ 2 files changed, 41 insertions(+), 16 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c index 904ff15..32d8f88 100644 --- a/drivers/infiniband/hw/ipath/ipath_verbs.c +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c @@ -1133,20 +1133,34 @@ static int ipath_query_device(struct ib_device *ibdev, return 0; } -const u8 ipath_cvt_physportstate[16] = { - [INFINIPATH_IBCS_LT_STATE_DISABLED] = 3, - [INFINIPATH_IBCS_LT_STATE_LINKUP] = 5, - [INFINIPATH_IBCS_LT_STATE_POLLACTIVE] = 2, - [INFINIPATH_IBCS_LT_STATE_POLLQUIET] = 2, - [INFINIPATH_IBCS_LT_STATE_SLEEPDELAY] = 1, - [INFINIPATH_IBCS_LT_STATE_SLEEPQUIET] = 1, - [INFINIPATH_IBCS_LT_STATE_CFGDEBOUNCE] = 4, - [INFINIPATH_IBCS_LT_STATE_CFGRCVFCFG] = 4, - [INFINIPATH_IBCS_LT_STATE_CFGWAITRMT] = 4, - [INFINIPATH_IBCS_LT_STATE_CFGIDLE] = 4, - [INFINIPATH_IBCS_LT_STATE_RECOVERRETRAIN] = 6, - [INFINIPATH_IBCS_LT_STATE_RECOVERWAITRMT] = 6, - [INFINIPATH_IBCS_LT_STATE_RECOVERIDLE] = 6, +const u8 ipath_cvt_physportstate[32] = { + [INFINIPATH_IBCS_LT_STATE_DISABLED] = IB_PHYSPORTSTATE_DISABLED, + [INFINIPATH_IBCS_LT_STATE_LINKUP] = IB_PHYSPORTSTATE_LINKUP, + [INFINIPATH_IBCS_LT_STATE_POLLACTIVE] = IB_PHYSPORTSTATE_POLL, + [INFINIPATH_IBCS_LT_STATE_POLLQUIET] = IB_PHYSPORTSTATE_POLL, + [INFINIPATH_IBCS_LT_STATE_SLEEPDELAY] = IB_PHYSPORTSTATE_SLEEP, + [INFINIPATH_IBCS_LT_STATE_SLEEPQUIET] = IB_PHYSPORTSTATE_SLEEP, + [INFINIPATH_IBCS_LT_STATE_CFGDEBOUNCE] = + IB_PHYSPORTSTATE_CFG_TRAIN, + [INFINIPATH_IBCS_LT_STATE_CFGRCVFCFG] = + IB_PHYSPORTSTATE_CFG_TRAIN, + [INFINIPATH_IBCS_LT_STATE_CFGWAITRMT] = + IB_PHYSPORTSTATE_CFG_TRAIN, + [INFINIPATH_IBCS_LT_STATE_CFGIDLE] = IB_PHYSPORTSTATE_CFG_TRAIN, + [INFINIPATH_IBCS_LT_STATE_RECOVERRETRAIN] = + IB_PHYSPORTSTATE_LINK_ERR_RECOVER, + [INFINIPATH_IBCS_LT_STATE_RECOVERWAITRMT] = + IB_PHYSPORTSTATE_LINK_ERR_RECOVER, + [INFINIPATH_IBCS_LT_STATE_RECOVERIDLE] = + IB_PHYSPORTSTATE_LINK_ERR_RECOVER, + [0x10] = IB_PHYSPORTSTATE_CFG_TRAIN, + [0x11] = IB_PHYSPORTSTATE_CFG_TRAIN, + [0x12] = IB_PHYSPORTSTATE_CFG_TRAIN, + [0x13] = IB_PHYSPORTSTATE_CFG_TRAIN, + [0x14] = IB_PHYSPORTSTATE_CFG_TRAIN, + [0x15] = IB_PHYSPORTSTATE_CFG_TRAIN, + [0x16] = IB_PHYSPORTSTATE_CFG_TRAIN, + [0x17] = IB_PHYSPORTSTATE_CFG_TRAIN }; u32 ipath_get_cr_errpkey(struct ipath_devdata *dd) @@ -1171,8 +1185,9 @@ static int ipath_query_port(struct ib_device *ibdev, ibcstat = dd->ipath_lastibcstat; props->state = ((ibcstat >> 4) & 0x3) + 1; /* See phys_state_show() */ - props->phys_state = ipath_cvt_physportstate[ - dd->ipath_lastibcstat & 0xf]; + props->phys_state = /* MEA: assumes shift == 0 */ + ipath_cvt_physportstate[dd->ipath_lastibcstat & + dd->ibcs_lts_mask]; props->port_cap_flags = dev->port_cap_flags; props->gid_tbl_len = 1; props->max_msg_sz = 0x80000000; diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.h b/drivers/infiniband/hw/ipath/ipath_verbs.h index 1c89850..3d59736 100644 --- a/drivers/infiniband/hw/ipath/ipath_verbs.h +++ b/drivers/infiniband/hw/ipath/ipath_verbs.h @@ -832,7 +832,17 @@ unsigned ipath_get_pkey(struct ipath_devdata *, unsigned); extern const enum ib_wc_opcode ib_ipath_wc_opcode[]; +/* + * Below converts HCA-specific LinkTrainingState to IB PhysPortState + * values. + */ extern const u8 ipath_cvt_physportstate[]; +#define IB_PHYSPORTSTATE_SLEEP 1 +#define IB_PHYSPORTSTATE_POLL 2 +#define IB_PHYSPORTSTATE_DISABLED 3 +#define IB_PHYSPORTSTATE_CFG_TRAIN 4 +#define IB_PHYSPORTSTATE_LINKUP 5 +#define IB_PHYSPORTSTATE_LINK_ERR_RECOVER 6 extern const int ib_ipath_state_ops[]; From arthur.jones at qlogic.com Tue Jan 15 16:25:07 2008 From: arthur.jones at qlogic.com (Arthur Jones) Date: Tue, 15 Jan 2008 16:25:07 -0800 Subject: [ofa-general] [PATCH] IB/ipath - bugfixes for 2.6.24 In-Reply-To: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> References: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> Message-ID: <20080116002507.GA12792@bauxite.pathscale.com> hmm, it looks like PATCH 1/2 didn't make it at least to open-fabrics, so i've attached it... arthur On Tue, Jan 15, 2008 at 03:58:08PM -0800, Arthur Jones wrote: > hi roland, these fix a long-sought bug with > ipath hardware and the MVAPICH over verbs > software stack and a use after free bug. > i'd like to squeeze this in to 2.6.24 if > there's still a chance... > > these changes are avail for git pull from (note > the new branch name for 2.6.24): > > git://git.qlogic.com/ipath-linux-2.6 for-roland-2.6.24 > > arthur > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -------------- next part -------------- IB/ipath - fix UD send with immediate From: Ralph Campbell This fixes a small bug which incorrectly calculated the header size for UD send with immediate and therefore dropped packets. Signed-off-by: Ralph Campbell --- drivers/infiniband/hw/ipath/ipath_ud.c | 47 ++++++++++++++++---------------- 1 files changed, 23 insertions(+), 24 deletions(-) diff --git a/drivers/infiniband/hw/ipath/ipath_ud.c b/drivers/infiniband/hw/ipath/ipath_ud.c index 16a2a93..de67eed 100644 --- a/drivers/infiniband/hw/ipath/ipath_ud.c +++ b/drivers/infiniband/hw/ipath/ipath_ud.c @@ -301,8 +301,6 @@ int ipath_make_ud_req(struct ipath_qp *qp) /* header size in 32-bit words LRH+BTH+DETH = (8+12+8)/4. */ qp->s_hdrwords = 7; - if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) - qp->s_hdrwords++; qp->s_cur_size = wqe->length; qp->s_cur_sge = &qp->s_sge; qp->s_wqe = wqe; @@ -327,6 +325,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) ohdr = &qp->s_hdr.u.oth; } if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) { + qp->s_hdrwords++; ohdr->u.ud.imm_data = wqe->wr.imm_data; bth0 = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE << 24; } else @@ -455,6 +454,28 @@ void ipath_ud_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr, } } + /* + * The opcode is in the low byte when its in network order + * (top byte when in host order). + */ + opcode = be32_to_cpu(ohdr->bth[0]) >> 24; + if (qp->ibqp.qp_num > 1 && + opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { + if (header_in_data) { + wc.imm_data = *(__be32 *) data; + data += sizeof(__be32); + } else + wc.imm_data = ohdr->u.ud.imm_data; + wc.wc_flags = IB_WC_WITH_IMM; + hdrsize += sizeof(u32); + } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { + wc.imm_data = 0; + wc.wc_flags = 0; + } else { + dev->n_pkt_drops++; + goto bail; + } + /* Get the number of bytes the message was padded by. */ pad = (be32_to_cpu(ohdr->bth[0]) >> 20) & 3; if (unlikely(tlen < (hdrsize + pad + 4))) { @@ -482,28 +503,6 @@ void ipath_ud_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr, wc.byte_len = tlen + sizeof(struct ib_grh); /* - * The opcode is in the low byte when its in network order - * (top byte when in host order). - */ - opcode = be32_to_cpu(ohdr->bth[0]) >> 24; - if (qp->ibqp.qp_num > 1 && - opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { - if (header_in_data) { - wc.imm_data = *(__be32 *) data; - data += sizeof(__be32); - } else - wc.imm_data = ohdr->u.ud.imm_data; - wc.wc_flags = IB_WC_WITH_IMM; - hdrsize += sizeof(u32); - } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { - wc.imm_data = 0; - wc.wc_flags = 0; - } else { - dev->n_pkt_drops++; - goto bail; - } - - /* * Get the next work request entry to find where to put the data. */ if (qp->r_reuse_sge) From sean.hefty at intel.com Tue Jan 15 16:28:16 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Tue, 15 Jan 2008 16:28:16 -0800 Subject: [ofa-general] RE: new API for the rdma-cma In-Reply-To: <478D43DD.1090007@opengridcomputing.com> References: <478D346B.8070506@opengridcomputing.com> <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> <478D43DD.1090007@opengridcomputing.com> Message-ID: <000a01c857d6$b046b1a0$a937170a@amr.corp.intel.com> >I'm prototyping this now (as part of our OMPI/rdma-cm/iwarp work). I'm not sure about what the interface should be, since there could be multiple addresses (IPv4 and IPv6) on a port. As a generality, my preference is to use sockaddr where possible. The only ideas I can come up with are APIs such as: struct sockaddr **rdma_get_addr_list(struct ibv_context *verbs, uint8_t port_num); void rdma_free_add_list(struct sockaddr **list); or, treating this more like a verbs call: int rdma_query_addr(struct ibv_context *verbs, uint8_t port_num, int index, struct sockaddr *addr, int addr_len); - Sean From jgunthorpe at obsidianresearch.com Tue Jan 15 17:13:08 2008 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Tue, 15 Jan 2008 18:13:08 -0700 Subject: [ofa-general] RE: new API for the rdma-cma In-Reply-To: <000a01c857d6$b046b1a0$a937170a@amr.corp.intel.com> References: <478D346B.8070506@opengridcomputing.com> <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> <478D43DD.1090007@opengridcomputing.com> <000a01c857d6$b046b1a0$a937170a@amr.corp.intel.com> Message-ID: <20080116011308.GK28360@obsidianresearch.com> On Tue, Jan 15, 2008 at 04:28:16PM -0800, Sean Hefty wrote: > >I'm prototyping this now (as part of our OMPI/rdma-cm/iwarp work). > > I'm not sure about what the interface should be, since there could > be multiple addresses (IPv4 and IPv6) on a port. As a generality, > my preference is to use sockaddr where possible. Generally sockaddr alone for this kind of purpose is frowned on in my view since there is no portable way to get the address length, and all downstream functions require the length.. The best interface is something like getaddrinfo that returns a new type that has the family, address and length in the structure. Also, your second version would require the address array to be of type sockaddr_storage, since sockaddr can only point to, not store addresses.. Jason From kliteyn at mellanox.co.il Tue Jan 15 18:00:07 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 16 Jan 2008 04:00:07 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-16:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-15 OpenSM git rev = Tue_Jan_15_15:10:26_2008 [6806bc4ba006cafd7431e85fd62e2d6d6f4c182d] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=396 Fail=4 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 6 LidMgr IS3-128.topo Failures: 4 LidMgr IS3-128.topo From rdreier at cisco.com Tue Jan 15 19:54:35 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 15 Jan 2008 19:54:35 -0800 Subject: [ofa-general] Re: [PATCH 2/2] IB/ipath - fix QP use after free bug In-Reply-To: <20080115235818.7794.96013.stgit@eng-46.internal.keyresearch.com> (Arthur Jones's message of "Tue, 15 Jan 2008 15:58:18 -0800") References: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> <20080115235818.7794.96013.stgit@eng-46.internal.keyresearch.com> Message-ID: Am I missing something, or is this still racy, just with a smaller window? Couldn't the following still happen? CPU #1 CPU #2 static inline void ipath_schedule_send(struct ipath_qp *qp) { if (!test_bit(IPATH_S_DESTROYING, &qp->s_busy)) // bit not set yet, continue into if statement... // in ipath_destroy_qp() on other CPU: set_bit(IPATH_S_DESTROYING, &qp->s_busy); /* Stop the sending tasklet. */ tasklet_kill(&qp->s_task); // tasklet_kill does nothing, // not scheduled yet... tasklet_hi_schedule(&qp->s_task); // uh-oh... In fact testing qp->s_busy is potentially just as much use-after-free as scheduling the tasklet itself... - R. From rdreier at cisco.com Tue Jan 15 20:06:10 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 15 Jan 2008 20:06:10 -0800 Subject: [ofa-general] [PATCH 1/2] IB/ipath - fix UD send with immediate In-Reply-To: <20080115235813.7794.4890.stgit@eng-46.internal.keyresearch.com> (Arthur Jones's message of "Tue, 15 Jan 2008 15:58:13 -0800") References: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> <20080115235813.7794.4890.stgit@eng-46.internal.keyresearch.com> Message-ID: > This fixes a small bug which incorrectly calculated the header size > for UD send with immediate and therefore dropped packets. What's the bug? It's not clear how this patch fixes anything... > @@ -301,8 +301,6 @@ int ipath_make_ud_req(struct ipath_qp *qp) > > /* header size in 32-bit words LRH+BTH+DETH = (8+12+8)/4. */ > qp->s_hdrwords = 7; > - if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) > - qp->s_hdrwords++; > qp->s_cur_size = wqe->length; > qp->s_cur_sge = &qp->s_sge; > qp->s_wqe = wqe; > @@ -327,6 +325,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) > ohdr = &qp->s_hdr.u.oth; > } > if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) { > + qp->s_hdrwords++; > ohdr->u.ud.imm_data = wqe->wr.imm_data; This looks like it doesn't make any difference, since I don't see any place qp->s_hdrwords is used in between the old location and the new location of the increment... > + /* > + * The opcode is in the low byte when its in network order > + * (top byte when in host order). > + */ > + opcode = be32_to_cpu(ohdr->bth[0]) >> 24; > + if (qp->ibqp.qp_num > 1 && > + opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { > + if (header_in_data) { > + wc.imm_data = *(__be32 *) data; > + data += sizeof(__be32); > + } else > + wc.imm_data = ohdr->u.ud.imm_data; > + wc.wc_flags = IB_WC_WITH_IMM; > + hdrsize += sizeof(u32); > + } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { > + wc.imm_data = 0; > + wc.wc_flags = 0; > + } else { > + dev->n_pkt_drops++; > + goto bail; > + } > + > /* Get the number of bytes the message was padded by. */ > pad = (be32_to_cpu(ohdr->bth[0]) >> 20) & 3; > if (unlikely(tlen < (hdrsize + pad + 4))) { I guess the bug is that you need to set hdrsize to the length including the immediate data before this last test? And this bug causes problems because MVAPICH uses UD sends with immediate data?? How bad is the impact of this -- do we really need it for 2.6.24? By the way, is there anything in the IB spec that forbids sends with immediate data on QP1? I'm wondering why your code treats QP1 differently here. - R. From jsquyres at cisco.com Tue Jan 15 20:05:57 2008 From: jsquyres at cisco.com (Jeff Squyres (jsquyres)) Date: Tue, 15 Jan 2008 23:05:57 -0500 Subject: [ofa-general] OFED weekly teleconference Message-ID: <70E5158644264146A09B8EC2CCC25CC9035DF5@xmb-rtp-215.amer.cisco.com> When: Monday, January 28, 2008 12:00 PM-1:00 PM (GMT-05:00) Eastern Time (US & Canada). Where: ID: 210020028 *~*~*~*~*~*~*~*~*~* ______________________________________________________________________________ Jeffrey Squyres has invited you to a Cisco Unified MeetingPlace Conference Date/Time: JAN 28, 2008 at 12:00PM America/New_York Length: 60 Frequency: 1 Meeting ID: 210020028 Meeting Password: Global Access Numbers: http://cisco.com/en/US/about/doing_business/conferencing/index.html US/Canada: +1.866.432.9903 United Kingdom: +44.20.8824.0117 India: +91.80.4103.3979 Germany: +49.619.6773.9002 Japan: +81.3.5763.9394 China: +86.10.8515.5666 TO ATTEND A WEB AND VOICE CONFERENCE: CISCO INTRANET ATTENDEES Join the Web & Voice Conference* 1. Go to http://meetingplaceinternal.cisco.com/join.asp?210020028 2. Enter your CEC User ID & Password then click OK - Accept any security warnings you receive and wait for the Meeting Room to initialize 3. Click on CONNECT from the Meeting Room to join the Voice Conference portion of the meeting EXTERNAL ATTENDEES - Outside the Cisco Intranet** Join the Web & Voice Conference* 1. Go to http://meetingplace.cisco.com/join.asp?210020028 2. Fill in the My Name is field then click Attend Meeting - If you have a CEC User ID, click on the Cisco icon - Accept any security warnings you receive and wait for the Meeting Room to initialize 3. Click on CONNECT from the Meeting Room to join the Voice Conference portion of the meeting - Note: Guest users will see a link to the Global Access Numbers. *If this is your first time attending a Web Conference, disable any pop-up blockers and visit http://meetingplace.cisco.com/mpweb/scripts/browsertestupper.asp to test your web browser for compatibility with the Web Conference. **Not all meetings are scheduled to allow external attendees into the Web Conference portion of the meeting, if the URL does not work, please follow the Voice only Conference instructions below to attend. TO ATTEND A VOICE ONLY CONFERENCE 1. Dial into Cisco Unified MeetingPlace (view the Access Numbers and link above) 2. Press 1 to attend the meeting 3. Follow the prompts to enter the Meeting ID 210020028 and join the meeting SUPPORT Information about this Conference: Contact Jeffrey Squyres, 85250971 Cisco IT Support Center: Attend the Voice Conference and then press #0 on your phone keypad GLOBAL ACCESS NUMBERS COUNTRY LOCATION LOCAL NUMBER TOLL FREE-FREEFONE Algeria Algiers +213.21.98.9047 Argentina Buenos Aires +54.11.4341.0101 Australia Canberra +61.2.6216.0643 Melbourne +61.3.9659.4173 North Sydney +61.2.8446.5260 Austria Vienna +43.12.4030.6022 Azerbaijan Baku +994.12.437.4829 Belgium Brussels +32.2.704.5072 Bosnia & Herzegovina Sarajevo +387.33.56.2898 Brazil Brasilia +55.613.424.0220 Rio de Janeiro +55.21.2483.6302 Sao Paulo +55.11.5508.6311 Bulgaria Sofia +359.2.937.5938 Canada Calgary +1.403.514.2435 Edmonton +1.780.441.3715 Halifax +1.902.474.0214 Kanata +1.613.254.0005 Markham +1.905.470.4810 Montreal +1.514.847.6875 Ottawa +1.613.788.7250 Quebec +1.418.634.5645 Regina +1.306.566.6410 Toronto +1.416.306.7230 Vancouver +1.604.647.2350 Winnipeg +1.204.336.6610 Chile Santiago +56.2.431.4936 China Beijing +86.10.8515.5666 Chengdu +86.28.8696.1333 Guangzhou +86.20.8519.3333 Shanghai +86.21.2302.4333 Colombia Bogota +57.1.325.6065 Costa Rica San Jose +506.201.3617 Croatia Zagreb +385.1.462.8908 Czech Republic Prague +420.22.143.5100 Denmark Aabyhoj +45.8.939.7131 Copenhagen +45.3.958.5010 Dominican Republic Santo Domingo +45.8.939.7131 Egypt Cairo +20.22.488.5377 Estonia Tallinn +372.6.67.5998 Finland Espoo +358.204.70.6227 France Paris +33.15.804.3116 Germany Eschborn +49.619.6773.9002 Hallbergmoos +49.811.554.3016 Greece Athens +30.210.638.1303 Hong Kong Hong Kong +852.3406.1000 Hungary Budapest +36.1.225.4621 India Bangalore +91.80.4103.3979 Chennai +1.800.102.3979 Mumbai IL & FS +91.22.4043.4030 New Delhi +91.11.4261.1088 Indonesia Jakarta +62.21.7854.7476 Ireland Dublin +353.1.819.2717 Israel Netanya +972.9.892.7026 Italy Milan +39.039.629.5068 Rome +39.06.5164.4006 Japan Tokyo Akasaka +81.3.5763.9394 +0120.312271 Kazakhstan Almaty +7.327.244.2198 Korea Seoul Asem +82.2.3429.8102 Latvia Please see Finland Number Lebanon Beirut +961.1.97.7011 Malaysia Kuala Lumpur +60.3.2081.1515 Penang +60.4.631.5125 Mexico Mexico City +52.55.5267.1800 Monterrey +52.81.8221.2462 Guadalajara +52.33.3770.1206 Morocco Casablanca +212.2.242.4088 Netherlands Amsterdam +31.20.357.1487 New Zealand Auckland +64.9.355.1968 Wellington +64.4.496.5554 Nigeria Lagos +234.1.462.1048 Norway Oslo +47.23.27.3647 Pakistan Islamabad +92.51.209.7994 Peru Lima +51.1.215.5101 Philippines Makati (Manila) +63.2.750.5886 Poland Warsaw +48.22.572.2615 Portugal Lisbon +351.21.446.8756 Puerto Rico San Juan +1.787.620.1865 Romania Bucharest +40.21.302.3511 Russia Moscow +7.495.230.5612 Saudi Arabia Dhahran +966.3.865.7998 Jeddah +966.2.653.6555 Riyadh +966.1.218.2666 Serbia Belgrade +381.11.209.2098 Singapore Singapore Capital +65.6317.7088 Slovakia Bratislava +421.2.5825.5309 Slovenia Ljubljana +386.1.582.3158 South Africa Cape Town +27.21.413.4502 Johannesburg +27.11.267.1011 Pretoria +27.12.844.7401 Spain Barcelona +34.93.393.4037 Madrid +34.91.201.2149 Sweden Gothenburg +46.31.63.4409 Stockholm +46.8.685.9035 Switzerland Glattzentrum +41.44.878.7335 Taiwan Taipei +886.2.8758.7088 Thailand Bangkok +66.2.263.7008 Turkey Istanbul +90.212.335.0208 Ukraine Kiev +380.44.391.3698 United Arab Emirates(UAE) Dubai +971.4.390.7840 United Kingdom Bedfont Lakes +44.20.8824.0117 Edinburgh +44.131.561.3643 London City +44.20.7496.3743 United States East +1.919.392.3330 +1.866.349.3520 West +1.408.525.6800 +1.866.432.9903 Venezuela Caracas +58.212.902.0210 Vietnam Ho Chi Minh +84.8.823.3418 City (Saigon) Hanoi +84.4.974.6250 _______________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: meeting.ics Type: text/calendar Size: 10182 bytes Desc: not available URL: From pseudoaccidental at medianet.com Tue Jan 15 21:06:51 2008 From: pseudoaccidental at medianet.com (Piercarlo Obrien) Date: Wed, 16 Jan 2008 06:06:51 +0100 Subject: [ofa-general] Mlcrosoft 0ff!ce2007 for XP|Vlsta 79, Retail 899 (save 819) Message-ID: <000401c857fc$841e1c00$0100007f@eahror> type fastadobenow . com in Internet browser avid xpress pro 5.7 - 119 turbotax business 2006 (usa only) - 29 steinberg nuendo 3.1 - 99 acronis true image workstation 9.1.3887 - 29 avid liquid pro 7 - 69 microsoft exchange server enterprise 2003 - 59 adobe illustrator cs2 - 59 adobe framemaker 8.0 - 69 cadlink signlab vinyl 7.1 - 69 ulead videostudio 11.0 plus - 39 microsoft onenote pro 2003 - 29 microsoft exchange server enterprise 2003 - 59 webeasy pro 6.0 - 39 sonic scenarist 3.0 - 49 sony vegas 6 - 69 From jackm at dev.mellanox.co.il Tue Jan 15 22:55:01 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 16 Jan 2008 08:55:01 +0200 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: <1200416103.11174.303.camel@mtls03> References: <1199980899.11174.91.camel@mtls03> <1200402798.11174.279.camel@mtls03> <1200416103.11174.303.camel@mtls03> Message-ID: <200801160855.01774.jackm@dev.mellanox.co.il> On Tuesday 15 January 2008 18:55, Eli Cohen wrote: > On a second thought, we could add - in addition to the creation flags - > a special qp type, IB_QPT_UD_IPOIB for example, where each low level > driver could add special enhancements for the sake of achieving higher > performance. What do you thinks? > You would get the same effect by just adding a create flag bit: IPOIB_QP This way, we do not need to add a new QP type every time we have need to know who the QP creator is, or if there is a sub-type. - Jack From jackm at dev.mellanox.co.il Tue Jan 15 23:00:17 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 16 Jan 2008 09:00:17 +0200 Subject: [ofa-general] [PATCH] Non-supported functions should return NULL when returning pointers In-Reply-To: <20080115235027.GB31543@opengridcomputing.com> References: <20080115235027.GB31543@opengridcomputing.com> Message-ID: <200801160900.17270.jackm@dev.mellanox.co.il> On Wednesday 16 January 2008 01:50, Jon Mason wrote: > Non-supported functions should return NULL when returning pointers. > Some/Most user space programs will not check for a (void *) to -ENOSYS, > which can look like a real address until referenced. > This patch breaks consistency with all the other drivers, and with libibverbs. During the debugging cycle, user programs will get segmentation faults, and fix their code. - Jack From glebn at voltaire.com Tue Jan 15 23:34:59 2008 From: glebn at voltaire.com (Gleb Natapov) Date: Wed, 16 Jan 2008 09:34:59 +0200 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: <000201c857b9$e67ee020$a937170a@amr.corp.intel.com> References: <478D1A49.1080807@mellanox.co.il> <000201c857b9$e67ee020$a937170a@amr.corp.intel.com> Message-ID: <20080116073459.GA20554@minantech.com> On Tue, Jan 15, 2008 at 01:02:12PM -0800, Sean Hefty wrote: > I would rather see OFED pull code from upstream with patches added > on only for backports and fixes. This is very important point actually. Is there any guaranty that XRC API will be pushed to the kernel as is? What if kernel maintainers will refuse to accept it in the present form? Will application using XRC from OFED have to support two different XRC APIs as a result? Roland, you said that XRC API is ugly, are you going to push it upstream in its present form? -- Gleb. From eli at dev.mellanox.co.il Tue Jan 15 23:39:21 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 09:39:21 +0200 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: References: <1199980899.11174.91.camel@mtls03> <1200402798.11174.279.camel@mtls03> <1200416103.11174.303.camel@mtls03> Message-ID: <1200469161.11174.322.camel@mtls03> On Tue, 2008-01-15 at 14:27 -0800, Roland Dreier wrote: > > On a second thought, we could add - in addition to the creation flags - > > a special qp type, IB_QPT_UD_IPOIB for example, where each low level > > driver could add special enhancements for the sake of achieving higher > > performance. What do you thinks? > > Did you have some specific use for this in mind? I don't think it's a > good idea to add this unless there is a really big win we can get. > > - R. The idea is that by being able to identify this as a special kind of QP, the low level driver can provide, for example, a dedicated send function which can avoid some logic in the data path and thus provide better results. For example, when the message rate is high (small UDP messages) and the CPU is fully loaded, reducing the amount of work done in the send function contributes to increasing the rate. I have made some changes against the ofed tree and was able to improve 128 bytes UDP throughput from ~280 mpbs to ~500 mpbs. This was measured on RH4U5 and a special send function was one of the changes I have made to achieve this. I will post these changes to the list soon. From eli at dev.mellanox.co.il Tue Jan 15 23:45:29 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 09:45:29 +0200 Subject: [ofa-general] Re: [PATCH] ib/core: Add creation flags to create QP In-Reply-To: <200801160855.01774.jackm@dev.mellanox.co.il> References: <1199980899.11174.91.camel@mtls03> <1200402798.11174.279.camel@mtls03> <1200416103.11174.303.camel@mtls03> <200801160855.01774.jackm@dev.mellanox.co.il> Message-ID: <1200469529.11174.327.camel@mtls03> On Wed, 2008-01-16 at 08:55 +0200, Jack Morgenstein wrote: > On Tuesday 15 January 2008 18:55, Eli Cohen wrote: > > On a second thought, we could add - in addition to the creation flags - > > a special qp type, IB_QPT_UD_IPOIB for example, where each low level > > driver could add special enhancements for the sake of achieving higher > > performance. What do you thinks? > > > You would get the same effect by just adding a create flag bit: > IPOIB_QP Yes I guess you're right. Adding a new QP type was how I did it from the first place but a creation flag will do as well. > > This way, we do not need to add a new QP type every time we have need > to know who the QP creator is, or if there is a sub-type. > > - Jack From ogerlitz at voltaire.com Wed Jan 16 01:21:05 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Wed, 16 Jan 2008 11:21:05 +0200 (IST) Subject: [ofa-general] ipoib_start_xmit Gratuitous ARP / bonding failover handling not applied on connected mode neighbours?! Message-ID: Hi Roland, Looking on ipoib_start_xmit, it seems that both the check that comes to handle a gratitious ARP (ie a difference between the remote GID as kept in the ipoib_neigh to the one present in the network stack neighbour) and the check that comes to handle a situation where we attempt to xmit an ipoib_neigh created by another ipoib device (ie following a bonding failover) - does not come into play for the connected mode neighbours. Isn't it a bug, or I miss something? Or. +static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh; + unsigned long flags; + ... + if (likely(skb->dst && skb->dst->neighbour)) { ... + neigh = *to_ipoib_neigh(skb->dst->neighbour); + + if (ipoib_cm_get(neigh)) { + if (ipoib_cm_up(neigh)) { + ipoib_cm_send(dev, skb, ipoib_cm_get(neigh)); + goto out; + } + } else if (neigh->ah) { + if (unlikely((memcmp(&neigh->dgid.raw, + skb->dst->neighbour->ha + 4, + sizeof(union ib_gid))) || + (neigh->dev != dev))) { any reason not to apply these two checks on connected mode neighbours? + spin_lock(&priv->lock); ... + ipoib_send(dev, skb, neigh->ah, IPOIB_QPN(skb->dst->neighbour->ha)); + goto out; + } From info at gletle.net Wed Jan 16 02:54:58 2008 From: info at gletle.net (=?windows-1255?B?4yL4IPnpIPHl7O8=?=) Date: Wed, 16 Jan 2008 02:54:58 -0800 Subject: [ofa-general] =?windows-1255?b?PyDs4/L6IOzu6+X4IOvu5SDu5PTw6CAt?= =?windows-1255?b?IPH46CDg5SDx7unw+A==?= Message-ID: <20080116105506.42194E28A79@openfabrics.org> An HTML attachment was scrubbed... URL: From vlad at lists.openfabrics.org Wed Jan 16 03:12:13 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 16 Jan 2008 03:12:13 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080116-0200 daily build status Message-ID: <20080116111213.0AD42E60074@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.19 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.22 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.16 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18 Passed on ppc64 with linux-2.6.12 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.18 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.19 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.15 Passed on x86_64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.17 Passed on ia64 with linux-2.6.12 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.17 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.13 Passed on ppc64 with linux-2.6.18 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-53.el5 Failed: From hrosenstock at xsigo.com Wed Jan 16 04:21:12 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Wed, 16 Jan 2008 04:21:12 -0800 Subject: [PATCH] opensm: Special Case the IPv6 Solicited Node Multicast address to use a single Mcast (WAS: Re: [ofa-general] IPoIB, OFED 1.2.5, and multicast groups.) In-Reply-To: <20080115141655.1e2c4c04.weiny2@llnl.gov> References: <20080114153546.73d43d6b.weiny2@llnl.gov> <1200355500.8962.148.camel@hrosenstock-ws.xsigo.com> <20080115005045.GG16009@sashak.voltaire.com> <1200360160.8962.196.camel@hrosenstock-ws.xsigo.com> <20080115014355.GK16009@sashak.voltaire.com> <1200361233.8962.206.camel@hrosenstock-ws.xsigo.com> <20080115084700.1d46407c.weiny2@llnl.gov> <1200418638.8962.328.camel@hrosenstock-ws.xsigo.com> <20080115192010.GG16009@sashak.voltaire.com> <1200424673.8962.364.camel@hrosenstock-ws.xsigo.com> <20080115193347.GI16009@sashak.voltaire.com> <20080115141655.1e2c4c04.weiny2@llnl.gov> Message-ID: <1200486072.8962.414.camel@hrosenstock-ws.xsigo.com> On Tue, 2008-01-15 at 14:16 -0800, Ira Weiny wrote: > I _will_ take your word for this, but I am still curious as to who is going > to know these MGIDs to be queried? Why do they need to be known to be queried ? An SA GetTable with MGID wildcarded (which is what saquery -g does) _should_ get these. -- Hal From hnguyen at linux.vnet.ibm.com Wed Jan 16 05:44:35 2008 From: hnguyen at linux.vnet.ibm.com (Hoang-Nam Nguyen) Date: Wed, 16 Jan 2008 14:44:35 +0100 Subject: [ofa-general] [PATCH] IB/ipoib: Fix undefined symbol (priv->cm) if ipoib_cm disabled Message-ID: <200801161444.35365.hnguyen@linux.vnet.ibm.com> Signed-off-by: Hoang-Nam Nguyen --- drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index e499626..0a58ac4 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -181,6 +181,7 @@ static int ipoib_change_mtu(struct net_device *dev, int new_mtu) { struct ipoib_dev_priv *priv = netdev_priv(dev); +#ifdef CONFIG_INFINIBAND_IPOIB_CM /* dev->mtu > 2K ==> connected mode */ if (ipoib_cm_admin_enabled(dev)) { if (new_mtu > priv->cm.max_cm_mtu) @@ -193,6 +194,7 @@ static int ipoib_change_mtu(struct net_device *dev, int new_mtu) dev->mtu = new_mtu; return 0; } +#endif if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) return -EINVAL; -- 1.5.2 From kliteyn at dev.mellanox.co.il Wed Jan 16 05:57:26 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 16 Jan 2008 15:57:26 +0200 Subject: [ofa-general] [PATCH] opensm/scripts: removing trailing blanks Message-ID: <478E0D46.7050805@dev.mellanox.co.il> Hi Sasha. Cosmetics - removing trailing blanks. Signed-off-by: Yevgeny Kliteynik --- opensm/scripts/opensm.conf | 6 ++-- opensm/scripts/opensmd | 54 ++++++++++++++++++------------------ opensm/scripts/redhat-opensm.init | 54 ++++++++++++++++++------------------ opensm/scripts/sldd.sh | 30 ++++++++++---------- 4 files changed, 72 insertions(+), 72 deletions(-) diff --git a/opensm/scripts/opensm.conf b/opensm/scripts/opensm.conf index 3c5dcdf..2ec63d4 100644 --- a/opensm/scripts/opensm.conf +++ b/opensm/scripts/opensm.conf @@ -14,7 +14,7 @@ # none, no debug options are enabled. DEBUG=none -# LMC +# LMC # This option specifies the subnet's LMC value. # The number of LIDs assigned to each port is 2^LMC. # The LMC value must be in the range 0-7. @@ -60,7 +60,7 @@ TIMEOUT=200 # This option defines the log to be the given file. # By default the log goes to /var/log/opensm.log # For the log to go to standard output use OSM_LOG=stdout. -OSM_LOG=/var/log/opensm.log +OSM_LOG=/var/log/opensm.log # VERBOSE # This option increases the log verbosity level. # The "-v" option may be specified multiple times @@ -103,7 +103,7 @@ OSM_CACHE_DIR=/var/cache/opensm # Set to '--cache-options' or '-c' in order to enable CACHE_OPTIONS="none" -# HONORE_GUID2LID +# HONORE_GUID2LID # This option forces OpenSM to honor the guid2lid file, # when it comes out of Standby state, if such file exists # under OSM_CACHE_DIR, and is valid. diff --git a/opensm/scripts/opensmd b/opensm/scripts/opensmd index 0a5fe9b..276a76f 100755 --- a/opensm/scripts/opensmd +++ b/opensm/scripts/opensmd @@ -62,7 +62,7 @@ shift if [ ! -x $prog ]; then echo "OpenSM not installed" exit 1 -fi +fi # Check if OpenSM configured to start automatically if [[ -z $ONBOOT || "$ONBOOT" != "yes" ]]; then @@ -248,7 +248,7 @@ echo_failure() { [ "$BOOTUP" = "color" ] && $MOVE_TO_COL echo -n "[" [ "$BOOTUP" = "color" ] && $SETCOLOR_FAILURE - echo -n $"FAILED" + echo -n $"FAILED" [ "$BOOTUP" = "color" ] && $SETCOLOR_NORMAL echo -n "]" echo -e "\r" @@ -257,7 +257,7 @@ echo_failure() { ######################################################################### - + # Check if $pid (could be plural) are running checkpid() { local i @@ -321,10 +321,10 @@ start() echo "Please load Infiniband driver first" echo return 2 - fi - + fi + local OSM_PID= - + if [ -f $PID_FILE ]; then local line p read line < $PID_FILE @@ -332,14 +332,14 @@ start() [ -z "${p//[0-9]/}" -a -d "/proc/$p" ] && pid="$pid $p" done fi - + if [ -z "$pid" ]; then pid=`pidof -o $$ -o $PPID -o %PPID -x $bin` fi - + if [ -n "${pid:-}" ] ; then echo $"${bin} (pid $pid) is already running..." - else + else if [ -n "${HONORE_GUID2LID_FLAG}" ]; then # Run sldd daemod @@ -352,17 +352,17 @@ start() do [ ! -z "$flag" ] && START_FLAGS="$START_FLAGS $flag" done - + echo $PORT_FLAG | $prog $START_FLAGS > /dev/null 2>&1 & OSM_PID=$! echo $OSM_PID > $PID_FILE - sleep 1 + sleep 1 checkpid $OSM_PID RC=$? [ $RC -eq 0 ] && echo_success "$bin start" || echo_failure "$bin start" - + fi -return $RC +return $RC } stop() @@ -381,11 +381,11 @@ stop() [ -z "${p//[0-9]/}" -a -d "/proc/$p" ] && pid1="$pid1 $p" done fi - + pid2=`pidof -o $$ -o $PPID -o %PPID -x $bin` - - pid=`echo "$pid1 $pid2" | sed -e 's/\ /\n/g' | sort -n | uniq | sed -e 's/\n/\ /g'` - + + pid=`echo "$pid1 $pid2" | sed -e 's/\ /\n/g' | sort -n | uniq | sed -e 's/\n/\ /g'` + if [ -n "${pid:-}" ] ; then # Kill opensm kill -15 $pid > /dev/null 2>&1 @@ -398,7 +398,7 @@ stop() kill -KILL $p > /dev/null 2>&1 echo -n "." sleep 1 - done + done done echo checkpid $pid @@ -408,24 +408,24 @@ stop() else echo_failure "$bin shutdown" RC=1 - fi - + fi + # Remove pid file if any. rm -f $PID_FILE -return $RC +return $RC } status() { local pid - + # First try "pidof" pid=`pidof -o $$ -o $PPID -o %PPID -x ${bin}` if [ -n "$pid" ]; then echo $"${bin} (pid $pid) is running..." - return 0 + return 0 fi - + # Next try "/var/run/opensm.pid" files if [ -f $PID_FILE ] ; then read pid < $PID_FILE @@ -435,7 +435,7 @@ status() fi fi echo $"${bin} is stopped" - return 3 + return 3 } @@ -445,7 +445,7 @@ case $ACTION in start ;; stop) - stop + stop ;; restart) stop @@ -455,7 +455,7 @@ case $ACTION in status ;; *) - echo + echo echo "Usage: `basename $0` {start|stop|restart|status}" echo exit 1 diff --git a/opensm/scripts/redhat-opensm.init b/opensm/scripts/redhat-opensm.init index 34739f3..9780f4f 100755 --- a/opensm/scripts/redhat-opensm.init +++ b/opensm/scripts/redhat-opensm.init @@ -155,7 +155,7 @@ fi ######################################################################### - + start_sldd() { if [ -f $sldd_pid_file ]; then @@ -205,9 +205,9 @@ stop_sldd() start() { local OSM_PID= - + pid="" - + if [ -f $PID_FILE ]; then local line p read line < $PID_FILE @@ -215,14 +215,14 @@ start() [ -z "${p//[0-9]/}" -a -d "/proc/$p" ] && pid="$pid $p" done fi - + if [ -z "$pid" ]; then pid=`pidof -o $$ -o $PPID -o %PPID -x $bin` fi - + if [ -n "${pid:-}" ] ; then echo $"${bin} (pid $pid) is already running..." - else + else if [ -n "${HONORE_GUID2LID_FLAG}" ]; then # Run sldd daemod @@ -235,20 +235,20 @@ start() do [ ! -z "$flag" ] && START_FLAGS="$START_FLAGS $flag" done - - echo -n "Starting IB Subnet Manager" + + echo -n "Starting IB Subnet Manager" echo $PORT_FLAG | $prog $START_FLAGS > /dev/null 2>&1 & OSM_PID=$! echo $OSM_PID > $PID_FILE - sleep 1 + sleep 1 checkpid $OSM_PID RC=$? [ $RC -eq 0 ] && echo_success || echo_failure [ $RC -eq 0 ] && touch /var/lock/subsys/opensm echo - + fi -return $RC +return $RC } stop() @@ -267,11 +267,11 @@ stop() [ -z "${p//[0-9]/}" -a -d "/proc/$p" ] && pid1="$pid1 $p" done fi - + pid2=`pidof -o $$ -o $PPID -o %PPID -x $bin` - - pid=`echo "$pid1 $pid2" | sed -e 's/\ /\n/g' | sort -n | uniq | sed -e 's/\n/\ /g'` - + + pid=`echo "$pid1 $pid2" | sed -e 's/\ /\n/g' | sort -n | uniq | sed -e 's/\n/\ /g'` + if [ -n "${pid:-}" ] ; then # Kill opensm echo -n "Stopping IB Subnet Manager." @@ -280,7 +280,7 @@ stop() while [ $cnt -lt 6 -a $alive -ne 0 ]; do echo -n "."; alive=0 - for p in $pid; do + for p in $pid; do if checkpid $p ; then alive=1; echo -n "-"; fi done let cnt++; @@ -293,11 +293,11 @@ stop() kill -KILL $p > /dev/null 2>&1 echo -n "+" sleep 1 - done + done done checkpid $pid RC=$? - [ $RC -eq 0 ] && echo_failure || echo_success + [ $RC -eq 0 ] && echo_failure || echo_success echo RC=$((! $RC)) else @@ -305,25 +305,25 @@ stop() echo_failure echo RC=1 - fi - + fi + # Remove pid file if any. rm -f $PID_FILE rm -f /var/lock/subsys/opensm - return $RC + return $RC } status() { local pid - + # First try "pidof" pid=`pidof -o $$ -o $PPID -o %PPID -x ${bin}` if [ -n "$pid" ]; then echo $"${bin} (pid $pid) is running..." - return 0 + return 0 fi - + # Next try "/var/run/opensm.pid" files if [ -f $PID_FILE ] ; then read pid < $PID_FILE @@ -333,7 +333,7 @@ status() fi fi echo $"${bin} is stopped" - return 3 + return 3 } @@ -343,7 +343,7 @@ case $ACTION in start ;; stop) - stop + stop ;; restart) stop @@ -361,7 +361,7 @@ case $ACTION in fi ;; *) - echo + echo echo "Usage: `basename $0` {start|stop|restart|status}" echo exit 1 diff --git a/opensm/scripts/sldd.sh b/opensm/scripts/sldd.sh index 496e74d..cf844a9 100755 --- a/opensm/scripts/sldd.sh +++ b/opensm/scripts/sldd.sh @@ -27,16 +27,16 @@ # # -# OpenSM found to have the following problem -# when handover is performed: +# OpenSM found to have the following problem +# when handover is performed: # If some of the cluster nodes are rebooted during the handover they loose their LID assignment. -# The reason for it is that the standby SM does not obey its own Guid to LID table -# and simply uses the discovered LIDs. If some nodes are not available for it +# The reason for it is that the standby SM does not obey its own Guid to LID table +# and simply uses the discovered LIDs. If some nodes are not available for it # their previous LID assignment is lost forever. # The idea is to use an external daemon that will distribute # the semi-static LID assignment table from the master SM to all standby SMs. -# A standby SM, becoming a master . needs to obey the copied semi static LID assignment table. +# A standby SM, becoming a master . needs to obey the copied semi static LID assignment table. # config: /etc/opensm.conf @@ -66,7 +66,7 @@ declare -i SLDD_DEBUG RESCAN_TIME=${RESCAN_TIME:-60} if [ -z "${OSM_HOSTS}" ]; then - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "No OpenSM servers (OSM_HOSTS) configured for the IB subnet." exit 0 fi @@ -78,7 +78,7 @@ arr_OSM_HOSTS=(${OSM_HOSTS}) num_of_osm_hosts=${#arr_OSM_HOSTS[@]} if [ ${num_of_osm_hosts} -eq 1 ]; then - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "One OpenSM server configured in the IB subnet." && echo "Nothing to be done for SLDD" @@ -123,19 +123,19 @@ update_remote_cache() if is_alive $host; then stat=$($RSH $host "/bin/mkdir -p ${CACHE_DIR} > /dev/null 2>&1; /bin/rm -f ${CACHE_FILE}.${local_host} > /dev/null 2>&1; echo \$?" | tr -d '[:space:]') if [ "X${stat}" == "X0" ]; then - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "Updating $host" logger -i "SLDD: updating $host with ${CACHE_FILE}" $RCP ${CACHE_FILE}.upd ${host}:${CACHE_FILE}.${local_host} /bin/cp ${CACHE_FILE}.upd ${CACHE_FILE}.${host} else - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "$RSH to $host failed." logger -i "SLDD: Failed to update $host with ${CACHE_FILE}. $RSH without password should be enabled" exit 5 fi else - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "$host is down." continue fi @@ -203,7 +203,7 @@ do # Check if local cache file larger than remote chache file if [ ${new_size} -gt ${largest_remote_cache_size} ]; then - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "Local cache file larger then remote. Update remote cache files" last_size=${new_size} update_remote_cache @@ -220,7 +220,7 @@ do # Update local cache file from remote if [ ${largest_remote_cache_size} -gt ${new_size} ]; then - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "Local cache file shorter then remote. Use ${largest_remote_cache}" logger -i "SLDD: updating local cache file with ${largest_remote_cache}" swap_cache_files @@ -228,13 +228,13 @@ do fi else # The local cache file is empty - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "${CACHE_FILE} is empty" largest_remote_cache=$(get_largest_remote_cache) if [[ -n "${largest_remote_cache}" && -s "${largest_remote_cache}" ]]; then # Copy it to the current cache - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "Local cache file is empty. Use ${largest_remote_cache}" logger -i "SLDD: updating local cache file with ${largest_remote_cache}" swap_cache_files @@ -242,7 +242,7 @@ do fi - [ $SLDD_DEBUG -eq 1 ] && + [ $SLDD_DEBUG -eq 1 ] && echo "Sleeping ${RESCAN_TIME} seconds." sleep ${RESCAN_TIME} -- 1.5.1.4 From kliteyn at dev.mellanox.co.il Wed Jan 16 06:01:50 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 16 Jan 2008 16:01:50 +0200 Subject: [ofa-general] [PATCH] opensm/scripts: fixing MAXSMPS values to the right default Message-ID: <478E0E4E.3010206@dev.mellanox.co.il> Hi Sasha, OpenSM has a maxsmps default value of 4, but the scripts have default of 0. Fixing the defaults in startup scripts. Please apply to ofed_1_3 and master. -- Yevgeny Signed-off-by: Yevgeny Kliteynik --- opensm/scripts/opensm.conf | 4 ++-- opensm/scripts/opensmd | 2 +- opensm/scripts/redhat-opensm.init | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/opensm/scripts/opensm.conf b/opensm/scripts/opensm.conf index 2ec63d4..11c7047 100644 --- a/opensm/scripts/opensm.conf +++ b/opensm/scripts/opensm.conf @@ -31,8 +31,8 @@ LMC=0 # allowed on the wire at any one time. # Specifying -maxsmps 0 allows unlimited outstanding SMPs. # Without -maxsmps, OpenSM defaults to a maximum of -# one outstanding SMP. -MAXSMPS=0 +# four outstanding SMP. +MAXSMPS=4 # REASSIGN_LIDS # This option causes OpenSM to reassign LIDs to all diff --git a/opensm/scripts/opensmd b/opensm/scripts/opensmd index 276a76f..e5c734e 100755 --- a/opensm/scripts/opensmd +++ b/opensm/scripts/opensmd @@ -98,7 +98,7 @@ else LMC_FLAG="-l ${LMC}" fi -if [[ -z $MAXSMPS || "$MAXSMPS" == "0" ]]; then +if [[ -z $MAXSMPS || "$MAXSMPS" == "4" ]]; then MAXSMPS_FLAG="" else MAXSMPS_FLAG="-maxsmps ${MAXSMPS}" diff --git a/opensm/scripts/redhat-opensm.init b/opensm/scripts/redhat-opensm.init index 9780f4f..56dbb7c 100755 --- a/opensm/scripts/redhat-opensm.init +++ b/opensm/scripts/redhat-opensm.init @@ -75,7 +75,7 @@ else LMC_FLAG="-l ${LMC}" fi -if [[ -z $MAXSMPS ]]; then +if [[ -z $MAXSMPS || "$MAXSMPS" == "4" ]]; then MAXSMPS_FLAG="" else MAXSMPS_FLAG="-maxsmps ${MAXSMPS}" -- 1.5.1.4 From ogerlitz at voltaire.com Wed Jan 16 06:11:43 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Wed, 16 Jan 2008 16:11:43 +0200 Subject: [ofa-general] Re: new API for the rdma-cma In-Reply-To: <478D43DD.1090007@opengridcomputing.com> References: <478D346B.8070506@opengridcomputing.com> <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> <478D43DD.1090007@opengridcomputing.com> Message-ID: <478E109F.3000909@voltaire.com> Steve Wise wrote: > Sean Hefty wrote: >>> What do you think about adding an API to the rdma-cma allowing >>> applications to get a list of ip addresses associated with a particular >>> rdma device/port? >> It seems kind of backwards from the design of the rdma_cm, but if it's >> useful to end users, I don't have any objections to having it. > Consider OMPI, which looks at each device on the system that can be used > to connect to the other nodes. Each device is analyzed to see what > method of communication should be used (tcp, ib, iwarp, whatever). Then > these interfaces and their attributes are conveyed to all the nodes and > the desired communication mesh is determined. Steve, Sean, this approach assumes the MPI job scheduler associate a rank with HW using a scheme, where I would like to let a scheme be supported, since I believe that what you suggest is covered by such design. How about a different approach that complies better to the nature of the rdma-cm and seems to support the requirement: have an API that would let apps to get a list of {interface name, ip addresses, device-attributes} containing all the "RDMA" interfaces, that is those whose ether-type is ARPHRD_INFINIBAND and what-ever key that identified iwarp interfaces. This can easily be implemented at user space using netlink calls as done by the ip(8) command, for example for the following device > $ ip addr show ib0 > 25: ib0: mtu 1500 qdisc pfifo_fast qlen 128 > link/[32] 80:00:04:04:fe:80:00:00:00:00:00:00:00:08:f1:04:03:97:08:dd > brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff > inet 192.168.3.61/24 brd 192.168.3.255 scope global ib0 > inet 192.168.3.71/32 scope global ib0 > inet6 fe80::208:f104:397:8dd/64 scope link valid_lft forever preferred_lft forever the user space script/code would note that its an IPoIB device, with the IPv4 addresses being 192.168.3.61/24 and 192.168.3.71/32 Now, if you want to go deeper and expose the to the job scheduler or to the rank, you can implement this code that uses netlink in librdmacm (no need for kernel changes) and it would reuse the code present now in rdma_resolve_addr for the local resolution, that is resolve the from an local ip address. Or. From eli at mellanox.co.il Wed Jan 16 07:14:25 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 17:14:25 +0200 Subject: [ofa-general] [PATCH] ib/mlx4_ib: Optimize stamping in mlx4 Message-ID: <1200496465.13546.8.camel@mtls03> Optimize stamping in mlx4 We can check how much of the WQE was used in the previous time and stamp only what was used. Signed-off-by: Eli Cohen The same fix can be done also in userspace. --- drivers/infiniband/hw/mlx4/qp.c | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index fc4811c..3138c5e 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -123,8 +123,11 @@ static void stamp_send_wqe(struct mlx4_ib_qp *qp, int n) { u32 *wqe = get_send_wqe(qp, n); int i; + struct mlx4_wqe_ctrl_seg *ctrl = wqe; + int stamp_limit; - for (i = 16; i < 1 << (qp->sq.wqe_shift - 2); i += 16) + stamp_limit = (ctrl->fence_size & 0x3f) << 2; + for (i = 16; i < stamp_limit; i += 16) wqe[i] = 0xffffffff; } @@ -928,6 +931,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, for (i = 0; i < qp->sq.wqe_cnt; ++i) { ctrl = get_send_wqe(qp, i); ctrl->owner_opcode = cpu_to_be32(1 << 31); + ctrl->fence_size = 1 << (qp->sq.wqe_shift - 4); stamp_send_wqe(qp, i); } -- 1.5.3.8 From swise at opengridcomputing.com Wed Jan 16 07:33:22 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Wed, 16 Jan 2008 09:33:22 -0600 Subject: [ofa-general] RE: new API for the rdma-cma In-Reply-To: <20080116011308.GK28360@obsidianresearch.com> References: <478D346B.8070506@opengridcomputing.com> <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> <478D43DD.1090007@opengridcomputing.com> <000a01c857d6$b046b1a0$a937170a@amr.corp.intel.com> <20080116011308.GK28360@obsidianresearch.com> Message-ID: <478E23C2.2040901@opengridcomputing.com> Jason Gunthorpe wrote: > On Tue, Jan 15, 2008 at 04:28:16PM -0800, Sean Hefty wrote: >>> I'm prototyping this now (as part of our OMPI/rdma-cm/iwarp work). >> I'm not sure about what the interface should be, since there could >> be multiple addresses (IPv4 and IPv6) on a port. As a generality, >> my preference is to use sockaddr where possible. > > Generally sockaddr alone for this kind of purpose is frowned on in my > view since there is no portable way to get the address length, and all > downstream functions require the length.. > > The best interface is something like getaddrinfo that returns a new > type that has the family, address and length in the structure. > > Also, your second version would require the address array to be of > type sockaddr_storage, since sockaddr can only point to, not store > addresses.. > > Jason Jason, maybe the if_index is a good start like you said. Regardless, I don't see any way to know which if_index (or even if_name) maps to which port on a multi-port rdma device. The /sys/class/.../net:* entries show both port device if_names, but there is really no way to know which one is port 1 and which one is port 2. With interface renaming, it is not necessarily true, for example, given an rnic with 2 ports setup as eth1 and eth2, that eth1 maps to port 1 of a device and eth2 maps to port 2. They could be renamed to anything. Does if_index remain constant? IE maybe I can assume that the if_index ordering is the same as the port ordering. So if a 2 port rnic, for example, has if indecies 2 and 3. Then I can assume if_index 2 is port 1 and if_index 3 is port 2. Is that a valid assumption? The other way to do this, I guess, is to look at port gids and map them back to the nic mac address. But I was hoping for something easier. Thoughts? From tziporet at mellanox.co.il Wed Jan 16 08:22:45 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Wed, 16 Jan 2008 18:22:45 +0200 Subject: [ofa-general] OFED 1.3 RC2 release is available Message-ID: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> Hi, OFED 1.3 RC2 release is available on http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc2.tgz To get BUILD_ID run ofed_info Please report any issues in bugzilla https://bugs.openfabrics.org/ The RC3 release is expected on January 30 Tziporet & Vlad ======================================================================== Release information: -------------------- OS support: Novell: - SLES10 - SLES10 SP1 and up1 Redhat: - Redhat EL4 up4 and up5 - Redhat EL5 and up1 kernel.org: - 2.6.23 and 2.6.24-rc5 Compilation only checks: - Fedora Core 6 - openSuSE 10.3 - Redhat EL4 up6 Systems: * x86_64 * x86 * ia64 * ppc64 Main Changes from OFED 1.3-RC1 =============================== * Fixed 21 Bugs (see attachment) * Added support for RHEL4.6 and openSuSE10.3 * Install: Added vendor's pre/post install scripts support * MPI packages update: * openmpi-1.2.5-1 * mvapich-1.0.0-1844 * mvapich2-1.0.1-2 * Added support for Qlogic new HCA: Specific module changes: ======================== ULPs: ----- SDP: * Executing netperf with TCP_CORK enabled never ends * poll() always returns POLLOUT on non-blocking socket * SDP connect() only allows AF_INET (2), not AF_INET_SDP (27) iSER: * Separate open-iscsi and iSER patches for different distros IPoIB: * Fix IPOIB LSO support: turn on the QP_CREATE_LSO flag to let the hw layer know and take proper actions Libraries: ---------- libibverbs: * Preserve backwards binary compatibility. librdmacm: * Release 1.0.5 Utilities: ---------- Opensm: * Fixing core dump in fat-tree routing * Use valid pkey index value for gsi mads * osm_sa_slvl_record: fix overflow crash * Fixing a seg. fault in processing mcast groups * mcast mgr improvements * QoS policy - increased stability mstflint: * Convert project to autoconf tools Performance tests: * Fix bug rdma_lat.c. Messages up to 400 bytes will be sent Inline * Added multicast support to ib_send_bw and ib_send_lat tests Diagnostic tools: * Enhanced saquery to support: VLArb and PKey Table Records Ports with LinkRecord query SL2VLTableRecord attribute Attribute names support * checkerrors: fix port errors count and query only single ports in CAs ibutils: * vsGetGeneralInfo function now dumps the correct data * Fixed stack-smashing bug in ibis gid typemaps, which could cause crashes on ppc64 Low level drivers: ------------------ mlx4: * max_recv_wr must be > 0 for non-SRQ QPs. * Fix the value of the pkey_index in the completion to get a valid value for GSI QPs. * Do not use memcpy when copying to the BlueFlame buffer * Fix pkey_index processing in cq polling mthca: * Ensure an Rx WQE is in memory before linking cxgb: * library release 1.1.2 ipath: * Added support for the new HCA iba7220 Nes: * fix virtual WQ mapping and size Tasks that should be completed for RC3: ============================== 1. XRC enhanced API 2. Fix bugs -------------- next part -------------- A non-text attachment was scrubbed... Name: rc2-fixed-bugs.csv Type: application/octet-stream Size: 2158 bytes Desc: rc2-fixed-bugs.csv URL: From sweitzen at cisco.com Wed Jan 16 08:27:24 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Wed, 16 Jan 2008 08:27:24 -0800 Subject: [ofa-general] OFED 1.3 RC2 release is available In-Reply-To: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> Message-ID: Isn't RHEL4 up6 supported, too? I have added Version 1.3rc2 to bugzilla. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -----Original Message----- > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of > Tziporet Koren > Sent: Wednesday, January 16, 2008 8:23 AM > To: ewg at lists.openfabrics.org > Cc: general at lists.openfabrics.org > Subject: [ofa-general] OFED 1.3 RC2 release is available > > > Hi, > OFED 1.3 RC2 release is available on > http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc2.tgz > > To get BUILD_ID run ofed_info > > Please report any issues in bugzilla https://bugs.openfabrics.org/ > The RC3 release is expected on January 30 > > Tziporet & Vlad > > > > ============================================================== > ========== > > Release information: > -------------------- > OS support: > Novell: > - SLES10 > - SLES10 SP1 and up1 > Redhat: > - Redhat EL4 up4 and up5 > - Redhat EL5 and up1 > kernel.org: > - 2.6.23 and 2.6.24-rc5 > > Compilation only checks: > - Fedora Core 6 > - openSuSE 10.3 > - Redhat EL4 up6 > > Systems: > * x86_64 > * x86 > * ia64 > * ppc64 > > Main Changes from OFED 1.3-RC1 > =============================== > * Fixed 21 Bugs (see attachment) > * Added support for RHEL4.6 and openSuSE10.3 > * Install: Added vendor's pre/post install scripts support > * MPI packages update: > * openmpi-1.2.5-1 > * mvapich-1.0.0-1844 > * mvapich2-1.0.1-2 > * Added support for Qlogic new HCA: > > Specific module changes: > ======================== > ULPs: > ----- > SDP: > * Executing netperf with TCP_CORK enabled never ends > * poll() always returns POLLOUT on non-blocking socket > * SDP connect() only allows AF_INET (2), not AF_INET_SDP (27) > iSER: > * Separate open-iscsi and iSER patches for different distros > IPoIB: > * Fix IPOIB LSO support: turn on the QP_CREATE_LSO flag to let the > hw layer know and take proper actions > > Libraries: > ---------- > libibverbs: > * Preserve backwards binary compatibility. > librdmacm: > * Release 1.0.5 > > Utilities: > ---------- > Opensm: > * Fixing core dump in fat-tree routing > * Use valid pkey index value for gsi mads > * osm_sa_slvl_record: fix overflow crash > * Fixing a seg. fault in processing mcast groups > * mcast mgr improvements > * QoS policy - increased stability > mstflint: > * Convert project to autoconf tools > Performance tests: > * Fix bug rdma_lat.c. Messages up to 400 bytes will be sent Inline > * Added multicast support to ib_send_bw and ib_send_lat tests > Diagnostic tools: > * Enhanced saquery to support: > VLArb and PKey Table Records > Ports with LinkRecord query > SL2VLTableRecord attribute > Attribute names support > * checkerrors: fix port errors count and query only single ports > in CAs > ibutils: > * vsGetGeneralInfo function now dumps the correct data > * Fixed stack-smashing bug in ibis gid typemaps, which could cause > crashes on ppc64 > > Low level drivers: > ------------------ > mlx4: > * max_recv_wr must be > 0 for non-SRQ QPs. > * Fix the value of the pkey_index in the completion to get a valid > value for GSI QPs. > * Do not use memcpy when copying to the BlueFlame buffer > * Fix pkey_index processing in cq polling > mthca: > * Ensure an Rx WQE is in memory before linking > cxgb: > * library release 1.1.2 > ipath: > * Added support for the new HCA iba7220 > Nes: > * fix virtual WQ mapping and size > > > Tasks that should be completed for RC3: > ============================== > 1. XRC enhanced API > 2. Fix bugs > > From eli at mellanox.co.il Wed Jan 16 08:37:19 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:37:19 +0200 Subject: [ofa-general] statelss offload patches Message-ID: <1200501439.13546.69.camel@mtls03> Following this email is a list of stateless offload patches. This series was posted to the list in the past and now has been revised. It has also been reviewed, some more and some less, by Or Gerlitz from Voltaire (thanks) though not all his suggestions made it to the code. I hope they are reviewed by the community and will eventually get into 2.6.25. From eli at mellanox.co.il Wed Jan 16 08:37:28 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:37:28 +0200 Subject: [ofa-general] [PATCH 1/16]: ib/ipoib: Add high dma support to ipoib Message-ID: <1200501448.13546.70.camel@mtls03> Add high dma support to ipoib This patch assumes all IB devices support dma-ing from high memory. Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index c9f6077..b40e0f7 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1118,6 +1118,8 @@ static struct net_device *ipoib_add_port(const char *format, SET_NETDEV_DEV(priv->dev, hca->dma_device); + priv->dev->features |= NETIF_F_HIGHDMA; + result = ib_query_pkey(hca, port, 0, &priv->pkey); if (result) { printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:37:39 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:37:39 +0200 Subject: [ofa-general] [PATCH 3/16] ib/core: Add checksum support to ib core Message-ID: <1200501459.13546.72.camel@mtls03> Add checksum support to ib core Signed-off-by: Eli Cohen --- include/rdma/ib_verbs.h | 13 +++++++++++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 11f3960..afd4d71 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -95,7 +95,14 @@ enum ib_device_cap_flags { IB_DEVICE_N_NOTIFY_CQ = (1<<14), IB_DEVICE_ZERO_STAG = (1<<15), IB_DEVICE_SEND_W_INV = (1<<16), - IB_DEVICE_MEM_WINDOW = (1<<17) + IB_DEVICE_MEM_WINDOW = (1<<17), + /* + * devices which publish this capability must support insertion of UDP + * and TCP checksum on outgoing packets and can verify the validity of + * checksum for incoming packets. Setting this flag implies the driver + * may set NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM. + */ + IB_DEVICE_IP_CSUM = (1<<18), }; enum ib_atomic_cap { @@ -431,6 +438,7 @@ struct ib_wc { u8 sl; u8 dlid_path_bits; u8 port_num; /* valid only for DR SMPs on switches */ + int csum_ok; }; enum ib_cq_notify_flags { @@ -615,7 +623,8 @@ enum ib_send_flags { IB_SEND_FENCE = 1, IB_SEND_SIGNALED = (1<<1), IB_SEND_SOLICITED = (1<<2), - IB_SEND_INLINE = (1<<3) + IB_SEND_INLINE = (1<<3), + IB_SEND_IP_CSUM = (1<<4) }; struct ib_sge { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:37:34 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:37:34 +0200 Subject: [ofa-general] [PATCH 2/16] ib/ipoib: Add s/g support for IPOIB Message-ID: <1200501454.13546.71.camel@mtls03> Add s/g support for IPOIB This patch acts as a preperation for using checksum offload for IB devices capable of inserting/verifying checksum in IP packets. The patch does not actaully turn on NETIF_F_SG but rather defers the role to the patches adding checksum offload capabilities. Support is added only for datagram mode since Mellanox HW does not support checksum offload on connected QPs. Signed-off-by: Michael S. Tsirkin Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib.h | 56 +++++++++++++++++++++++++++- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 10 ++-- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 41 ++++++++++---------- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 10 ++-- 4 files changed, 85 insertions(+), 32 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index eb7edab..6729c14 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -142,9 +142,61 @@ struct ipoib_rx_buf { struct ipoib_tx_buf { struct sk_buff *skb; - u64 mapping; + u64 mapping[MAX_SKB_FRAGS + 1]; }; +static inline int ipoib_dma_map_tx(struct ib_device *ca, + struct ipoib_tx_buf *tx_req) +{ + struct sk_buff *skb = tx_req->skb; + u64 *mapping = tx_req->mapping; + int frags; + int i; + + mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) + return -EIO; + + frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < frags; ++i) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + mapping[i + 1] = ib_dma_map_page(ca, frag->page, + frag->page_offset, frag->size, + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[i + 1]))) + goto partial_error; + } + return 0; + +partial_error: + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + + for (; i > 0; --i) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1]; + ib_dma_unmap_page(ca, mapping[i], frag->size, DMA_TO_DEVICE); + } + return -EIO; +} + +static inline void ipoib_dma_unmap_tx(struct ib_device *ca, + struct ipoib_tx_buf *tx_req) +{ + struct sk_buff *skb = tx_req->skb; + u64 *mapping = tx_req->mapping; + int frags; + int i; + + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + + frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < frags; ++i) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + ib_dma_unmap_page(ca, mapping[i + 1], frag->size, + DMA_TO_DEVICE); + } +} + struct ib_cm_id; struct ipoib_cm_data { @@ -290,7 +342,7 @@ struct ipoib_dev_priv { struct ipoib_tx_buf *tx_ring; unsigned tx_head; unsigned tx_tail; - struct ib_sge tx_sge; + struct ib_sge tx_sge[MAX_SKB_FRAGS + 1]; struct ib_send_wr tx_wr; unsigned tx_outstanding; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 059cf92..8485fde 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -495,8 +495,8 @@ static inline int post_send(struct ipoib_dev_priv *priv, { struct ib_send_wr *bad_wr; - priv->tx_sge.addr = addr; - priv->tx_sge.length = len; + priv->tx_sge[0].addr = addr; + priv->tx_sge[0].length = len; priv->tx_wr.wr_id = wr_id | IPOIB_OP_CM; @@ -537,7 +537,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_ return; } - tx_req->mapping = addr; + tx_req->mapping[0] = addr; if (unlikely(post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1), addr, skb->len))) { @@ -576,7 +576,7 @@ void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ib_wc *wc) tx_req = &tx->tx_ring[wr_id]; - ib_dma_unmap_single(priv->ca, tx_req->mapping, tx_req->skb->len, DMA_TO_DEVICE); + ib_dma_unmap_single(priv->ca, tx_req->mapping[0], tx_req->skb->len, DMA_TO_DEVICE); /* FIXME: is this right? Shouldn't we only increment on success? */ ++dev->stats.tx_packets; @@ -954,7 +954,7 @@ timeout: while ((int) p->tx_tail - (int) p->tx_head < 0) { tx_req = &p->tx_ring[p->tx_tail & (ipoib_sendq_size - 1)]; - ib_dma_unmap_single(priv->ca, tx_req->mapping, tx_req->skb->len, + ib_dma_unmap_single(priv->ca, tx_req->mapping[0], tx_req->skb->len, DMA_TO_DEVICE); dev_kfree_skb_any(tx_req->skb); ++p->tx_tail; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 5063dd5..680c27f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -257,8 +257,7 @@ static void ipoib_ib_handle_tx_wc(struct net_device *dev, struct ib_wc *wc) tx_req = &priv->tx_ring[wr_id]; - ib_dma_unmap_single(priv->ca, tx_req->mapping, - tx_req->skb->len, DMA_TO_DEVICE); + ipoib_dma_unmap_tx(priv->ca, tx_req); ++dev->stats.tx_packets; dev->stats.tx_bytes += tx_req->skb->len; @@ -341,16 +340,23 @@ void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) static inline int post_send(struct ipoib_dev_priv *priv, unsigned int wr_id, struct ib_ah *address, u32 qpn, - u64 addr, int len) + u64 *mapping, int headlen, + skb_frag_t *frags, + int nr_frags) { struct ib_send_wr *bad_wr; + int i; - priv->tx_sge.addr = addr; - priv->tx_sge.length = len; - - priv->tx_wr.wr_id = wr_id; - priv->tx_wr.wr.ud.remote_qpn = qpn; - priv->tx_wr.wr.ud.ah = address; + priv->tx_sge[0].addr = mapping[0]; + priv->tx_sge[0].length = headlen; + for (i = 0; i < nr_frags; ++i) { + priv->tx_sge[i + 1].addr = mapping[i + 1]; + priv->tx_sge[i + 1].length = frags[i].size; + } + priv->tx_wr.num_sge = nr_frags + 1; + priv->tx_wr.wr_id = wr_id; + priv->tx_wr.wr.ud.remote_qpn = qpn; + priv->tx_wr.wr.ud.ah = address; return ib_post_send(priv->qp, &priv->tx_wr, &bad_wr); } @@ -360,7 +366,6 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ipoib_tx_buf *tx_req; - u64 addr; if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", @@ -383,20 +388,19 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, */ tx_req = &priv->tx_ring[priv->tx_head & (ipoib_sendq_size - 1)]; tx_req->skb = skb; - addr = ib_dma_map_single(priv->ca, skb->data, skb->len, - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { + if (unlikely(ipoib_dma_map_tx(priv->ca, tx_req))) { ++dev->stats.tx_errors; dev_kfree_skb_any(skb); return; } - tx_req->mapping = addr; if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), - address->ah, qpn, addr, skb->len))) { + address->ah, qpn, + tx_req->mapping, skb_headlen(skb), + skb_shinfo(skb)->frags, skb_shinfo(skb)->nr_frags))) { ipoib_warn(priv, "post_send failed\n"); ++dev->stats.tx_errors; - ib_dma_unmap_single(priv->ca, addr, skb->len, DMA_TO_DEVICE); + ipoib_dma_unmap_tx(priv->ca, tx_req); dev_kfree_skb_any(skb); } else { dev->trans_start = jiffies; @@ -615,10 +619,7 @@ int ipoib_ib_dev_stop(struct net_device *dev, int flush) while ((int) priv->tx_tail - (int) priv->tx_head < 0) { tx_req = &priv->tx_ring[priv->tx_tail & (ipoib_sendq_size - 1)]; - ib_dma_unmap_single(priv->ca, - tx_req->mapping, - tx_req->skb->len, - DMA_TO_DEVICE); + ipoib_dma_unmap_tx(priv->ca, tx_req); dev_kfree_skb_any(tx_req->skb); ++priv->tx_tail; --priv->tx_outstanding; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 3c6e45d..a6f5f65 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -149,14 +149,14 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .cap = { .max_send_wr = ipoib_sendq_size, .max_recv_wr = ipoib_recvq_size, - .max_send_sge = 1, + .max_send_sge = dev->features & NETIF_F_SG ? MAX_SKB_FRAGS + 1 : 1, .max_recv_sge = 1 }, .sq_sig_type = IB_SIGNAL_ALL_WR, .qp_type = IB_QPT_UD }; - int ret, size; + int i, ret, size; priv->pd = ib_alloc_pd(priv->ca); if (IS_ERR(priv->pd)) { @@ -197,11 +197,11 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) priv->dev->dev_addr[2] = (priv->qp->qp_num >> 8) & 0xff; priv->dev->dev_addr[3] = (priv->qp->qp_num ) & 0xff; - priv->tx_sge.lkey = priv->mr->lkey; + for (i = 0; i < MAX_SKB_FRAGS + 1; ++i) + priv->tx_sge[i].lkey = priv->mr->lkey; priv->tx_wr.opcode = IB_WR_SEND; - priv->tx_wr.sg_list = &priv->tx_sge; - priv->tx_wr.num_sge = 1; + priv->tx_wr.sg_list = priv->tx_sge; priv->tx_wr.send_flags = IB_SEND_SIGNALED; return 0; -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:37:43 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:37:43 +0200 Subject: [ofa-general] [PATCH 4/16] Add checksum offload support for ipoib Message-ID: <1200501463.13546.73.camel@mtls03> Add checksum offload support for ipoib Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib.h | 1 + drivers/infiniband/ulp/ipoib/ipoib_cm.c | 9 +++++++++ drivers/infiniband/ulp/ipoib/ipoib_ib.c | 19 +++++++++++++++++++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 +++++++++++++++ 4 files changed, 44 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 6729c14..f0876dc 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -86,6 +86,7 @@ enum { IPOIB_MCAST_STARTED = 8, IPOIB_FLAG_ADMIN_CM = 9, IPOIB_FLAG_UMCAST = 10, + IPOIB_FLAG_CSUM = 11, IPOIB_MAX_BACKOFF_SECONDS = 16, diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 8485fde..435ec58 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1234,6 +1234,11 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, set_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); ipoib_warn(priv, "enabling connected mode " "will cause multicast packet drops\n"); + + dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM); + + priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM; + ipoib_flush_paths(dev); return count; } @@ -1242,6 +1247,10 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, clear_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); dev->mtu = min(priv->mcast_mtu, dev->mtu); ipoib_flush_paths(dev); + + if (priv->ca->flags & IB_DEVICE_IP_CSUM) + dev->features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM; + return count; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 680c27f..ca11469 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -37,6 +37,7 @@ #include #include +#include #include @@ -231,6 +232,18 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) skb->dev = dev; /* XXX get correct PACKET_ type here */ skb->pkt_type = PACKET_HOST; + + /* check rx csum */ + if (test_bit(IPOIB_FLAG_CSUM, &priv->flags) && likely(wc->csum_ok)) { + /* + * Note: this is a specific requirement for Mellanox + * HW but since it is the only HW currently supporting + * checksum offload I put it here + */ + if (ip_hdr(skb)->ihl == 5) + skb->ip_summed = CHECKSUM_UNNECESSARY; + } + netif_receive_skb(skb); repost: @@ -394,6 +407,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, return; } + if (priv->ca->flags & IB_DEVICE_IP_CSUM && + skb->ip_summed == CHECKSUM_PARTIAL) + priv->tx_wr.send_flags |= IB_SEND_IP_CSUM; + else + priv->tx_wr.send_flags &= IB_SEND_IP_CSUM; + if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), address->ah, qpn, tx_req->mapping, skb_headlen(skb), diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index b40e0f7..58fc5b3 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1106,6 +1106,20 @@ int ipoib_add_pkey_attr(struct net_device *dev) return device_create_file(&dev->dev, &dev_attr_pkey); } +static void set_csum(struct net_device *dev, struct ib_device *hca) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (test_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags)) + return; + + if (!(hca->flags & IB_DEVICE_IP_CSUM)) + return; + + dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM; + set_bit(IPOIB_FLAG_CSUM, &priv->flags); +} + static struct net_device *ipoib_add_port(const char *format, struct ib_device *hca, u8 port) { @@ -1144,6 +1158,7 @@ static struct net_device *ipoib_add_port(const char *format, } else memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + set_csum(priv->dev, hca); result = ipoib_dev_init(priv->dev, hca, port); if (result < 0) { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:37:48 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:37:48 +0200 Subject: [ofa-general] [PATCH 5/16] ib/mlx4: Add checksum offload support to mlx4 Message-ID: <1200501468.13546.74.camel@mtls03> Add checksum offload support to mlx4 Signed-off-by: Eli Cohen Signed-off-by: Ali Ayub --- drivers/infiniband/hw/mlx4/cq.c | 2 ++ drivers/infiniband/hw/mlx4/main.c | 5 +++++ drivers/infiniband/hw/mlx4/qp.c | 3 +++ drivers/net/mlx4/fw.c | 3 +++ include/linux/mlx4/cq.h | 4 ++-- include/linux/mlx4/qp.h | 2 ++ 6 files changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 9d32c49..8b49895 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -431,6 +431,8 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, wc->wc_flags |= be32_to_cpu(cqe->g_mlpath_rqpn) & 0x80000000 ? IB_WC_GRH : 0; wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) & 0x7f; + wc->csum_ok = be32_to_cpu(cqe->ipoib_status) & 0x10000000 && + be16_to_cpu(cqe->checksum) == 0xffff; } return 0; diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index d8287d9..8ce94a1 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -99,6 +99,8 @@ static int mlx4_ib_query_device(struct ib_device *ibdev, props->device_cap_flags |= IB_DEVICE_AUTO_PATH_MIG; if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_UD_AV_PORT) props->device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) + props->device_cap_flags |= IB_DEVICE_IP_CSUM; props->vendor_id = be32_to_cpup((__be32 *) (out_mad->data + 36)) & 0xffffff; @@ -612,6 +614,9 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) ibdev->ib_dev.unmap_fmr = mlx4_ib_unmap_fmr; ibdev->ib_dev.dealloc_fmr = mlx4_ib_fmr_dealloc; + if (ibdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) + ibdev->ib_dev.flags |= IB_DEVICE_IP_CSUM; + if (init_node_data(ibdev)) goto err_map; diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 8cba9c5..ca7cd04 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -1307,6 +1307,9 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE) : 0) | (wr->send_flags & IB_SEND_SOLICITED ? cpu_to_be32(MLX4_WQE_CTRL_SOLICITED) : 0) | + ((wr->send_flags & IB_SEND_IP_CSUM) ? + cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM | + MLX4_WQE_CTRL_TCP_UDP_CSUM) : 0) | qp->sq_signal_bits; if (wr->opcode == IB_WR_SEND_WITH_IMM || diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index 5064873..d6c2851 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -736,6 +736,9 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param) MLX4_PUT(inbox, (u8) (PAGE_SHIFT - 12), INIT_HCA_UAR_PAGE_SZ_OFFSET); MLX4_PUT(inbox, param->log_uar_sz, INIT_HCA_LOG_UAR_SZ_OFFSET); + if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) + *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 3); + err = mlx4_cmd(dev, mailbox->dma, 0, 0, MLX4_CMD_INIT_HCA, 10000); if (err) diff --git a/include/linux/mlx4/cq.h b/include/linux/mlx4/cq.h index 0181e0a..5fdc859 100644 --- a/include/linux/mlx4/cq.h +++ b/include/linux/mlx4/cq.h @@ -45,11 +45,11 @@ struct mlx4_cqe { u8 sl; u8 reserved1; __be16 rlid; - u32 reserved2; + __be32 ipoib_status; __be32 byte_cnt; __be16 wqe_index; __be16 checksum; - u8 reserved3[3]; + u8 reserved2[3]; u8 owner_sr_opcode; }; diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h index 3968b94..b4eb921 100644 --- a/include/linux/mlx4/qp.h +++ b/include/linux/mlx4/qp.h @@ -158,6 +158,8 @@ enum { MLX4_WQE_CTRL_FENCE = 1 << 6, MLX4_WQE_CTRL_CQ_UPDATE = 3 << 2, MLX4_WQE_CTRL_SOLICITED = 1 << 1, + MLX4_WQE_CTRL_IP_CSUM = 1 << 4, + MLX4_WQE_CTRL_TCP_UDP_CSUM = 1 << 5, }; struct mlx4_wqe_ctrl_seg { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:37:52 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:37:52 +0200 Subject: [ofa-general] [PATCH 6/16] Add checksum offload support in mthca Message-ID: <1200501472.13546.75.camel@mtls03> Add checksum offload support in mthca Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mthca/mthca_cmd.c | 3 +++ drivers/infiniband/hw/mthca/mthca_cmd.h | 1 + drivers/infiniband/hw/mthca/mthca_cq.c | 14 +++++++++----- drivers/infiniband/hw/mthca/mthca_main.c | 6 ++++++ drivers/infiniband/hw/mthca/mthca_qp.c | 2 ++ drivers/infiniband/hw/mthca/mthca_wqe.h | 17 +++++++++-------- 6 files changed, 30 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c index 6966f94..2a38926 100644 --- a/drivers/infiniband/hw/mthca/mthca_cmd.c +++ b/drivers/infiniband/hw/mthca/mthca_cmd.c @@ -1383,6 +1383,9 @@ int mthca_INIT_HCA(struct mthca_dev *dev, MTHCA_PUT(inbox, param->uarc_base, INIT_HCA_UAR_CTX_BASE_OFFSET); } + if (dev->device_cap_flags & IB_DEVICE_IP_CSUM) + *(inbox + INIT_HCA_FLAGS2_OFFSET / 4) |= cpu_to_be32(7 << 3); + err = mthca_cmd(dev, mailbox->dma, 0, 0, CMD_INIT_HCA, HZ, status); mthca_free_mailbox(dev, mailbox); diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.h b/drivers/infiniband/hw/mthca/mthca_cmd.h index 2f976f2..8928ca4 100644 --- a/drivers/infiniband/hw/mthca/mthca_cmd.h +++ b/drivers/infiniband/hw/mthca/mthca_cmd.h @@ -103,6 +103,7 @@ enum { DEV_LIM_FLAG_RAW_IPV6 = 1 << 4, DEV_LIM_FLAG_RAW_ETHER = 1 << 5, DEV_LIM_FLAG_SRQ = 1 << 6, + DEV_LIM_FLAG_IPOIB_CSUM = 1 << 7, DEV_LIM_FLAG_BAD_PKEY_CNTR = 1 << 8, DEV_LIM_FLAG_BAD_QKEY_CNTR = 1 << 9, DEV_LIM_FLAG_MW = 1 << 16, diff --git a/drivers/infiniband/hw/mthca/mthca_cq.c b/drivers/infiniband/hw/mthca/mthca_cq.c index 6bd9f13..4e6c75c 100644 --- a/drivers/infiniband/hw/mthca/mthca_cq.c +++ b/drivers/infiniband/hw/mthca/mthca_cq.c @@ -119,7 +119,8 @@ struct mthca_cqe { __be32 my_qpn; __be32 my_ee; __be32 rqpn; - __be16 sl_g_mlpath; + u8 sl_ipok; + u8 g_mlpath; __be16 rlid; __be32 imm_etype_pkey_eec; __be32 byte_cnt; @@ -493,6 +494,7 @@ static inline int mthca_poll_one(struct mthca_dev *dev, int is_send; int free_cqe = 1; int err = 0; + u16 checksum; cqe = next_cqe_sw(cq); if (!cqe) @@ -635,12 +637,14 @@ static inline int mthca_poll_one(struct mthca_dev *dev, break; } entry->slid = be16_to_cpu(cqe->rlid); - entry->sl = be16_to_cpu(cqe->sl_g_mlpath) >> 12; + entry->sl = cqe->sl_ipok >> 4; entry->src_qp = be32_to_cpu(cqe->rqpn) & 0xffffff; - entry->dlid_path_bits = be16_to_cpu(cqe->sl_g_mlpath) & 0x7f; + entry->dlid_path_bits = cqe->g_mlpath & 0x7f; entry->pkey_index = be32_to_cpu(cqe->imm_etype_pkey_eec) >> 16; - entry->wc_flags |= be16_to_cpu(cqe->sl_g_mlpath) & 0x80 ? - IB_WC_GRH : 0; + entry->wc_flags |= cqe->g_mlpath & 0x80 ? IB_WC_GRH : 0; + checksum = (be32_to_cpu(cqe->rqpn) >> 24) | + ((be32_to_cpu(cqe->my_ee) >> 16) & 0xff00); + entry->csum_ok = (cqe->sl_ipok & 1 && checksum == 0xffff); } entry->status = IB_WC_SUCCESS; diff --git a/drivers/infiniband/hw/mthca/mthca_main.c b/drivers/infiniband/hw/mthca/mthca_main.c index 60de6f9..0257932 100644 --- a/drivers/infiniband/hw/mthca/mthca_main.c +++ b/drivers/infiniband/hw/mthca/mthca_main.c @@ -272,6 +272,10 @@ static int mthca_dev_lim(struct mthca_dev *mdev, struct mthca_dev_lim *dev_lim) if (dev_lim->flags & DEV_LIM_FLAG_SRQ) mdev->mthca_flags |= MTHCA_FLAG_SRQ; + if (mthca_is_memfree(mdev)) + if (dev_lim->flags & DEV_LIM_FLAG_IPOIB_CSUM) + mdev->device_cap_flags |= IB_DEVICE_IP_CSUM; + return 0; } @@ -1116,6 +1120,8 @@ static int __mthca_init_one(struct pci_dev *pdev, int hca_type) if (err) goto err_cmd; + mdev->ib_dev.flags = mdev->device_cap_flags; + if (mdev->fw_ver < mthca_hca_table[hca_type].latest_fw) { mthca_warn(mdev, "HCA FW version %d.%d.%03d is old (%d.%d.%03d is current).\n", (int) (mdev->fw_ver >> 32), (int) (mdev->fw_ver >> 16) & 0xffff, diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c index 0e5461c..86aa732 100644 --- a/drivers/infiniband/hw/mthca/mthca_qp.c +++ b/drivers/infiniband/hw/mthca/mthca_qp.c @@ -2012,6 +2012,8 @@ int mthca_arbel_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, cpu_to_be32(MTHCA_NEXT_CQ_UPDATE) : 0) | ((wr->send_flags & IB_SEND_SOLICITED) ? cpu_to_be32(MTHCA_NEXT_SOLICIT) : 0) | + ((wr->send_flags & IB_SEND_IP_CSUM) ? + cpu_to_be32(MTHCA_NEXT_IP_CSUM | MTHCA_NEXT_TCP_UDP_CSUM) : 0) | cpu_to_be32(1); if (wr->opcode == IB_WR_SEND_WITH_IMM || wr->opcode == IB_WR_RDMA_WRITE_WITH_IMM) diff --git a/drivers/infiniband/hw/mthca/mthca_wqe.h b/drivers/infiniband/hw/mthca/mthca_wqe.h index f6a66fe..0e3a0e4 100644 --- a/drivers/infiniband/hw/mthca/mthca_wqe.h +++ b/drivers/infiniband/hw/mthca/mthca_wqe.h @@ -38,14 +38,15 @@ #include enum { - MTHCA_NEXT_DBD = 1 << 7, - MTHCA_NEXT_FENCE = 1 << 6, - MTHCA_NEXT_CQ_UPDATE = 1 << 3, - MTHCA_NEXT_EVENT_GEN = 1 << 2, - MTHCA_NEXT_SOLICIT = 1 << 1, - - MTHCA_MLX_VL15 = 1 << 17, - MTHCA_MLX_SLR = 1 << 16 + MTHCA_NEXT_DBD = 1 << 7, + MTHCA_NEXT_FENCE = 1 << 6, + MTHCA_NEXT_CQ_UPDATE = 1 << 3, + MTHCA_NEXT_EVENT_GEN = 1 << 2, + MTHCA_NEXT_SOLICIT = 1 << 1, + MTHCA_NEXT_IP_CSUM = 1 << 4, + MTHCA_NEXT_TCP_UDP_CSUM = 1 << 5, + MTHCA_MLX_VL15 = 1 << 17, + MTHCA_MLX_SLR = 1 << 16 }; enum { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:37:57 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:37:57 +0200 Subject: [ofa-general] [PATCH 7/16] ib/core: Add creation flags to QPs Message-ID: <1200501477.13546.76.camel@mtls03> Add creation flags to QPs This will allow a kernel verbs consumer to create a QP and pass special flags to the hw layer. This patch also defines one such flag for LSO support. Signed-off-by: Eli Cohen --- drivers/infiniband/core/uverbs_cmd.c | 1 + include/rdma/ib_verbs.h | 5 +++++ 2 files changed, 6 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index 495c803..9e98cec 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -1065,6 +1065,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, attr.srq = srq; attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; attr.qp_type = cmd.qp_type; + attr.create_flags = 0; attr.cap.max_send_wr = cmd.max_send_wr; attr.cap.max_recv_wr = cmd.max_recv_wr; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index afd4d71..2498083 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -494,6 +494,10 @@ enum ib_qp_type { IB_QPT_RAW_ETY }; +enum qp_create_flags { + QP_CREATE_LSO = 1 << 0, +}; + struct ib_qp_init_attr { void (*event_handler)(struct ib_event *, void *); void *qp_context; @@ -504,6 +508,7 @@ struct ib_qp_init_attr { enum ib_sig_type sq_sig_type; enum ib_qp_type qp_type; u8 port_num; /* special QP types only */ + enum qp_create_flags create_flags; }; enum ib_rnr_timeout { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:38:01 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:38:01 +0200 Subject: [ofa-general] [PATCH 8/16] ib/core: Add core support for LSO Message-ID: <1200501481.13546.77.camel@mtls03> Add core support for LSO LSO allows to pass to the network driver SKBs with data larger than MTU and let the HW fragment the packet to mss quantities. Signed-off-by: Eli Cohen --- include/rdma/ib_verbs.h | 11 +++++++++-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 2498083..8412566 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -103,6 +103,7 @@ enum ib_device_cap_flags { * may set NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM. */ IB_DEVICE_IP_CSUM = (1<<18), + IB_DEVICE_TCP_TSO = (1<<19), }; enum ib_atomic_cap { @@ -410,6 +411,7 @@ enum ib_wc_opcode { IB_WC_COMP_SWAP, IB_WC_FETCH_ADD, IB_WC_BIND_MW, + IB_WC_LSO, /* * Set value of IB_WC_RECV so consumers can test if a completion is a * receive by testing (opcode & IB_WC_RECV). @@ -621,7 +623,8 @@ enum ib_wr_opcode { IB_WR_SEND_WITH_IMM, IB_WR_RDMA_READ, IB_WR_ATOMIC_CMP_AND_SWP, - IB_WR_ATOMIC_FETCH_AND_ADD + IB_WR_ATOMIC_FETCH_AND_ADD, + IB_WR_LSO }; enum ib_send_flags { @@ -629,7 +632,8 @@ enum ib_send_flags { IB_SEND_SIGNALED = (1<<1), IB_SEND_SOLICITED = (1<<2), IB_SEND_INLINE = (1<<3), - IB_SEND_IP_CSUM = (1<<4) + IB_SEND_IP_CSUM = (1<<4), + IB_SEND_UDP_LSO = (1<<5) }; struct ib_sge { @@ -659,6 +663,9 @@ struct ib_send_wr { } atomic; struct { struct ib_ah *ah; + void *header; + int hlen; + int mss; u32 remote_qpn; u32 remote_qkey; u16 pkey_index; /* valid for GSI only */ -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:38:06 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:38:06 +0200 Subject: [ofa-general] [PATCH 9/16] ib/ipoib: Add LSO support to ipoib Message-ID: <1200501486.13546.78.camel@mtls03> Add LSO support to ipoib Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib.h | 54 ++++++++++++------- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 7 ++- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 80 +++++++++++++++++++++------ drivers/infiniband/ulp/ipoib/ipoib_main.c | 8 +++- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 3 +- 5 files changed, 111 insertions(+), 41 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index f0876dc..e15884c 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -151,31 +151,40 @@ static inline int ipoib_dma_map_tx(struct ib_device *ca, { struct sk_buff *skb = tx_req->skb; u64 *mapping = tx_req->mapping; - int frags; int i; - - mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) - return -EIO; - - frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < frags; ++i) { + int nfrags; + int off; + + if (skb_headlen(skb)) { + mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) + return -EIO; + off = 1; + } else + off = 0; + + nfrags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < nfrags; ++i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - mapping[i + 1] = ib_dma_map_page(ca, frag->page, - frag->page_offset, frag->size, - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(ca, mapping[i + 1]))) + mapping[i + off] = ib_dma_map_page(ca, frag->page, frag->page_offset, + frag->size, DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[i + off]))) goto partial_error; } return 0; partial_error: - ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + if (skb_headlen(skb)) { + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + off = 0; + } else + off = 1; for (; i > 0; --i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1]; - ib_dma_unmap_page(ca, mapping[i], frag->size, DMA_TO_DEVICE); + ib_dma_unmap_page(ca, mapping[i - off], frag->size, + DMA_TO_DEVICE); } return -EIO; } @@ -185,15 +194,20 @@ static inline void ipoib_dma_unmap_tx(struct ib_device *ca, { struct sk_buff *skb = tx_req->skb; u64 *mapping = tx_req->mapping; - int frags; int i; + int nfrags; + int off; - ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + if (skb_headlen(skb)) { + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + off = 1; + } else + off = 0; - frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < frags; ++i) { + nfrags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < nfrags; ++i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - ib_dma_unmap_page(ca, mapping[i + 1], frag->size, + ib_dma_unmap_page(ca, mapping[i + off], frag->size, DMA_TO_DEVICE); } } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 435ec58..c35b2e2 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1235,7 +1235,7 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, ipoib_warn(priv, "enabling connected mode " "will cause multicast packet drops\n"); - dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM); + dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_TSO); priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM; @@ -1251,6 +1251,11 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, if (priv->ca->flags & IB_DEVICE_IP_CSUM) dev->features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM; + + if (priv->dev->features & NETIF_F_SG && + priv->ca->flags & IB_DEVICE_TCP_TSO) + priv->dev->features |= NETIF_F_TSO; + return count; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index ca11469..cdd8486 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -38,6 +38,7 @@ #include #include #include +#include #include @@ -353,24 +354,40 @@ void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) static inline int post_send(struct ipoib_dev_priv *priv, unsigned int wr_id, struct ib_ah *address, u32 qpn, - u64 *mapping, int headlen, - skb_frag_t *frags, - int nr_frags) + struct ipoib_tx_buf *tx_req, + void *head, int hlen) { struct ib_send_wr *bad_wr; - int i; + int i, off; + struct sk_buff *skb = tx_req->skb; + skb_frag_t *frags = skb_shinfo(skb)->frags; + int nr_frags = skb_shinfo(skb)->nr_frags; + u64 *mapping = tx_req->mapping; + + if (skb_headlen(skb)) { + priv->tx_sge[0].addr = mapping[0]; + priv->tx_sge[0].length = skb_headlen(skb); + off = 1; + } else + off = 0; - priv->tx_sge[0].addr = mapping[0]; - priv->tx_sge[0].length = headlen; for (i = 0; i < nr_frags; ++i) { - priv->tx_sge[i + 1].addr = mapping[i + 1]; - priv->tx_sge[i + 1].length = frags[i].size; + priv->tx_sge[i + off].addr = mapping[i + off]; + priv->tx_sge[i + off].length = frags[i].size; } - priv->tx_wr.num_sge = nr_frags + 1; + priv->tx_wr.num_sge = nr_frags + off; priv->tx_wr.wr_id = wr_id; priv->tx_wr.wr.ud.remote_qpn = qpn; priv->tx_wr.wr.ud.ah = address; + if (head) { + priv->tx_wr.wr.ud.mss = skb_shinfo(skb)->gso_size; + priv->tx_wr.wr.ud.header = head; + priv->tx_wr.wr.ud.hlen = hlen; + priv->tx_wr.opcode = IB_WR_LSO; + } else + priv->tx_wr.opcode = IB_WR_SEND; + return ib_post_send(priv->qp, &priv->tx_wr, &bad_wr); } @@ -379,14 +396,36 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ipoib_tx_buf *tx_req; + int hlen; + void *phead; + + if (!skb_is_gso(skb)) { + if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { + ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", + skb->len, priv->mcast_mtu + IPOIB_ENCAP_LEN); + ++dev->stats.tx_dropped; + ++dev->stats.tx_errors; + ipoib_cm_skb_too_long(dev, skb, priv->mcast_mtu); + return; + } + phead = 0; + hlen = 0; + } else { + /* + * LSO header is limited to max 60 bytes + */ + if (unlikely((ip_hdr(skb)->ihl + tcp_hdr(skb)->doff) > 15)) { + ipoib_warn(priv, "ip(%d) and tcp(%d) headers too long, dropping skb\n", + ip_hdr(skb)->ihl << 2, tcp_hdr(skb)->doff << 2); + goto drop; + } - if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { - ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", - skb->len, priv->mcast_mtu + IPOIB_ENCAP_LEN); - ++dev->stats.tx_dropped; - ++dev->stats.tx_errors; - ipoib_cm_skb_too_long(dev, skb, priv->mcast_mtu); - return; + hlen = ((ip_hdr(skb)->ihl + tcp_hdr(skb)->doff) << 2) + IPOIB_ENCAP_LEN; + phead = skb->data; + if (unlikely(!skb_pull(skb, hlen))) { + ipoib_warn(priv, "linear data too small\n"); + goto drop; + } } ipoib_dbg_data(priv, "sending packet, length=%d address=%p qpn=0x%06x\n", @@ -415,8 +454,7 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), address->ah, qpn, - tx_req->mapping, skb_headlen(skb), - skb_shinfo(skb)->frags, skb_shinfo(skb)->nr_frags))) { + tx_req, phead, hlen))) { ipoib_warn(priv, "post_send failed\n"); ++dev->stats.tx_errors; ipoib_dma_unmap_tx(priv->ca, tx_req); @@ -432,6 +470,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, netif_stop_queue(dev); } } + return; + +drop: + ++dev->stats.tx_errors; + dev_kfree_skb_any(skb); + return; } static void __ipoib_reap_ah(struct net_device *dev) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 58fc5b3..fa93eaf 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -713,7 +713,9 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev) goto out; } - ipoib_send(dev, skb, neigh->ah, IPOIB_QPN(skb->dst->neighbour->ha)); + ipoib_send(dev, skb, neigh->ah, + IPOIB_QPN(skb->dst->neighbour->ha)); + goto out; } @@ -1177,6 +1179,10 @@ static struct net_device *ipoib_add_port(const char *format, goto event_failed; } + if (priv->dev->features & NETIF_F_SG && priv->ca->flags & IB_DEVICE_TCP_TSO) + priv->dev->features |= NETIF_F_TSO; + + result = register_netdev(priv->dev); if (result) { printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index a6f5f65..f2289c6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -153,7 +153,8 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .max_recv_sge = 1 }, .sq_sig_type = IB_SIGNAL_ALL_WR, - .qp_type = IB_QPT_UD + .qp_type = IB_QPT_UD, + .create_flags = QP_CREATE_LSO, }; int i, ret, size; -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:38:09 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:38:09 +0200 Subject: [ofa-general] [PATCH 10/16] ib/mlx4: Add creation flags to mlx4 QPs Message-ID: <1200501489.13546.79.camel@mtls03> Add creation flags to mlx4 QPs The core passes creation flags and mlx4 saves them for later reference. Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mlx4/mlx4_ib.h | 5 +++++ drivers/infiniband/hw/mlx4/qp.c | 12 +++++++++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index 2869765..39bc060 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -110,6 +110,10 @@ struct mlx4_ib_wq { unsigned tail; }; +enum qp_flags { + MLX4_QP_LSO = 1 << 0 +}; + struct mlx4_ib_qp { struct ib_qp ibqp; struct mlx4_qp mqp; @@ -133,6 +137,7 @@ struct mlx4_ib_qp { u8 resp_depth; u8 sq_no_prefetch; u8 state; + u32 flags; }; struct mlx4_ib_srq { diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index ca7cd04..a04e931 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -238,9 +238,12 @@ static int set_rq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap, return 0; } -static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap, - enum ib_qp_type type, struct mlx4_ib_qp *qp) +static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_init_attr *init_attr, + struct mlx4_ib_qp *qp) { + struct ib_qp_cap *cap = &init_attr->cap; + enum ib_qp_type type = init_attr->qp_type; + /* Sanity check SQ size before proceeding */ if (cap->max_send_wr > dev->dev->caps.max_wqes || cap->max_send_sge > dev->dev->caps.max_sq_sg || @@ -256,6 +259,9 @@ static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap, cap->max_send_sge + 2 > dev->dev->caps.max_sq_sg) return -EINVAL; + if (init_attr->create_flags & QP_CREATE_LSO) + qp->flags |= MLX4_QP_LSO; + qp->sq.wqe_shift = ilog2(roundup_pow_of_two(max(cap->max_send_sge * sizeof (struct mlx4_wqe_data_seg), cap->max_inline_data + @@ -371,7 +377,7 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd, } else { qp->sq_no_prefetch = 0; - err = set_kernel_sq_size(dev, &init_attr->cap, init_attr->qp_type, qp); + err = set_kernel_sq_size(dev, init_attr, qp); if (err) goto err; -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:38:15 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:38:15 +0200 Subject: [ofa-general] [PATCH 11/16] ib/ipoib: Add LSO support to mlx4 Message-ID: <1200501495.13546.80.camel@mtls03> Add LSO support to mlx4 Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mlx4/cq.c | 3 ++ drivers/infiniband/hw/mlx4/main.c | 4 +++ drivers/infiniband/hw/mlx4/qp.c | 52 +++++++++++++++++++++++++++++++++--- drivers/net/mlx4/fw.c | 9 ++++++ drivers/net/mlx4/fw.h | 1 + drivers/net/mlx4/main.c | 1 + include/linux/mlx4/device.h | 1 + include/linux/mlx4/qp.h | 5 +++ 8 files changed, 71 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 8b49895..7258c88 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -403,6 +403,9 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, case MLX4_OPCODE_BIND_MW: wc->opcode = IB_WC_BIND_MW; break; + case MLX4_OPCODE_LSO: + wc->opcode = IB_WC_LSO; + break; } } else { wc->byte_len = be32_to_cpu(cqe->byte_cnt); diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 8ce94a1..2dd0de3 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -101,6 +101,8 @@ static int mlx4_ib_query_device(struct ib_device *ibdev, props->device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE; if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) props->device_cap_flags |= IB_DEVICE_IP_CSUM; + if (dev->dev->caps.max_gso_sz) + props->device_cap_flags |= IB_DEVICE_TCP_TSO; props->vendor_id = be32_to_cpup((__be32 *) (out_mad->data + 36)) & 0xffffff; @@ -616,6 +618,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) if (ibdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) ibdev->ib_dev.flags |= IB_DEVICE_IP_CSUM; + if (ibdev->dev->caps.max_gso_sz) + ibdev->ib_dev.flags |= IB_DEVICE_TCP_TSO; if (init_node_data(ibdev)) goto err_map; diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index a04e931..fc4811c 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -69,6 +69,7 @@ enum { static const __be32 mlx4_ib_opcode[] = { [IB_WR_SEND] = __constant_cpu_to_be32(MLX4_OPCODE_SEND), + [IB_WR_LSO] = __constant_cpu_to_be32(MLX4_OPCODE_LSO), [IB_WR_SEND_WITH_IMM] = __constant_cpu_to_be32(MLX4_OPCODE_SEND_IMM), [IB_WR_RDMA_WRITE] = __constant_cpu_to_be32(MLX4_OPCODE_RDMA_WRITE), [IB_WR_RDMA_WRITE_WITH_IMM] = __constant_cpu_to_be32(MLX4_OPCODE_RDMA_WRITE_IMM), @@ -243,6 +244,7 @@ static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_init_attr *i { struct ib_qp_cap *cap = &init_attr->cap; enum ib_qp_type type = init_attr->qp_type; + int reserve = 0; /* Sanity check SQ size before proceeding */ if (cap->max_send_wr > dev->dev->caps.max_wqes || @@ -259,15 +261,18 @@ static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_init_attr *i cap->max_send_sge + 2 > dev->dev->caps.max_sq_sg) return -EINVAL; - if (init_attr->create_flags & QP_CREATE_LSO) + if (init_attr->create_flags & QP_CREATE_LSO) { qp->flags |= MLX4_QP_LSO; + reserve = 64; + } qp->sq.wqe_shift = ilog2(roundup_pow_of_two(max(cap->max_send_sge * - sizeof (struct mlx4_wqe_data_seg), + sizeof (struct mlx4_wqe_data_seg) + + reserve, cap->max_inline_data + sizeof (struct mlx4_wqe_inline_seg)) + send_wqe_overhead(type))); - qp->sq.max_gs = ((1 << qp->sq.wqe_shift) - send_wqe_overhead(type)) / + qp->sq.max_gs = ((1 << qp->sq.wqe_shift) - reserve - send_wqe_overhead(type)) / sizeof (struct mlx4_wqe_data_seg); /* @@ -755,9 +760,11 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, } } - if (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI || - ibqp->qp_type == IB_QPT_UD) + if (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI) context->mtu_msgmax = (IB_MTU_4096 << 5) | 11; + else if (ibqp->qp_type == IB_QPT_UD) + context->mtu_msgmax = (IB_MTU_4096 << 5) | + ilog2(dev->dev->caps.max_gso_sz); else if (attr_mask & IB_QP_PATH_MTU) { if (attr->path_mtu < IB_MTU_256 || attr->path_mtu > IB_MTU_4096) { printk(KERN_ERR "path MTU (%u) is invalid\n", @@ -1274,6 +1281,28 @@ static void __set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ib_sge *sg) dseg->addr = cpu_to_be64(sg->addr); } +static int build_lso_seg(struct mlx4_lso_seg *wqe, struct ib_send_wr *wr, + struct mlx4_ib_qp *qp, int *lso_seg_len) +{ + int halign; + + halign = ALIGN(wr->wr.ud.hlen, 16); + if (unlikely(!(qp->flags & MLX4_QP_LSO) && wr->num_sge > qp->sq.max_gs - (halign >> 4))) + return -EINVAL; + + memcpy(wqe->header, wr->wr.ud.header, wr->wr.ud.hlen); + + /* make sure LSO header is written before + overwriting stamping */ + wmb(); + + wqe->mss_hdr_size = cpu_to_be32(((wr->wr.ud.mss - wr->wr.ud.hlen) + << 16) | wr->wr.ud.hlen); + + *lso_seg_len = halign; + return 0; +} + int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, struct ib_send_wr **bad_wr) { @@ -1364,6 +1393,19 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, set_datagram_seg(wqe, wr); wqe += sizeof (struct mlx4_wqe_datagram_seg); size += sizeof (struct mlx4_wqe_datagram_seg) / 16; + + if (wr->opcode == IB_WR_LSO) { + int hlen; + + err = build_lso_seg(wqe, wr, qp, &hlen); + if (err) { + *bad_wr = wr; + goto out; + } + wqe += hlen; + size += hlen >> 4; + } + break; case IB_QPT_SMI: diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index d6c2851..d243db5 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -133,6 +133,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) #define QUERY_DEV_CAP_MAX_AV_OFFSET 0x27 #define QUERY_DEV_CAP_MAX_REQ_QP_OFFSET 0x29 #define QUERY_DEV_CAP_MAX_RES_QP_OFFSET 0x2b +#define QUERY_DEV_CAP_MAX_GSO_OFFSET 0x2d #define QUERY_DEV_CAP_MAX_RDMA_OFFSET 0x2f #define QUERY_DEV_CAP_RSZ_SRQ_OFFSET 0x33 #define QUERY_DEV_CAP_ACK_DELAY_OFFSET 0x35 @@ -215,6 +216,13 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev_cap->max_requester_per_qp = 1 << (field & 0x3f); MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_RES_QP_OFFSET); dev_cap->max_responder_per_qp = 1 << (field & 0x3f); + MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_GSO_OFFSET); + field &= 0x1f; + if (!field) + dev_cap->max_gso_sz = 0; + else + dev_cap->max_gso_sz = 1 << field; + MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_RDMA_OFFSET); dev_cap->max_rdma_global = 1 << (field & 0x3f); MLX4_GET(field, outbox, QUERY_DEV_CAP_ACK_DELAY_OFFSET); @@ -377,6 +385,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev_cap->max_sq_desc_sz, dev_cap->max_sq_sg); mlx4_dbg(dev, "Max RQ desc size: %d, max RQ S/G: %d\n", dev_cap->max_rq_desc_sz, dev_cap->max_rq_sg); + mlx4_dbg(dev, "Max GSO size: %d\n", dev_cap->max_gso_sz); dump_dev_cap_flags(dev, dev_cap->flags); diff --git a/drivers/net/mlx4/fw.h b/drivers/net/mlx4/fw.h index 7e1dd9e..ad5abf3 100644 --- a/drivers/net/mlx4/fw.h +++ b/drivers/net/mlx4/fw.h @@ -96,6 +96,7 @@ struct mlx4_dev_cap { u8 bmme_flags; u32 reserved_lkey; u64 max_icm_sz; + int max_gso_sz; }; struct mlx4_adapter { diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index 89b3f0b..ed2c648 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -159,6 +159,7 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev->caps.page_size_cap = ~(u32) (dev_cap->min_page_sz - 1); dev->caps.flags = dev_cap->flags; dev->caps.stat_rate_support = dev_cap->stat_rate_support; + dev->caps.max_gso_sz = dev_cap->max_gso_sz; return 0; } diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 222815d..856570f 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -181,6 +181,7 @@ struct mlx4_caps { u32 flags; u16 stat_rate_support; u8 port_width_cap[MLX4_MAX_PORTS + 1]; + int max_gso_sz; }; struct mlx4_buf_list { diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h index b4eb921..0bac8e8 100644 --- a/include/linux/mlx4/qp.h +++ b/include/linux/mlx4/qp.h @@ -215,6 +215,11 @@ struct mlx4_wqe_datagram_seg { __be32 reservd[2]; }; +struct mlx4_lso_seg { + __be32 mss_hdr_size; + __be32 header[0]; +}; + struct mlx4_wqe_bind_seg { __be32 flags1; __be32 flags2; -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:38:20 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:38:20 +0200 Subject: [ofa-general] [PATCH 12/16] ib/ipoib: Add ethtool support to IPOIB Message-ID: <1200501500.13546.81.camel@mtls03> Add ethtool support to IPOIB Just add the infrastructure to add functionality later. Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/Makefile | 3 +- drivers/infiniband/ulp/ipoib/ipoib.h | 2 + drivers/infiniband/ulp/ipoib/ipoib_etool.c | 55 ++++++++++++++++++++++++++++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 + 4 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 drivers/infiniband/ulp/ipoib/ipoib_etool.c diff --git a/drivers/infiniband/ulp/ipoib/Makefile b/drivers/infiniband/ulp/ipoib/Makefile index 98ee38e..83488ee 100644 --- a/drivers/infiniband/ulp/ipoib/Makefile +++ b/drivers/infiniband/ulp/ipoib/Makefile @@ -4,7 +4,8 @@ ib_ipoib-y := ipoib_main.o \ ipoib_ib.o \ ipoib_multicast.o \ ipoib_verbs.o \ - ipoib_vlan.o + ipoib_vlan.o \ + ipoib_etool.o ib_ipoib-$(CONFIG_INFINIBAND_IPOIB_CM) += ipoib_cm.o ib_ipoib-$(CONFIG_INFINIBAND_IPOIB_DEBUG) += ipoib_fs.o diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index e15884c..6783936 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -520,6 +520,8 @@ void ipoib_pkey_poll(struct work_struct *work); int ipoib_pkey_dev_delay_open(struct net_device *dev); void ipoib_drain_cq(struct net_device *dev); +void ipoib_set_ethtool_ops(struct net_device *dev); + #ifdef CONFIG_INFINIBAND_IPOIB_CM #define IPOIB_FLAGS_RC 0x80 diff --git a/drivers/infiniband/ulp/ipoib/ipoib_etool.c b/drivers/infiniband/ulp/ipoib/ipoib_etool.c new file mode 100644 index 0000000..913aea0 --- /dev/null +++ b/drivers/infiniband/ulp/ipoib/ipoib_etool.c @@ -0,0 +1,55 @@ +/* + * Copyright (c) 2007 Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_etool.c $ + */ + +#include +#include +#include + +#include "ipoib.h" + +static void ipoib_get_drvinfo(struct net_device *netdev, + struct ethtool_drvinfo *drvinfo) +{ + strncpy(drvinfo->driver, "ipoib", sizeof(drvinfo->driver) - 1); +} + +static const struct ethtool_ops ipoib_ethtool_ops = { + .get_drvinfo = ipoib_get_drvinfo, + .get_tso = ethtool_op_get_tso, +}; + +void ipoib_set_ethtool_ops(struct net_device *dev) +{ + SET_ETHTOOL_OPS(dev, &ipoib_ethtool_ops); +} diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index fa93eaf..0192ed5 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -961,6 +961,8 @@ static void ipoib_setup(struct net_device *dev) dev->set_multicast_list = ipoib_set_mcast_list; dev->neigh_setup = ipoib_neigh_setup_dev; + ipoib_set_ethtool_ops(dev); + netif_napi_add(dev, &priv->napi, ipoib_poll, 100); dev->watchdog_timeo = HZ; -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:38:24 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:38:24 +0200 Subject: [ofa-general] [PATCH 13/16] ib/core: Add support for modify CQ Message-ID: <1200501504.13546.82.camel@mtls03> Add support for modify CQ Add support for modifying CQ parameters for controlling event generation moderation. Signed-off-by: Eli Cohen --- drivers/infiniband/core/verbs.c | 7 +++++++ include/rdma/ib_verbs.h | 12 ++++++++++++ 2 files changed, 19 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index 86ed8af..84709ed 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -628,6 +628,13 @@ struct ib_cq *ib_create_cq(struct ib_device *device, } EXPORT_SYMBOL(ib_create_cq); +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period) +{ + return cq->device->modify_cq ? + cq->device->modify_cq(cq, cq_count, cq_period) : -ENOSYS; +} +EXPORT_SYMBOL(ib_modify_cq); + int ib_destroy_cq(struct ib_cq *cq) { if (atomic_read(&cq->usecnt)) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 8412566..fa6e32e 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -984,6 +984,8 @@ struct ib_device { int comp_vector, struct ib_ucontext *context, struct ib_udata *udata); + int (*modify_cq)(struct ib_cq *cq, u16 cq_count, + u16 cq_period); int (*destroy_cq)(struct ib_cq *cq); int (*resize_cq)(struct ib_cq *cq, int cqe, struct ib_udata *udata); @@ -1389,6 +1391,16 @@ struct ib_cq *ib_create_cq(struct ib_device *device, int ib_resize_cq(struct ib_cq *cq, int cqe); /** + * ib_modify_cq - Modifies moderation params of the CQ + * @cq: The CQ to modify. + * @cq_count: number of CQEs that will tirgger an event + * @cq_period: max period of time beofre triggering an event + * + * Users can examine the cq structure to determine the actual CQ size. + */ +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); + +/** * ib_destroy_cq - Destroys the specified CQ. * @cq: The CQ to destroy. */ -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:38:28 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:38:28 +0200 Subject: [ofa-general] [PATCH 14/16] ib/ipoib: Support modifying IPOIB CQ moderation params Message-ID: <1200501508.13546.83.camel@mtls03> Support modifying IPOIB CQ moderation params This can be used to tune at run time the paramters controlling the event (interrupt) generation rate and thus reduce the overhead incurred by handling interrupts resulting in better throughput. Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib.h | 6 ++++ drivers/infiniband/ulp/ipoib/ipoib_etool.c | 42 ++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 6783936..b22b0c7 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -303,6 +303,11 @@ struct ipoib_cm_dev_priv { struct ib_recv_wr rx_wr; }; +struct ipoib_ethtool_st { + u16 coalesce_usecs; + u16 max_coalesced_frames; +}; + /* * Device private locking: tx_lock protects members used in TX fast * path (and we use LLTX so upper layers don't do extra locking). @@ -380,6 +385,7 @@ struct ipoib_dev_priv { struct dentry *mcg_dentry; struct dentry *path_dentry; #endif + struct ipoib_ethtool_st etool; }; struct ipoib_ah { diff --git a/drivers/infiniband/ulp/ipoib/ipoib_etool.c b/drivers/infiniband/ulp/ipoib/ipoib_etool.c index 913aea0..513140c 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_etool.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_etool.c @@ -44,9 +44,51 @@ static void ipoib_get_drvinfo(struct net_device *netdev, strncpy(drvinfo->driver, "ipoib", sizeof(drvinfo->driver) - 1); } +static int ipoib_get_coalesce(struct net_device *dev, + struct ethtool_coalesce *coal) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + coal->rx_coalesce_usecs = priv->etool.coalesce_usecs; + coal->tx_coalesce_usecs = priv->etool.coalesce_usecs; + coal->rx_max_coalesced_frames = priv->etool.max_coalesced_frames; + coal->tx_max_coalesced_frames = priv->etool.max_coalesced_frames; + + return 0; +} + +static int ipoib_set_coalesce(struct net_device *dev, + struct ethtool_coalesce *coal) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + if (coal->rx_coalesce_usecs > 0xffff || + coal->tx_coalesce_usecs > 0xffff || + coal->rx_max_coalesced_frames > 0xffff || + coal->tx_max_coalesced_frames > 0xffff) + return -EINVAL; + + ret = ib_modify_cq(priv->cq, coal->rx_max_coalesced_frames, + coal->rx_coalesce_usecs); + if (ret) { + ipoib_dbg(priv, "failed modifying CQ\n"); + return ret; + } + + coal->tx_coalesce_usecs = coal->rx_coalesce_usecs; + priv->etool.coalesce_usecs = coal->rx_coalesce_usecs; + coal->rx_max_coalesced_frames = coal->rx_max_coalesced_frames; + priv->etool.max_coalesced_frames = coal->rx_max_coalesced_frames; + + return 0; +} + static const struct ethtool_ops ipoib_ethtool_ops = { .get_drvinfo = ipoib_get_drvinfo, .get_tso = ethtool_op_get_tso, + .get_coalesce = ipoib_get_coalesce, + .set_coalesce = ipoib_set_coalesce, }; void ipoib_set_ethtool_ops(struct net_device *dev) -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:38:32 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:38:32 +0200 Subject: [ofa-general] [PATCH 15/16] ib/mlx4: mlx4 add support for modifying CQ parameters Message-ID: <1200501512.13546.84.camel@mtls03> mlx4 add support for modifying CQ parameters. Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mlx4/cq.c | 19 +++++++++++++ drivers/infiniband/hw/mlx4/main.c | 1 + drivers/infiniband/hw/mlx4/mlx4_ib.h | 1 + drivers/net/mlx4/cq.c | 49 ++++++++++++++++++---------------- include/linux/mlx4/cmd.h | 2 +- include/linux/mlx4/cq.h | 25 +++++++++++++++++ 6 files changed, 73 insertions(+), 24 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 7258c88..15e5e2b 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -91,6 +91,25 @@ static struct mlx4_cqe *next_cqe_sw(struct mlx4_ib_cq *cq) return get_sw_cqe(cq, cq->mcq.cons_index); } +int mlx4_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period) +{ + struct mlx4_ib_cq *mcq = to_mcq(cq); + struct mlx4_ib_dev *dev = to_mdev(cq->device); + struct mlx4_cq_context *context; + int err; + + context = kzalloc(sizeof *context, GFP_KERNEL); + if (!context) + return -ENOMEM; + + context->cq_period = cpu_to_be16(cq_period); + context->cq_max_count = cpu_to_be16(cq_count); + err = mlx4_cq_modify(dev->dev, &mcq->mcq, context, 1); + + kfree(context); + return err; +} + struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev, int entries, int vector, struct ib_ucontext *context, struct ib_udata *udata) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 2dd0de3..6b00a81 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -601,6 +601,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) ibdev->ib_dev.post_send = mlx4_ib_post_send; ibdev->ib_dev.post_recv = mlx4_ib_post_recv; ibdev->ib_dev.create_cq = mlx4_ib_create_cq; + ibdev->ib_dev.modify_cq = mlx4_ib_modify_cq; ibdev->ib_dev.destroy_cq = mlx4_ib_destroy_cq; ibdev->ib_dev.poll_cq = mlx4_ib_poll_cq; ibdev->ib_dev.req_notify_cq = mlx4_ib_arm_cq; diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index 39bc060..211eb56 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -252,6 +252,7 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, struct ib_udata *udata); int mlx4_ib_dereg_mr(struct ib_mr *mr); +int mlx4_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev, int entries, int vector, struct ib_ucontext *context, struct ib_udata *udata); diff --git a/drivers/net/mlx4/cq.c b/drivers/net/mlx4/cq.c index d4441fe..39004c5 100644 --- a/drivers/net/mlx4/cq.c +++ b/drivers/net/mlx4/cq.c @@ -38,33 +38,11 @@ #include #include +#include #include "mlx4.h" #include "icm.h" -struct mlx4_cq_context { - __be32 flags; - u16 reserved1[3]; - __be16 page_offset; - __be32 logsize_usrpage; - u8 reserved2; - u8 cq_period; - u8 reserved3; - u8 cq_max_count; - u8 reserved4[3]; - u8 comp_eqn; - u8 log_page_size; - u8 reserved5[2]; - u8 mtt_base_addr_h; - __be32 mtt_base_addr_l; - __be32 last_notified_index; - __be32 solicit_producer_index; - __be32 consumer_index; - __be32 producer_index; - u32 reserved6[2]; - __be64 db_rec_addr; -}; - #define MLX4_CQ_STATUS_OK ( 0 << 28) #define MLX4_CQ_STATUS_OVERFLOW ( 9 << 28) #define MLX4_CQ_STATUS_WRITE_FAIL (10 << 28) @@ -121,6 +99,13 @@ static int mlx4_SW2HW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox, MLX4_CMD_TIME_CLASS_A); } +static int mlx4_MODIFY_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox, + int cq_num, u32 opmod) +{ + return mlx4_cmd(dev, mailbox->dma, cq_num, opmod, MLX4_CMD_MODIFY_CQ, + MLX4_CMD_TIME_CLASS_A); +} + static int mlx4_HW2SW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox, int cq_num) { @@ -206,6 +191,24 @@ err_out: } EXPORT_SYMBOL_GPL(mlx4_cq_alloc); +int mlx4_cq_modify(struct mlx4_dev *dev, struct mlx4_cq *cq, + struct mlx4_cq_context *context, int modify) +{ + struct mlx4_cmd_mailbox *mailbox; + int err; + + mailbox = mlx4_alloc_cmd_mailbox(dev); + if (IS_ERR(mailbox)) + return PTR_ERR(mailbox); + + memcpy(mailbox->buf, context, sizeof *context); + err = mlx4_MODIFY_CQ(dev, mailbox, cq->cqn, modify); + + mlx4_free_cmd_mailbox(dev, mailbox); + return err; +} +EXPORT_SYMBOL_GPL(mlx4_cq_modify); + void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq) { struct mlx4_priv *priv = mlx4_priv(dev); diff --git a/include/linux/mlx4/cmd.h b/include/linux/mlx4/cmd.h index 7d1eaa9..77323a7 100644 --- a/include/linux/mlx4/cmd.h +++ b/include/linux/mlx4/cmd.h @@ -81,7 +81,7 @@ enum { MLX4_CMD_SW2HW_CQ = 0x16, MLX4_CMD_HW2SW_CQ = 0x17, MLX4_CMD_QUERY_CQ = 0x18, - MLX4_CMD_RESIZE_CQ = 0x2c, + MLX4_CMD_MODIFY_CQ = 0x2c, /* SRQ commands */ MLX4_CMD_SW2HW_SRQ = 0x35, diff --git a/include/linux/mlx4/cq.h b/include/linux/mlx4/cq.h index 5fdc859..5d8625e 100644 --- a/include/linux/mlx4/cq.h +++ b/include/linux/mlx4/cq.h @@ -38,6 +38,27 @@ #include #include +struct mlx4_cq_context { + __be32 flags; + u16 reserved1[3]; + __be16 page_offset; + __be32 logsize_usrpage; + u16 cq_period; + u16 cq_max_count; + u8 reserved4[3]; + u8 comp_eqn; + u8 log_page_size; + u8 reserved5[2]; + u8 mtt_base_addr_h; + __be32 mtt_base_addr_l; + __be32 last_notified_index; + __be32 solicit_producer_index; + __be32 consumer_index; + __be32 producer_index; + u32 reserved6[2]; + __be64 db_rec_addr; +}; + struct mlx4_cqe { __be32 my_qpn; __be32 immed_rss_invalid; @@ -120,4 +141,8 @@ enum { MLX4_CQ_DB_REQ_NOT = 2 << 24 }; + +int mlx4_cq_modify(struct mlx4_dev *dev, struct mlx4_cq *cq, + struct mlx4_cq_context *context, int resize); + #endif /* MLX4_CQ_H */ -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 16 08:38:37 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 18:38:37 +0200 Subject: [ofa-general] [PATCH 16/16] ib/ipoib: Set default CQ moderation parameters Message-ID: <1200501517.13546.85.camel@mtls03> Set default CQ moderation parameters Set the default params to make sure they are applied. This params give better performance. Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index f2289c6..f7defe9 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -34,6 +34,7 @@ */ #include "ipoib.h" +#include int ipoib_mcast_attach(struct net_device *dev, u16 mlid, union ib_gid *mgid) { @@ -158,6 +159,7 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) }; int i, ret, size; + struct ethtool_coalesce *coal; priv->pd = ib_alloc_pd(priv->ca); if (IS_ERR(priv->pd)) { @@ -182,6 +184,16 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) goto out_free_mr; } + coal = kzalloc(sizeof *coal, GFP_KERNEL); + if (coal) { + coal->rx_coalesce_usecs = 10; + coal->tx_coalesce_usecs = 10; + coal->rx_max_coalesced_frames = 16; + coal->tx_max_coalesced_frames = 16; + dev->ethtool_ops->set_coalesce(dev, coal); + kfree(coal); + } + if (ib_req_notify_cq(priv->cq, IB_CQ_NEXT_COMP)) goto out_free_cq; -- 1.5.3.8 From sean.hefty at intel.com Wed Jan 16 09:24:24 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 16 Jan 2008 09:24:24 -0800 Subject: [ofa-general] [PATCH 8/16] ib/core: Add core support for LSO In-Reply-To: <1200501481.13546.77.camel@mtls03> References: <1200501481.13546.77.camel@mtls03> Message-ID: <000301c85864$a3cd6520$20fc070a@amr.corp.intel.com> > struct ib_sge { >@@ -659,6 +663,9 @@ struct ib_send_wr { > } atomic; > struct { > struct ib_ah *ah; >+ void *header; >+ int hlen; >+ int mss; Can this be done on a per QP basis, versus every send? - Sean From sean.hefty at intel.com Wed Jan 16 09:30:40 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 16 Jan 2008 09:30:40 -0800 Subject: [ofa-general] [PATCH 13/16] ib/core: Add support for modify CQ In-Reply-To: <1200501504.13546.82.camel@mtls03> References: <1200501504.13546.82.camel@mtls03> Message-ID: <000401c85865$840db090$20fc070a@amr.corp.intel.com> >+ * @cq_count: number of CQEs that will tirgger an event >+ * @cq_period: max period of time beofre triggering an event Spelling nits: tirgger -> trigger beofre -> before From mschlining at datadirectnet.com Wed Jan 16 09:31:14 2008 From: mschlining at datadirectnet.com (Martin W. Schlining III) Date: Wed, 16 Jan 2008 12:31:14 -0500 Subject: [ofa-general] OFED 1.3 RC2 release is available In-Reply-To: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> Message-ID: <478E3F62.1040307@datadirectnet.com> What about the recent patches to SRP to use the request_limit_delta field in the SRP Login response? Are those changes destined for OFED 1.3? Tziporet Koren wrote: > Hi, > OFED 1.3 RC2 release is available on > http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc2.tgz > > To get BUILD_ID run ofed_info > > Please report any issues in bugzilla https://bugs.openfabrics.org/ > The RC3 release is expected on January 30 > > Tziporet & Vlad > > > > ======================================================================== > > Release information: > -------------------- > OS support: > Novell: > - SLES10 > - SLES10 SP1 and up1 > Redhat: > - Redhat EL4 up4 and up5 > - Redhat EL5 and up1 > kernel.org: > - 2.6.23 and 2.6.24-rc5 > > Compilation only checks: > - Fedora Core 6 > - openSuSE 10.3 > - Redhat EL4 up6 > > Systems: > * x86_64 > * x86 > * ia64 > * ppc64 > > Main Changes from OFED 1.3-RC1 > =============================== > * Fixed 21 Bugs (see attachment) > * Added support for RHEL4.6 and openSuSE10.3 > * Install: Added vendor's pre/post install scripts support > * MPI packages update: > * openmpi-1.2.5-1 > * mvapich-1.0.0-1844 > * mvapich2-1.0.1-2 > * Added support for Qlogic new HCA: > > Specific module changes: > ======================== > ULPs: > ----- > SDP: > * Executing netperf with TCP_CORK enabled never ends > * poll() always returns POLLOUT on non-blocking socket > * SDP connect() only allows AF_INET (2), not AF_INET_SDP (27) > iSER: > * Separate open-iscsi and iSER patches for different distros > IPoIB: > * Fix IPOIB LSO support: turn on the QP_CREATE_LSO flag to let the > hw layer know and take proper actions > > Libraries: > ---------- > libibverbs: > * Preserve backwards binary compatibility. > librdmacm: > * Release 1.0.5 > > Utilities: > ---------- > Opensm: > * Fixing core dump in fat-tree routing > * Use valid pkey index value for gsi mads > * osm_sa_slvl_record: fix overflow crash > * Fixing a seg. fault in processing mcast groups > * mcast mgr improvements > * QoS policy - increased stability > mstflint: > * Convert project to autoconf tools > Performance tests: > * Fix bug rdma_lat.c. Messages up to 400 bytes will be sent Inline > * Added multicast support to ib_send_bw and ib_send_lat tests > Diagnostic tools: > * Enhanced saquery to support: > VLArb and PKey Table Records > Ports with LinkRecord query > SL2VLTableRecord attribute > Attribute names support > * checkerrors: fix port errors count and query only single ports > in CAs > ibutils: > * vsGetGeneralInfo function now dumps the correct data > * Fixed stack-smashing bug in ibis gid typemaps, which could cause > crashes on ppc64 > > Low level drivers: > ------------------ > mlx4: > * max_recv_wr must be > 0 for non-SRQ QPs. > * Fix the value of the pkey_index in the completion to get a valid > value for GSI QPs. > * Do not use memcpy when copying to the BlueFlame buffer > * Fix pkey_index processing in cq polling > mthca: > * Ensure an Rx WQE is in memory before linking > cxgb: > * library release 1.1.2 > ipath: > * Added support for the new HCA iba7220 > Nes: > * fix virtual WQ mapping and size > > > Tasks that should be completed for RC3: > ============================== > 1. XRC enhanced API > 2. Fix bugs > > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From rdreier at cisco.com Wed Jan 16 09:32:21 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 16 Jan 2008 09:32:21 -0800 Subject: [ofa-general] Re: [PATCH 13/16] ib/core: Add support for modify CQ In-Reply-To: <1200501504.13546.82.camel@mtls03> (Eli Cohen's message of "Wed, 16 Jan 2008 18:38:24 +0200") References: <1200501504.13546.82.camel@mtls03> Message-ID: > + * @cq_period: max period of time beofre triggering an event What are the units for this? That needs to be part of the documentation. From jgunthorpe at obsidianresearch.com Wed Jan 16 09:56:03 2008 From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe) Date: Wed, 16 Jan 2008 10:56:03 -0700 Subject: [ofa-general] RE: new API for the rdma-cma In-Reply-To: <478E23C2.2040901@opengridcomputing.com> References: <478D346B.8070506@opengridcomputing.com> <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> <478D43DD.1090007@opengridcomputing.com> <000a01c857d6$b046b1a0$a937170a@amr.corp.intel.com> <20080116011308.GK28360@obsidianresearch.com> <478E23C2.2040901@opengridcomputing.com> Message-ID: <20080116175603.GL28360@obsidianresearch.com> On Wed, Jan 16, 2008 at 09:33:22AM -0600, Steve Wise wrote: > Regardless, I don't see any way to know which if_index (or even if_name) > maps to which port on a multi-port rdma device. The /sys/class/.../net:* > entries show both port device if_names, but there is really no way > to know Well, I suppose no matter what you do, you have to start there :) Isn't this information available through sysfs? It looks pretty easy to find out which ipoib devices are on the same PCI device through sysfs, but I don't immediately see a way to get the port number it is bound too (this should probably be added..) > which one is port 1 and which one is port 2. With interface renaming, it is > not necessarily true, for example, given an rnic with 2 ports setup as eth1 > and eth2, that eth1 maps to port 1 of a device and eth2 maps to port 2. > They could be renamed to anything. Does if_index remain constant? > IE maybe if_index is constant once the device is created until it is destroyed. Renames do not affect if_index. I suppose it is possible that with enough device creation/destruction if_index will wrap, and then it won't have port1 < port2 anymore.. Jason From ralph.campbell at qlogic.com Wed Jan 16 11:00:00 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Wed, 16 Jan 2008 11:00:00 -0800 Subject: [ofa-general] [PATCH 1/2] IB/ipath - fix UD send with immediate In-Reply-To: References: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> <20080115235813.7794.4890.stgit@eng-46.internal.keyresearch.com> Message-ID: <1200510000.18075.78.camel@brick.pathscale.com> On Tue, 2008-01-15 at 20:06 -0800, Roland Dreier wrote: > > This fixes a small bug which incorrectly calculated the header size > > for UD send with immediate and therefore dropped packets. > > What's the bug? It's not clear how this patch fixes anything... > > > @@ -301,8 +301,6 @@ int ipath_make_ud_req(struct ipath_qp *qp) > > > > /* header size in 32-bit words LRH+BTH+DETH = (8+12+8)/4. */ > > qp->s_hdrwords = 7; > > - if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) > > - qp->s_hdrwords++; > > qp->s_cur_size = wqe->length; > > qp->s_cur_sge = &qp->s_sge; > > qp->s_wqe = wqe; > > @@ -327,6 +325,7 @@ int ipath_make_ud_req(struct ipath_qp *qp) > > ohdr = &qp->s_hdr.u.oth; > > } > > if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) { > > + qp->s_hdrwords++; > > ohdr->u.ud.imm_data = wqe->wr.imm_data; This is just eliminating an "if". A trivial optimization. > This looks like it doesn't make any difference, since I don't see any > place qp->s_hdrwords is used in between the old location and the new > location of the increment... > > > + /* > > + * The opcode is in the low byte when its in network order > > + * (top byte when in host order). > > + */ > > + opcode = be32_to_cpu(ohdr->bth[0]) >> 24; > > + if (qp->ibqp.qp_num > 1 && > > + opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { > > + if (header_in_data) { > > + wc.imm_data = *(__be32 *) data; > > + data += sizeof(__be32); > > + } else > > + wc.imm_data = ohdr->u.ud.imm_data; > > + wc.wc_flags = IB_WC_WITH_IMM; > > + hdrsize += sizeof(u32); > > + } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { > > + wc.imm_data = 0; > > + wc.wc_flags = 0; > > + } else { > > + dev->n_pkt_drops++; > > + goto bail; > > + } > > + > > /* Get the number of bytes the message was padded by. */ > > pad = (be32_to_cpu(ohdr->bth[0]) >> 20) & 3; > > if (unlikely(tlen < (hdrsize + pad + 4))) { hdrsize is used here so the code to initialize it has to happen before this point. > I guess the bug is that you need to set hdrsize to the length > including the immediate data before this last test? Correct. > And this bug causes problems because MVAPICH uses UD sends with > immediate data?? How bad is the impact of this -- do we really need > it for 2.6.24? MVAPICH-UD doesn't work at all without this change. HP MPI also uses UD with immediate to do the initial discovery. > By the way, is there anything in the IB spec that forbids sends with > immediate data on QP1? I'm wondering why your code treats QP1 > differently here. > > - R. > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From rdreier at cisco.com Wed Jan 16 11:13:05 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 16 Jan 2008 11:13:05 -0800 Subject: [ofa-general] ipoib_start_xmit Gratuitous ARP / bonding failover handling not applied on connected mode neighbours?! In-Reply-To: (Or Gerlitz's message of "Wed, 16 Jan 2008 11:21:05 +0200 (IST)") References: Message-ID: > Looking on ipoib_start_xmit, it seems that both the check that > comes to handle a gratitious ARP (ie a difference between the > remote GID as kept in the ipoib_neigh to the one present in the > network stack neighbour) and the check that comes to handle a > situation where we attempt to xmit an ipoib_neigh created by > another ipoib device (ie following a bonding failover) - does not > come into play for the connected mode neighbours. > > Isn't it a bug, or I miss something? Good question. The device test came straight from Moni's patch -- how much have you guys tested bonding of IPoIB CM? The GID comparison seems a little trickier to handle -- it seems on a neighbour GID change we need to tear down any connection we might have in the CM case... From swise at opengridcomputing.com Wed Jan 16 11:18:51 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Wed, 16 Jan 2008 13:18:51 -0600 Subject: [ofa-general] RE: new API for the rdma-cma In-Reply-To: <20080116175603.GL28360@obsidianresearch.com> References: <478D346B.8070506@opengridcomputing.com> <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> <478D43DD.1090007@opengridcomputing.com> <000a01c857d6$b046b1a0$a937170a@amr.corp.intel.com> <20080116011308.GK28360@obsidianresearch.com> <478E23C2.2040901@opengridcomputing.com> <20080116175603.GL28360@obsidianresearch.com> Message-ID: <478E589B.2060703@opengridcomputing.com> Jason Gunthorpe wrote: > On Wed, Jan 16, 2008 at 09:33:22AM -0600, Steve Wise wrote: > >> Regardless, I don't see any way to know which if_index (or even if_name) >> maps to which port on a multi-port rdma device. The /sys/class/.../net:* >> entries show both port device if_names, but there is really no way >> to know > > Well, I suppose no matter what you do, you have to start there :) > Isn't this information available through sysfs? > > It looks pretty easy to find out which ipoib devices are on the same > PCI device through sysfs, but I don't immediately see a way to get the > port number it is bound too (this should probably be added..) > >> which one is port 1 and which one is port 2. With interface renaming, it is >> not necessarily true, for example, given an rnic with 2 ports setup as eth1 >> and eth2, that eth1 maps to port 1 of a device and eth2 maps to port 2. >> They could be renamed to anything. Does if_index remain constant? >> IE maybe > > if_index is constant once the device is created until it is > destroyed. Renames do not affect if_index. I suppose it is possible > that with enough device creation/destruction if_index will wrap, and > then it won't have port1 < port2 anymore.. > > Jason Perhaps the best way is to add a new sysfs entry that makes it explicit which if_index maps to which rdma device port. Or maybe add if_index to the ibv_port_attr structure... Steve. From swise at opengridcomputing.com Wed Jan 16 11:26:58 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Wed, 16 Jan 2008 13:26:58 -0600 Subject: [ofa-general] RE: new API for the rdma-cma In-Reply-To: <478E589B.2060703@opengridcomputing.com> References: <478D346B.8070506@opengridcomputing.com> <000801c857cc$0d6c5ed0$a937170a@amr.corp.intel.com> <478D43DD.1090007@opengridcomputing.com> <000a01c857d6$b046b1a0$a937170a@amr.corp.intel.com> <20080116011308.GK28360@obsidianresearch.com> <478E23C2.2040901@opengridcomputing.com> <20080116175603.GL28360@obsidianresearch.com> <478E589B.2060703@opengridcomputing.com> Message-ID: <478E5A82.9040602@opengridcomputing.com> Steve Wise wrote: >> >> if_index is constant once the device is created until it is >> destroyed. Renames do not affect if_index. I suppose it is possible >> that with enough device creation/destruction if_index will wrap, and >> then it won't have port1 < port2 anymore.. >> >> Jason > > Perhaps the best way is to add a new sysfs entry that makes it explicit > which if_index maps to which rdma device port. Or maybe add if_index to > the ibv_port_attr structure... > So we could add a net:if_name link in /sys/class/infiniband//ports// Just like the net:if_name link that's currently in: /sys/class/infiniband_verbs/uverbs_/device/ Steve. From rdreier at cisco.com Wed Jan 16 13:22:07 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 16 Jan 2008 13:22:07 -0800 Subject: [ofa-general] [PATCH] IB/ipoib: Fix undefined symbol (priv->cm) if ipoib_cm disabled In-Reply-To: <200801161444.35365.hnguyen@linux.vnet.ibm.com> (Hoang-Nam Nguyen's message of "Wed, 16 Jan 2008 14:44:35 +0100") References: <200801161444.35365.hnguyen@linux.vnet.ibm.com> Message-ID: Thanks a lot for pointing this out! I rolled the following into the offending patch in my tree instead (I preferred avoiding #ifdefs in .c files...) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 545c5a3..fe250c6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -505,6 +505,12 @@ static inline int ipoib_cm_has_srq(struct net_device *dev) return !!priv->cm.srq; } +static inline unsigned int ipoib_cm_max_mtu(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + return priv->cm.max_cm_mtu; +} + void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_tx *tx); int ipoib_cm_dev_open(struct net_device *dev); void ipoib_cm_dev_stop(struct net_device *dev); @@ -554,6 +560,11 @@ static inline int ipoib_cm_has_srq(struct net_device *dev) return 0; } +static inline unsigned int ipoib_cm_max_mtu(struct net_device *dev) +{ + return 0; +} + static inline void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_tx *tx) { diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index e499626..d733045 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -183,7 +183,7 @@ static int ipoib_change_mtu(struct net_device *dev, int new_mtu) /* dev->mtu > 2K ==> connected mode */ if (ipoib_cm_admin_enabled(dev)) { - if (new_mtu > priv->cm.max_cm_mtu) + if (new_mtu > ipoib_cm_max_mtu(dev)) return -EINVAL; if (new_mtu > priv->mcast_mtu) From eli at mellanox.co.il Wed Jan 16 13:34:51 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 23:34:51 +0200 Subject: [ofa-general] RE: [PATCH 13/16] ib/core: Add support for modify CQ In-Reply-To: References: <1200501504.13546.82.camel@mtls03> Message-ID: <6C2C79E72C305246B504CBA17B5500C9031E66E9@mtlexch01.mtl.com> The units are micro seconds. I will fix this. Thanks. -----Original Message----- From: Roland Dreier [mailto:rdreier at cisco.com] Sent: Wednesday, January 16, 2008 7:32 PM To: Eli Cohen Cc: openfabrics Subject: Re: [PATCH 13/16] ib/core: Add support for modify CQ > + * @cq_period: max period of time beofre triggering an event What are the units for this? That needs to be part of the documentation. From eli at mellanox.co.il Wed Jan 16 13:36:13 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 23:36:13 +0200 Subject: [ofa-general] [PATCH 13/16] ib/core: Add support for modify CQ In-Reply-To: <000401c85865$840db090$20fc070a@amr.corp.intel.com> References: <1200501504.13546.82.camel@mtls03> <000401c85865$840db090$20fc070a@amr.corp.intel.com> Message-ID: <6C2C79E72C305246B504CBA17B5500C9031E66EA@mtlexch01.mtl.com> Will fix this, thanks. -----Original Message----- From: Sean Hefty [mailto:sean.hefty at intel.com] Sent: Wednesday, January 16, 2008 7:31 PM To: Eli Cohen; Roland Dreier Cc: openfabrics Subject: RE: [ofa-general] [PATCH 13/16] ib/core: Add support for modify CQ >+ * @cq_count: number of CQEs that will tirgger an event >+ * @cq_period: max period of time beofre triggering an event Spelling nits: tirgger -> trigger beofre -> before From eli at mellanox.co.il Wed Jan 16 13:43:53 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 16 Jan 2008 23:43:53 +0200 Subject: [ofa-general] [PATCH 8/16] ib/core: Add core support for LSO In-Reply-To: <000301c85864$a3cd6520$20fc070a@amr.corp.intel.com> References: <1200501481.13546.77.camel@mtls03> <000301c85864$a3cd6520$20fc070a@amr.corp.intel.com> Message-ID: <6C2C79E72C305246B504CBA17B5500C9031E66EE@mtlexch01.mtl.com> I am not sure I follow you. This how I pass to the QP the pointer to the header, the size of the header and the mss. These values change with every post. -----Original Message----- From: Sean Hefty [mailto:sean.hefty at intel.com] Sent: Wednesday, January 16, 2008 7:24 PM To: Eli Cohen; Roland Dreier Cc: openfabrics Subject: RE: [ofa-general] [PATCH 8/16] ib/core: Add core support for LSO > struct ib_sge { >@@ -659,6 +663,9 @@ struct ib_send_wr { > } atomic; > struct { > struct ib_ah *ah; >+ void *header; >+ int hlen; >+ int mss; Can this be done on a per QP basis, versus every send? - Sean From sean.hefty at intel.com Wed Jan 16 14:02:07 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 16 Jan 2008 14:02:07 -0800 Subject: [ofa-general] [PATCH 8/16] ib/core: Add core support for LSO In-Reply-To: <6C2C79E72C305246B504CBA17B5500C9031E66EE@mtlexch01.mtl.com> References: <1200501481.13546.77.camel@mtls03> <000301c85864$a3cd6520$20fc070a@amr.corp.intel.com> <6C2C79E72C305246B504CBA17B5500C9031E66EE@mtlexch01.mtl.com> Message-ID: <000001c8588b$6f987520$ff0da8c0@amr.corp.intel.com> >I am not sure I follow you. This how I pass to the QP the pointer to the >header, the size of the header and the mss. These values change with >every post. I was wondering if the header size and mss changed on each post. From ralph.campbell at qlogic.com Wed Jan 16 14:26:51 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Wed, 16 Jan 2008 14:26:51 -0800 Subject: [ofa-general] Re: [PATCH 2/2] IB/ipath - fix QP use after free bug In-Reply-To: References: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> <20080115235818.7794.96013.stgit@eng-46.internal.keyresearch.com> Message-ID: <1200522412.18075.85.camel@brick.pathscale.com> On Tue, 2008-01-15 at 19:54 -0800, Roland Dreier wrote: > Am I missing something, or is this still racy, just with a smaller > window? Couldn't the following still happen? > > CPU #1 CPU #2 > > static inline void ipath_schedule_send(struct ipath_qp *qp) > { > if (!test_bit(IPATH_S_DESTROYING, &qp->s_busy)) > // bit not set yet, continue into if statement... > > // in ipath_destroy_qp() on other CPU: > > set_bit(IPATH_S_DESTROYING, &qp->s_busy); > > /* Stop the sending tasklet. */ > tasklet_kill(&qp->s_task); > // tasklet_kill does nothing, > // not scheduled yet... > > tasklet_hi_schedule(&qp->s_task); > // uh-oh... I think you are right. I will have to think about this some more. > In fact testing qp->s_busy is potentially just as much use-after-free > as scheduling the tasklet itself... This should be safe in the receive interrupt handling since it keeps a reference to the QP but there might be some other races possible with posting sends and timeouts. I will think some more... From rdreier at cisco.com Wed Jan 16 14:38:21 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 16 Jan 2008 14:38:21 -0800 Subject: [ofa-general] Re: [PATCH 2/2] IB/ipath - fix QP use after free bug In-Reply-To: <1200522412.18075.85.camel@brick.pathscale.com> (Ralph Campbell's message of "Wed, 16 Jan 2008 14:26:51 -0800") References: <20080115235808.7794.67134.stgit@eng-46.internal.keyresearch.com> <20080115235818.7794.96013.stgit@eng-46.internal.keyresearch.com> <1200522412.18075.85.camel@brick.pathscale.com> Message-ID: > This should be safe in the receive interrupt handling since it keeps a > reference to the QP but there might be some other races possible > with posting sends and timeouts. I will think some more... If you are sure that the QP structure will be around in ipath_schedule_send() the you could just add a spinlock to make sure that the test_bit/tasklet_hi_schedule and set_bit/tasklet_kill are atomic against each other. - R. From rdreier at cisco.com Wed Jan 16 14:46:21 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 16 Jan 2008 14:46:21 -0800 Subject: [ofa-general] [GIT PULL] please pull infiniband.git Message-ID: Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This will get the commit below, which moves a little code around to make a couple of major MPI implementations work on ipath devices: drivers/infiniband/hw/ipath/ipath_ud.c | 44 ++++++++++++++++---------------- 1 files changed, 22 insertions(+), 22 deletions(-) commit 0a69631b2869093d7306e8f66cca8eb0a05aa919 Author: Ralph Campbell Date: Tue Jan 15 15:58:13 2008 -0800 IB/ipath: Fix receiving UD messages with immediate data This fixes a small bug in ipath_ud_rcv()'s handling of UD messages with immediate data. We need to test whether immediate data is present and update the header size accordingly *before* testing the packet size from the header against the actual received length. Otherwise the wrong header size will be used and all messages with immediate data will be dropped. This bug keeps MVAPICH-UD and HP MPI from working at all on ipath devices. Signed-off-by: Ralph Campbell Signed-off-by: Roland Dreier --- diff --git a/drivers/infiniband/hw/ipath/ipath_ud.c b/drivers/infiniband/hw/ipath/ipath_ud.c index 16a2a93..b3df6f3 100644 --- a/drivers/infiniband/hw/ipath/ipath_ud.c +++ b/drivers/infiniband/hw/ipath/ipath_ud.c @@ -455,6 +455,28 @@ void ipath_ud_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr, } } + /* + * The opcode is in the low byte when its in network order + * (top byte when in host order). + */ + opcode = be32_to_cpu(ohdr->bth[0]) >> 24; + if (qp->ibqp.qp_num > 1 && + opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { + if (header_in_data) { + wc.imm_data = *(__be32 *) data; + data += sizeof(__be32); + } else + wc.imm_data = ohdr->u.ud.imm_data; + wc.wc_flags = IB_WC_WITH_IMM; + hdrsize += sizeof(u32); + } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { + wc.imm_data = 0; + wc.wc_flags = 0; + } else { + dev->n_pkt_drops++; + goto bail; + } + /* Get the number of bytes the message was padded by. */ pad = (be32_to_cpu(ohdr->bth[0]) >> 20) & 3; if (unlikely(tlen < (hdrsize + pad + 4))) { @@ -482,28 +504,6 @@ void ipath_ud_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr, wc.byte_len = tlen + sizeof(struct ib_grh); /* - * The opcode is in the low byte when its in network order - * (top byte when in host order). - */ - opcode = be32_to_cpu(ohdr->bth[0]) >> 24; - if (qp->ibqp.qp_num > 1 && - opcode == IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE) { - if (header_in_data) { - wc.imm_data = *(__be32 *) data; - data += sizeof(__be32); - } else - wc.imm_data = ohdr->u.ud.imm_data; - wc.wc_flags = IB_WC_WITH_IMM; - hdrsize += sizeof(u32); - } else if (opcode == IB_OPCODE_UD_SEND_ONLY) { - wc.imm_data = 0; - wc.wc_flags = 0; - } else { - dev->n_pkt_drops++; - goto bail; - } - - /* * Get the next work request entry to find where to put the data. */ if (qp->r_reuse_sge) From kliteyn at mellanox.co.il Wed Jan 16 17:27:27 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 17 Jan 2008 03:27:27 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-17:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-16 OpenSM git rev = Tue_Jan_15_15:34:46_2008 [82b13a3b06289e434ce35534cf74f15211b3e4d4] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=399 Fail=1 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 9 LidMgr IS3-128.topo Failures: 1 LidMgr IS3-128.topo From information at hitsrl.it Wed Jan 16 20:01:41 2008 From: information at hitsrl.it (Port Equipments from HIT) Date: Thu, 17 Jan 2008 05:01:41 +0100 Subject: [ofa-general] available: FANTUZZI CONTSTACKER CS45KL - We are looking for: REACH STACKERS and STRADDLE CARRIERS Message-ID: <476D8A2E02F18708@> (added by postmaster@aa011msb.fastweb.it) HIT Revamping, Spare Parts for Port Equipments and Material Handling Equipments AVAILABLE FANTUZZI CONTSTACKER CS45KL CAPACITY: 45T YEAR: 2002 WORKING HOURS: approx 18000 Location: Italy Reply only to:info at hitsrl.com We are looking for N�8 Straddle Carriers (should be 1 over 3 high - YEAR:2000 or more) N�4 Reach Stackers (should be for laden containers - YEAR 2000 or more) our CS45KL DataSheet TIPO Motor Type VOLVO MODELLO Model TWD 1031 VE POTENZA MAX Hp 320 Hp N� CILINDRI Stroke 6 CILINDRATA 9600 cm3 Kg. 71.000 Peso a vuoto Weight Pneumatici Gommatura Wheel Boom Weldox 700 Materiali Material Chassis Fe 510 D (Fe 52) Spreader Weldox 700 / Fe 510 D Model SRP 45 Spreader Spreader type Telescopico 20/40 Slew Da - 95� a + 185� Hydraulic - Servo assistito Sterzo Steering riduttori epicicloidali Ponte Differenziale Axle Numero ant./post. 4 anteriori e 2 posteriori Pneumatici Pneumatic Dimensioni anteriori 18.00-25 Dimensioni posteriori 18.00-25 Pressione gonfiaggio 10 bar Freni di servizio Brake Anteriori Disco a bagno d'olio Posteriori A mancamento d'olio - disco a secco sulla trasmissione Freno di parcheggio Parking Brake 24 V 200 Ah Batteria Battery Idrodinamica con convertitore di coppia Trasmissione Transmission Power-shift Frizione Clatch 4 Marce avanti e 4 in retromarcia Selettore marce Swift Si Sistema raffr. Idraulico Heat Exch. Olio idraulico 950 lt. Rifornimenti Liquid capacity Carburante 580 lt. Trasmissione (Clark 36000) 48 lt. Differenziale e mozzi 100 lt. Olio motore riduttore spreader 2,5 lt. Cuscinetti Quanto necessario Antigelo 40% di antigelo HIT S.r.l. Team info at hitsrl.com SPARE PARTS SERVICE AVAILABLE spareparts at hitsrl.com CVS FERRARI, BELOTTI, FANTUZZI REGGIANE and further brands. Spare Parts Fax: +39 059 9770805 HIT S.r.l. Via S.Francesco 8 int 5 Carpi 41012 (Mo) ITALY Tel. +39 059 6229975 Fax. +39 059 6221140 Damiano Vanini d.vanini at hitsrl.com Cell +39 335 7162346 (Italian French Spanish) Paolo Soncini p.soncini at hitsrl.com Cell +39 335 8238855 (Italian English Portuguese) www.hitsrl.com InfoMail: info at hitsrl.com C.F. e P.IVA : 02984080362 Codice REA: MO 348073 If you don�t want to receive again this kind of information from HIT Srl please reply to this email with NOEMAIL specified in the subject. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Jan 16 21:35:39 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 16 Jan 2008 21:35:39 -0800 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: <20080116073459.GA20554@minantech.com> (Gleb Natapov's message of "Wed, 16 Jan 2008 09:34:59 +0200") References: <478D1A49.1080807@mellanox.co.il> <000201c857b9$e67ee020$a937170a@amr.corp.intel.com> <20080116073459.GA20554@minantech.com> Message-ID: > Roland, you said that XRC API is ugly, are you going to push it upstream > in its present form? That's a good question. Since there is no 'present form' for XRC as far as I can tell, it's hard to make a definitive answer. Certainly I haven't made up my mind in advance one way or another. In addition to seeing how the code ends up, I think the other big piece of the puzzle is to hear from the Open MPI team and other consumers of the API and find out how big the benefit is. - R. From buettni at magnapubs.com Wed Jan 16 01:56:17 2008 From: buettni at magnapubs.com (Ernesto Santiago) Date: Thu, 16 Jan 2008 10:56:17 +0100 Subject: [ofa-general] Potenzprobleme? Ab jetzt nicht mehr Message-ID: <01c8582e$6b2e5e80$51f4e8d9@buettni> Haben Sie endlich wieder Spass am Leben! Preise die keine Konkurrenz kennen - Diskrete Verpackung und Zahlung - Kein peinlicher A r z t b e s u c h erforderlich - Kostenlose, arztliche Telefon-Beratung - Kein langes Warten - Auslieferung innerhalb von 2-3 Tagen - Bequem und diskret online bestellen. - Visa verifizierter Onlineshop - keine versteckte Kosten Originalmedikamente Ciiaaaaaalis... 10 Pack. 21,00 Euro Viiaaaagra... 10 Pack. 11,00 Euro Nur fur kurze Zeit - vier Pillen umsonst erhalten http://springfixdirect.com (bitte warten Sie einen Moment bis die Seite vollstandig geladen ist) -------------- next part -------------- An HTML attachment was scrubbed... URL: From erezz at voltaire.com Thu Jan 17 01:48:44 2008 From: erezz at voltaire.com (Erez Zilber) Date: Thu, 17 Jan 2008 11:48:44 +0200 Subject: [ofa-general] [PATCH 0/2] IB/iser: fixes for 2.6.25 Message-ID: <478F247C.9010306@voltaire.com> Roland, The following patch set contains some iSER fixes for 2.6.25. Thanks, Erez From erezz at voltaire.com Thu Jan 17 01:51:58 2008 From: erezz at voltaire.com (Erez Zilber) Date: Thu, 17 Jan 2008 11:51:58 +0200 Subject: [ofa-general] [PATCH 1/2] IB/iser: Print information about unhandled RDMA CM events In-Reply-To: <478F247C.9010306@voltaire.com> References: <478F247C.9010306@voltaire.com> Message-ID: <478F253E.2090800@voltaire.com> Some RDMA CM events are not supported or not handled in iSER. This patch adds some info (printk) for the user about them. Signed-off-by: Erez Zilber --- drivers/infiniband/ulp/iser/iser_verbs.c | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c index 654a4dc..675d00b 100644 --- a/drivers/infiniband/ulp/iser/iser_verbs.c +++ b/drivers/infiniband/ulp/iser/iser_verbs.c @@ -475,13 +475,11 @@ static int iser_cma_handler(struct rdma_cm_id *cma_id, struct rdma_cm_event *eve iser_disconnected_handler(cma_id); break; case RDMA_CM_EVENT_DEVICE_REMOVAL: + iser_err("Device removal is currently unsupported\n"); BUG(); break; - case RDMA_CM_EVENT_CONNECT_RESPONSE: - BUG(); - break; - case RDMA_CM_EVENT_CONNECT_REQUEST: default: + iser_err("Unexpected RDMA CM event (%d)\n", event->event); break; } return ret; -- 1.5.3.7 From erezz at voltaire.com Thu Jan 17 01:53:17 2008 From: erezz at voltaire.com (Erez Zilber) Date: Thu, 17 Jan 2008 11:53:17 +0200 Subject: [ofa-general] [PATCH 2/2] IB/iser: lower queue depth In-Reply-To: <478F247C.9010306@voltaire.com> References: <478F247C.9010306@voltaire.com> Message-ID: <478F258D.3080500@voltaire.com> Add change_queue_depth handler to scsi_host_template in the iSER driver. This handler was added to iscsi_tcp in order to solve the problem of queue depth which was too high for some targets. It is also applicable for iSER. Signed-off-by: Erez Zilber --- drivers/infiniband/ulp/iser/iscsi_iser.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c index bad8dac..dfa5a45 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.c +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c @@ -551,6 +551,7 @@ static struct scsi_host_template iscsi_iser_sht = { .module = THIS_MODULE, .name = "iSCSI Initiator over iSER, v." DRV_VER, .queuecommand = iscsi_queuecommand, + .change_queue_depth = iscsi_change_queue_depth, .can_queue = ISCSI_DEF_XMIT_CMDS_MAX - 1, .sg_tablesize = ISCSI_ISER_SG_TABLESIZE, .max_sectors = 1024, -- 1.5.3.7 From postmaster at etsi.org Thu Jan 17 02:18:06 2008 From: postmaster at etsi.org (Spam Firewall) Date: Thu, 17 Jan 2008 10:18:06 +0000 (GMT) Subject: [ofa-general] **Message you sent blocked by our bulk email filter** Message-ID: <20080117141712.8149.qmail@thesmallboss> Your message to: general at list.homegatewayinitiative.org was blocked by our Spam Firewall. The email you sent with the following subject has NOT BEEN DELIVERED: Subject: Thu, 17 Jan 2008 12:17:12 +0200 SALE 78% OFF on Pfizer -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/rfc822-headers Size: 971 bytes Desc: Undelivered-message headers URL: From vlad at lists.openfabrics.org Thu Jan 17 03:11:44 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 17 Jan 2008 03:11:44 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080117-0200 daily build status Message-ID: <20080117111144.6374CE60216@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.18 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.15 Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.17 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.18-53.el5 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.13 Passed on powerpc with linux-2.6.15 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.14 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.13 Passed on ppc64 with linux-2.6.17 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.13 Passed on powerpc with linux-2.6.12 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.21.1 Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.16.21-0.8-smp Failed: From ogerlitz at voltaire.com Thu Jan 17 05:06:22 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 17 Jan 2008 15:06:22 +0200 Subject: [ofa-general] statelss offload patches In-Reply-To: <1200501439.13546.69.camel@mtls03> References: <1200501439.13546.69.camel@mtls03> Message-ID: <478F52CE.1000804@voltaire.com> Eli Cohen wrote: > Following this email is a list of stateless offload patches. This series > was posted to the list in the past and now has been revised. It has also > been reviewed, some more and some less, by Or Gerlitz from Voltaire > (thanks) though not all his suggestions made it to the code. I hope they > are reviewed by the community and will eventually get into 2.6.25. Just to be precise... I have reviewed and provided feedback where only part of it was implemented for this posting, on patches 1-4 Or. From ogerlitz at voltaire.com Thu Jan 17 05:11:35 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 17 Jan 2008 15:11:35 +0200 Subject: [ofa-general] [PATCH 3/16] ib/core: Add checksum support to ib core In-Reply-To: <1200501459.13546.72.camel@mtls03> References: <1200501459.13546.72.camel@mtls03> Message-ID: <478F5407.2040004@voltaire.com> Eli Cohen wrote: > --- a/include/rdma/ib_verbs.h > +++ b/include/rdma/ib_verbs.h > @@ -95,7 +95,14 @@ enum ib_device_cap_flags { > + * devices which publish this capability must support insertion of UDP > + * and TCP checksum on outgoing packets and can verify the validity of > + * checksum for incoming packets. Setting this flag implies the driver > + * may set NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM. > + IB_DEVICE_IP_CSUM = (1<<18), > enum ib_atomic_cap { > @@ -431,6 +438,7 @@ struct ib_wc { > + int csum_ok; Hi Eli, With the comment in patch #4 at ipoib_ib_handle_rx_wc at hand, the IB_DEVICE_IP_CSUM and the csum_ok bit are not well defined, since there are some cases when the HW can not do tx csum and there are some cases where the hw driver reported csum_ok but some more validation need to be done in order to make sure its really ok, I suggest you guys discuss this internally and come up with some solution. Or. From ogerlitz at voltaire.com Thu Jan 17 05:18:28 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 17 Jan 2008 15:18:28 +0200 Subject: [ofa-general] [PATCH 4/16] Add checksum offload support for ipoib In-Reply-To: <1200501463.13546.73.camel@mtls03> References: <1200501463.13546.73.camel@mtls03> Message-ID: <478F55A4.7070706@voltaire.com> Eli Cohen wrote: > --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c > @@ -231,6 +232,18 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) > skb->dev = dev; > /* XXX get correct PACKET_ type here */ > skb->pkt_type = PACKET_HOST; > + > + /* check rx csum */ > + if (test_bit(IPOIB_FLAG_CSUM, &priv->flags) && likely(wc->csum_ok)) { First, since the device IB_DEVICE_IP_CSUM capability means that "devices which publish this capability must support insertion of UDP and TCP checksum on outgoing packets and can verify the validity of checksum for incoming packets" the IPOIB_FLAG_CSUM bit is redundant, I suggest to remove it. Second, the csum_ok bit is not well defined, etc as I commented on patch #4 > @@ -394,6 +407,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, > return; > } > > + if (priv->ca->flags & IB_DEVICE_IP_CSUM && > + skb->ip_summed == CHECKSUM_PARTIAL) > + priv->tx_wr.send_flags |= IB_SEND_IP_CSUM; > + else > + priv->tx_wr.send_flags &= IB_SEND_IP_CSUM; I think that the code would be somehow clearer if you use dev->features and not priv->ca->flags Or From ogerlitz at voltaire.com Thu Jan 17 05:22:53 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 17 Jan 2008 15:22:53 +0200 Subject: [ofa-general] [PATCH 14/16] ib/ipoib: Support modifying IPOIB CQ moderation params In-Reply-To: <1200501508.13546.83.camel@mtls03> References: <1200501508.13546.83.camel@mtls03> Message-ID: <478F56AD.1030006@voltaire.com> Eli Cohen wrote: > Support modifying IPOIB CQ moderation params > > This can be used to tune at run time the paramters controlling > the event (interrupt) generation rate and thus reduce the overhead > incurred by handling interrupts resulting in better throughput. IPoIB has one CQ. As I see it, this means that you should either let the user specify only one of rx or tx coalescing params, or make sure that the user did not provide something that the driver can not deploy, eg rx_usecs 12 tx_usecs 1 > +static int ipoib_set_coalesce(struct net_device *dev, > + struct ethtool_coalesce *coal) > + coal->tx_coalesce_usecs = coal->rx_coalesce_usecs; > + priv->etool.coalesce_usecs = coal->rx_coalesce_usecs; > + coal->rx_max_coalesced_frames = coal->rx_max_coalesced_frames; I guess you wanted to say here coal->tx_max_coalesced_frames = Or From ogerlitz at voltaire.com Thu Jan 17 05:47:55 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 17 Jan 2008 15:47:55 +0200 Subject: [ofa-general] ipoib_start_xmit Gratuitous ARP / bonding failover handling not applied on connected mode neighbours?! In-Reply-To: References: Message-ID: <478F5C8B.8040601@voltaire.com> Roland Dreier wrote: > Good question. The device test came straight from Moni's patch -- how > much have you guys tested bonding of IPoIB CM? The test for neigh->dev != dev comes to handle a possible race where a fail over occurs under a high xmit rate, so the deletion of the ipoib_neigh portion of the neighbour causes by the bonding fail-over did not happen yet, but as of the fail-over the bonding is now xmitting through a device which is not the one that created the ipoib_neigh. We have never managed to reproduce a hit on this check... anyway, I will double check on how much testing was done with the bonding and connected mode. > The GID comparison seems a little trickier to handle -- it seems on a > neighbour GID change we need to tear down any connection we might have > in the CM case... not really: when there is a hit on the GID comparison ipoib_neigh_free() is called which for a connected mode neighbour will invoke ipoib_cm_destroy_tx() which will disconnect etc. Or From simonx at email.cz Thu Jan 17 01:23:02 2008 From: simonx at email.cz (=?us-ascii?Q?=20?=) Date: Thu, 17 Jan 2008 10:23:02 +0100 (CET) Subject: [ofa-general] ***SPAM*** WinOF - MT21308 - win2k3 problem in MS network Message-ID: <51.144-6063-660232640-1200561782@email.cz> I have problem with SFS Topspin 90 switch in MS network. 1. LAN Topology: remote workstation (win2k3 fresh installation, ethernet NIC, IP 192.168.1.2) <-> ethernet router/DHCP (Zywall 5, IP 192.168.1.1)) <-> Topspin 90/6p eth. gateway <-> HCA/server/win2k3 - fresh installation 2. Like HCAs I have tried PCI-E Voltaire 410EX-D and Mellanox PCI-X MT23108. In the first case with IBoIP from Voltaire, in second from Mellanox and also with WinOF. Both firmwares were updated. 3. All LEDs (Topspin, HCAs) signalize, that all devices are in correct status. 4. Topspin was configured by the most simple setting - as "no VAN + default PKEY". Set correct suffix (192.168.1.0), length (24), choosen correct ports for bridge mode (2/1 + 2/1(gw)), enabled broadcast forwarding. Other default. 5. PROBLEM description: 5.1 HCA 410EX-D & IPoIB from Voltaire: 5.1.1 pinging (from HCA based server) to 192.168.1.2 works fine 5.1.2 iperf from both sides works excellent 5.1.3 remote desktops from both sides works 5.1.4 ping (from HCA based server) to any WAN IP does not work 5.1.5 ping (from HCA based server) to WAN webserver by domain name does not work (IP is not viewed) 5.1.6 no webbrowser (IE6) connection to any WAN web server (nor by IP of webserver!) 5.1.7 no possibility to browse computers in local MS workgroup (only seen server with HCA). By IPs of workstation there is possible to browse correctly. 5.2 Mellanox (PCI-X) MT21308 and IPoIB from WinOF: 5.2.1 DHCP can not afford IP - no available MAC for HCA ports (-> no network communication) 5.2.2 IP/subnet mask/def. gateway assigned manually -> no net communication Any idea? Thanks, Simon From fenkes at de.ibm.com Thu Jan 17 06:02:33 2008 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Thu, 17 Jan 2008 15:02:33 +0100 Subject: [ofa-general] [PATCH 0/4] IB/ehca: fixes, port connectivity autodetection, problem workaround Message-ID: <200801171502.34287.fenkes@de.ibm.com> This patchset will fix a minor issue, introduce port connectivity autodetection and work around an RDMA-related problem in eHCA2. [1/4] fixes an error path in destroy_qp() [2/4] stores the SMI/GSI QPs in a per-port array [3/4] adds port connectivity autodetection [4/4] adds the aforementioned workaround The patches will apply, in order, on top of Roland's for-2.6.25 branch. Please review them and apply for 2.6.25 if you think they're okay. Thanks and regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: fenkes at de.ibm.com From fenkes at de.ibm.com Thu Jan 17 06:03:55 2008 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Thu, 17 Jan 2008 15:03:55 +0100 Subject: [ofa-general] [PATCH 1/4] IB/ehca: Remove CQ-QP-link before destroying QP in error path of create_qp() In-Reply-To: <200801171502.34287.fenkes@de.ibm.com> References: <200801171502.34287.fenkes@de.ibm.com> Message-ID: <200801171503.56348.fenkes@de.ibm.com> From: Hoang-Nam Nguyen Signed-off-by: Hoang-Nam Nguyen --- drivers/infiniband/hw/ehca/ehca_qp.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index f116eb7..26c6a94 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -769,12 +769,15 @@ static struct ehca_qp *internal_create_qp( if (ib_copy_to_udata(udata, &resp, sizeof resp)) { ehca_err(pd->device, "Copy to udata failed"); ret = -EINVAL; - goto create_qp_exit4; + goto create_qp_exit5; } } return my_qp; +create_qp_exit5: + ehca_cq_unassign_qp(my_qp->send_cq, my_qp->real_qp_num); + create_qp_exit4: if (HAS_RQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); -- 1.5.2 From fenkes at de.ibm.com Thu Jan 17 06:04:32 2008 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Thu, 17 Jan 2008 15:04:32 +0100 Subject: [ofa-general] [PATCH 2/4] IB/ehca: Define array to store SMI/GSI QPs In-Reply-To: <200801171502.34287.fenkes@de.ibm.com> References: <200801171502.34287.fenkes@de.ibm.com> Message-ID: <200801171504.33426.fenkes@de.ibm.com> From: Hoang-Nam Nguyen Signed-off-by: Hoang-Nam Nguyen --- drivers/infiniband/hw/ehca/ehca_classes.h | 2 +- drivers/infiniband/hw/ehca/ehca_main.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 74d2b72..936580d 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -94,7 +94,7 @@ struct ehca_sma_attr { struct ehca_sport { struct ib_cq *ibcq_aqp1; - struct ib_qp *ibqp_aqp1; + struct ib_qp *ibqp_sqp[2]; enum ib_port_state port_state; struct ehca_sma_attr saved_attr; }; diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 6a56d86..cde486c 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -511,7 +511,7 @@ static int ehca_create_aqp1(struct ehca_shca *shca, u32 port) } sport->ibcq_aqp1 = ibcq; - if (sport->ibqp_aqp1) { + if (sport->ibqp_sqp[IB_QPT_GSI]) { ehca_err(&shca->ib_device, "AQP1 QP is already created."); ret = -EPERM; goto create_aqp1; @@ -537,7 +537,7 @@ static int ehca_create_aqp1(struct ehca_shca *shca, u32 port) ret = PTR_ERR(ibqp); goto create_aqp1; } - sport->ibqp_aqp1 = ibqp; + sport->ibqp_sqp[IB_QPT_GSI] = ibqp; return 0; @@ -550,7 +550,7 @@ static int ehca_destroy_aqp1(struct ehca_sport *sport) { int ret; - ret = ib_destroy_qp(sport->ibqp_aqp1); + ret = ib_destroy_qp(sport->ibqp_sqp[IB_QPT_GSI]); if (ret) { ehca_gen_err("Cannot destroy AQP1 QP. ret=%i", ret); return ret; -- 1.5.2 From fenkes at de.ibm.com Thu Jan 17 06:05:45 2008 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Thu, 17 Jan 2008 15:05:45 +0100 Subject: [ofa-general] [PATCH 3/4] IB/ehca: Add "port connection autodetect mode" In-Reply-To: <200801171502.34287.fenkes@de.ibm.com> References: <200801171502.34287.fenkes@de.ibm.com> Message-ID: <200801171505.45612.fenkes@de.ibm.com> From: Hoang-Nam Nguyen This patch enhances ehca with a capability to "autodetect" the ports being connected physically. In order to utilize that function the module option nr_ports must be set to -1 (default is 2 - two ports). This feature is experimental and will made the default later. More detail: If the user connects only one port to the switch, current code requires 1) port one to be connected and 2) module option nr_ports=1 to be given. If autodetect is enabled, ehca will not wait at creation of the GSI QP for the respective port to become active. Since firmware does not accept modify_qp() while the port is down at initialization, we need to cache all calls to modify_qp() for the SMI/GSI QP and just return a good return code. When a port is activated and we get a PORT_ACTIVE event, we replay the cached modify-qp() parms and re-trigger any posted recv WRs. Only then do we forward the PORT_ACTIVE event to registered clients. The result of this autodetect patch is that all ports will be accessible by the users. Depending on their respective cabling only those ports that are connected properly will become operable. If a user tries to modify a regular QP of a non-connected port, modify_qp() will fail. Furthermore, ibv_devinfo should show the port state accordingly. Note that this patch primarily improves the loading behaviour of ehca. If the cable is removed while the driver is operating and plugged in again, firmware will handle that properly by sending an appropriate async event. Signed-off-by: Hoang-Nam Nguyen --- drivers/infiniband/hw/ehca/ehca_classes.h | 16 +++ drivers/infiniband/hw/ehca/ehca_irq.c | 26 ++++- drivers/infiniband/hw/ehca/ehca_iverbs.h | 2 + drivers/infiniband/hw/ehca/ehca_main.c | 7 +- drivers/infiniband/hw/ehca/ehca_qp.c | 162 ++++++++++++++++++++++++++++- drivers/infiniband/hw/ehca/ehca_sqp.c | 6 +- 6 files changed, 204 insertions(+), 15 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 936580d..2502366 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -95,6 +95,10 @@ struct ehca_sma_attr { struct ehca_sport { struct ib_cq *ibcq_aqp1; struct ib_qp *ibqp_sqp[2]; + /* lock to serialze modify_qp() calls for sqp in normal + * and irq path (when event PORT_ACTIVE is received first time) + */ + spinlock_t mod_sqp_lock; enum ib_port_state port_state; struct ehca_sma_attr saved_attr; }; @@ -141,6 +145,14 @@ enum ehca_ext_qp_type { EQPT_SRQ = 3, }; +/* struct to cache modify_qp()'s parms for GSI/SMI qp */ +struct ehca_mod_qp_parm { + int mask; + struct ib_qp_attr attr; +}; + +#define EHCA_MOD_QP_PARM_MAX 4 + struct ehca_qp { union { struct ib_qp ib_qp; @@ -164,6 +176,9 @@ struct ehca_qp { struct ehca_cq *recv_cq; unsigned int sqerr_purgeflag; struct hlist_node list_entries; + /* array to cache modify_qp()'s parms for GSI/SMI qp */ + struct ehca_mod_qp_parm *mod_qp_parm; + int mod_qp_parm_idx; /* mmap counter for resources mapped into user space */ u32 mm_count_squeue; u32 mm_count_rqueue; @@ -323,6 +338,7 @@ extern int ehca_port_act_time; extern int ehca_use_hp_mr; extern int ehca_scaling_code; extern int ehca_lock_hcalls; +extern int ehca_nr_ports; struct ipzu_queue_resp { u32 qe_size; /* queue entry size */ diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c index 4c734ec..863b34f 100644 --- a/drivers/infiniband/hw/ehca/ehca_irq.c +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -356,17 +356,33 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe) u8 ec = EHCA_BMASK_GET(NEQE_EVENT_CODE, eqe); u8 port = EHCA_BMASK_GET(NEQE_PORT_NUMBER, eqe); u8 spec_event; + struct ehca_sport *sport = &shca->sport[port - 1]; + unsigned long flags; switch (ec) { case 0x30: /* port availability change */ if (EHCA_BMASK_GET(NEQE_PORT_AVAILABILITY, eqe)) { - shca->sport[port - 1].port_state = IB_PORT_ACTIVE; + int suppress_event; + /* replay modify_qp for sqps */ + spin_lock_irqsave(&sport->mod_sqp_lock, flags); + suppress_event = !sport->ibqp_sqp[IB_QPT_GSI]; + if (sport->ibqp_sqp[IB_QPT_SMI]) + ehca_recover_sqp(sport->ibqp_sqp[IB_QPT_SMI]); + if (!suppress_event) + ehca_recover_sqp(sport->ibqp_sqp[IB_QPT_GSI]); + spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); + + /* AQP1 was destroyed, ignore this event */ + if (suppress_event) + break; + + sport->port_state = IB_PORT_ACTIVE; dispatch_port_event(shca, port, IB_EVENT_PORT_ACTIVE, "is active"); ehca_query_sma_attr(shca, port, - &shca->sport[port - 1].saved_attr); + &sport->saved_attr); } else { - shca->sport[port - 1].port_state = IB_PORT_DOWN; + sport->port_state = IB_PORT_DOWN; dispatch_port_event(shca, port, IB_EVENT_PORT_ERR, "is inactive"); } @@ -380,11 +396,11 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe) ehca_warn(&shca->ib_device, "disruptive port " "%d configuration change", port); - shca->sport[port - 1].port_state = IB_PORT_DOWN; + sport->port_state = IB_PORT_DOWN; dispatch_port_event(shca, port, IB_EVENT_PORT_ERR, "is inactive"); - shca->sport[port - 1].port_state = IB_PORT_ACTIVE; + sport->port_state = IB_PORT_ACTIVE; dispatch_port_event(shca, port, IB_EVENT_PORT_ACTIVE, "is active"); } else diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h index 5485799..c469bfd 100644 --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h @@ -200,4 +200,6 @@ void ehca_free_fw_ctrlblock(void *ptr); #define ehca_free_fw_ctrlblock(ptr) free_page((unsigned long)(ptr)) #endif +void ehca_recover_sqp(struct ib_qp *sqp); + #endif diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index cde486c..74a4592 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -90,7 +90,8 @@ MODULE_PARM_DESC(hw_level, "hardware level" " (0: autosensing (default), 1: v. 0.20, 2: v. 0.21)"); MODULE_PARM_DESC(nr_ports, - "number of connected ports (default: 2)"); + "number of connected ports (-1: autodetect, 1: port one only, " + "2: two ports (default)"); MODULE_PARM_DESC(use_hp_mr, "high performance MRs (0: no (default), 1: yes)"); MODULE_PARM_DESC(port_act_time, @@ -688,7 +689,7 @@ static int __devinit ehca_probe(struct of_device *dev, struct ehca_shca *shca; const u64 *handle; struct ib_pd *ibpd; - int ret; + int ret, i; handle = of_get_property(dev->node, "ibm,hca-handle", NULL); if (!handle) { @@ -709,6 +710,8 @@ static int __devinit ehca_probe(struct of_device *dev, return -ENOMEM; } mutex_init(&shca->modify_mutex); + for (i = 0; i < ARRAY_SIZE(shca->sport); i++) + spin_lock_init(&shca->sport[i].mod_sqp_lock); shca->ofdev = dev; shca->ipz_hca_handle.handle = *handle; diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index 26c6a94..bb7ccef 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -729,12 +729,31 @@ static struct ehca_qp *internal_create_qp( init_attr->cap.max_send_wr = parms.squeue.act_nr_wqes; my_qp->init_attr = *init_attr; + if (qp_type == IB_QPT_SMI || qp_type == IB_QPT_GSI) { + shca->sport[init_attr->port_num - 1].ibqp_sqp[qp_type] = + &my_qp->ib_qp; + if (ehca_nr_ports < 0) { + /* alloc array to cache subsequent modify qp parms + * for autodetect mode + */ + my_qp->mod_qp_parm = + kzalloc(EHCA_MOD_QP_PARM_MAX * + sizeof(*my_qp->mod_qp_parm), + GFP_KERNEL); + if (!my_qp->mod_qp_parm) { + ehca_err(pd->device, + "Could not alloc mod_qp_parm"); + goto create_qp_exit4; + } + } + } + /* NOTE: define_apq0() not supported yet */ if (qp_type == IB_QPT_GSI) { h_ret = ehca_define_sqp(shca, my_qp, init_attr); if (h_ret != H_SUCCESS) { ret = ehca2ib_return_code(h_ret); - goto create_qp_exit4; + goto create_qp_exit5; } } @@ -743,7 +762,7 @@ static struct ehca_qp *internal_create_qp( if (ret) { ehca_err(pd->device, "Couldn't assign qp to send_cq ret=%i", ret); - goto create_qp_exit4; + goto create_qp_exit5; } } @@ -769,15 +788,19 @@ static struct ehca_qp *internal_create_qp( if (ib_copy_to_udata(udata, &resp, sizeof resp)) { ehca_err(pd->device, "Copy to udata failed"); ret = -EINVAL; - goto create_qp_exit5; + goto create_qp_exit6; } } return my_qp; -create_qp_exit5: +create_qp_exit6: ehca_cq_unassign_qp(my_qp->send_cq, my_qp->real_qp_num); +create_qp_exit5: + if (my_qp->mod_qp_parm) + kfree(my_qp->mod_qp_parm); + create_qp_exit4: if (HAS_RQ(my_qp)) ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); @@ -995,7 +1018,7 @@ static int internal_modify_qp(struct ib_qp *ibqp, unsigned long flags = 0; /* do query_qp to obtain current attr values */ - mqpcb = ehca_alloc_fw_ctrlblock(GFP_KERNEL); + mqpcb = ehca_alloc_fw_ctrlblock(GFP_ATOMIC); if (!mqpcb) { ehca_err(ibqp->device, "Could not get zeroed page for mqpcb " "ehca_qp=%p qp_num=%x ", my_qp, ibqp->qp_num); @@ -1183,6 +1206,8 @@ static int internal_modify_qp(struct ib_qp *ibqp, update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PRIM_P_KEY_IDX, 1); } if (attr_mask & IB_QP_PORT) { + struct ehca_sport *sport; + struct ehca_qp *aqp1; if (attr->port_num < 1 || attr->port_num > shca->num_ports) { ret = -EINVAL; ehca_err(ibqp->device, "Invalid port=%x. " @@ -1191,6 +1216,29 @@ static int internal_modify_qp(struct ib_qp *ibqp, shca->num_ports); goto modify_qp_exit2; } + sport = &shca->sport[attr->port_num - 1]; + if (!sport->ibqp_sqp[IB_QPT_GSI]) { + /* should not occur */ + ret = -EFAULT; + ehca_err(ibqp->device, "AQP1 was not created for " + "port=%x", attr->port_num); + goto modify_qp_exit2; + } + aqp1 = container_of(sport->ibqp_sqp[IB_QPT_GSI], + struct ehca_qp, ib_qp); + if (ibqp->qp_type != IB_QPT_GSI && + ibqp->qp_type != IB_QPT_SMI && + aqp1->mod_qp_parm) { + /* + * firmware will reject this modify_qp() because + * port is not activated/initialized fully + */ + ret = -EFAULT; + ehca_warn(ibqp->device, "Couldn't modify qp port=%x: " + "either port is being activated (try again) " + "or cabling issue", attr->port_num); + goto modify_qp_exit2; + } mqpcb->prim_phys_port = attr->port_num; update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PRIM_PHYS_PORT, 1); } @@ -1470,6 +1518,8 @@ modify_qp_exit1: int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, struct ib_udata *udata) { + struct ehca_shca *shca = container_of(ibqp->device, struct ehca_shca, + ib_device); struct ehca_qp *my_qp = container_of(ibqp, struct ehca_qp, ib_qp); struct ehca_pd *my_pd = container_of(my_qp->ib_qp.pd, struct ehca_pd, ib_pd); @@ -1482,9 +1532,100 @@ int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, return -EINVAL; } + /* The if-block below caches qp_attr to be modified for GSI and SMI + * qps during the initialization by ib_mad. When the respective port + * is activated, ie we got an event PORT_ACTIVE, we'll replay the + * cached modify calls sequence, see ehca_recover_sqs() below. + * Why that is required: + * 1) If one port is connected, older code requires that port one + * to be connected and module option nr_ports=1 to be given by + * user, which is very inconvenient for end user. + * 2) Firmware accepts modify_qp() only if respective port has become + * active. Older code had a wait loop of 30sec create_qp()/ + * define_aqp1(), which is not appropriate in practice. This + * code now removes that wait loop, see define_aqp1(), and always + * reports all ports to ib_mad resp. users. Only activated ports + * will then usable for the users. + */ + if (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI) { + int port = my_qp->init_attr.port_num; + struct ehca_sport *sport = &shca->sport[port - 1]; + unsigned long flags; + spin_lock_irqsave(&sport->mod_sqp_lock, flags); + /* cache qp_attr only during init */ + if (my_qp->mod_qp_parm) { + struct ehca_mod_qp_parm *p; + if (my_qp->mod_qp_parm_idx >= EHCA_MOD_QP_PARM_MAX) { + ehca_err(&shca->ib_device, + "mod_qp_parm overflow state=%x port=%x" + " type=%x", attr->qp_state, + my_qp->init_attr.port_num, + ibqp->qp_type); + spin_unlock_irqrestore(&sport->mod_sqp_lock, + flags); + return -EINVAL; + } + p = &my_qp->mod_qp_parm[my_qp->mod_qp_parm_idx]; + p->mask = attr_mask; + p->attr = *attr; + my_qp->mod_qp_parm_idx++; + ehca_dbg(&shca->ib_device, + "Saved qp_attr for state=%x port=%x type=%x", + attr->qp_state, my_qp->init_attr.port_num, + ibqp->qp_type); + spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); + return 0; + } + spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); + } + return internal_modify_qp(ibqp, attr, attr_mask, 0); } +void ehca_recover_sqp(struct ib_qp *sqp) +{ + struct ehca_qp *my_sqp = container_of(sqp, struct ehca_qp, ib_qp); + int port = my_sqp->init_attr.port_num; + struct ib_qp_attr attr; + struct ehca_mod_qp_parm *qp_parm; + int i, qp_parm_idx, ret; + unsigned long flags, wr_cnt; + + if (!my_sqp->mod_qp_parm) + return; + ehca_dbg(sqp->device, "SQP port=%x qp_num=%x", port, sqp->qp_num); + + qp_parm = my_sqp->mod_qp_parm; + qp_parm_idx = my_sqp->mod_qp_parm_idx; + for (i = 0; i < qp_parm_idx; i++) { + attr = qp_parm[i].attr; + ret = internal_modify_qp(sqp, &attr, qp_parm[i].mask, 0); + if (ret) { + ehca_err(sqp->device, "Could not modify SQP port=%x " + "qp_num=%x ret=%x", port, sqp->qp_num, ret); + goto free_qp_parm; + } + ehca_dbg(sqp->device, "SQP port=%x qp_num=%x in state=%x", + port, sqp->qp_num, attr.qp_state); + } + + /* re-trigger posted recv wrs */ + wr_cnt = my_sqp->ipz_rqueue.current_q_offset / + my_sqp->ipz_rqueue.qe_size; + if (wr_cnt) { + spin_lock_irqsave(&my_sqp->spinlock_r, flags); + hipz_update_rqa(my_sqp, wr_cnt); + spin_unlock_irqrestore(&my_sqp->spinlock_r, flags); + ehca_dbg(sqp->device, "doorbell port=%x qp_num=%x wr_cnt=%lx", + port, sqp->qp_num, wr_cnt); + } + +free_qp_parm: + kfree(qp_parm); + /* this prevents subsequent calls to modify_qp() to cache qp_attr */ + my_sqp->mod_qp_parm = NULL; +} + int ehca_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr, int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr) @@ -1772,6 +1913,7 @@ static int internal_destroy_qp(struct ib_device *dev, struct ehca_qp *my_qp, struct ehca_shca *shca = container_of(dev, struct ehca_shca, ib_device); struct ehca_pd *my_pd = container_of(my_qp->ib_qp.pd, struct ehca_pd, ib_pd); + struct ehca_sport *sport = &shca->sport[my_qp->init_attr.port_num - 1]; u32 cur_pid = current->tgid; u32 qp_num = my_qp->real_qp_num; int ret; @@ -1818,6 +1960,16 @@ static int internal_destroy_qp(struct ib_device *dev, struct ehca_qp *my_qp, port_num = my_qp->init_attr.port_num; qp_type = my_qp->init_attr.qp_type; + if (qp_type == IB_QPT_SMI || qp_type == IB_QPT_GSI) { + spin_lock_irqsave(&sport->mod_sqp_lock, flags); + if (my_qp->mod_qp_parm) { + kfree(my_qp->mod_qp_parm); + my_qp->mod_qp_parm = NULL; + } + shca->sport[port_num - 1].ibqp_sqp[qp_type] = NULL; + spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); + } + /* no support for IB_QPT_SMI yet */ if (qp_type == IB_QPT_GSI) { struct ib_event event; diff --git a/drivers/infiniband/hw/ehca/ehca_sqp.c b/drivers/infiniband/hw/ehca/ehca_sqp.c index f0792e5..79e72b2 100644 --- a/drivers/infiniband/hw/ehca/ehca_sqp.c +++ b/drivers/infiniband/hw/ehca/ehca_sqp.c @@ -40,11 +40,8 @@ */ -#include -#include #include "ehca_classes.h" #include "ehca_tools.h" -#include "ehca_qes.h" #include "ehca_iverbs.h" #include "hcp_if.h" @@ -93,6 +90,9 @@ u64 ehca_define_sqp(struct ehca_shca *shca, return H_PARAMETER; } + if (ehca_nr_ports < 0) /* autodetect mode */ + return H_SUCCESS; + for (counter = 0; shca->sport[port - 1].port_state != IB_PORT_ACTIVE && counter < ehca_port_act_time; -- 1.5.2 From fenkes at de.ibm.com Thu Jan 17 06:07:24 2008 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Thu, 17 Jan 2008 15:07:24 +0100 Subject: [ofa-general] [PATCH 4/4] IB/ehca: Prevent RDMA-related connection failures In-Reply-To: <200801171502.34287.fenkes@de.ibm.com> References: <200801171502.34287.fenkes@de.ibm.com> Message-ID: <200801171507.25145.fenkes@de.ibm.com> Some HW revisions of eHCA2 may cause an RC connection to break if they received RDMA Reads over that connection before. This can be prevented by assuring that, after the first RDMA Read, the QP receives a new RDMA Read every few million link packets. Include code into the driver that inserts an empty (size 0) RDMA Read into the message stream every now and then if the consumer doesn't post them frequently enough. Signed-off-by: Joachim Fenkes --- drivers/infiniband/hw/ehca/ehca_classes.h | 5 ++ drivers/infiniband/hw/ehca/ehca_qp.c | 14 +++- drivers/infiniband/hw/ehca/ehca_reqs.c | 112 ++++++++++++++++++++-------- 3 files changed, 95 insertions(+), 36 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 2502366..f281d16 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -183,6 +183,11 @@ struct ehca_qp { u32 mm_count_squeue; u32 mm_count_rqueue; u32 mm_count_galpa; + /* unsolicited ack circumvention */ + int unsol_ack_circ; + int mtu_shift; + u32 message_count; + u32 packet_count; }; #define IS_SRQ(qp) (qp->ext_type == EQPT_SRQ) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index bb7ccef..6c050e0 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -592,10 +592,8 @@ static struct ehca_qp *internal_create_qp( goto create_qp_exit1; } - if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) - parms.sigtype = HCALL_SIGT_EVERY; - else - parms.sigtype = HCALL_SIGT_BY_WQE; + /* Always signal by WQE so we can hide circ. WQEs */ + parms.sigtype = HCALL_SIGT_BY_WQE; /* UD_AV CIRCUMVENTION */ max_send_sge = init_attr->cap.max_send_sge; @@ -618,6 +616,10 @@ static struct ehca_qp *internal_create_qp( parms.squeue.max_sge = max_send_sge; parms.rqueue.max_sge = max_recv_sge; + /* RC QPs need one more SWQE for unsolicited ack circumvention */ + if (qp_type == IB_QPT_RC) + parms.squeue.max_wr++; + if (EHCA_BMASK_GET(HCA_CAP_MINI_QP, shca->hca_cap)) { if (HAS_SQ(my_qp)) ehca_determine_small_queue( @@ -650,6 +652,8 @@ static struct ehca_qp *internal_create_qp( parms.squeue.act_nr_sges = 1; parms.rqueue.act_nr_sges = 1; } + /* hide the extra WQE */ + parms.squeue.act_nr_wqes--; break; case IB_QPT_UD: case IB_QPT_GSI: @@ -1295,6 +1299,8 @@ static int internal_modify_qp(struct ib_qp *ibqp, } if (attr_mask & IB_QP_PATH_MTU) { + /* store ld(MTU) */ + my_qp->mtu_shift = attr->path_mtu + 7; mqpcb->path_mtu = attr->path_mtu; update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PATH_MTU, 1); } diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index ea91360..3aacc8c 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -50,6 +50,9 @@ #include "hcp_if.h" #include "hipz_fns.h" +/* in RC traffic, insert an empty RDMA READ every this many packets */ +#define ACK_CIRC_THRESHOLD 2000000 + static inline int ehca_write_rwqe(struct ipz_queue *ipz_rqueue, struct ehca_wqe *wqe_p, struct ib_recv_wr *recv_wr) @@ -81,7 +84,7 @@ static inline int ehca_write_rwqe(struct ipz_queue *ipz_rqueue, if (ehca_debug_level) { ehca_gen_dbg("RECEIVE WQE written into ipz_rqueue=%p", ipz_rqueue); - ehca_dmp( wqe_p, 16*(6 + wqe_p->nr_of_data_seg), "recv wqe"); + ehca_dmp(wqe_p, 16*(6 + wqe_p->nr_of_data_seg), "recv wqe"); } return 0; @@ -135,7 +138,8 @@ static void trace_send_wr_ud(const struct ib_send_wr *send_wr) static inline int ehca_write_swqe(struct ehca_qp *qp, struct ehca_wqe *wqe_p, - const struct ib_send_wr *send_wr) + const struct ib_send_wr *send_wr, + int hidden) { u32 idx; u64 dma_length; @@ -176,7 +180,9 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, wqe_p->wr_flag = 0; - if (send_wr->send_flags & IB_SEND_SIGNALED) + if ((send_wr->send_flags & IB_SEND_SIGNALED || + qp->init_attr.sq_sig_type == IB_SIGNAL_ALL_WR) + && !hidden) wqe_p->wr_flag |= WQE_WRFLAG_REQ_SIGNAL_COM; if (send_wr->opcode == IB_WR_SEND_WITH_IMM || @@ -199,7 +205,7 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, wqe_p->destination_qp_number = send_wr->wr.ud.remote_qpn << 8; wqe_p->local_ee_context_qkey = remote_qkey; - if (!send_wr->wr.ud.ah) { + if (unlikely(!send_wr->wr.ud.ah)) { ehca_gen_err("wr.ud.ah is NULL. qp=%p", qp); return -EINVAL; } @@ -255,6 +261,15 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, } /* eof idx */ wqe_p->u.nud.atomic_1st_op_dma_len = dma_length; + /* unsolicited ack circumvention */ + if (send_wr->opcode == IB_WR_RDMA_READ) { + /* on RDMA read, switch on and reset counters */ + qp->message_count = qp->packet_count = 0; + qp->unsol_ack_circ = 1; + } else + /* else estimate #packets */ + qp->packet_count += (dma_length >> qp->mtu_shift) + 1; + break; default: @@ -355,13 +370,49 @@ static inline void map_ib_wc_status(u32 cqe_status, *wc_status = IB_WC_SUCCESS; } +static inline int post_one_send(struct ehca_qp *my_qp, + struct ib_send_wr *cur_send_wr, + struct ib_send_wr **bad_send_wr, + int hidden) +{ + struct ehca_wqe *wqe_p; + int ret; + u64 start_offset = my_qp->ipz_squeue.current_q_offset; + + /* get pointer next to free WQE */ + wqe_p = ipz_qeit_get_inc(&my_qp->ipz_squeue); + if (unlikely(!wqe_p)) { + /* too many posted work requests: queue overflow */ + if (bad_send_wr) + *bad_send_wr = cur_send_wr; + ehca_err(my_qp->ib_qp.device, "Too many posted WQEs " + "qp_num=%x", my_qp->ib_qp.qp_num); + return -ENOMEM; + } + /* write a SEND WQE into the QUEUE */ + ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, hidden); + /* + * if something failed, + * reset the free entry pointer to the start value + */ + if (unlikely(ret)) { + my_qp->ipz_squeue.current_q_offset = start_offset; + if (bad_send_wr) + *bad_send_wr = cur_send_wr; + ehca_err(my_qp->ib_qp.device, "Could not write WQE " + "qp_num=%x", my_qp->ib_qp.qp_num); + return -EINVAL; + } + + return 0; +} + int ehca_post_send(struct ib_qp *qp, struct ib_send_wr *send_wr, struct ib_send_wr **bad_send_wr) { struct ehca_qp *my_qp = container_of(qp, struct ehca_qp, ib_qp); struct ib_send_wr *cur_send_wr; - struct ehca_wqe *wqe_p; int wqe_cnt = 0; int ret = 0; unsigned long flags; @@ -369,37 +420,33 @@ int ehca_post_send(struct ib_qp *qp, /* LOCK the QUEUE */ spin_lock_irqsave(&my_qp->spinlock_s, flags); + /* Send an empty extra RDMA read if: + * 1) there has been an RDMA read on this connection before + * 2) no RDMA read occurred for ACK_CIRC_THRESHOLD link packets + * 3) we can be sure that any previous extra RDMA read has been + * processed so we don't overflow the SQ + */ + if (unlikely(my_qp->unsol_ack_circ && + my_qp->packet_count > ACK_CIRC_THRESHOLD && + my_qp->message_count > my_qp->init_attr.cap.max_send_wr)) { + /* insert an empty RDMA READ to fix up the remote QP state */ + struct ib_send_wr circ_wr; + memset(&circ_wr, 0, sizeof(circ_wr)); + circ_wr.opcode = IB_WR_RDMA_READ; + post_one_send(my_qp, &circ_wr, NULL, 1); /* ignore retcode */ + wqe_cnt++; + ehca_dbg(qp->device, "posted circ wr qp_num=%x", qp->qp_num); + my_qp->message_count = my_qp->packet_count = 0; + } + /* loop processes list of send reqs */ for (cur_send_wr = send_wr; cur_send_wr != NULL; cur_send_wr = cur_send_wr->next) { - u64 start_offset = my_qp->ipz_squeue.current_q_offset; - /* get pointer next to free WQE */ - wqe_p = ipz_qeit_get_inc(&my_qp->ipz_squeue); - if (unlikely(!wqe_p)) { - /* too many posted work requests: queue overflow */ - if (bad_send_wr) - *bad_send_wr = cur_send_wr; - if (wqe_cnt == 0) { - ret = -ENOMEM; - ehca_err(qp->device, "Too many posted WQEs " - "qp_num=%x", qp->qp_num); - } - goto post_send_exit0; - } - /* write a SEND WQE into the QUEUE */ - ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr); - /* - * if something failed, - * reset the free entry pointer to the start value - */ + ret = post_one_send(my_qp, cur_send_wr, bad_send_wr, 0); if (unlikely(ret)) { - my_qp->ipz_squeue.current_q_offset = start_offset; - *bad_send_wr = cur_send_wr; - if (wqe_cnt == 0) { - ret = -EINVAL; - ehca_err(qp->device, "Could not write WQE " - "qp_num=%x", qp->qp_num); - } + /* if one or more WQEs were successful, don't fail */ + if (wqe_cnt) + ret = 0; goto post_send_exit0; } wqe_cnt++; @@ -410,6 +457,7 @@ int ehca_post_send(struct ib_qp *qp, post_send_exit0: iosync(); /* serialize GAL register access */ hipz_update_sqa(my_qp, wqe_cnt); + my_qp->message_count += wqe_cnt; spin_unlock_irqrestore(&my_qp->spinlock_s, flags); return ret; } -- 1.5.2 From moshek at voltaire.com Thu Jan 17 06:33:02 2008 From: moshek at voltaire.com (Moshe Kazir) Date: Thu, 17 Jan 2008 16:33:02 +0200 Subject: [ofa-general] ofed against lustre kernel In-Reply-To: <97a7c7ed0801101216p16ad7f55padac2b7200c3a4be@mail.gmail.com> References: <97a7c7ed0801101216p16ad7f55padac2b7200c3a4be@mail.gmail.com> Message-ID: <39C75744D164D948A170E9792AF8E7CAC5AD17@exil.voltaire.com> Try the attached file. Moshe ____________________________________________________________ Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Michael Di Domenico Sent: Thursday, January 10, 2008 10:17 PM To: general at lists.openfabrics.org Subject: [ofa-general] ofed against lustre kernel Is there a trick to getting OFED 1.2.5.1 to compile against the lustre kernel on RedHat 5 x86_64? The first time i tried i got the below error, which looks like a problem with the lustre kernel source tree. I'm trying to work through it, but i have a feeling im starting to wander down a rabbit hole... Does anyone know if there is a step guide for installing redhat 5, ofed 1.2.5.1, and lustre 1.6.4.1? gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/net/rds/.ib_sysctl.o.d -nostdinc -isystem /usr/lib/gcc-I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/kernel_addons/ba ckport/2.6.18_FC6/include/ \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/include \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/drivers/infiniband/include \ -Iinclude \ -Iinclude2 -I/usr/src/linux-2.6.18-8.1.14.el5_lustre.1.6.4.1/include \ -include include/linux/autoconf.h \ -include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/include/linux/autoconf.h \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/net/rds -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-st ld -m elf_x86_64 -r -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.1/net/rds/rds.o /var/tmp/OFEDRPM/BUILD/ofa_ke Building modules, stage 2. make -rR -f /usr/src/linux-2.6.18-8.1.14.el5_lustre.1.6.4.1/scripts/Makefile.modpost /usr/src/linux-2.6.18-8.1.14.el5_lustre.1.6.4.1/scripts/Makefile.modpost :38: include/config/auto.conf: No such make[4]: *** No rule to make target `include/config/auto.conf'. Stop. make[3]: *** [modules] Error 2 make[2]: *** [modules] Error 2 make[1]: *** [modules] Error 2 make[1]: Leaving directory `/usr/src/linux-2.6.18-8.1.14.el5_lustre.1.6.4.1-obj/x86_64/smp' make: *** [kernel] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.65645 (%install) _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -------------- next part -------------- A non-text attachment was scrubbed... Name: zzz_0070_2.6.9-55.0.2.EL_lustre.1.6.2smp_backport.diff Type: application/octet-stream Size: 435 bytes Desc: zzz_0070_2.6.9-55.0.2.EL_lustre.1.6.2smp_backport.diff URL: From ahubbe at iol.unh.edu Thu Jan 17 06:34:36 2008 From: ahubbe at iol.unh.edu (Allen Hubbe) Date: Thu, 17 Jan 2008 09:34:36 -0500 (EST) Subject: [ofa-general] Support for Ammasso in OFED 1.3? Message-ID: Hello, Are there plans to include support for Ammasso hardware with OFED 1.3? Is there anything that prevents the iw_c2 module or libamso from being distributed and installed with the rest of OFED? Allen Hubbe From moshek at voltaire.com Thu Jan 17 06:37:07 2008 From: moshek at voltaire.com (Moshe Kazir) Date: Thu, 17 Jan 2008 16:37:07 +0200 Subject: [ofa-general] build & install error In-Reply-To: <200801101838.26820.slava@auto.ru> References: <200801101838.26820.slava@auto.ru> Message-ID: <39C75744D164D948A170E9792AF8E7CAC5AD18@exil.voltaire.com> For CentOS 5.1 you need the RH 5.1 backporting included in OFED-1.2.5.5-rc2 you may download it from http://www.openfabrics.org/builds/connectx/OFED-1.2.5.5-rc2.tgz Moshe ____________________________________________________________ Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Viatcheslav E. Kouznetsov Sent: Thursday, January 10, 2008 5:38 PM To: general at lists.openfabrics.org Subject: [ofa-general] build & install error Hi All! I have a some trouble with building & installing OFED software If I try to build OFED-1.2.5.4, i get next error ---- make -f scripts/Makefile.build obj=/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/.af_rds.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.1.2/include -D__KERNEL__ \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/kernel_addons/backport/2.6.1 8_RH_5.1 /include/ \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include \ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include \ -Iinclude \ \ -include include/linux/autoconf.h \ -include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include/linux/autoconf.h \ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -mtune=generic -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/ulp/ipoib -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/debug -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/hw/cxgb3/ core -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/net/cxgb3 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/net/mlx4 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/hw/mlx4 -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(af_rds)" -D"KBUILD_MODNAME=KBUILD_STR(rds)" -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/.tmp_af_rds.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/af_rds.c In file included from /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/af_rds.c:39: /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h:167: error: expected specifier-qualifier-list before '__sum16' /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h: In function 'rds_message_make_checksum': /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h:480: error: 'struct rds_header' has no member named 'h_csum' /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h:481: error: 'struct rds_header' has no member named 'h_csum' /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h: In function 'rds_message_verify_checksum': /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/rds.h:486: error: 'const struct rds_header' has no member named 'h_c sum' make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds/af_rds.o] Error 1 make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/net/rds] Error 2 make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4] Error 2 make[1]: Leaving directory `/usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64' make: *** [kernel] Error 2 When I build OFED-1.2.5.4-20080107-0713, building OK, but when I run modprobe command, I get next error ---- [root at blade02 ~]# modprobe ib_ipoib WARNING: Error inserting ib_core (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/core/ib_co re.ko): Invalid module format WARNING: Error inserting ib_mad (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/core/ib_ma d.ko): Invalid module format WARNING: Error inserting ib_sa (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/core/ib_sa .ko): Invalid module format WARNING: Error inserting ib_cm (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/core/ib_cm .ko): Invalid module format FATAL: Error inserting ib_ipoib (/lib/modules/2.6.18-53.el5/updates/kernel/drivers/infiniband/ulp/ipoib/ ib_ipoib.ko): Invalid module format OS - CentOS 5.1 Kernel 2.6.18-53.el5 Hardware Supermicro SuperBlade with http://supermicro.com/products/superblade/module/SBI-7125B-T1.cfm modules. lspci -vv: InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (rev 20) Subsystem: Mellanox Technologies MT25208 InfiniHost III Ex _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From Rolando at tei-usa.com Thu Jan 17 06:48:28 2008 From: Rolando at tei-usa.com (Rolando X. Fitzgerald) Date: Thu, 17 Jan 2008 16:48:28 +0200 Subject: [ofa-general] Say YES to your new super-abilities! Message-ID: <5ebb01c85918$05a118b0$f6e07e57@Rolando> In just a few short weeks, you'll watch with amazement as your phallus grows into the powerful, thickest, hardest, and most biggest tool you've ever imagined - the one you've always interested about having! No pen!s EN'LARGEment system is faster, easier to use, or more effective than VPXL+ - THE BEST}! VPXL+ IS GUARANTEED TO EN'LARGE & STRENGTHEN YOUR PHALLUS OR YOUR MONEY BACK - PERIOD! SO WHY WAIT? GET VPXL+ AND LIVE LARGE TODAY! CHECK THIS OFFER TO GAIN THE LONGEST AND HARDEST PHALLUS IN THIS YEAR! http://damseldist.com/ ____________________________________________ valid yardsticks: Each situation needs to be judged on its own merit. InSoros didnt care. For him, the technique worked like a charm,73 Invest First and Investigate Later would be so often, was on the cutting edge of the profession, for in themight go short 75 percent of his positions, and go long the other 25 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ogerlitz at voltaire.com Thu Jan 17 07:03:45 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 17 Jan 2008 17:03:45 +0200 (IST) Subject: [ofa-general] [PATCH] ib/ipoib: handle Gratuitous ARP & bonding failover race also for connected mode neighbours In-Reply-To: References: Message-ID: move a little up the code that checks for a situation where the remote GID stored in the ipoib_neigh is different than the one present in the neighbour (handle Gratuitous ARP) or that a bonding fail over has happened but the neighbour still has a pointer to an ipoib_neigh created not by the current slave. This will cause the driver to apply the check also for connected mode neighbours. Signed-off-by: Or Gerlitz I have tested this patch on 2.6.24-rc1 (and its now in progress for 2.6.24-rc8) things are basically working fine, but I do want to play more with bonding fail-overs to make sure nothing was broken wrt to Gratuitous ARP etc, will let you know. ----- Index: linux-2.6.24-rc8/drivers/infiniband/ulp/ipoib/ipoib_main.c =================================================================== --- linux-2.6.24-rc8.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-17 16:37:10.000000000 +0200 +++ linux-2.6.24-rc8/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-17 16:46:51.000000000 +0200 @@ -686,13 +686,8 @@ static int ipoib_start_xmit(struct sk_bu } neigh = *to_ipoib_neigh(skb->dst->neighbour); - - if (ipoib_cm_get(neigh)) { - if (ipoib_cm_up(neigh)) { - ipoib_cm_send(dev, skb, ipoib_cm_get(neigh)); - goto out; - } - } else if (neigh->ah) { + + if (neigh->ah) if (unlikely((memcmp(&neigh->dgid.raw, skb->dst->neighbour->ha + 4, sizeof(union ib_gid))) || @@ -713,6 +708,12 @@ static int ipoib_start_xmit(struct sk_bu goto out; } + if (ipoib_cm_get(neigh)) { + if (ipoib_cm_up(neigh)) { + ipoib_cm_send(dev, skb, ipoib_cm_get(neigh)); + goto out; + } + } else if (neigh->ah) { ipoib_send(dev, skb, neigh->ah, IPOIB_QPN(skb->dst->neighbour->ha)); goto out; } From bart.vanassche at gmail.com Thu Jan 17 07:23:31 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Thu, 17 Jan 2008 16:23:31 +0100 Subject: [ofa-general] MT25204 versus MT25208 Message-ID: Hello, Four hosts are connected through an InfiniBand network, two with a MT25204 HCA and two with a MT25208 HCA. All four hosts communicate fine via IPoIB, and iperf shows the expected results. Until now I only succeeded to run ib_rdma_bw between the MT25204 interfaces. When I run ib_rdma_bw on one of the MT25208 interfaces a strange error message is printed. Any idea how I can fix this ? MT25204 to MT25204: root at 192.168.102.5:~# ib_rdma_bw 6467: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=0 | cma=0 | 6467: Local address: LID 0x06, QPN 0x140405, PSN 0x355f4d RKey 0x22002400 VAddr 0x002ad73ba76000 6467: Remote address: LID 0x11, QPN 0x80405, PSN 0xa189f4, RKey 0x40002400 VAddr 0x002aff033e3000 root at 192.168.102.12:~# ib_rdma_bw 192.168.102.5 6857: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=0 | cma=0 | 6857: Local address: LID 0x11, QPN 0x80405, PSN 0xa189f4 RKey 0x40002400 VAddr 0x002aff033e3000 6857: Remote address: LID 0x06, QPN 0x140405, PSN 0x355f4d, RKey 0x22002400 VAddr 0x002ad73ba76000 6857: Bandwidth peak (#0 to #989): 674.73 MB/sec 6857: Bandwidth average: 674.711 MB/sec 6857: Service Demand peak (#0 to #989): 2887 cycles/KB 6857: Service Demand Avg : 2887 cycles/KB MT25208 to MT25204: root at 192.168.102.5:~# ib_rdma_bw 6468: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=0 | cma=0 | server read: Success 6468:pp_server_exch_dest: 0/45 Couldn't read remote address root at 192.168.102.100:~# ib_rdma_bw 192.168.102.5 6466: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=0 | cma=0 | 6466:main: Local lid 0x0 detected. Is an SM running? Software versions: * Linux kernel 2.6.22.9, 64-bit version. * OFED-1.2.5.4 userspace components. Bart. From glebn at voltaire.com Thu Jan 17 07:30:44 2008 From: glebn at voltaire.com (Gleb Natapov) Date: Thu, 17 Jan 2008 17:30:44 +0200 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: References: <478D1A49.1080807@mellanox.co.il> <000201c857b9$e67ee020$a937170a@amr.corp.intel.com> <20080116073459.GA20554@minantech.com> Message-ID: <20080117153043.GA10065@minantech.com> On Wed, Jan 16, 2008 at 09:35:39PM -0800, Roland Dreier wrote: > > Roland, you said that XRC API is ugly, are you going to push it upstream > > in its present form? > > That's a good question. Since there is no 'present form' for XRC as > far as I can tell, it's hard to make a definitive answer. Certainly I There is a proposed API. Jack is working on an implementation. API is not pretty at all, but it seems that with the way XRC is implemented in HW it is hard to think about better one. It is very important to decide if API is good enough for kernel proper before releasing OFED1.3. After that the damage will be done. > haven't made up my mind in advance one way or another. In addition to > seeing how the code ends up, I think the other big piece of the puzzle > is to hear from the Open MPI team and other consumers of the API and > find out how big the benefit is. > Well, I can't speak for everyone, but in my opinion if someone wants to run MPI job so huge that XRC absolutely has to be used to be able to actually finish it then he should seriously rethink his application design. This is only my opinion of cause, I am sure if you'll ask Mellanox they will tell you that XRC is the best thing that happened to networking since invention of Infiniband :). XRC can be used not just for scalability BTW. It can be used as a way to post differently sized buffers to the same QP and this is very useful, but for this kind of usage the most ugly parts of the API are not needed. I will be glad to hear other people's opinions too (I know Mellanox one). -- Gleb. From eli at mellanox.co.il Thu Jan 17 07:30:23 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 17 Jan 2008 17:30:23 +0200 Subject: [ofa-general] [PATCH 4/16] Add checksum offload support for ipoib In-Reply-To: <478F55A4.7070706@voltaire.com> References: <1200501463.13546.73.camel@mtls03> <478F55A4.7070706@voltaire.com> Message-ID: <1200583823.6925.47.camel@mtls03> On Thu, 2008-01-17 at 15:18 +0200, Or Gerlitz wrote: > Eli Cohen wrote: > > --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c > > @@ -231,6 +232,18 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) > > skb->dev = dev; > > /* XXX get correct PACKET_ type here */ > > skb->pkt_type = PACKET_HOST; > > + > > + /* check rx csum */ > > + if (test_bit(IPOIB_FLAG_CSUM, &priv->flags) && likely(wc->csum_ok)) { > > First, since the device IB_DEVICE_IP_CSUM capability means that "devices > which publish this capability must support insertion of UDP and TCP > checksum on outgoing packets and can verify the validity of checksum for > incoming packets" the IPOIB_FLAG_CSUM bit is redundant, I suggest to > remove it. IB_DEVICE_IP_CSUM is used by IPOIB to mark the devices which are capable of verifying checksum. This flag is reflected in both IPOIB_FLAG_CSUM and NETIF_F_IP_CSUM. is your suggestion to use NETIF_F_IP_CSUM instead of IPOIB_FLAG_CSUM? Because I don't see how we can omit it unless we require the low lwvel drivers to set cqe.csum_ok only if they are sure that checksum is OK. > Second, the csum_ok bit is not well defined, etc as I > commented on patch #4 I will make it clearer by stating that Mellanox devices require one more condition but the semantics of this field is that it is set only if checksum is known to be good. Another option is to use the condition I put in ipoib in the low level driver and that will remove the confusion. What do you think? > > > @@ -394,6 +407,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, > > return; > > } > > > > + if (priv->ca->flags & IB_DEVICE_IP_CSUM && > > + skb->ip_summed == CHECKSUM_PARTIAL) > > + priv->tx_wr.send_flags |= IB_SEND_IP_CSUM; > > + else > > + priv->tx_wr.send_flags &= IB_SEND_IP_CSUM; > > I think that the code would be somehow clearer if you use dev->features > and not priv->ca->flags > OK, I will change this. From dotanb at dev.mellanox.co.il Thu Jan 17 07:38:15 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Thu, 17 Jan 2008 17:38:15 +0200 Subject: [ofa-general] MT25204 versus MT25208 In-Reply-To: References: Message-ID: <478F7667.3090602@dev.mellanox.co.il> Hi. Bart Van Assche wrote: > Hello, > > Four hosts are connected through an InfiniBand network, two with a > MT25204 HCA and two with a MT25208 HCA. All four hosts communicate > fine via IPoIB, and iperf shows the expected results. Until now I only > succeeded to run ib_rdma_bw between the MT25204 interfaces. When I run > ib_rdma_bw on one of the MT25208 interfaces a strange error message is > printed. Any idea how I can fix this ? > > > MT25208 to MT25204: > root at 192.168.102.5:~# ib_rdma_bw > 6468: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | > iters=1000 | duplex=0 | cma=0 | > server read: Success > 6468:pp_server_exch_dest: 0/45 Couldn't read remote address > root at 192.168.102.100:~# ib_rdma_bw 192.168.102.5 > 6466: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | > iters=1000 | duplex=0 | cma=0 | > 6466:main: Local lid 0x0 detected. Is an SM running? > What is the output of ibv_devinfo in both of those devices? (maybe you need to specify the IB port value or the device name?) Dotan From bart.vanassche at gmail.com Thu Jan 17 07:50:37 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Thu, 17 Jan 2008 16:50:37 +0100 Subject: [ofa-general] MT25204 versus MT25208 In-Reply-To: <478F7667.3090602@dev.mellanox.co.il> References: <478F7667.3090602@dev.mellanox.co.il> Message-ID: On Jan 17, 2008 4:38 PM, Dotan Barak wrote: > > Bart Van Assche wrote: > > > > Four hosts are connected through an InfiniBand network, two with a > > MT25204 HCA and two with a MT25208 HCA. All four hosts communicate > > fine via IPoIB, and iperf shows the expected results. Until now I only > > succeeded to run ib_rdma_bw between the MT25204 interfaces. When I run > > ib_rdma_bw on one of the MT25208 interfaces a strange error message is > > printed. Any idea how I can fix this ? > > > > MT25208 to MT25204: > > root at 192.168.102.5:~# ib_rdma_bw > > 6468: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | > > iters=1000 | duplex=0 | cma=0 | > > server read: Success > > 6468:pp_server_exch_dest: 0/45 Couldn't read remote address > > root at 192.168.102.100:~# ib_rdma_bw 192.168.102.5 > > 6466: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | > > iters=1000 | duplex=0 | cma=0 | > > 6466:main: Local lid 0x0 detected. Is an SM running? > > > > What is the output of ibv_devinfo in both of those devices? > > (maybe you need to specify the IB port value or the device name?) Thanks, after specifying -i 2 as an argument to ib_rdma_bw the test runs fine. Would it be possible to let the ib_rdma_bw program determine this parameter automatically ? In my setup the InfiniBand cable is connected to the second port of the MT25208 interfaces. root at 192.168.102.100:~# ibv_devinfo hca_id: mthca0 fw_ver: 4.8.200 node_guid: 0008:f104:0399:3474 sys_image_guid: 0008:f104:0399:3477 vendor_id: 0x08f1 vendor_part_id: 25208 hw_ver: 0xA0 board_id: VLT0060010001 phys_port_cnt: 2 port: 1 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 512 (2) sm_lid: 0 port_lid: 0 port_lmc: 0x00 port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 18 port_lmc: 0x00 Bart. From dotanb at dev.mellanox.co.il Thu Jan 17 07:55:48 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Thu, 17 Jan 2008 17:55:48 +0200 Subject: [ofa-general] MT25204 versus MT25208 In-Reply-To: References: <478F7667.3090602@dev.mellanox.co.il> Message-ID: <478F7A84.3070401@dev.mellanox.co.il> Bart Van Assche wrote: > On Jan 17, 2008 4:38 PM, Dotan Barak wrote: > >> Bart Van Assche wrote: >> >>> Four hosts are connected through an InfiniBand network, two with a >>> MT25204 HCA and two with a MT25208 HCA. All four hosts communicate >>> fine via IPoIB, and iperf shows the expected results. Until now I only >>> succeeded to run ib_rdma_bw between the MT25204 interfaces. When I run >>> ib_rdma_bw on one of the MT25208 interfaces a strange error message is >>> printed. Any idea how I can fix this ? >>> >>> MT25208 to MT25204: >>> root at 192.168.102.5:~# ib_rdma_bw >>> 6468: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>> iters=1000 | duplex=0 | cma=0 | >>> server read: Success >>> 6468:pp_server_exch_dest: 0/45 Couldn't read remote address >>> root at 192.168.102.100:~# ib_rdma_bw 192.168.102.5 >>> 6466: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>> iters=1000 | duplex=0 | cma=0 | >>> 6466:main: Local lid 0x0 detected. Is an SM running? >>> >>> >> What is the output of ibv_devinfo in both of those devices? >> >> (maybe you need to specify the IB port value or the device name?) >> > > Thanks, after specifying -i 2 as an argument to ib_rdma_bw the test > runs fine. Happy to hear this ... > Would it be possible to let the ib_rdma_bw program > determine this parameter automatically ? In my setup the InfiniBand > cable is connected to the second port of the MT25208 interfaces. > I think that for every IB application/test one should specify the exact device name + IB port. (this is what I'm doing in our nightly regression) I'm not the maintainer of the test, but do you suggest that this benchmark will work on the first IB port which is in ACTIVE state? thanks Dotan From eli at mellanox.co.il Thu Jan 17 08:00:53 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 17 Jan 2008 18:00:53 +0200 Subject: [ofa-general] [PATCH 14/16] ib/ipoib: Support modifying IPOIB CQ moderation params In-Reply-To: <478F56AD.1030006@voltaire.com> References: <1200501508.13546.83.camel@mtls03> <478F56AD.1030006@voltaire.com> Message-ID: <1200585653.6925.57.camel@mtls03> On Thu, 2008-01-17 at 15:22 +0200, Or Gerlitz wrote: > Eli Cohen wrote: > > Support modifying IPOIB CQ moderation params > > > > This can be used to tune at run time the paramters controlling > > the event (interrupt) generation rate and thus reduce the overhead > > incurred by handling interrupts resulting in better throughput. > > IPoIB has one CQ. As I see it, this means that you should either let the > user specify only one of rx or tx coalescing params, or make sure that > the user did not provide something that the driver can not deploy, eg > rx_usecs 12 tx_usecs 1 I think making sure you can't provide two different values for rx and tx will do. > > > +static int ipoib_set_coalesce(struct net_device *dev, > > + struct ethtool_coalesce *coal) > > + coal->tx_coalesce_usecs = coal->rx_coalesce_usecs; > > + priv->etool.coalesce_usecs = coal->rx_coalesce_usecs; > > + coal->rx_max_coalesced_frames = coal->rx_max_coalesced_frames; > > I guess you wanted to say here coal->tx_max_coalesced_frames = > > Or > Yes, thanks. From bart.vanassche at gmail.com Thu Jan 17 08:05:56 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Thu, 17 Jan 2008 17:05:56 +0100 Subject: [ofa-general] Performance of MT25204 versus MT25208 Message-ID: Hello, Via ib_rdma_bw and ib_rdma_lat I obtained the following performance data: * peak bandwith 674 MB/s / latency 3.07 microseconds between two uniprocessor systems with MT25204 HCA's. * peak bandwith 935 MB/s / latency 3.36 microseconds between two dual processor systems with MT25208 HCA's. I'm wondering now whether the performance differences are due to the HCA's or due to the motherboard ? Software: 2.6.22.9 Linux kernel (64-bit x86_64) + OFED 1.2.5.4 userspace components. Bart. From Sagir at mellanox.co.il Thu Jan 17 08:11:45 2008 From: Sagir at mellanox.co.il (Sagi Rotem) Date: Thu, 17 Jan 2008 18:11:45 +0200 Subject: [ofa-general] MT25204 versus MT25208 In-Reply-To: <478F7A84.3070401@dev.mellanox.co.il> References: <478F7667.3090602@dev.mellanox.co.il> <478F7A84.3070401@dev.mellanox.co.il> Message-ID: <6C2C79E72C305246B504CBA17B5500C90323048A@mtlexch01.mtl.com> You can use the ib_write_bw instead ,this newer test has more options. Test generally cant "guess" automatically the right port, what would you like to use as defaults if both ports are up ? If you have more than 1 card ? Etc ... So for the single case of cable connected only to port 2 u can use the flag Dotan suggested. -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Dotan Barak Sent: Thursday, January 17, 2008 5:56 PM To: Bart Van Assche Cc: Openib-General Subject: Re: [ofa-general] MT25204 versus MT25208 Bart Van Assche wrote: > On Jan 17, 2008 4:38 PM, Dotan Barak wrote: > >> Bart Van Assche wrote: >> >>> Four hosts are connected through an InfiniBand network, two with a >>> MT25204 HCA and two with a MT25208 HCA. All four hosts communicate >>> fine via IPoIB, and iperf shows the expected results. Until now I >>> only succeeded to run ib_rdma_bw between the MT25204 interfaces. >>> When I run ib_rdma_bw on one of the MT25208 interfaces a strange >>> error message is printed. Any idea how I can fix this ? >>> >>> MT25208 to MT25204: >>> root at 192.168.102.5:~# ib_rdma_bw >>> 6468: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>> iters=1000 | duplex=0 | cma=0 | server read: Success >>> 6468:pp_server_exch_dest: 0/45 Couldn't read remote address >>> root at 192.168.102.100:~# ib_rdma_bw 192.168.102.5 >>> 6466: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | >>> iters=1000 | duplex=0 | cma=0 | >>> 6466:main: Local lid 0x0 detected. Is an SM running? >>> >>> >> What is the output of ibv_devinfo in both of those devices? >> >> (maybe you need to specify the IB port value or the device name?) >> > > Thanks, after specifying -i 2 as an argument to ib_rdma_bw the test > runs fine. Happy to hear this ... > Would it be possible to let the ib_rdma_bw program determine this > parameter automatically ? In my setup the InfiniBand cable is > connected to the second port of the MT25208 interfaces. > I think that for every IB application/test one should specify the exact device name + IB port. (this is what I'm doing in our nightly regression) I'm not the maintainer of the test, but do you suggest that this benchmark will work on the first IB port which is in ACTIVE state? thanks Dotan _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From Sagir at mellanox.co.il Thu Jan 17 08:13:35 2008 From: Sagir at mellanox.co.il (Sagi Rotem) Date: Thu, 17 Jan 2008 18:13:35 +0200 Subject: [ofa-general] Performance of MT25204 versus MT25208 In-Reply-To: References: Message-ID: <6C2C79E72C305246B504CBA17B5500C90323048D@mtlexch01.mtl.com> Did you check that the PCIe slot you are using is x8 -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Bart Van Assche Sent: Thursday, January 17, 2008 6:06 PM To: Openib-General Subject: [ofa-general] Performance of MT25204 versus MT25208 Hello, Via ib_rdma_bw and ib_rdma_lat I obtained the following performance data: * peak bandwith 674 MB/s / latency 3.07 microseconds between two uniprocessor systems with MT25204 HCA's. * peak bandwith 935 MB/s / latency 3.36 microseconds between two dual processor systems with MT25208 HCA's. I'm wondering now whether the performance differences are due to the HCA's or due to the motherboard ? Software: 2.6.22.9 Linux kernel (64-bit x86_64) + OFED 1.2.5.4 userspace components. Bart. _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From jon at opengridcomputing.com Thu Jan 17 09:44:39 2008 From: jon at opengridcomputing.com (Jon Mason) Date: Thu, 17 Jan 2008 11:44:39 -0600 Subject: [ofa-general] [PATCH] Non-supported functions should return NULL when returning pointers In-Reply-To: <200801160900.17270.jackm@dev.mellanox.co.il> References: <20080115235027.GB31543@opengridcomputing.com> <200801160900.17270.jackm@dev.mellanox.co.il> Message-ID: <20080117174439.GA26449@opengridcomputing.com> On Wed, Jan 16, 2008 at 09:00:17AM +0200, Jack Morgenstein wrote: > On Wednesday 16 January 2008 01:50, Jon Mason wrote: > > Non-supported functions should return NULL when returning pointers. > > Some/Most user space programs will not check for a (void *) to -ENOSYS, > > which can look like a real address until referenced. > > > This patch breaks consistency with all the other drivers, and with libibverbs. > During the debugging cycle, user programs will get segmentation faults, and fix > their code. I didn't see any other user space libs that return a (void *) -ENOSYS. Since -ENOSYS could be a valid memory location, this is not the proper thing to do. Returning NULL will push the user space apps down the correct error path, and allow them to fail gracefully. Getting a seg fault and crashing the app is not a good way to handle this, as this can lead to many problems after the product has shipped. If these errors are handled correctly, then this greatly reduces this. Thanks, Jon > > - Jack From franquicia008 at gmail.com Thu Jan 17 09:46:44 2008 From: franquicia008 at gmail.com (franquicia008 at gmail.com) Date: Thu, 17 Jan 2008 12:46:44 -0500 Subject: [ofa-general] franquicia Message-ID: <1844850-220081417174644500@MAGO> Querid@ amig@, Antes que nada, perm�teme que deje algo claro� Si , eres capaz de dejar lo que est�s haciendo durante 3 minutos y prestarle la necesaria atenci�n a esta p�gina... Si , eres capaz de leer, palabra por palabra, todo lo que aqu� hemos escrito para ti� Si , eres capaz de tomarme en serio� Aprovecha ahora. Ahora bien, si has decido quedarte porque tu FUTURO TE IMPORTA y porque quieres ganar mucho m�s dinero del que has ganado hasta ahora, te aconsejo que te quedes, esta FRANQUICIA es lo que est�s buscando. BENEFICIOS � Franquicia Internacional � Ingresos en d�lares � Manejo de su tiempo � La Franquicia no tiene costo � Opera en cualquier ciudad y pa�s del mundo � Estabilidad econ�mica � Soporte administrativo � Estamos interesados en invitarlo a conocer detalladamente, como Usted tambi�n puede iniciar su propia Empresa. Comun�quese con nosotros Franquicia007 at gmail.com Celular 300-5 69 71 07 From daacaijmba at bmsiweb.com Wed Jan 16 12:07:34 2008 From: daacaijmba at bmsiweb.com (Jamie Brandon) Date: Thu, 16 Jan 2008 21:07:34 +0100 Subject: [ofa-general] Better quality and low prices Message-ID: <184259557.73959958929046@bmsiweb.com> Crazy me vou di snn ca go tion menu Hello, Dear User!Sorry for interrupting you, but this information is very importent for your purse and family budget! Don't pay a lot for me eu dic pjk ati ldw ons, we offer acceptable pr vk ic sr es. For example, V iaz ia bm gr owa a doesn't cost 160$ for 10 pi xtf ll!It's deception!We offer you V iyr ia wz gr avr a (10 pi wu lls x 100 mg) + C eh ia wfz li gbg s (10 pi lb lls x 20 mg) only for 68.72$ - that's the real pri sk ce. You can see the whole list of m lgb edi lj ca zz tio uwu ns on our si dw te. Save your health and money! -------------- next part -------------- An HTML attachment was scrubbed... URL: From tenders at qapco.com.qa Wed Jan 16 14:59:36 2008 From: tenders at qapco.com.qa (Cory Webster) Date: Thu, 16 Jan 2008 23:59:36 +0100 Subject: [ofa-general] Won't forget last night Message-ID: <102246501.84771543374343@qapco.com.qa> IF YOU WANT TO F mw UC nl K SOMEBODY, F ut U kxr CK YOURSELF & SAVE YOUR MONEY! Some words about hea lpj lth!It's motto for losers! Are you loser? To my mind you're not! I'll give some advise how could get it on for a day or two:) If you can not do it ph zca ysi fep cally, use some ataraxics. For example L bvf ev ty itr fyi a or Vi je a km g tu ra...In my s pm ex rvc ual practice it helped me not once... Do you feel such satisfaction by yourself? But i feel it every time i use such le td xir as C aon ia nsg lis or L xd ev kmr it tt ra! BE THE BEST IN BAD!!! ;) Visit our s dnh ite -------------- next part -------------- An HTML attachment was scrubbed... URL: From jun.takagi at fds.com Thu Jan 17 15:18:49 2008 From: jun.takagi at fds.com (jun.takagi at fds.com) Date: Thu, 17 Jan 2008 17:18:49 -0600 Subject: [ofa-general] The Dance of Love Message-ID: <478FE259.1070402@fds.com> Happy I'll Be Your Bride http://68.73.203.25/ From rdreier at cisco.com Thu Jan 17 16:11:11 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 17 Jan 2008 16:11:11 -0800 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.25 Message-ID: The 2.6.25 will open soon, so it's time to review what my plans are for the merge window opens. A note on reviewing: the upstream push for "Reviewed-by:" tags seems to have lost some momentum, but it is still the case that if you can get someone other than me to review your work, then the chances of it being merged increase dramatically. I'm talking about a real review-- ideally, someone independent (a different domain after the '@' would be good ;) who is willing to provide a "Reviewed-by:" line that means the reviewer has really looked at and thought about the patch. There should be a mailing list thread you can point me at where the reviewer comments on the patch and a new version of that patch addressing all comments is posted (or in exceptional cases, where the patch is perfect to start with, where the reviewer says the patch is great). Anyway, here are all the pending things that I'm aware of. As usual, if something isn't already in my tree and isn't listed below, I probably missed it or dropped it by mistake. Please remind me again in that case. Core: - Sean's inter-subnet CM changes. My first reaction is that I'll probably merge it, but I need to find the time to really read it over first. ULPs: - Rolf's IPoIB MGID scope changes. I have the core changes queued up, so at least recompiling the whole kernel should no longer be necessary. I need to look over the rest of the patches more carefully and make sure we have the correct userspace interface at least. - Eli's IPoIB stateless offload (checksum offload, LSO, interrupt moderation, etc). It's a big series that makes quite a few core changes. I hope to merge at least part of this queue soon. Outside opinions are welcome though. - Remove LLTX from IPoIB. I haven't posted a draft of this patch yet, so I guess it will probably wait for 2.6.26... - I still have a few SRP changes from David Dillow to review and apply. Nothing looked to radical so they should be good to go. HW specific: - Neteffect "nes" driver. It's not terribly clean code but since it's a new driver that is completely self-contained, I plan on merging it and letting cleanups happen upstream. - mlx4 WQE shrinking patch. I would like to see this as two patches: one that vmap()s work queues, and another that does the tricky part of using smaller WQEs. If I get a chance I'll try to do the split up myself. - ipath pending patches. Nothing looks like a problem there; it's just a matter of actually pulling the patches into my tree. Here are a few topics that I believe will not be ready in time for the 2.6.25 window and will need to wait for 2.6.26 at least: - Multiple CQ event vector support. I still haven't seen any discussions about how ULPs or userspace apps should decide which vector to use, and hence no progress has been made since we deferred this during the 2.6.23 merge window. - XRC. I don't get the feeling that even the interface has converged sufficiently to merge this. Here all the patches I already have in my for-2.6.25 branch: Adrian Bunk (1): IB/mthca: Remove MSI support as scheduled Anton Blanchard (1): IB/ehca: Use round_jiffies() for EQ polling timer Arthur Jones (2): IB/ipath: Better comment for rmb() in ipath_intr() IB/ipath: Add ipath_read_ireg() abstraction Dave Olson (5): IB/ipath: Improve interrupt handler cache footprint IB/ipath: Generalize some xxx_SHIFT macros IB/ipath: Changes for fields moving from devdata to portdata IB/ipath: Clean up some comments IB/ipath: Drop support for the original QHT7040 board David Dillow (3): IB/srp: Respect target credit limit IB/srp: Enable SG list chaining IB/srp: Add identifying information to log messages Erez Zilber (1): IB/iser: update URLs of iSER docs Hoang-Nam Nguyen (1): IB/ehca: Forward event client-reregister-required to registered clients Jack Morgenstein (1): mlx4_core: Fix max_eqs masking in QUERY_DEV_CAP Joe Perches (2): drivers/infiniband: Add missing "space" IB: Spelling fixes in comments John Gregor (1): IB/ipath: Fix sendctrl locking Matthias Kaehlcke (1): IB/ipath: Convert ipath_eep_sem semaphore to a mutex Nick Piggin (1): IB/ipath: Convert from .nopage to .fault Olaf Kirch (1): IB/fmr_pool: Flush serial numbers can get out of sync Oliver Pinter (1): IB/iser: Typo fix (s/destory/destroy/) Pradeep Satyanarayana (2): IPoIB/cm: Add connected mode support for devices without SRQs IPoIB/CM: Enable SRQ support on HCAs that support fewer than 16 SG entries Ralph Campbell (14): IB/mad: Remove redundant NULL pointer check in ib_mad_recv_done_handler() IB/ipath: Enable loopback of DR SMP responses from userspace IB/ipath: Remove dead code for user process waiting for send buffer IB/ipath: Fix error returned from ib_resize_cq if new size smaller than # entries IB/ipath: Fix comments for ipath_create_srq() IB/ipath: Add the work completion error code to the QP error debug output IB/ipath: Fix RNR NAK handling IB/ipath: Cleanup ipath_get_egrbuf() IB/ipath: kreceive uses portdata rather than devdata IB/ipath: MAD performance sampling registers support IB/ipath: Export hardware counters more consistently IB/ipath: Allow more flexible user register alignments IB/ipath: Port config has on-chip effects for 7220 IB/ipath: Add flag and handling for chips with swapped register bug Roland Dreier (8): IPoIB: Trivial formatting cleanups IPoIB/cm: Factor out ipoib_cm_free_rx_ring() IPoIB/cm: Factor out ipoib_cm_create_srq() IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list() IB/mlx4: Micro-optimize mlx4_ib_poll_one() RDMA/cxgb3: Endianness annotation for irs field IB/ipath: Fix some sparse warnings about shadowed symbols IB/umad: Simplify and fix locking Rolf Manderscheid (1): IPoIB: improve IPv4/IPv6 to IB mcast mapping functions Sean Hefty (6): IB/multicast: Report errors on multicast groups if P_key changes IB/mad: Report number of times a mad was retried IB/cm: Add basic performance counters IB/mad: Fix incorrect access to items on local_list RDMA/cma: add support for rdma_migrate_id() RDMA/cma: Override default responder_resources with user value Steve Welch (1): IB/mad: Enable loopback of DR SMP responses from userspace Steve Wise (3): RDMA/iwcm: Set initiator depth and responder resources to device max values RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call RDMA/cxgb3: Support version 5.0 firmware Vladimir Sokolovsky (1): RDMA/cma: Reenable device removal on passive side From kliteyn at mellanox.co.il Thu Jan 17 17:11:10 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: Thu, 17 Jan 2008 17:11:10 -0800 (PST) Subject: [ofa-general] ***SPAM*** nightly osm_sim report 2008-01-18:normal completion Message-ID: <20080118011110.43D92E60224@openfabrics.org> From: kliteyn at mellanox.co.il Return-Path: kliteyn at mellanox.co.il Message-ID: X-OriginalArrivalTime: 18 Jan 2008 01:11:08.0032 (UTC) FILETIME=[016E8000:01C8596F] Date: 18 Jan 2008 03:11:08 +0200 OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-17 OpenSM git rev = Tue_Jan_15_15:34:46_2008 [82b13a3b06289e434ce35534cf74f15211b3e4d4] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From rdreier at cisco.com Thu Jan 17 17:25:14 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 17 Jan 2008 17:25:14 -0800 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: <20080117153043.GA10065@minantech.com> (Gleb Natapov's message of "Thu, 17 Jan 2008 17:30:44 +0200") References: <478D1A49.1080807@mellanox.co.il> <000201c857b9$e67ee020$a937170a@amr.corp.intel.com> <20080116073459.GA20554@minantech.com> <20080117153043.GA10065@minantech.com> Message-ID: > Well, I can't speak for everyone, but in my opinion if someone wants to run > MPI job so huge that XRC absolutely has to be used to be able to actually > finish it then he should seriously rethink his application design. But where do you think the crossover is where XRC starts to help MPI? In other words do I need a 10000 process job on 32-core systems for it to matter, or is there a significant advantage for running a 2048 process job on 256 8-core systems? > XRC can be used not just for scalability BTW. It can be used as a > way to post differently sized buffers to the same QP and this is > very useful, but for this kind of usage the most ugly parts of the > API are not needed. I will be glad to hear other people's opinions > too (I know Mellanox one). I guess you mean just implement XRC without allowing multiple processes to share an XRC domain? That actually seems like a sensible thing to implement as well... - R. From halmeida at cimcorp.com.br Thu Jan 17 20:57:57 2008 From: halmeida at cimcorp.com.br (Henrique Leandro Ferreira de Almeida) Date: Fri, 18 Jan 2008 01:57:57 -0300 Subject: [ofa-general] OFED 1.2 + Lustre 1.6.4.1 and SLES 10 SP1 Message-ID: <2866E324F324C34293ADBD2753AB0EE501245EFA@SRVIMPMAIL.cimcorp.com.br> Hello, I´m very glad to have so many good people to help us in this work. Our problem is pick up our night sleep but maybe is too simple for many of yours We follow the steps: 1 - Install SLES 10 SP1 Operating System in a x86_64 Machine....works fine 2 - Install Kernel Lustre 1.6.4.1 with old ".config" changing do monolitic the Infiniband modules and support (the basis: make, make modules, make modules_install, make install and reboot to new kernel) .... works fine too 3 - uncompress and build OFED-1.2.tgz........build fine, but many problems on load ib_ipoib 4 - Manual compile from ofa_kernel source... modules work fine 5 - uncompress and install the Lustre 1.6.4.1 from rpm source... after build the RPMS... when we load the lustre modules appears many " lustre: unkown symbols". 6 - i´ve tried everything the "google" told us... nothing happens If anybody can help us with a step-by-step, since now we can very thankfull. best regards to all Henrique Almeida CIMCORP Brazil - SP -------------- next part -------------- An HTML attachment was scrubbed... URL: From alwysl868 at exgate.tek.com Thu Jan 17 23:18:03 2008 From: alwysl868 at exgate.tek.com (alwysl868 at exgate.tek.com) Date: Fri, 18 Jan 2008 15:18:03 +0800 Subject: [ofa-general] The Dance of Love Message-ID: <003101c859a2$436585c0$2ad0223b@orbc> Come Relax with Me http://203.255.10.96/ From bart.vanassche at gmail.com Fri Jan 18 00:55:35 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Fri, 18 Jan 2008 09:55:35 +0100 Subject: [ofa-general] Performance of MT25204 versus MT25208 In-Reply-To: <6C2C79E72C305246B504CBA17B5500C90323048D@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C90323048D@mtlexch01.mtl.com> Message-ID: > On January 17, 2008 Bart Van Assche wrote: > > > > Via ib_rdma_bw and ib_rdma_lat I obtained the following performance > > data: > > * peak bandwith 674 MB/s / latency 3.07 microseconds between two > > uniprocessor systems with MT25204 HCA's. > > * peak bandwith 935 MB/s / latency 3.36 microseconds between two dual > > processor systems with MT25208 HCA's. > > > > I'm wondering now whether the performance differences are due to the > > HCA's or due to the motherboard ? > On Jan 17, 2008 5:13 PM, Sagi Rotem wrote: > Did you check that the PCIe slot you are using is x8 The motherboard is an Intel(R) Server Board S5000PAL (http://www.intel.com/design/servers/boards/s5000pal/), which has both x4 and x8 slots. I'm not sure the information from lspci can be trusted, but lspci says it's a x8 slot: # lspci -vv ... 08:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] (rev a0) Subsystem: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Former President Bill Klinton uses Vi uv a ogu g vsf ra! Everybody knows the great s awr ex bai ual scandal known as "Klinton-Levinsky". After the relations like this Klinton's popularity raised a lot! It's a natural ph vjl eno qx menon, because Bill as a real man in order not to shame himself when he was with Monica regularly used V fw ia mx gr jek a. What happened you see:) His political figure became more courageous and more attractive. Women all over the world made out in his person not only the president of the USA, but the man!!! It's very important for a man to be respected as a man. See our site to enter upon the new phase of your life. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jphlma at lifesciences.com Fri Jan 18 02:28:49 2008 From: jphlma at lifesciences.com (jphlma at lifesciences.com) Date: Fri, 18 Jan 2008 15:58:49 +0530 Subject: [ofa-general] Kisses Through E-mail Message-ID: <002101c859bc$e9da9840$ed1d322f@akq> I Love You Because http://59.116.128.193/ From vlad at lists.openfabrics.org Fri Jan 18 03:12:35 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 18 Jan 2008 03:12:35 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080118-0200 daily build status Message-ID: <20080118111235.E61DBE60233@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.19 Passed on ia64 with linux-2.6.23 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16 Passed on powerpc with linux-2.6.12 Passed on ia64 with linux-2.6.19 Passed on ppc64 with linux-2.6.13 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.19 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.12 Passed on powerpc with linux-2.6.13 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.15 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.18 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ppc64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.22 Failed: From tempearts-request at listserv.tempe.gov Thu Jan 17 05:22:10 2008 From: tempearts-request at listserv.tempe.gov (Alana Dale) Date: Fri, 17 Jan 2008 21:22:10 +0800 Subject: [ofa-general] Live longer life without cigarettes Message-ID: <399718272.83636774467709@listserv.tempe.gov> An HTML attachment was scrubbed... URL: From Sagir at mellanox.co.il Fri Jan 18 08:09:41 2008 From: Sagir at mellanox.co.il (Sagi Rotem) Date: Fri, 18 Jan 2008 18:09:41 +0200 Subject: [ofa-general] Performance of MT25204 versus MT25208 In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C90323048D@mtlexch01.mtl.com> Message-ID: <6C2C79E72C305246B504CBA17B5500C903230685@mtlexch01.mtl.com> This may be your problem: MaxReadReq 128 bytes U need a BIOS update , common value with good performance is 512. Alternatively u can force it using setpci but than system may be unstable. -----Original Message----- From: Bart Van Assche [mailto:bart.vanassche at gmail.com] Sent: Friday, January 18, 2008 10:56 AM To: Sagi Rotem Cc: Openib-General Subject: Re: [ofa-general] Performance of MT25204 versus MT25208 > On January 17, 2008 Bart Van Assche wrote: > > > > Via ib_rdma_bw and ib_rdma_lat I obtained the following performance > > data: > > * peak bandwith 674 MB/s / latency 3.07 microseconds between two > > uniprocessor systems with MT25204 HCA's. > > * peak bandwith 935 MB/s / latency 3.36 microseconds between two > > dual processor systems with MT25208 HCA's. > > > > I'm wondering now whether the performance differences are due to the > > HCA's or due to the motherboard ? > On Jan 17, 2008 5:13 PM, Sagi Rotem wrote: > Did you check that the PCIe slot you are using is x8 The motherboard is an Intel(R) Server Board S5000PAL (http://www.intel.com/design/servers/boards/s5000pal/), which has both x4 and x8 slots. I'm not sure the information from lspci can be trusted, but lspci says it's a x8 slot: # lspci -vv ... 08:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] (rev a0) Subsystem: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Hello, I have a minor problem with ib_mthca driver in linux with Xen in DomU. If I keep ib_mthca driver in kernel while shutting down the DomU, the next start of the DomU resets the machine. Trivial fix is possible: either to rmmod ib_mthca before shutting down the DomU or set .shutdown section to the same value as the .remove section in pci_driver structure. Are you willing apply a patch that sets .shutdown in the mainline of IB driver in Linux? Or is it something that should be fixed by Xen guys? -- Lukáš Hejtmánek From hrosenstock at xsigo.com Fri Jan 18 09:10:32 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Fri, 18 Jan 2008 09:10:32 -0800 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: References: Message-ID: <1200676232.1486.88.camel@hrosenstock-ws.xsigo.com> On Thu, 2008-01-17 at 16:11 -0800, Roland Dreier wrote: > Here all the patches I already have in my for-2.6.25 branch: > Sean Hefty (6): > IB/mad: Fix incorrect access to items on local_list It wasn't clear to me that this issue was ever really nailed. Was the loop on this closed ? -- Hal From drjameschidozie2000 at yahoo.fr Fri Jan 18 09:27:40 2008 From: drjameschidozie2000 at yahoo.fr (Dr. James Chidozie) Date: Sat, 19 Jan 2008 02:27:40 +0900 Subject: [ofa-general] Contact my secretary Message-ID: Dear Friend, I didnot forgot your past effort and attemps to assist me, now I'm happy to inform you that i have suceeded in getting those funds transferred under the cooperation of a new partner from London. Now Contact my secretary ask him for ($950.000.00) for your compensation his name is Frank Kenndy. E-Mail (frank_kenndy001 at yahoo.fr) he will send you the money without any delay send him the following information to prove yourself to him. YOUR FULL NAME;............................... YOUR ADDRESS:................................. YOUR COUNTRY:................................. YOUR AGE:......................................... YOUR OCCUPATION:................................. YOUR PHONE NUMBER:............................. YOUR FAITHFUL. Dr. James Chidozie From drjameschidozie2000 at yahoo.fr Fri Jan 18 09:28:04 2008 From: drjameschidozie2000 at yahoo.fr (Dr. James Chidozie) Date: Sat, 19 Jan 2008 02:28:04 +0900 Subject: [ofa-general] Contact my secretary Message-ID: Dear Friend, I didnot forgot your past effort and attemps to assist me, now I'm happy to inform you that i have suceeded in getting those funds transferred under the cooperation of a new partner from London. Now Contact my secretary ask him for ($950.000.00) for your compensation his name is Frank Kenndy. E-Mail (frank_kenndy001 at yahoo.fr) he will send you the money without any delay send him the following information to prove yourself to him. YOUR FULL NAME;............................... YOUR ADDRESS:................................. YOUR COUNTRY:................................. YOUR AGE:......................................... YOUR OCCUPATION:................................. YOUR PHONE NUMBER:............................. YOUR FAITHFUL. Dr. James Chidozie From sean.hefty at intel.com Fri Jan 18 09:42:23 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Fri, 18 Jan 2008 09:42:23 -0800 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: <1200676232.1486.88.camel@hrosenstock-ws.xsigo.com> References: <1200676232.1486.88.camel@hrosenstock-ws.xsigo.com> Message-ID: <000001c859f9$7bc7f590$ff0da8c0@amr.corp.intel.com> >> Sean Hefty (6): >> IB/mad: Fix incorrect access to items on local_list > >It wasn't clear to me that this issue was ever really nailed. Was the >loop on this closed ? The error that this patches addresses is fairly obvious if you inspect the code. There's a strong chance that the patch fixes the bug that was reported, but the last I remember, they had trouble reproducing the crash to see if the patch would indeed make it go away. - Sean From hrosenstock at xsigo.com Fri Jan 18 10:25:11 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Fri, 18 Jan 2008 10:25:11 -0800 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: <000001c859f9$7bc7f590$ff0da8c0@amr.corp.intel.com> References: <1200676232.1486.88.camel@hrosenstock-ws.xsigo.com> <000001c859f9$7bc7f590$ff0da8c0@amr.corp.intel.com> Message-ID: <1200680711.1486.104.camel@hrosenstock-ws.xsigo.com> On Fri, 2008-01-18 at 09:42 -0800, Sean Hefty wrote: > >> Sean Hefty (6): > >> IB/mad: Fix incorrect access to items on local_list > > > >It wasn't clear to me that this issue was ever really nailed. Was the > >loop on this closed ? > > The error that this patches addresses is fairly obvious if you inspect the code. The bug seems obvious but I'm not sure about the fix. Just my $0.02 worth. -- Hal > There's a strong chance that the patch fixes the bug that was reported, but the > last I remember, they had trouble reproducing the crash to see if the patch > would indeed make it go away. > > - Sean From rdreier at cisco.com Fri Jan 18 12:36:00 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 18 Jan 2008 12:36:00 -0800 Subject: [ofa-general] MTHCA driver for Linux In-Reply-To: <20080118170117.GH4136@ics.muni.cz> (Lukas Hejtmanek's message of "Fri, 18 Jan 2008 18:01:17 +0100") References: <20080118170117.GH4136@ics.muni.cz> Message-ID: > If I keep ib_mthca driver in kernel while shutting down the DomU, the next > start of the DomU resets the machine. > > Trivial fix is possible: either to rmmod ib_mthca before shutting down the > DomU or set .shutdown section to the same value as the .remove section in > pci_driver structure. > > Are you willing apply a patch that sets .shutdown in the mainline of IB driver > in Linux? Or is it something that should be fixed by Xen guys? I would like to understand the underlying problem before blindly setting the .shutdown method of the ib_mthca PCI driver section. The mthca driver should be able to handle the hardware being in an arbitrary state when it is reloaded -- that is why it resets the adapter very early during initialization. Do you have any idea what is going wrong in the case where the machine resets? Very few other PCI drivers have a .shutdown method, and I don't know of any that just duplicate the .remove method. So rather than just having a bandaid for mthca that probably leaves the same problem for every other driver, I would prefer to understand the problem first, and if it is indeed something specific to mthca, then fix the underlying issue in mthca with a simpler shutdown method. I guess one way to debug this would be to delete operations from mthca_remove_one() one by one (starting from the end of the function), and each time try restarting your domU after doing rmmod ib_mthca. When you reach the really necessary thing, then you'll see the reset. - R. From rdreier at cisco.com Fri Jan 18 13:38:47 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 18 Jan 2008 13:38:47 -0800 Subject: [ofa-general] MTHCA driver for Linux In-Reply-To: <20080118204834.GI4136@ics.muni.cz> (Lukas Hejtmanek's message of "Fri, 18 Jan 2008 21:48:34 +0100") References: <20080118170117.GH4136@ics.muni.cz> <20080118204834.GI4136@ics.muni.cz> Message-ID: > The pcifront-end of xen is wrong. It touches somehow the device when the DomU > is starting. At that point, it resets the box hardly, if DomU has been started > already with IB driver since the box start up. I'm not sure I'm understanding what you're saying. Do you mean that you've found a bug in the Xen pci front-end, or do you still think we should fix this by changing the mthca driver? From hartlch14 at gmail.com Fri Jan 18 13:38:57 2008 From: hartlch14 at gmail.com (Chuck Hartley) Date: Fri, 18 Jan 2008 16:38:57 -0500 Subject: [ofa-general] Performance of MT25204 versus MT25208 In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C90323048D@mtlexch01.mtl.com> Message-ID: Bart - I started a thread similar to this a while back about expected RDMA performance after I measured low bandwidth similar to yours. In our case, we are using DDR and you apparently are using SDR, but we are getting bandwidth almost exactly twice what you are getting. That is: your SDR BW = 674 MB/s and our DDR BW = 1336 MB/s. Our motherboards are SuperMicro (X7DBU and X7DBT) using the same 5000P chipset as your board. They are dual Xeon CPU boards. The HCA is the MT25204 also. Here is our output from lspci for comparison: 0b:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] (rev 20) Subsystem: Mellanox Technologies MT25204 [InfiniHost III Lx HCA] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- From rdreier at cisco.com Fri Jan 18 14:22:18 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 18 Jan 2008 14:22:18 -0800 Subject: [ofa-general] Re: [PATCH 2/2] IB/iser: lower queue depth In-Reply-To: <478F258D.3080500@voltaire.com> (Erez Zilber's message of "Thu, 17 Jan 2008 11:53:17 +0200") References: <478F247C.9010306@voltaire.com> <478F258D.3080500@voltaire.com> Message-ID: thanks, applied both patches From rdreier at cisco.com Fri Jan 18 14:27:38 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 18 Jan 2008 14:27:38 -0800 Subject: [ofa-general] Re: [PATCH 4/4] IB/ehca: Prevent RDMA-related connection failures In-Reply-To: <200801171507.25145.fenkes@de.ibm.com> (Joachim Fenkes's message of "Thu, 17 Jan 2008 15:07:24 +0100") References: <200801171502.34287.fenkes@de.ibm.com> <200801171507.25145.fenkes@de.ibm.com> Message-ID: thanks, applied 1-4. From weiny2 at llnl.gov Fri Jan 18 14:37:56 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Jan 2008 14:37:56 -0800 Subject: [ofa-general] [PATCH] infiniband-diags/scripts/IBswcountlimits.pm: Fix comment Message-ID: <20080118143756.3fa2bf20.weiny2@llnl.gov> >From 96eb9de7b1918766020e7d9621d79d86949cfc39 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Fri, 18 Jan 2008 11:39:01 -0800 Subject: [PATCH] infiniband-diags/scripts/IBswcountlimits.pm: Fix comment Signed-off-by: Ira K. Weiny --- infiniband-diags/scripts/IBswcountlimits.pm | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/infiniband-diags/scripts/IBswcountlimits.pm b/infiniband-diags/scripts/IBswcountlimits.pm index ece1284..aed10fc 100755 --- a/infiniband-diags/scripts/IBswcountlimits.pm +++ b/infiniband-diags/scripts/IBswcountlimits.pm @@ -215,7 +215,7 @@ sub ensure_cache_dir } # ========================================================================= -# get_link_ends(ca_name, ca_port) +# get_cache_file(ca_name, ca_port) # sub get_cache_file { -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-infiniband-diags-scripts-IBswcountlimits.pm-Fix-com.patch Type: application/octet-stream Size: 842 bytes Desc: not available URL: From rdreier at cisco.com Fri Jan 18 14:38:09 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 18 Jan 2008 14:38:09 -0800 Subject: [ofa-general] [PATCH] IB/ipath - second prep series for iba7220 In-Reply-To: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> (Arthur Jones's message of "Tue, 15 Jan 2008 16:18:08 -0800") References: <20080116001808.12687.13839.stgit@eng-46.internal.keyresearch.com> Message-ID: Thanks, applied all 6 From weiny2 at llnl.gov Fri Jan 18 14:38:12 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Jan 2008 14:38:12 -0800 Subject: [ofa-general] [PATCH] infiniband-diags/scripts/IBswcountlimits.pm: clean up long lines Message-ID: <20080118143812.2c767c08.weiny2@llnl.gov> >From 76499fd3eef2ec847e9967e10f87eb8ecbbfc97d Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Fri, 18 Jan 2008 11:46:27 -0800 Subject: [PATCH] infiniband-diags/scripts/IBswcountlimits.pm: clean up long lines Signed-off-by: Ira K. Weiny --- infiniband-diags/scripts/IBswcountlimits.pm | 30 ++++++++++++++++---------- 1 files changed, 18 insertions(+), 12 deletions(-) diff --git a/infiniband-diags/scripts/IBswcountlimits.pm b/infiniband-diags/scripts/IBswcountlimits.pm index aed10fc..6985750 100755 --- a/infiniband-diags/scripts/IBswcountlimits.pm +++ b/infiniband-diags/scripts/IBswcountlimits.pm @@ -295,9 +295,10 @@ sub get_link_ends my $rem_port = $3; my $rem_desc = $4; my $rem_lid = $5; - $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => "", loc_desc => $desc, - loc_sw_lid => $loc_sw_lid, - rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc }; + $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => "", + loc_desc => $desc, loc_sw_lid => $loc_sw_lid, + rem_guid => "0x$rem_guid", rem_lid => $rem_lid, + rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc }; } if ($line =~ /^\[(\d+)\]\[ext (\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\(.+\)\s+#.*\"(.*)\"\.* lid (\d+).*/) { @@ -307,9 +308,10 @@ sub get_link_ends my $rem_port = $4; my $rem_desc = $5; my $rem_lid = $6; - $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => $loc_ext_port, loc_desc => $desc, - loc_sw_lid => $loc_sw_lid, - rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc }; + $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => $loc_ext_port, + loc_desc => $desc, loc_sw_lid => $loc_sw_lid, + rem_guid => "0x$rem_guid", rem_lid => $rem_lid, + rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc }; } if ($line =~ /^\[(\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\[ext (\d+)\]\(.+\)\s+#.*\"(.*)\"\.* lid (\d+).*/) { @@ -319,9 +321,11 @@ sub get_link_ends my $rem_ext_port = $4; my $rem_desc = $5; my $rem_lid = $6; - $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => "", loc_desc => $desc, - loc_sw_lid => $loc_sw_lid, - rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => $rem_ext_port, rem_desc => $rem_desc }; + $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => "", + loc_desc => $desc, loc_sw_lid => $loc_sw_lid, + rem_guid => "0x$rem_guid", rem_lid => $rem_lid, + rem_port => $rem_port, rem_ext_port => $rem_ext_port, + rem_desc => $rem_desc }; } if ($line =~ /^\[(\d+)\]\[ext (\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\[ext (\d+)\]\(.+\)\s+#.*\"(.*)\"\.* lid (\d+).*/) { @@ -332,9 +336,11 @@ sub get_link_ends my $rem_ext_port = $5; my $rem_desc = $6; my $rem_lid = $7; - $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => $loc_ext_port, loc_desc => $desc, - loc_sw_lid => $loc_sw_lid, - rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => $rem_ext_port, rem_desc => $rem_desc }; + $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => $loc_ext_port, + loc_desc => $desc, loc_sw_lid => $loc_sw_lid, + rem_guid => "0x$rem_guid", rem_lid => $rem_lid, + rem_port => $rem_port, rem_ext_port => $rem_ext_port, + rem_desc => $rem_desc }; } if ($rec) { -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-infiniband-diags-scripts-IBswcountlimits.pm-clean-u.patch Type: application/octet-stream Size: 3968 bytes Desc: not available URL: From xhejtman at ics.muni.cz Fri Jan 18 12:48:34 2008 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Fri, 18 Jan 2008 21:48:34 +0100 Subject: ***SPAM*** Re: [ofa-general] MTHCA driver for Linux In-Reply-To: References: <20080118170117.GH4136@ics.muni.cz> Message-ID: <20080118204834.GI4136@ics.muni.cz> On Fri, Jan 18, 2008 at 12:36:00PM -0800, Roland Dreier wrote: > I would like to understand the underlying problem before blindly > setting the .shutdown method of the ib_mthca PCI driver section. The > mthca driver should be able to handle the hardware being in an > arbitrary state when it is reloaded -- that is why it resets the > adapter very early during initialization. Do you have any idea what > is going wrong in the case where the machine resets? The pcifront-end of xen is wrong. It touches somehow the device when the DomU is starting. At that point, it resets the box hardly, if DomU has been started already with IB driver since the box start up. If the IB device is properly shut down (rmmod ib_mthca), pcifront-end driver does not reset the box at DomU start up. -- Lukáš Hejtmánek From rdreier at cisco.com Fri Jan 18 14:49:57 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 18 Jan 2008 14:49:57 -0800 Subject: [ofa-general] [PATCH] IPoIB: Remove redundant check in xmit handler In-Reply-To: <20071115050537.7100.93755.sendpatchset@K50wks273871wss.in.ibm.com> (Krishna Kumar's message of "Thu, 15 Nov 2007 10:35:37 +0530") References: <20071115050537.7100.93755.sendpatchset@K50wks273871wss.in.ibm.com> Message-ID: I didn't get around to killing LLTX completely, so I applied this patch for 2.6.25. From xhejtman at ics.muni.cz Fri Jan 18 14:50:58 2008 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Fri, 18 Jan 2008 23:50:58 +0100 Subject: [ofa-general] MTHCA driver for Linux In-Reply-To: References: <20080118170117.GH4136@ics.muni.cz> <20080118204834.GI4136@ics.muni.cz> Message-ID: <20080118225058.GJ4136@ics.muni.cz> On Fri, Jan 18, 2008 at 01:38:47PM -0800, Roland Dreier wrote: > I'm not sure I'm understanding what you're saying. Do you mean that > you've found a bug in the Xen pci front-end, or do you still think we > should fix this by changing the mthca driver? I'm not sure where exactly the bug is. The bug is triggered by Xen PCI front-end driver in DomU. The workaround is to either rmmod mthca driver or merge .shutdown and .remove sections of the mthca driver (in the module that runs in DomU kernel). I'm not sure where the bug is as the driver should leave the device in correct state. The current Linux kernel does not do that for most devices. Similar problem was with e1000 driver. If the driver was not removed before reboot, the system froze in BIOS code. This one was fixed in the BIOS of motherboard. But I believe, the drivers should not leave the device as is. Maybe people from Xen could write their opinion what should be done here. -- Lukáš Hejtmánek From weiny2 at llnl.gov Fri Jan 18 16:37:10 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Jan 2008 16:37:10 -0800 Subject: [ofa-general] [PATCH] infiniband-diags/scripts/ibprintswitch.pl: fix regex when searching for switch by name Message-ID: <20080118163710.4fe2a64d.weiny2@llnl.gov> >From ac265e5c7130bf9a6b43ae2ed86ae342dbfe6bc0 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Fri, 18 Jan 2008 16:26:50 -0800 Subject: [PATCH] infiniband-diags/scripts/ibprintswitch.pl: fix regex when searching for switch by name Signed-off-by: Ira K. Weiny --- infiniband-diags/scripts/ibprintswitch.pl | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/infiniband-diags/scripts/ibprintswitch.pl b/infiniband-diags/scripts/ibprintswitch.pl index d28a839..23a39b5 100755 --- a/infiniband-diags/scripts/ibprintswitch.pl +++ b/infiniband-diags/scripts/ibprintswitch.pl @@ -104,7 +104,7 @@ sub main print $ports{$port}; } } - if ("0x$guid" eq $target_switch || $desc =~ /.*$target_switch\s+.*/) + if ("0x$guid" eq $target_switch || $desc =~ /.*$target_switch.*/) { print $line; $in_switch = "yes"; -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-infiniband-diags-scripts-ibprintswitch.pl-fix-regex.patch Type: application/octet-stream Size: 960 bytes Desc: not available URL: From weiny2 at llnl.gov Fri Jan 18 16:37:18 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Jan 2008 16:37:18 -0800 Subject: [ofa-general] [PATCH] infiniband-diags/scripts/ibprintrt.pl: fix regex when searching for router by name Message-ID: <20080118163718.14017758.weiny2@llnl.gov> >From efcac0f06884c15c9a3abf97dee99953d9c5112f Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Fri, 18 Jan 2008 16:27:40 -0800 Subject: [PATCH] infiniband-diags/scripts/ibprintrt.pl: fix regex when searching for router by name Signed-off-by: Ira K. Weiny --- infiniband-diags/scripts/ibprintrt.pl | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/infiniband-diags/scripts/ibprintrt.pl b/infiniband-diags/scripts/ibprintrt.pl index 7f979e9..443cf9f 100755 --- a/infiniband-diags/scripts/ibprintrt.pl +++ b/infiniband-diags/scripts/ibprintrt.pl @@ -103,7 +103,7 @@ sub main $in_rt = "no"; goto DONE; } - if ("0x$guid" eq $target_rt || $desc =~ /.*$target_rt$/) + if ("0x$guid" eq $target_rt || $desc =~ /.*$target_rt.*/) { print $line; $in_rt = "yes"; -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-infiniband-diags-scripts-ibprintrt.pl-fix-regex-whe.patch Type: application/octet-stream Size: 912 bytes Desc: not available URL: From weiny2 at llnl.gov Fri Jan 18 16:37:22 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Jan 2008 16:37:22 -0800 Subject: [ofa-general] [PATCH] infiniband-diags/scripts/ibprintca.pl: fix regex when searching for node by name Message-ID: <20080118163722.0e7ad23a.weiny2@llnl.gov> >From 6a94e0a34428a067c125bcb11ed9f306e60c35c9 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Fri, 18 Jan 2008 16:24:37 -0800 Subject: [PATCH] infiniband-diags/scripts/ibprintca.pl: fix regex when searching for node by name Signed-off-by: Ira K. Weiny --- infiniband-diags/scripts/ibprintca.pl | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/infiniband-diags/scripts/ibprintca.pl b/infiniband-diags/scripts/ibprintca.pl index 271b324..5b1e929 100755 --- a/infiniband-diags/scripts/ibprintca.pl +++ b/infiniband-diags/scripts/ibprintca.pl @@ -103,7 +103,7 @@ sub main $in_hca = "no"; goto DONE; } - if ("0x$guid" eq $target_hca || $desc =~ /.*$target_hca$/) + if ("0x$guid" eq $target_hca || $desc =~ /.*$target_hca.*/) { print $line; $in_hca = "yes"; -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-infiniband-diags-scripts-ibprintca.pl-fix-regex-whe.patch Type: application/octet-stream Size: 916 bytes Desc: not available URL: From weiny2 at llnl.gov Fri Jan 18 16:37:28 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Jan 2008 16:37:28 -0800 Subject: [ofa-general] [PATCH] infiniband-diags/scripts/iblinkinfo.pl: clean up output format Message-ID: <20080118163728.19514876.weiny2@llnl.gov> >From b29ba8352381f21c178a57d5e0bac0d82bfb64ab Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Fri, 18 Jan 2008 16:06:06 -0800 Subject: [PATCH] infiniband-diags/scripts/iblinkinfo.pl: clean up output format - make "line mode" explicit in the code - Better align output fields Signed-off-by: Ira K. Weiny --- infiniband-diags/scripts/iblinkinfo.pl | 36 +++++++++++++++++-------------- 1 files changed, 20 insertions(+), 16 deletions(-) diff --git a/infiniband-diags/scripts/iblinkinfo.pl b/infiniband-diags/scripts/iblinkinfo.pl index 1298f57..1103a2b 100755 --- a/infiniband-diags/scripts/iblinkinfo.pl +++ b/infiniband-diags/scripts/iblinkinfo.pl @@ -108,7 +108,7 @@ sub main } my @lines = split("\n", $data); foreach my $line (@lines) { if ($line =~ /^LifeTime:\.+(.*)/) { $pkt_lifetime = $1; } } - $pkt_life_prompt = sprintf(" (LT: %s)", $pkt_lifetime); + $pkt_life_prompt = sprintf(" (LT: %2s)", $pkt_lifetime); } foreach my $port (1 .. $num_ports) { my $hr = $IBswcountlimits::link_ends{$switch}{$port}; @@ -166,24 +166,17 @@ sub main if ($line =~ /^LinkWidthSupported:\.+(.*)/) { $rem_width_sup = $1; } } } - my $line_begin = ""; - my $ext_guid = ""; - if ($line_mode) - { - $line_begin = sprintf ("%18s \"%s\"%s", $switch, $hr->{loc_desc}, $pkt_life_prompt); - $ext_guid = sprintf ("%18s", $hr->{rem_guid}); - } my $capabilities = ""; if ($print_extended_cap) { - $capabilities = sprintf("(%3s %s %6s/%s [%s/%s][%s/%s])", + $capabilities = sprintf("(%3s %s %6s / %8s [%s/%s][%s/%s])", $width, $speed, $state, $phy_link_state, $width_enable, $width_sup, $speed_enable, $speed_sup); } else { - $capabilities = sprintf("(%3s %s %6s/%s)", + $capabilities = sprintf("(%3s %s %6s / %8s)", $width, $speed, $state, $phy_link_state); } if ($print_add_switch) @@ -219,12 +212,23 @@ sub main } } - push (@output_lines, sprintf (" %s %6s %4s[%2s] ==%s%s==> %s %6s %4s[%2s] \"%s\" ( %s %s)\n", - $line_begin, - $hr->{loc_sw_lid}, $port, $hr->{loc_ext_port}, - $capabilities, $port_timeouts, - $ext_guid, $hr->{rem_lid}, $hr->{rem_port}, $hr->{rem_ext_port}, - $hr->{rem_desc}, $width_msg, $speed_msg)); + if ($line_mode) + { + my $line_begin = sprintf ("%18s \"%30s\"%s", $switch, $hr->{loc_desc}, $pkt_life_prompt); + my $ext_guid = sprintf ("%18s", $hr->{rem_guid}); + push (@output_lines, sprintf ("%s %6s %4s[%2s] ==%s%s==> %18s %6s %4s[%2s] \"%s\" ( %s %s)\n", + $line_begin, + $hr->{loc_sw_lid}, $port, $hr->{loc_ext_port}, + $capabilities, $port_timeouts, + $ext_guid, $hr->{rem_lid}, $hr->{rem_port}, $hr->{rem_ext_port}, + $hr->{rem_desc}, $width_msg, $speed_msg)); + } else { + push (@output_lines, sprintf (" %6s %4s[%2s] ==%s%s==> %6s %4s[%2s] \"%s\" ( %s %s)\n", + $hr->{loc_sw_lid}, $port, $hr->{loc_ext_port}, + $capabilities, $port_timeouts, + $hr->{rem_lid}, $hr->{rem_port}, $hr->{rem_ext_port}, + $hr->{rem_desc}, $width_msg, $speed_msg)); + } $print_switch = "yes"; } } -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-infiniband-diags-scripts-iblinkinfo.pl-clean-up-out.patch Type: application/octet-stream Size: 3591 bytes Desc: not available URL: From weiny2 at llnl.gov Fri Jan 18 16:37:36 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Jan 2008 16:37:36 -0800 Subject: [ofa-general] [PATCH] Add -C and -P to ibprint[ca|rt|switch].pl Message-ID: <20080118163736.2bab0541.weiny2@llnl.gov> >From 5b2317bf383d22e9ccb87d3c1d24735b3e508ab4 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Fri, 18 Jan 2008 16:21:28 -0800 Subject: [PATCH] Add -C and -P to ibprint[ca|rt|switch].pl Signed-off-by: Ira K. Weiny --- infiniband-diags/scripts/ibprintca.pl | 16 ++++++++++++---- infiniband-diags/scripts/ibprintrt.pl | 16 ++++++++++++---- infiniband-diags/scripts/ibprintswitch.pl | 16 ++++++++++++---- 3 files changed, 36 insertions(+), 12 deletions(-) diff --git a/infiniband-diags/scripts/ibprintca.pl b/infiniband-diags/scripts/ibprintca.pl index d311422..271b324 100755 --- a/infiniband-diags/scripts/ibprintca.pl +++ b/infiniband-diags/scripts/ibprintca.pl @@ -49,25 +49,33 @@ sub usage_and_exit print " print only the ca specified from the ibnetdiscover output\n"; print " -R Recalculate ibnetdiscover information\n"; print " -l list cas\n"; + print " -C use selected channel adaptor name for queries\n"; + print " -P use selected channel adaptor port for queries\n"; exit 0; } my $argv0 = `basename $0`; my $regenerate_map = undef; my $list_hcas = undef; +my $ca_name = ""; +my $ca_port = ""; chomp $argv0; -if (!getopts("hRl")) { usage_and_exit $argv0; } +if (!getopts("hRlC:P:")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_l) { $list_hcas = $Getopt::Std::opt_l; } +if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } +if (defined $Getopt::Std::opt_P) { $ca_port = $Getopt::Std::opt_P; } my $target_hca = $ARGV[0]; -if ($regenerate_map || !(-f "$IBswcountlimits::cache_dir/ibnetdiscover.topology")) { generate_ibnetdiscover_topology; } +my $cache_file = get_cache_file($ca_name, $ca_port); + +if ($regenerate_map || !(-f "$cache_file")) { generate_ibnetdiscover_topology($ca_name, $ca_port); } if ($list_hcas) { - system ("ibhosts $IBswcountlimits::cache_dir/ibnetdiscover.topology"); + system ("ibhosts $cache_file"); exit 1; } @@ -81,7 +89,7 @@ if ($target_hca eq "") sub main { my $found_hca = undef; - open IBNET_TOPO, "<$IBswcountlimits::cache_dir/ibnetdiscover.topology" or die "Failed to open ibnet topology\n"; + open IBNET_TOPO, "<$cache_file" or die "Failed to open ibnet topology\n"; my $in_hca = "no"; my %ports = undef; while (my $line = ) diff --git a/infiniband-diags/scripts/ibprintrt.pl b/infiniband-diags/scripts/ibprintrt.pl index d76b767..7f979e9 100755 --- a/infiniband-diags/scripts/ibprintrt.pl +++ b/infiniband-diags/scripts/ibprintrt.pl @@ -49,25 +49,33 @@ sub usage_and_exit print " print only the rt specified from the ibnetdiscover output\n"; print " -R Recalculate ibnetdiscover information\n"; print " -l list rts\n"; + print " -C use selected channel adaptor name for queries\n"; + print " -P use selected channel adaptor port for queries\n"; exit 0; } my $argv0 = `basename $0`; my $regenerate_map = undef; my $list_rts = undef; +my $ca_name = ""; +my $ca_port = ""; chomp $argv0; -if (!getopts("hRl")) { usage_and_exit $argv0; } +if (!getopts("hRlC:P:")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_l) { $list_rts = $Getopt::Std::opt_l; } +if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } +if (defined $Getopt::Std::opt_P) { $ca_port = $Getopt::Std::opt_P; } my $target_rt = $ARGV[0]; -if ($regenerate_map || !(-f "$IBswcountlimits::cache_dir/ibnetdiscover.topology")) { generate_ibnetdiscover_topology; } +my $cache_file = get_cache_file($ca_name, $ca_port); + +if ($regenerate_map || !(-f "$cache_file")) { generate_ibnetdiscover_topology($ca_name, $ca_port); } if ($list_rts) { - system ("ibrouters $IBswcountlimits::cache_dir/ibnetdiscover.topology"); + system ("ibrouters $cache_file"); exit 1; } @@ -81,7 +89,7 @@ if ($target_rt eq "") sub main { my $found_rt = undef; - open IBNET_TOPO, "<$IBswcountlimits::cache_dir/ibnetdiscover.topology" or die "Failed to open ibnet topology\n"; + open IBNET_TOPO, "<$cache_file" or die "Failed to open ibnet topology\n"; my $in_rt = "no"; my %ports = undef; while (my $line = ) diff --git a/infiniband-diags/scripts/ibprintswitch.pl b/infiniband-diags/scripts/ibprintswitch.pl index 5ab8f65..d28a839 100755 --- a/infiniband-diags/scripts/ibprintswitch.pl +++ b/infiniband-diags/scripts/ibprintswitch.pl @@ -48,25 +48,33 @@ sub usage_and_exit print " print only the switch specified from the ibnetdiscover output\n"; print " -R Recalculate ibnetdiscover information\n"; print " -l list switches\n"; + print " -C use selected channel adaptor name for queries\n"; + print " -P use selected channel adaptor port for queries\n"; exit 0; } my $argv0 = `basename $0`; my $regenerate_map = undef; my $list_switches = undef; +my $ca_name = ""; +my $ca_port = ""; chomp $argv0; -if (!getopts("hRl")) { usage_and_exit $argv0; } +if (!getopts("hRlC:P:")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_l) { $list_switches = $Getopt::Std::opt_l; } +if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } +if (defined $Getopt::Std::opt_P) { $ca_port = $Getopt::Std::opt_P; } my $target_switch = $ARGV[0]; -if ($regenerate_map || !(-f "$IBswcountlimits::cache_dir/ibnetdiscover.topology")) { generate_ibnetdiscover_topology; } +my $cache_file = get_cache_file($ca_name, $ca_port); + +if ($regenerate_map || !(-f "$cache_file")) { generate_ibnetdiscover_topology($ca_name, $ca_port); } if ($list_switches) { - system ("ibswitches $IBswcountlimits::cache_dir/ibnetdiscover.topology"); + system ("ibswitches $cache_file"); exit 1; } @@ -80,7 +88,7 @@ if ($target_switch eq "") sub main { my $found_switch = undef; - open IBNET_TOPO, "<$IBswcountlimits::cache_dir/ibnetdiscover.topology" or die "Failed to open ibnet topology\n"; + open IBNET_TOPO, "<$cache_file" or die "Failed to open ibnet topology\n"; my $in_switch = "no"; my %ports = undef; while (my $line = ) -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Add-C-ca_name-and-P-ca_port-to-ibprint-ca-rt-s.patch Type: application/octet-stream Size: 6561 bytes Desc: not available URL: From weiny2 at llnl.gov Fri Jan 18 17:01:10 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Fri, 18 Jan 2008 17:01:10 -0800 Subject: [ofa-general] [PATCH] Update perl script man pages for -C and -P options Message-ID: <20080118170110.3fb04fb2.weiny2@llnl.gov> >From 967a1c0c87d86ce4d86d3c870692ec6ea494a5ea Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Fri, 18 Jan 2008 16:59:38 -0800 Subject: [PATCH] Update perl script man pages for -C and -P options Signed-off-by: Ira K. Weiny --- infiniband-diags/man/iblinkinfo.8 | 7 ++++++- infiniband-diags/man/ibprintca.8 | 6 +++++- infiniband-diags/man/ibprintrt.8 | 7 ++++++- infiniband-diags/man/ibprintswitch.8 | 7 ++++++- infiniband-diags/man/ibqueryerrors.8 | 6 +++++- 5 files changed, 28 insertions(+), 5 deletions(-) diff --git a/infiniband-diags/man/iblinkinfo.8 b/infiniband-diags/man/iblinkinfo.8 index ffe17b0..943ef8f 100644 --- a/infiniband-diags/man/iblinkinfo.8 +++ b/infiniband-diags/man/iblinkinfo.8 @@ -5,7 +5,7 @@ iblinkinfo.pl \- report link info for all links in the fabric .SH SYNOPSIS .B iblinkinfo.pl - [-Rhcdl -v -S ] + [-Rhcdl -C -P -v -S ] .SH DESCRIPTION .PP @@ -37,6 +37,11 @@ Verify additional switch settings (,,) .TP \fB\-c\fR Print port capabilities (enabled and supported values) +.TP +\fB\-C \fR use the specified ca_name for the search. +.TP +\fB\-P \fR use the specified ca_port for the search. + .SH AUTHOR .TP diff --git a/infiniband-diags/man/ibprintca.8 b/infiniband-diags/man/ibprintca.8 index d6db2f7..ae304f7 100644 --- a/infiniband-diags/man/ibprintca.8 +++ b/infiniband-diags/man/ibprintca.8 @@ -5,7 +5,7 @@ ibprintca.pl \- print either the ca specified or the list of cas from the ibnetd .SH SYNOPSIS .B ibprintca.pl -[-R -l] [] +[-R -l -C -P ] [] .SH DESCRIPTION .PP @@ -30,6 +30,10 @@ Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. +.TP +\fB\-C \fR use the specified ca_name for the search. +.TP +\fB\-P \fR use the specified ca_port for the search. .SH AUTHORS .TP diff --git a/infiniband-diags/man/ibprintrt.8 b/infiniband-diags/man/ibprintrt.8 index 309d437..4929586 100644 --- a/infiniband-diags/man/ibprintrt.8 +++ b/infiniband-diags/man/ibprintrt.8 @@ -5,7 +5,7 @@ ibprintrt.pl \- print either only the router specified or a list of routers from .SH SYNOPSIS .B ibprintrt.pl -[-R -l] [] +[-R -l -C -P ] [] .SH DESCRIPTION .PP @@ -30,6 +30,11 @@ Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. +.TP +\fB\-C \fR use the specified ca_name for the search. +.TP +\fB\-P \fR use the specified ca_port for the search. + .SH AUTHOR .TP diff --git a/infiniband-diags/man/ibprintswitch.8 b/infiniband-diags/man/ibprintswitch.8 index 9a8524f..11e0a87 100644 --- a/infiniband-diags/man/ibprintswitch.8 +++ b/infiniband-diags/man/ibprintswitch.8 @@ -5,7 +5,7 @@ ibprintswitch.pl \- print either the switch specified or a list of switches from .SH SYNOPSIS .B ibprintswitch.pl -[-R -l] [] +[-R -l -C -P ] [] .SH DESCRIPTION .PP @@ -32,6 +32,11 @@ Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. +.TP +\fB\-C \fR use the specified ca_name for the search. +.TP +\fB\-P \fR use the specified ca_port for the search. + .SH AUTHORS .TP diff --git a/infiniband-diags/man/ibqueryerrors.8 b/infiniband-diags/man/ibqueryerrors.8 index ad8ebde..5de484d 100644 --- a/infiniband-diags/man/ibqueryerrors.8 +++ b/infiniband-diags/man/ibqueryerrors.8 @@ -5,7 +5,7 @@ ibqueryerrors.pl \- query and report non-zero IB port counters .SH SYNOPSIS .B ibqueryerrors.pl -[-a -c -r -R -s -S -d] +[-a -c -r -R -C -P -s -S -d] .SH DESCRIPTION .PP @@ -47,6 +47,10 @@ Report results only for the switch specified. .TP \fB\-d\fR Include the optional transmit and receive data counters. +.TP +\fB\-C \fR use the specified ca_name for the search. +.TP +\fB\-P \fR use the specified ca_port for the search. .SH AUTHOR -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Update-perl-script-man-pages-for-C-and-P-options.patch Type: application/octet-stream Size: 4747 bytes Desc: not available URL: From tippecanoepii39 at klassikradio.de Thu Jan 17 20:40:24 2008 From: tippecanoepii39 at klassikradio.de (Tyree Forbes) Date: Sat, 18 Jan 2008 06:40:24 +0200 Subject: [ofa-general] start believing in wonders, be young and attractive again Message-ID: <650978277.16677855323153@klassikradio.de> It’s OK to buy V xlw ia os gr tbw a from In my te wav rnet ph xqi ar vv ma sql ci tu es Our onli nq ne ph fta ar mlt ma al cy works 24 hours a day. We offer more than 150 me wxh di vja ca gtl ti sg ons:- V fnu ia sui gr lfo a- So nzn ma- C em ia vzx li tmx s- V yfd al aa iu mat m- X znj an asr ax- A ut mb vf ien- I it mi yg tre uh x- L by ev coq it lw ra- X ef en je ic tcb al- T bi al uv wi fg n All the p ebb ric xzz es you can find on our si hn te! Tyree Forbes -------------- next part -------------- An HTML attachment was scrubbed... URL: From alangium at bceng.com Fri Jan 18 22:55:53 2008 From: alangium at bceng.com (Moran Stafford) Date: Sat, 19 Jan 2008 11:55:53 +0500 Subject: [ofa-general] Ado6e Akrobat Pro 8 for MAC\XP\Vlsta 79, Retail 599 (save 520) Message-ID: <000501c85a60$d8c1d100$0100007f@kscvge> type "mycheapmicrosoft .com" in Internet Exp|orer adobe photoshop cs3 extended - 89 symantec norton antivirus 10.1 for mac - 29 autodesk autocad electrical 2006 - 99 sony sound forge 9.0 - 49 sony acid pro 6 - 59 symantec antivirus corporate 10 - 29 adobe illustrator cs3 - 69 ulead videostudio 11.0 plus - 39 autodesk autocad lt 2008 - 69 ulead mediastudio pro v8.0 with extras - 79 ms windows 2003 enterprise server - 69 apollo divx2dvd divx to dvd creator v3.3.0 - 29 luxology modo 301 for mac - 129 microsoft money home & business 7 - 39 google sketchup pro 6 for mac - 59 From jackm at dev.mellanox.co.il Fri Jan 18 23:27:08 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sat, 19 Jan 2008 09:27:08 +0200 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: References: <478D1A49.1080807@mellanox.co.il> <20080117153043.GA10065@minantech.com> Message-ID: <200801190927.08992.jackm@dev.mellanox.co.il> On Friday 18 January 2008 03:25, Roland Dreier wrote: > I guess you mean just implement XRC without allowing multiple > processes to share an XRC domain?  That actually seems like a sensible > thing to implement as well... This is part of the current XRC implementation -- just give -1 as the fd value in ibv_open_xrc_domain(). - Jack From vlad at lists.openfabrics.org Sat Jan 19 03:07:24 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sat, 19 Jan 2008 03:07:24 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080119-0200 daily build status Message-ID: <20080119110724.BF9B6E601A5@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.14 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.14 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18 Passed on powerpc with linux-2.6.13 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.13 Passed on ppc64 with linux-2.6.17 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.15 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.12 Passed on ia64 with linux-2.6.19 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.18 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.16 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.16 Passed on ppc64 with linux-2.6.13 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ppc64 with linux-2.6.18-8.el5 Failed: From hyvalujdm at brakemann.com Fri Jan 18 04:18:22 2008 From: hyvalujdm at brakemann.com (Bertie Arellano) Date: Sat, 18 Jan 2008 15:18:22 +0300 Subject: [ofa-general] Doctor Approved And Recommended Message-ID: <01c859e5$5cd67300$f3c98e55@hyvalujdm> Increase Your PenisWidth(Girth) By upto20% http://wooglespi.com From uk-lottery at winners.co.uk Sat Jan 19 05:18:22 2008 From: uk-lottery at winners.co.uk (Uk Award Department) Date: Sat, 19 Jan 2008 07:18:22 -0600 (CST) Subject: [ofa-general] Congratulation: You Have Won!! Message-ID: <53314.196.220.2.238.1200748702.squirrel@196.220.2.238> UK LOTTERY AWARD DEPARTMENT P. o Box 1010, Liverpool L70 1NL UNITED KINGDOM Ref: FR/9420X /05 Batch: 074/05/ ZY369 WINNING NOTIFICATION We happily announce to you the draw (#978) of the UK LOTTERY AWARD, on-line Sweepstakes program held on This day of Jenuary 2008. Your e-mail attached to ticket number: 5647560054-188 with Serial number 5368/02 jackpot lotto winners drew the lucky numbers: (-02-06-21-34-35-49-(46-)"bonus no 46. that won" you the lottery in the 2nd category i.e. match-5 plus bonus. You have therefore been approved to claim a total sum of £1,050,000 (One Million, Fifty Thousand pound sterling) in cash credited to file KUT /9023118308/03 in a total sum of £6,300,000 (Six Million Three hundred Thousand pounds Sterling) lotto jackpot shared amongst the first six(6) lucky winners in this edition. This promotion takes place weekly. Your £1,050,000 (One Million, Fifty Thousand pounds sterling) would be released to you by our payment office in Europe, Our agent will immediately process the release of your funds as soon as you contact him with your personal information. FULL NAME:............ LOCATION:............. NATIONALITY:.......... MOBILE NUMBER:........ GENDER:............... AGE:.................. Name: Mr. George Donalds E-mail: georgedonalds at hotmail.com Yours Sincerely. Mr. Eddie Wenger UK LOTTERY AWARD DEPARTMENT P.o Box 1010, Liverpool L70 1NL, United Kingdom From dsesko at ece.ac.ae Sun Jan 20 06:23:03 2008 From: dsesko at ece.ac.ae (dsesko at ece.ac.ae) Date: Sun, 20 Jan 2008 15:23:03 +0100 Subject: [ofa-general] the quicker pecker upper Message-ID: <002501c85b6f$f7af2fd0$ba90b163@iud> Natural Beauty! http://zekpka.pillsgaston95.com From meditativeqp6 at belfin.de Fri Jan 18 07:13:03 2008 From: meditativeqp6 at belfin.de (Helene John) Date: Sat, 18 Jan 2008 20:43:03 +0530 Subject: [ofa-general] A turnover in your se>. What is generic "V"? Every me gz di jq cine has a brand name and its generic. Generic me czz ans the actual chemical concentrate which is the main ingredient for that me qro dic thm ine. Since brand names obtain a patent therefore the other dr xja ug companies make the same dr ncq ug as in its generic form, so generic Vi ot a ce g dv ra is just the same as Pfizer’s V tuq ia rd g lrp ra. The only difference which is an added advantage to you is that generic Vi vvn a ep g puq ra is far cheaper than the branded V xw ia tev g jj ra but the benefits are just identical as the branded Vi vx ag na ra. The active ingredient in Vi wk a ejw g jlp ra which is used as generic is sildenafil citrate. We offer you cheap V vnd ia xdl g za ra Visit us. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Sat Jan 19 09:48:05 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Jan 2008 17:48:05 +0000 Subject: [ofa-general] lock dependency in ib_user_mad In-Reply-To: <1200346175.8962.91.camel@hrosenstock-ws.xsigo.com> References: <200801081733.m08HXX3x013059@cmf.nrl.navy.mil> <000201c856f0$c50ea430$9b37170a@amr.corp.intel.com> <1200346175.8962.91.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080119174805.GB10979@sashak.voltaire.com> On 13:29 Mon 14 Jan , Hal Rosenstock wrote: > > Has there been any OpenSM (and diags) testing with this ? I'd like Sasha > to ack this change (including testing multiple instances of opensm) > prior to submitting this to 2.6.25. I run sort of stress test with many instances of OpenSM and various diag tools for many hours (more then 48) and didn't see any problem. Sasha From sashak at voltaire.com Sat Jan 19 09:49:30 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Jan 2008 17:49:30 +0000 Subject: [ofa-general] [PATCH] opensm/opensm.spec.in: add Obsoletes rule Message-ID: <20080119174930.GC10979@sashak.voltaire.com> This addresses bug#852 and adds Obsoletes rpm rule for opensm-libs and opensm-devel packages. which should help with upgrades (from OFED-1.2x). Signed-off-by: Sasha Khapyorsky --- opensm/opensm.spec.in | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/opensm/opensm.spec.in b/opensm/opensm.spec.in index 20bcc7b..cde005f 100644 --- a/opensm/opensm.spec.in +++ b/opensm/opensm.spec.in @@ -46,6 +46,7 @@ Summary: Libraries from the opensm package Group: System Environment/Libraries Requires(post): /sbin/ldconfig Requires(postun): /sbin/ldconfig +Obsoletes: libopensm, libosmcomp, libosmvendor %description libs Shared libraries that are part of the opensm package but are also used by @@ -56,6 +57,7 @@ libraries can be installed to satisfy dependencies of other applications. Summary: Development files for OpenSM Group: System Environment/Libraries Requires: %{name}-libs = %{version}-%{release} libibumad-devel +Obsoletes: libopensm-devel, libosmcomp-devel, libosmvendor-devel %description devel Header files for OpenSM. -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Sat Jan 19 09:50:00 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Jan 2008 17:50:00 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/infiniband-diags.spec.in: add Obsoletes rule Message-ID: <20080119175000.GD10979@sashak.voltaire.com> This addresses bug#854 and adds Obsoletes rpm rule for infiniband-diags packages, which should help with upgrades (from OFED-1.2x where it was named openib-diags). Signed-off-by: Sasha Khapyorsky --- infiniband-diags/infiniband-diags.spec.in | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/infiniband-diags/infiniband-diags.spec.in b/infiniband-diags/infiniband-diags.spec.in index 75880c0..7d4884f 100644 --- a/infiniband-diags/infiniband-diags.spec.in +++ b/infiniband-diags/infiniband-diags.spec.in @@ -13,6 +13,7 @@ Source: http://www.openfabrics.org/downloads/management/@TARBALL@ Url: http://openfabrics.org/ BuildRequires: libibmad-devel, opensm-devel, libibcommon-devel, libibumad-devel Provides: perl(IBswcountlimits) +Obsoletes: openib-diags %description This package provides IB diagnostic programs and scripts needed to -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Sat Jan 19 09:53:07 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Jan 2008 17:53:07 +0000 Subject: [ofa-general] Re: [PATCH] opensm/scripts: removing trailing blanks In-Reply-To: <478E0D46.7050805@dev.mellanox.co.il> References: <478E0D46.7050805@dev.mellanox.co.il> Message-ID: <20080119175307.GE10979@sashak.voltaire.com> On 15:57 Wed 16 Jan , Yevgeny Kliteynik wrote: > Hi Sasha. > > Cosmetics - removing trailing blanks. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. > --- > opensm/scripts/opensm.conf | 6 ++-- > opensm/scripts/opensmd | 54 ++++++++++++++++++------------------ > opensm/scripts/redhat-opensm.init | 54 ++++++++++++++++++------------------ > opensm/scripts/sldd.sh | 30 ++++++++++---------- > 4 files changed, 72 insertions(+), 72 deletions(-) > > diff --git a/opensm/scripts/opensm.conf b/opensm/scripts/opensm.conf > index 3c5dcdf..2ec63d4 100644 > --- a/opensm/scripts/opensm.conf > +++ b/opensm/scripts/opensm.conf > @@ -14,7 +14,7 @@ > # none, no debug options are enabled. > DEBUG=none > > -# LMC > +# LMC BTW the patch seems badly formatted because doesn't contain blanks in original lines (with '-'). So I just removed trailing whitespaces. Sasha From sashak at voltaire.com Sat Jan 19 09:54:02 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Jan 2008 17:54:02 +0000 Subject: [ofa-general] Re: [PATCH] opensm/scripts: fixing MAXSMPS values to the right default In-Reply-To: <478E0E4E.3010206@dev.mellanox.co.il> References: <478E0E4E.3010206@dev.mellanox.co.il> Message-ID: <20080119175402.GF10979@sashak.voltaire.com> On 16:01 Wed 16 Jan , Yevgeny Kliteynik wrote: > Hi Sasha, > > OpenSM has a maxsmps default value of 4, but the scripts have > default of 0. Fixing the defaults in startup scripts. > > Please apply to ofed_1_3 and master. > > -- Yevgeny > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From sashak at voltaire.com Sat Jan 19 10:15:55 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Jan 2008 18:15:55 +0000 Subject: [ofa-general] [PATCH] opensm/lash: fix possible segfault in osm_get_lash_sl() In-Reply-To: <20080114225415.GC16009@sashak.voltaire.com> References: <18311.19963.735177.83038@kuku.melbourne.sgi.com> <20080113084001.GC1903@sashak.voltaire.com> <18313.62780.11344.45768@kuku.melbourne.sgi.com> <20080113201747.GK10650@sashak.voltaire.com> <18314.58208.527011.666002@kuku.melbourne.sgi.com> <20080114225415.GC16009@sashak.voltaire.com> Message-ID: <20080119181555.GG10979@sashak.voltaire.com> It is possible (and reproducible) that osm_get_lash_sl() is called (via SA PathRecord query) when switch was discovered already by OpenSM but yet not processed LASH and still has ->priv = NULL, it can happen during subsequent heavy sweep somewhere between subnet discovery and routing calculation phases of the sweep. Pointed out by: Max Matveev Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_ucast_lash.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c index c17046e..cf9d701 100644 --- a/opensm/opensm/osm_ucast_lash.c +++ b/opensm/opensm/osm_ucast_lash.c @@ -1429,7 +1429,7 @@ uint8_t osm_get_lash_sl(osm_opensm_t * p_osm, osm_port_t * p_src_port, return OSM_DEFAULT_SL; p_sw = get_osm_switch_from_port(p_dst_port); - if (!p_sw) + if (!p_sw || !p_sw->priv) return OSM_DEFAULT_SL; dst_id = get_lash_id(p_sw); -- 1.5.4.rc2.38.gd6da3 From sashak at voltaire.com Sat Jan 19 10:54:40 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Jan 2008 18:54:40 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/IBswcountlimits.pm: Fix comment In-Reply-To: <20080118143756.3fa2bf20.weiny2@llnl.gov> References: <20080118143756.3fa2bf20.weiny2@llnl.gov> Message-ID: <20080119185440.GH10979@sashak.voltaire.com> On 14:37 Fri 18 Jan , Ira Weiny wrote: > From 96eb9de7b1918766020e7d9621d79d86949cfc39 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 18 Jan 2008 11:39:01 -0800 > Subject: [PATCH] infiniband-diags/scripts/IBswcountlimits.pm: Fix comment > > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From sashak at voltaire.com Sat Jan 19 10:54:57 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Jan 2008 18:54:57 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/IBswcountlimits.pm: clean up long lines In-Reply-To: <20080118143812.2c767c08.weiny2@llnl.gov> References: <20080118143812.2c767c08.weiny2@llnl.gov> Message-ID: <20080119185457.GI10979@sashak.voltaire.com> On 14:38 Fri 18 Jan , Ira Weiny wrote: > From 76499fd3eef2ec847e9967e10f87eb8ecbbfc97d Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 18 Jan 2008 11:46:27 -0800 > Subject: [PATCH] infiniband-diags/scripts/IBswcountlimits.pm: clean up long lines > > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From sashak at voltaire.com Sat Jan 19 10:55:47 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 19 Jan 2008 18:55:47 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/ibprintswitch.pl: fix regex when searching for switch by name In-Reply-To: <20080118163710.4fe2a64d.weiny2@llnl.gov> References: <20080118163710.4fe2a64d.weiny2@llnl.gov> Message-ID: <20080119185547.GJ10979@sashak.voltaire.com> Hi Ira, On 16:37 Fri 18 Jan , Ira Weiny wrote: > From ac265e5c7130bf9a6b43ae2ed86ae342dbfe6bc0 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 18 Jan 2008 16:26:50 -0800 > Subject: [PATCH] infiniband-diags/scripts/ibprintswitch.pl: fix regex when searching for switch > by name > > Signed-off-by: Ira K. Weiny > --- > infiniband-diags/scripts/ibprintswitch.pl | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/infiniband-diags/scripts/ibprintswitch.pl b/infiniband-diags/scripts/ibprintswitch.pl > index d28a839..23a39b5 100755 > --- a/infiniband-diags/scripts/ibprintswitch.pl > +++ b/infiniband-diags/scripts/ibprintswitch.pl > @@ -104,7 +104,7 @@ sub main > print $ports{$port}; > } > } > - if ("0x$guid" eq $target_switch || $desc =~ /.*$target_switch\s+.*/) > + if ("0x$guid" eq $target_switch || $desc =~ /.*$target_switch.*/) When the original regex will not work? Sasha From rdreier at cisco.com Sat Jan 19 13:54:15 2008 From: rdreier at cisco.com (Roland Dreier) Date: Sat, 19 Jan 2008 13:54:15 -0800 Subject: [ofa-general] [PATCH 1/2] Update License: field in librdmacm spec file Message-ID: Update License: field to match the exact format given in http://fedoraproject.org/wiki/Packaging/LicensingGuidelines for a package available under a choice of GPL or BSD license. Signed-off-by: Roland Dreier --- librdmacm.spec.in | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/librdmacm.spec.in b/librdmacm.spec.in index 2fab9d4..bc55d07 100644 --- a/librdmacm.spec.in +++ b/librdmacm.spec.in @@ -6,7 +6,7 @@ Release: 1%{?dist} Summary: Userspace RDMA Connection Manager. Group: System Environment/Libraries -License: GPL/BSD +License: GPLv2 or BSD Url: http://www.openfabrics.org/ Source: http://www.openfabrics.org/downloads/%{name}-%{version}.tar.gz BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n) From rdreier at cisco.com Sat Jan 19 13:55:46 2008 From: rdreier at cisco.com (Roland Dreier) Date: Sat, 19 Jan 2008 13:55:46 -0800 Subject: [ofa-general] [PATCH 2/2] Update %install section of librdmacm spec file Message-ID: Change from using the %makeinstall macro to using "make install" directly. The page has this to say: "Fedora's RPM includes a %makeinstall macro but it must NOT be used when make install DESTDIR=%{buildroot} works. %makeinstall is a kludge.... It is error-prone and can have unexpected effects.... It can trigger unnecessary and wrong rebuilds.... ....it can cause broken *.la files to be installed.... Instead, Fedora packages should use: make DESTDIR=%{buildroot} install or make DESTDIR=$RPM_BUILD_ROOT install" The librdmacm package uses automake, which means that the "make DESTDIR=... install" method works fine, so we should use it. Signed-off-by: Roland Dreier --- librdmacm.spec.in | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/librdmacm.spec.in b/librdmacm.spec.in index 2fab9d4..415367b 100644 --- a/librdmacm.spec.in +++ b/librdmacm.spec.in @@ -39,7 +39,7 @@ make %{?_smp_mflags} %install rm -rf $RPM_BUILD_ROOT -%makeinstall +make DESTDIR=$RPM_BUILD_ROOT install # remove unpackaged files from the buildroot rm -f $RPM_BUILD_ROOT%{_libdir}/*.la From moshek at voltaire.com Sat Jan 19 23:59:43 2008 From: moshek at voltaire.com (Moshe Kazir) Date: Sun, 20 Jan 2008 09:59:43 +0200 Subject: [ofa-general] OFED 1.2 + Lustre 1.6.4.1 and SLES 10 SP1 In-Reply-To: <2866E324F324C34293ADBD2753AB0EE501245EFA@SRVIMPMAIL.cimcorp.com.br> References: <2866E324F324C34293ADBD2753AB0EE501245EFA@SRVIMPMAIL.cimcorp.com.br> Message-ID: <39C75744D164D948A170E9792AF8E7CAC5AD1D@exil.voltaire.com> The attached file worked for me on kernel 2.6.9-55 and OFED-1.2.5 . Vlad, I did not sent this patch to the group as I wasn't sure if it is good / needed when the luster kernel patches are not installed. Now we see that the same issue was raised on more then 3 different systems. Can you have a look and say what you think ? Can we add this fix to OFED-1.3 ? Moshe ____________________________________________________________ Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire – The Grid Backbone www.voltaire.com ________________________________ From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Henrique Leandro Ferreira de Almeida Sent: Friday, January 18, 2008 6:58 AM To: general at lists.openfabrics.org Cc: Rodrigo S. Caldeira Subject: [ofa-general] OFED 1.2 + Lustre 1.6.4.1 and SLES 10 SP1 Hello, I´m very glad to have so many good people to help us in this work. Our problem is pick up our night sleep but maybe is too simple for many of yours We follow the steps: 1 - Install SLES 10 SP1 Operating System in a x86_64 Machine....works fine 2 - Install Kernel Lustre 1.6.4.1 with old ".config" changing do monolitic the Infiniband modules and support (the basis: make, make modules, make modules_install, make install and reboot to new kernel) .... works fine too 3 - uncompress and build OFED-1.2.tgz........build fine, but many problems on load ib_ipoib 4 - Manual compile from ofa_kernel source... modules work fine 5 - uncompress and install the Lustre 1.6.4.1 from rpm source... after build the RPMS... when we load the lustre modules appears many " lustre: unkown symbols". 6 - i´ve tried everything the "google" told us... nothing happens If anybody can help us with a step-by-step, since now we can very thankfull. best regards to all Henrique Almeida CIMCORP Brazil - SP -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: zzz_0070_2.6.9-55.0.2.EL_lustre.1.6.2smp_backport.diff Type: application/octet-stream Size: 435 bytes Desc: zzz_0070_2.6.9-55.0.2.EL_lustre.1.6.2smp_backport.diff URL: From pasha at dev.mellanox.co.il Sun Jan 20 01:14:56 2008 From: pasha at dev.mellanox.co.il (Pavel Shamis (Pasha)) Date: Sun, 20 Jan 2008 11:14:56 +0200 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: <200801190927.08992.jackm@dev.mellanox.co.il> References: <478D1A49.1080807@mellanox.co.il> <20080117153043.GA10065@minantech.com> <200801190927.08992.jackm@dev.mellanox.co.il> Message-ID: <47931110.6040303@dev.mellanox.co.il> > On Friday 18 January 2008 03:25, Roland Dreier wrote: > >> I guess you mean just implement XRC without allowing multiple >> processes to share an XRC domain? That actually seems like a sensible >> thing to implement as well... >> > > This is part of the current XRC implementation -- just give -1 as the fd value > in ibv_open_xrc_domain(). > I guess Gleb talked about one of the possible XRC usages described in the paper: http://www.cs.sandia.gov/~rbbrigh/papers/ompi-ib-pvmmpi07.pdf -- Pavel Shamis (Pasha) Mellanox Technologies From ogerlitz at voltaire.com Sun Jan 20 01:45:08 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Sun, 20 Jan 2008 11:45:08 +0200 Subject: [ofa-general] [PATCH 4/16] Add checksum offload support for ipoib In-Reply-To: <1200583823.6925.47.camel@mtls03> References: <1200501463.13546.73.camel@mtls03> <478F55A4.7070706@voltaire.com> <1200583823.6925.47.camel@mtls03> Message-ID: <47931824.5040002@voltaire.com> Eli Cohen wrote: > On Thu, 2008-01-17 at 15:18 +0200, Or Gerlitz wrote: > IB_DEVICE_IP_CSUM is used by IPOIB to mark the devices which are capable > of verifying checksum. This flag is reflected in both IPOIB_FLAG_CSUM > and NETIF_F_IP_CSUM. is your suggestion to use NETIF_F_IP_CSUM instead > of IPOIB_FLAG_CSUM? Because I don't see how we can omit it unless we > require the low lwvel drivers to set cqe.csum_ok only if they are sure > that checksum is OK. lets close this after setting the exact semantics of the csum_ok flag of the work completion. >> Second, the csum_ok bit is not well defined, etc as I commented on patch #3 > I will make it clearer by stating that Mellanox devices require one more > condition but the semantics of this field is that it is set only if > checksum is known to be good. Another option is to use the condition I > put in ipoib in the low level driver and that will remove the confusion. > What do you think? I suggest that we first work on patch #3 (add checksum support to the core) to agree on the conventions and then implement what needed below (in the hw drivers) and above (in ipoib). Or From glebn at voltaire.com Sun Jan 20 02:13:02 2008 From: glebn at voltaire.com (Gleb Natapov) Date: Sun, 20 Jan 2008 12:13:02 +0200 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: References: <478D1A49.1080807@mellanox.co.il> <000201c857b9$e67ee020$a937170a@amr.corp.intel.com> <20080116073459.GA20554@minantech.com> <20080117153043.GA10065@minantech.com> Message-ID: <20080120101301.GB10065@minantech.com> On Thu, Jan 17, 2008 at 05:25:14PM -0800, Roland Dreier wrote: > > Well, I can't speak for everyone, but in my opinion if someone wants to run > > MPI job so huge that XRC absolutely has to be used to be able to actually > > finish it then he should seriously rethink his application design. > > But where do you think the crossover is where XRC starts to help MPI? > In other words do I need a 10000 process job on 32-core systems for it > to matter, or is there a significant advantage for running a 2048 > process job on 256 8-core systems? Lets do the math: N - number of processes C - number of cores QPS - qp size (assume 4K) N/C - number of nodes For non XRC case each process creates QP to each other process so the number of QPs created by each process is N (well N - 1, but we don't care) so the memory consumed by QPs from one node is: N * C * QPS For XRC case each process creates send QP for each node and receive QP for each process so the memory consumed by QPs from one node is: (N/C * C + N) * QPS => 2 * N * QPS Looking at your two examples: 1. N=10000 C=32 non XRC memory consumption: 1250M XRC memory consumption: 78.125M 2. N=2048 C=8 non XRC memory consumption: 64M XRC memory consumption: 16M As you can see the benefit grows fast with the number of cores. But it seems that applications, that are running ob big scale, rarely (if at all) create all to all connections during their run. Just one fun observation: lets assume that creating of one connection takes 500ms then in your first example creating of all connection from one process to all other processes will take 1.4 hour. Memory consumed by the QPs is not the only thing that limits scalability BTW. If each process communicates with all other processes it better be preposting enough receive buffer. With XRC if recv QP is shared by local processes and one of them goes RNR all other processes can't receive on this QP either. And with XRC/SRQ we pretty much rely on HW flow control, so this scenario will happen. Thus if you want to minimize RNRs you should prepost more buffers as job grows. -- Gleb. From ogerlitz at voltaire.com Sun Jan 20 02:13:25 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Sun, 20 Jan 2008 12:13:25 +0200 Subject: [ofa-general] [PATCH 14/16] ib/ipoib: Support modifying IPOIB CQ moderation params In-Reply-To: <1200585653.6925.57.camel@mtls03> References: <1200501508.13546.83.camel@mtls03> <478F56AD.1030006@voltaire.com> <1200585653.6925.57.camel@mtls03> Message-ID: <47931EC5.1080300@voltaire.com> Eli Cohen wrote: > On Thu, 2008-01-17 at 15:22 +0200, Or Gerlitz wrote: >> Eli Cohen wrote: >> IPoIB has one CQ. As I see it, this means that you should either let the >> user specify only one of rx or tx coalescing params, or make sure that >> the user did not provide something that the driver can not deploy, eg >> rx_usecs 12 tx_usecs 1 > I think making sure you can't provide two different values for rx and tx > will do. ok. Anyway, note that with your planned re-separation of the rx and tx CQs, with no interrupts used for the tx CQ, you would need to remove all the control given to the user on tx interrupt moderation... so if you will not let them control tx from day one your life would be easier here, no? Or. From vlad at lists.openfabrics.org Sun Jan 20 03:05:48 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 20 Jan 2008 03:05:48 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080120-0200 daily build status Message-ID: <20080120110549.235CDE601CA@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.15 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.18-53.el5 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.12 Passed on powerpc with linux-2.6.15 Passed on powerpc with linux-2.6.14 Passed on powerpc with linux-2.6.12 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.19 Passed on powerpc with linux-2.6.13 Passed on ia64 with linux-2.6.13 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.15 Passed on x86_64 with linux-2.6.13 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.14 Passed on x86_64 with linux-2.6.20 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.17 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.13 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.14 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ppc64 with linux-2.6.18-8.el5 Failed: From eterlei.piazzetta at gmail.com Sun Jan 20 04:45:07 2008 From: eterlei.piazzetta at gmail.com (eterlei piazzetta) Date: Sun, 20 Jan 2008 09:45:07 -0300 Subject: [ofa-general] =?windows-1252?q?Enfim_algo_que_=E9_um_sucesso?= Message-ID: <49c49d1b0801200445q125f7226ndef4df62e3edfb0a@mail.gmail.com> *GOSTARIA DE CONVIDÁ-LO A PARTICIPAR DE UMA CONFERENCIA QUE PODE MUDAR NOSSO FUTURO. * * * *ICII BRASIL A MANEIRA MAIS SEGURA DE INVESTIR O SEU DINHEIRO! * *Conheça hoje mesmo, o maior dos negócios dos últimos tempos no **BRASIL* *, empresa internacional com 67 anos de mercado, mais 90 milhões de investidores ao redor do mundo, abre suas **portas* * para o mercado brasileiro, venha fazer parte deste inovador processo e fature muito alto, somos os pioneiros e você poderá ser também. Eterlei Investidor ICII **http://www.usaicii.biz/* *CONFERÊNCIA ICII INVESTIMENTOS (A UNICA EMPRESA COM MMN QUE VOCÊ GANHA MESMO SEM INDICAR NINGUEM) Conheça ainda hoje a ICII Brasil e líderes do mundo todo! Para participar da conferência, entre em contato com a pessoa que te convidou, ou ACESSE O ENDEREÇO **http://www.sala.iciibr.com/ * *De segunda a sábado : início 20:29 hrs. NECESSÁRIO SE IDENTIFICAR NA SALA DA SEGUINTE FORMA C - SEU NOME – Eterlei * * Esperamos você* *Grato* ** *eterlei at hotmail.com* -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlad at dev.mellanox.co.il Sun Jan 20 05:55:32 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 20 Jan 2008 15:55:32 +0200 Subject: [ofa-general] OFED 1.2 + Lustre 1.6.4.1 and SLES 10 SP1 In-Reply-To: <39C75744D164D948A170E9792AF8E7CAC5AD1D@exil.voltaire.com> References: <2866E324F324C34293ADBD2753AB0EE501245EFA@SRVIMPMAIL.cimcorp.com.br> <39C75744D164D948A170E9792AF8E7CAC5AD1D@exil.voltaire.com> Message-ID: <479352D4.3080603@dev.mellanox.co.il> Moshe Kazir wrote: > The attached file worked for me on kernel 2.6.9-55 and OFED-1.2.5 . > > > Vlad, > > I did not sent this patch to the group as I wasn't sure if it is good / > needed when the luster kernel patches are not installed. > > Now we see that the same issue was raised on more then 3 different systems. > > Can you have a look and say what you think ? > > Can we add this fix to OFED-1.3 ? > > Moshe > > ____________________________________________________________ > > Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) > > > > Voltaire – _The Grid Backbone_ > > _ _ > > www.voltaire.com > > > Hi Moshe, This problem should occur on RHEL4.0 Update[3-6] (kernels 2.6.9-[34|42|55|67]. Your patch can be added as a backport patch for these Distributions. Regards, Vladimir From shintaro at neilsbooks.com Sun Jan 20 06:15:34 2008 From: shintaro at neilsbooks.com (Ruby Conner) Date: , 20 Jan 2008 16:15:34 +0200 Subject: [ofa-general] Make absolutely safe purchase with CanadianPharmacy Message-ID: <994233722.96647691853892@neilsbooks.com> Looking for cheap drugs? What about 20% discount for extremely high quality products? Don't hesitate to purchase products from a reliable source at incredibly low prices.CanadianPharmacy offers high quality Canadian products meeting all Pharmaceutical Standards. Wide selection of products which are cheaper than American ones are available to order online. Easy, secure and confidential ordering process.Get 12 free pills for over $300 order.Save time and money with CanadianPharmacy. http://geocities.com/salliefields78/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From moshek at voltaire.com Sun Jan 20 06:29:23 2008 From: moshek at voltaire.com (Moshe Kazir) Date: Sun, 20 Jan 2008 16:29:23 +0200 Subject: [ofa-general] OFED 1.2 + Lustre 1.6.4.1 and SLES 10 SP1 In-Reply-To: <479352D4.3080603@dev.mellanox.co.il> References: <2866E324F324C34293ADBD2753AB0EE501245EFA@SRVIMPMAIL.cimcorp.com.br> <39C75744D164D948A170E9792AF8E7CAC5AD1D@exil.voltaire.com> <479352D4.3080603@dev.mellanox.co.il> Message-ID: <39C75744D164D948A170E9792AF8E7CAC5AD29@exil.voltaire.com> O.K. If you think so please add it , But -> - Why we face this problem only when Luster is added ? - And as the see this user reports SLES 10 sp 1 ? Moshe ____________________________________________________________ Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -----Original Message----- From: Vladimir Sokolovsky [mailto:vlad at dev.mellanox.co.il] Sent: Sunday, January 20, 2008 3:56 PM To: Moshe Kazir Cc: Henrique Leandro Ferreira de Almeida; general at lists.openfabrics.org; Vladimir Sokolovsky (Mellanox); Rodrigo S. Caldeira Subject: Re: [ofa-general] OFED 1.2 + Lustre 1.6.4.1 and SLES 10 SP1 Moshe Kazir wrote: > The attached file worked for me on kernel 2.6.9-55 and OFED-1.2.5 . > > > Vlad, > > I did not sent this patch to the group as I wasn't sure if it is good > / needed when the luster kernel patches are not installed. > > Now we see that the same issue was raised on more then 3 different systems. > > Can you have a look and say what you think ? > > Can we add this fix to OFED-1.3 ? > > Moshe > > ____________________________________________________________ > > Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) > > > > Voltaire - _The Grid Backbone_ > > _ _ > > www.voltaire.com > > > Hi Moshe, This problem should occur on RHEL4.0 Update[3-6] (kernels 2.6.9-[34|42|55|67]. Your patch can be added as a backport patch for these Distributions. Regards, Vladimir From eli at mellanox.co.il Sun Jan 20 06:32:50 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Sun, 20 Jan 2008 16:32:50 +0200 Subject: [ofa-general] [PATCH 14/16] ib/ipoib: Support modifying IPOIB CQ moderation params In-Reply-To: <47931EC5.1080300@voltaire.com> References: <1200501508.13546.83.camel@mtls03> <478F56AD.1030006@voltaire.com> <1200585653.6925.57.camel@mtls03> <47931EC5.1080300@voltaire.com> Message-ID: <1200839570.6925.130.camel@mtls03> On Sun, 2008-01-20 at 12:13 +0200, Or Gerlitz wrote: > Eli Cohen wrote: > > On Thu, 2008-01-17 at 15:22 +0200, Or Gerlitz wrote: > >> Eli Cohen wrote: > > >> IPoIB has one CQ. As I see it, this means that you should either let the > >> user specify only one of rx or tx coalescing params, or make sure that > >> the user did not provide something that the driver can not deploy, eg > >> rx_usecs 12 tx_usecs 1 > > > I think making sure you can't provide two different values for rx and tx > > will do. > > ok. > > Anyway, note that with your planned re-separation of the rx and tx CQs, > with no interrupts used for the tx CQ, you would need to remove all the > control given to the user on tx interrupt moderation... so if you will > not let them control tx from day one your life would be easier here, no? > Thinking it over , if I want to make the code correct as it is now (e.g. a single cq for both tx and rx), I must require that the user supply both tx and rx values in each invocation of ethtool and that these values be identical. Alternatively we can say that rx dictates configuration for both rx and tx. When we get separate tx and rx CQs we I will modify this patch to exclude tx configuration. From sashak at voltaire.com Sun Jan 20 07:04:21 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 15:04:21 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/ibprintswitch.pl: fix regex when searching for switch by name In-Reply-To: <20080119185547.GJ10979@sashak.voltaire.com> References: <20080118163710.4fe2a64d.weiny2@llnl.gov> <20080119185547.GJ10979@sashak.voltaire.com> Message-ID: <20080120150421.GM10650@sashak.voltaire.com> On 18:55 Sat 19 Jan , Sasha Khapyorsky wrote: > > diff --git a/infiniband-diags/scripts/ibprintswitch.pl b/infiniband-diags/scripts/ibprintswitch.pl > > index d28a839..23a39b5 100755 > > --- a/infiniband-diags/scripts/ibprintswitch.pl > > +++ b/infiniband-diags/scripts/ibprintswitch.pl > > @@ -104,7 +104,7 @@ sub main > > print $ports{$port}; > > } > > } > > - if ("0x$guid" eq $target_switch || $desc =~ /.*$target_switch\s+.*/) > > + if ("0x$guid" eq $target_switch || $desc =~ /.*$target_switch.*/) > > When the original regex will not work? Actually I think I see why this change is. You want to match node description as regex. Looks fine for me. Sasha From sashak at voltaire.com Sun Jan 20 07:04:41 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 15:04:41 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/ibprintswitch.pl: fix regex when searching for switch by name In-Reply-To: <20080118163710.4fe2a64d.weiny2@llnl.gov> References: <20080118163710.4fe2a64d.weiny2@llnl.gov> Message-ID: <20080120150441.GN10650@sashak.voltaire.com> On 16:37 Fri 18 Jan , Ira Weiny wrote: > From ac265e5c7130bf9a6b43ae2ed86ae342dbfe6bc0 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 18 Jan 2008 16:26:50 -0800 > Subject: [PATCH] infiniband-diags/scripts/ibprintswitch.pl: fix regex when searching for switch > by name > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From sashak at voltaire.com Sun Jan 20 07:05:22 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 15:05:22 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/ibprintrt.pl: fix regex when searching for router by name In-Reply-To: <20080118163718.14017758.weiny2@llnl.gov> References: <20080118163718.14017758.weiny2@llnl.gov> Message-ID: <20080120150522.GO10650@sashak.voltaire.com> On 16:37 Fri 18 Jan , Ira Weiny wrote: > From efcac0f06884c15c9a3abf97dee99953d9c5112f Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 18 Jan 2008 16:27:40 -0800 > Subject: [PATCH] infiniband-diags/scripts/ibprintrt.pl: fix regex when searching for router by > name > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From sashak at voltaire.com Sun Jan 20 07:05:48 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 15:05:48 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/ibprintca.pl: fix regex when searching for node by name In-Reply-To: <20080118163722.0e7ad23a.weiny2@llnl.gov> References: <20080118163722.0e7ad23a.weiny2@llnl.gov> Message-ID: <20080120150548.GP10650@sashak.voltaire.com> On 16:37 Fri 18 Jan , Ira Weiny wrote: > From 6a94e0a34428a067c125bcb11ed9f306e60c35c9 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 18 Jan 2008 16:24:37 -0800 > Subject: [PATCH] infiniband-diags/scripts/ibprintca.pl: fix regex when searching for node by > name > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From sashak at voltaire.com Sun Jan 20 07:06:08 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 15:06:08 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/iblinkinfo.pl: clean up output format In-Reply-To: <20080118163728.19514876.weiny2@llnl.gov> References: <20080118163728.19514876.weiny2@llnl.gov> Message-ID: <20080120150608.GQ10650@sashak.voltaire.com> On 16:37 Fri 18 Jan , Ira Weiny wrote: > From b29ba8352381f21c178a57d5e0bac0d82bfb64ab Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 18 Jan 2008 16:06:06 -0800 > Subject: [PATCH] infiniband-diags/scripts/iblinkinfo.pl: clean up output format > > - make "line mode" explicit in the code > - Better align output fields > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From sashak at voltaire.com Sun Jan 20 07:06:34 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 15:06:34 +0000 Subject: [ofa-general] Re: [PATCH] Add -C and -P to ibprint[ca|rt|switch].pl In-Reply-To: <20080118163736.2bab0541.weiny2@llnl.gov> References: <20080118163736.2bab0541.weiny2@llnl.gov> Message-ID: <20080120150634.GR10650@sashak.voltaire.com> On 16:37 Fri 18 Jan , Ira Weiny wrote: > From 5b2317bf383d22e9ccb87d3c1d24735b3e508ab4 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 18 Jan 2008 16:21:28 -0800 > Subject: [PATCH] Add -C and -P to ibprint[ca|rt|switch].pl > > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From sashak at voltaire.com Sun Jan 20 07:07:06 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 15:07:06 +0000 Subject: [ofa-general] Re: [PATCH] Update perl script man pages for -C and -P options In-Reply-To: <20080118170110.3fb04fb2.weiny2@llnl.gov> References: <20080118170110.3fb04fb2.weiny2@llnl.gov> Message-ID: <20080120150706.GS10650@sashak.voltaire.com> On 17:01 Fri 18 Jan , Ira Weiny wrote: > From 967a1c0c87d86ce4d86d3c870692ec6ea494a5ea Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Fri, 18 Jan 2008 16:59:38 -0800 > Subject: [PATCH] Update perl script man pages for -C and -P options > > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From vlad at mellanox.co.il Sun Jan 20 07:23:05 2008 From: vlad at mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 20 Jan 2008 17:23:05 +0200 Subject: [ofa-general] OFED 1.2 + Lustre 1.6.4.1 and SLES 10 SP1 In-Reply-To: <39C75744D164D948A170E9792AF8E7CAC5AD29@exil.voltaire.com> Message-ID: <6C2C79E72C305246B504CBA17B5500C90168D750@mtlexch01.mtl.com> Moshe, Looking again at the git tree and your patch I see that your patch should not solve any compialtion issue. As I know, there are no compilation issues on RHEL4 U[4-6], SLES10 SP1 and on Lustre kernels. What issue you trying to resolve? - Vladimir. > -----Original Message----- > From: Moshe Kazir [mailto:moshek at voltaire.com] > Sent: Sunday, January 20, 2008 4:29 PM > To: Vladimir Sokolovsky > Cc: Henrique Leandro Ferreira de Almeida; > general at lists.openfabrics.org; Vladimir Sokolovsky > (Mellanox); Rodrigo S. Caldeira > Subject: RE: [ofa-general] OFED 1.2 + Lustre 1.6.4.1 and SLES 10 SP1 > > > O.K. > > If you think so please add it , > > But -> > > - Why we face this problem only when Luster is added ? > - And as the see this user reports SLES 10 sp 1 ? > > Moshe > > > ____________________________________________________________ > Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) > > Voltaire - The Grid Backbone > > www.voltaire.com > > > > -----Original Message----- > From: Vladimir Sokolovsky [mailto:vlad at dev.mellanox.co.il] > Sent: Sunday, January 20, 2008 3:56 PM > To: Moshe Kazir > Cc: Henrique Leandro Ferreira de Almeida; > general at lists.openfabrics.org; Vladimir Sokolovsky > (Mellanox); Rodrigo S. Caldeira > Subject: Re: [ofa-general] OFED 1.2 + Lustre 1.6.4.1 and SLES 10 SP1 > > Moshe Kazir wrote: > > The attached file worked for me on kernel 2.6.9-55 and OFED-1.2.5 . > > > > > > Vlad, > > > > I did not sent this patch to the group as I wasn't sure if > it is good > > > / needed when the luster kernel patches are not installed. > > > > Now we see that the same issue was raised on more then 3 different > systems. > > > > Can you have a look and say what you think ? > > > > Can we add this fix to OFED-1.3 ? > > > > Moshe > > > > ____________________________________________________________ > > > > Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) > > > > > > > > Voltaire - _The Grid Backbone_ > > > > _ _ > > > > www.voltaire.com > > > > > > > > Hi Moshe, > This problem should occur on RHEL4.0 Update[3-6] (kernels > 2.6.9-[34|42|55|67]. Your patch can be added as a backport > patch for these Distributions. > > > Regards, > Vladimir > From sashak at voltaire.com Sun Jan 20 11:58:28 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 19:58:28 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/saquery: usage field to struct query_cmd Message-ID: <20080120195828.GT10650@sashak.voltaire.com> Add usage field to struct query_cmd. When initialized this usage text will be shown as output of 'saquery --help' for the specific query type. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/src/saquery.c | 13 +++++++++---- 1 files changed, 9 insertions(+), 4 deletions(-) diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c index 8c0aff8..d16e604 100644 --- a/infiniband-diags/src/saquery.c +++ b/infiniband-diags/src/saquery.c @@ -1,6 +1,6 @@ /* * Copyright (c) 2006,2007 The Regents of the University of California. - * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved. + * Copyright (c) 2004-2008 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * @@ -60,6 +60,7 @@ struct query_cmd { const char *name, *alias; ib_net16_t query_type; + const char *usage; int (*handler)(const struct query_cmd *q, osm_bind_handle_t bind_handle, int argc, char *argv[]); }; @@ -1418,13 +1419,17 @@ static const struct query_cmd query_cmds[] = { { "NodeRecord", "NR", IB_MAD_ATTR_NODE_RECORD, }, { "PortInfoRecord", "PIR", IB_MAD_ATTR_PORTINFO_RECORD, }, { "SL2VLTableRecord", "SL2VL", IB_MAD_ATTR_SLVL_RECORD, + "[[lid]/[in_port]/[out_port]]", print_sl2vl_records }, { "PKeyTableRecord", "PKTR", IB_MAD_ATTR_PKEY_TBL_RECORD, + "[[lid]/[port]/[block]]", print_pkey_tbl_records }, { "VLArbitrationTableRecord", "VLAR", IB_MAD_ATTR_VLARB_RECORD, + "[[lid]/[port]/[block]]", print_vlarb_records }, { "InformInfoRecord", "IIR", IB_MAD_ATTR_INFORM_INFO_RECORD, }, - { "LinkRecord", "LR", IB_MAD_ATTR_LINK_RECORD, }, + { "LinkRecord", "LR", IB_MAD_ATTR_LINK_RECORD, + "[[from_lid]/[from_port]] [[to_lid]/[to_port]]", }, { "ServiceRecord", "SR", IB_MAD_ATTR_SERVICE_RECORD, }, { "PathRecord", "PR", IB_MAD_ATTR_PATH_RECORD, }, { "MCMemberRecord", "MCMR", IB_MAD_ATTR_MCMEMBER_RECORD, }, @@ -1488,8 +1493,8 @@ usage(void) fprintf(stderr, " --node-name-map specify a node name map\n"); fprintf(stderr, "\n Supported query names (and aliases):\n"); for (q = query_cmds; q->name; q++) - fprintf(stderr, " %s (%s)\n", q->name, - q->alias ? q->alias : ""); + fprintf(stderr, " %s (%s) %s\n", q->name, + q->alias ? q->alias : "", q->usage ? q->usage : ""); fprintf(stderr, "\n"); exit(-1); -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Sun Jan 20 12:06:39 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 20:06:39 +0000 Subject: [ofa-general] [PATCH] opensm: rename field pkey to pkey_ix in gsi part of mad address In-Reply-To: <20080115195002.GL16009@sashak.voltaire.com> References: <4781FB41.6040204@dev.mellanox.co.il> <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <20080115182142.GC16009@sashak.voltaire.com> <20080115113533.10c3c7c9.weiny2@llnl.gov> <20080115195002.GL16009@sashak.voltaire.com> Message-ID: <20080120200639.GU10650@sashak.voltaire.com> Since this field is used for storing pkey index and not pkey value rename it to pkey_ix in order to avoid confusions. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_madw.h | 2 +- opensm/libvendor/osm_vendor_al.c | 2 +- opensm/libvendor/osm_vendor_ibumad.c | 4 ++-- opensm/libvendor/osm_vendor_mlx_ibmgt.c | 2 +- opensm/libvendor/osm_vendor_mlx_sa.c | 2 +- opensm/libvendor/osm_vendor_mlx_sim.c | 8 +------- opensm/libvendor/osm_vendor_mlx_ts.c | 6 +----- opensm/libvendor/osm_vendor_mlx_ts_anafa.c | 8 +------- opensm/libvendor/osm_vendor_mtl.c | 2 +- opensm/libvendor/osm_vendor_ts.c | 8 +------- opensm/libvendor/osm_vendor_umadt.c | 5 ++--- opensm/opensm/osm_perfmgr.c | 2 +- opensm/opensm/osm_sa.c | 9 +++++---- 13 files changed, 19 insertions(+), 41 deletions(-) diff --git a/opensm/include/opensm/osm_madw.h b/opensm/include/opensm/osm_madw.h index 31707ad..f957a99 100644 --- a/opensm/include/opensm/osm_madw.h +++ b/opensm/include/opensm/osm_madw.h @@ -385,7 +385,7 @@ typedef struct _osm_mad_addr { struct _gsi { ib_net32_t remote_qp; ib_net32_t remote_qkey; - ib_net16_t pkey; + uint16_t pkey_ix; uint8_t service_level; boolean_t global_route; ib_grh_t grh_info; diff --git a/opensm/libvendor/osm_vendor_al.c b/opensm/libvendor/osm_vendor_al.c index 34ecf30..7d497c5 100644 --- a/opensm/libvendor/osm_vendor_al.c +++ b/opensm/libvendor/osm_vendor_al.c @@ -294,7 +294,7 @@ __osm_al_rcv_callback(IN void *mad_svc_context, IN ib_mad_element_t * p_elem) } else { mad_addr.addr_type.gsi.remote_qp = p_elem->remote_qp; mad_addr.addr_type.gsi.remote_qkey = p_elem->remote_qkey; - mad_addr.addr_type.gsi.pkey = p_elem->pkey_index; + mad_addr.addr_type.gsi.pkey_ix = p_elem->pkey_index; mad_addr.addr_type.gsi.service_level = p_elem->remote_sl; mad_addr.addr_type.gsi.global_route = FALSE; } diff --git a/opensm/libvendor/osm_vendor_ibumad.c b/opensm/libvendor/osm_vendor_ibumad.c index 8d1f070..38c6628 100644 --- a/opensm/libvendor/osm_vendor_ibumad.c +++ b/opensm/libvendor/osm_vendor_ibumad.c @@ -215,7 +215,7 @@ ib_mad_addr_conv(ib_user_mad_t * umad, osm_mad_addr_t * osm_mad_addr, osm_mad_addr->addr_type.gsi.remote_qp = ib_mad_addr->qpn; osm_mad_addr->addr_type.gsi.remote_qkey = ib_mad_addr->qkey; - osm_mad_addr->addr_type.gsi.pkey = umad_get_pkey(umad); + osm_mad_addr->addr_type.gsi.pkey_ix = umad_get_pkey(umad); osm_mad_addr->addr_type.gsi.service_level = ib_mad_addr->sl; osm_mad_addr->addr_type.gsi.global_route = 0; /* FIXME: handle GRH */ memset(&osm_mad_addr->addr_type.gsi.grh_info, 0, @@ -1047,7 +1047,7 @@ osm_vendor_send(IN osm_bind_handle_t h_bind, p_mad_addr->addr_type.gsi.service_level, IB_QP1_WELL_KNOWN_Q_KEY); umad_set_grh(p_vw->umad, 0); /* FIXME: GRH support */ - umad_set_pkey(p_vw->umad, p_mad_addr->addr_type.gsi.pkey); + umad_set_pkey(p_vw->umad, p_mad_addr->addr_type.gsi.pkey_ix); if (ib_class_is_rmpp(p_mad->mgmt_class)) { /* RMPP GSI classes FIXME: no GRH */ if (!ib_rmpp_is_flag_set((ib_rmpp_mad_t *) p_sa, IB_RMPP_FLAG_ACTIVE)) { diff --git a/opensm/libvendor/osm_vendor_mlx_ibmgt.c b/opensm/libvendor/osm_vendor_mlx_ibmgt.c index 3841356..b3d72f7 100644 --- a/opensm/libvendor/osm_vendor_mlx_ibmgt.c +++ b/opensm/libvendor/osm_vendor_mlx_ibmgt.c @@ -720,7 +720,7 @@ __osmv_IBMGT_rcv_desc_to_osm_addr(IN IB_MGT_mad_rcv_desc_t * p_rcv_desc, /* since this does not seem reasonable to me I simply use the default */ /* There is a TAVOR limitation that only one P_KEY is supported per */ /* QP - so QP1 must use IB_DEFAULT_PKEY */ - p_mad_addr->addr_type.gsi.pkey = IB_DEFAULT_PKEY; + p_mad_addr->addr_type.gsi.pkey_ix = 0; p_mad_addr->addr_type.gsi.service_level = p_rcv_desc->sl; p_mad_addr->addr_type.gsi.global_route = p_rcv_desc->grh_flag; diff --git a/opensm/libvendor/osm_vendor_mlx_sa.c b/opensm/libvendor/osm_vendor_mlx_sa.c index bf44413..aeb8542 100644 --- a/opensm/libvendor/osm_vendor_mlx_sa.c +++ b/opensm/libvendor/osm_vendor_mlx_sa.c @@ -533,7 +533,7 @@ __osmv_send_sa_req(IN osmv_sa_bind_info_t * p_bind, p_madw->mad_addr.addr_type.smi.source_lid = cl_hton16(p_bind->lid); p_madw->mad_addr.addr_type.gsi.remote_qp = CL_HTON32(1); p_madw->mad_addr.addr_type.gsi.remote_qkey = IB_QP1_WELL_KNOWN_Q_KEY; - p_madw->mad_addr.addr_type.gsi.pkey = IB_DEFAULT_PKEY; + p_madw->mad_addr.addr_type.gsi.pkey_ix = 0; p_madw->resp_expected = TRUE; p_madw->fail_msg = CL_DISP_MSGID_NONE; diff --git a/opensm/libvendor/osm_vendor_mlx_sim.c b/opensm/libvendor/osm_vendor_mlx_sim.c index fb5687b..c700759 100644 --- a/opensm/libvendor/osm_vendor_mlx_sim.c +++ b/opensm/libvendor/osm_vendor_mlx_sim.c @@ -382,13 +382,7 @@ __osmv_ibms_mad_addr_to_osm_addr(IN osm_vendor_t const *p_vend, p_osm_addr->addr_type.gsi.remote_qp = cl_ntoh32(p_ibms_addr->sqpn); p_osm_addr->addr_type.gsi.remote_qkey = IB_QP1_WELL_KNOWN_Q_KEY; - /* we do have the p_osm_addr->pkey_ix but how to get the PKey by index ? */ - /* the only way seems to be to use VAPI_query_hca_pkey_tbl and obtain */ - /* the full PKey table - than go by the index. */ - /* since this does not seem reasonable to me I simply use the default */ - /* There is a TAVOR limitation that only one P_KEY is supported per */ - /* QP - so QP1 must use IB_DEFAULT_PKEY */ - p_osm_addr->addr_type.gsi.pkey = IB_DEFAULT_PKEY; + p_osm_addr->addr_type.gsi.pkey_ix = p_ibms_addr->pkey_index; p_osm_addr->addr_type.gsi.service_level = p_ibms_addr->sl; p_osm_addr->addr_type.gsi.global_route = FALSE; diff --git a/opensm/libvendor/osm_vendor_mlx_ts.c b/opensm/libvendor/osm_vendor_mlx_ts.c index 7bddeed..f5ca136 100644 --- a/opensm/libvendor/osm_vendor_mlx_ts.c +++ b/opensm/libvendor/osm_vendor_mlx_ts.c @@ -443,13 +443,9 @@ __osmv_TOPSPIN_mad_addr_to_osm_addr(IN osm_vendor_t const *p_vend, /* GSI */ p_mad_addr->addr_type.gsi.remote_qp = cl_ntoh32(p_mad->sqpn); p_mad_addr->addr_type.gsi.remote_qkey = IB_QP1_WELL_KNOWN_Q_KEY; - /* we do have the p_mad_addr->pkey_ix but how to get the PKey by index ? */ - /* the only way seems to be to use VAPI_query_hca_pkey_tbl and obtain */ - /* the full PKey table - than go by the index. */ - /* since this does not seem reasonable to me I simply use the default */ /* There is a TAVOR limitation that only one P_KEY is supported per */ /* QP - so QP1 must use IB_DEFAULT_PKEY */ - p_mad_addr->addr_type.gsi.pkey = IB_DEFAULT_PKEY; + p_mad_addr->addr_type.gsi.pkey_ix = p_mad->pkey_index; p_mad_addr->addr_type.gsi.service_level = p_mad->sl; p_mad_addr->addr_type.gsi.global_route = FALSE; diff --git a/opensm/libvendor/osm_vendor_mlx_ts_anafa.c b/opensm/libvendor/osm_vendor_mlx_ts_anafa.c index 8bb3a36..9cbe1b6 100644 --- a/opensm/libvendor/osm_vendor_mlx_ts_anafa.c +++ b/opensm/libvendor/osm_vendor_mlx_ts_anafa.c @@ -393,13 +393,7 @@ __osmv_TOPSPIN_ANAFA_mad_addr_to_osm_addr(IN osm_vendor_t const *p_vend, /* GSI */ p_mad_addr->addr_type.gsi.remote_qp = p_mad->sqpn; p_mad_addr->addr_type.gsi.remote_qkey = IB_QP1_WELL_KNOWN_Q_KEY; - /* we do have the p_mad_addr->pkey_ix but how to get the PKey by index ? */ - /* the only way seems to be to use VAPI_query_hca_pkey_tbl and obtain */ - /* the full PKey table - than go by the index. */ - /* since this does not seem reasonable to me I simply use the default */ - /* There is a TAVOR limitation that only one P_KEY is supported per */ - /* QP - so QP1 must use IB_DEFAULT_PKEY */ - p_mad_addr->addr_type.gsi.pkey = IB_DEFAULT_PKEY; + p_mad_addr->addr_type.gsi.pkey_ix = p_mad->pkey_index; p_mad_addr->addr_type.gsi.service_level = p_mad->sl; p_mad_addr->addr_type.gsi.global_route = FALSE; diff --git a/opensm/libvendor/osm_vendor_mtl.c b/opensm/libvendor/osm_vendor_mtl.c index 8b919e4..d8f6715 100644 --- a/opensm/libvendor/osm_vendor_mtl.c +++ b/opensm/libvendor/osm_vendor_mtl.c @@ -114,7 +114,7 @@ __osm_mtl_conv_ibmgt_rcv_desc_to_osm_addr(IN osm_vendor_t * const p_vend, /* since this does not seem reasonable to me I simply use the default */ /* There is a TAVOR limitation that only one P_KEY is supported per */ /* QP - so QP1 must use IB_DEFAULT_PKEY */ - p_mad_addr->addr_type.gsi.pkey = IB_DEFAULT_PKEY; + p_mad_addr->addr_type.gsi.pkey_ix = 0; p_mad_addr->addr_type.gsi.service_level = p_rcv_desc->sl; p_mad_addr->addr_type.gsi.global_route = p_rcv_desc->grh_flag; diff --git a/opensm/libvendor/osm_vendor_ts.c b/opensm/libvendor/osm_vendor_ts.c index 0dad230..365d609 100644 --- a/opensm/libvendor/osm_vendor_ts.c +++ b/opensm/libvendor/osm_vendor_ts.c @@ -96,13 +96,7 @@ __osm_ts_conv_mad_rcv_desc_to_osm_addr(IN osm_vendor_t * const p_vend, /* GSI */ p_mad_addr->addr_type.gsi.remote_qp = p_mad->sqpn; p_mad_addr->addr_type.gsi.remote_qkey = IB_QP1_WELL_KNOWN_Q_KEY; - /* we do have the p_mad_addr->pkey_ix but how to get the PKey by index ? */ - /* the only way seems to be to use VAPI_query_hca_pkey_tbl and obtain */ - /* the full PKey table - than go by the index. */ - /* since this does not seem reasonable to me I simply use the default */ - /* There is a TAVOR limitation that only one P_KEY is supported per */ - /* QP - so QP1 must use IB_DEFAULT_PKEY */ - p_mad_addr->addr_type.gsi.pkey = IB_DEFAULT_PKEY; + p_mad_addr->addr_type.gsi.pkey_ix = p_mad->pkey_index; p_mad_addr->addr_type.gsi.service_level = 0; /* HACK no way to know */ p_mad_addr->addr_type.gsi.global_route = FALSE; /* HACK no way to know */ diff --git a/opensm/libvendor/osm_vendor_umadt.c b/opensm/libvendor/osm_vendor_umadt.c index e761452..b68d6c1 100644 --- a/opensm/libvendor/osm_vendor_umadt.c +++ b/opensm/libvendor/osm_vendor_umadt.c @@ -549,7 +549,7 @@ osm_vendor_send(IN osm_bind_handle_t h_bind, p_mad_addr->addr_type.gsi.remote_qp; destAddr.AddrType.Gsi.RemoteQkey = p_mad_addr->addr_type.gsi.remote_qkey; - destAddr.AddrType.Gsi.PKey = p_mad_addr->addr_type.gsi.pkey; + destAddr.AddrType.Gsi.PKey = OSM_DEFAULT_PKEY; destAddr.AddrType.Gsi.ServiceLevel = p_mad_addr->addr_type.gsi.service_level; destAddr.AddrType.Gsi.GlobalRoute = @@ -962,8 +962,7 @@ void __mad_recv_processor(IN void *context) pRecvCmp->AddressInfo.AddrType.Gsi.RemoteQpNumber; osm_mad_addr.addr_type.gsi.remote_qkey = pRecvCmp->AddressInfo.AddrType.Gsi.RemoteQkey; - osm_mad_addr.addr_type.gsi.pkey = - pRecvCmp->AddressInfo.AddrType.Gsi.PKey; + osm_mad_addr.addr_type.gsi.pkey_ix = 0; osm_mad_addr.addr_type.gsi.service_level = pRecvCmp->AddressInfo.AddrType.Gsi.ServiceLevel; osm_mad_addr.addr_type.gsi.global_route = diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index 66d919d..860a20d 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -399,7 +399,7 @@ osm_perfmgr_send_pc_mad(osm_perfmgr_t * perfmgr, ib_net16_t dest_lid, p_madw->mad_addr.addr_type.gsi.remote_qkey = cl_hton32(IB_QP1_WELL_KNOWN_Q_KEY); /* FIXME what about other partitions */ - p_madw->mad_addr.addr_type.gsi.pkey = 0; + p_madw->mad_addr.addr_type.gsi.pkey_ix = 0; p_madw->mad_addr.addr_type.gsi.service_level = 0; p_madw->mad_addr.addr_type.gsi.global_route = FALSE; p_madw->resp_expected = TRUE; diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c index 740fef5..c286259 100644 --- a/opensm/opensm/osm_sa.c +++ b/opensm/opensm/osm_sa.c @@ -442,7 +442,7 @@ static void sa_dump_one_inform(cl_list_item_t * p_list_item, void *cxt) " qpn_resp_time_val=0x%x" " node_type=0x%06x" " rep_addr: lid=0x%04x path_bits=0x%02x static_rate=0x%02x" - " remote_qp=0x%08x remote_qkey=0x%08x pkey=0x%04x sl=0x%02x" + " remote_qp=0x%08x remote_qkey=0x%08x pkey_ix=0x%04x sl=0x%02x" "\n\n", cl_ntoh64(p_iir->subscriber_gid.unicast.prefix), cl_ntoh64(p_iir->subscriber_gid.unicast.interface_id), @@ -462,7 +462,7 @@ static void sa_dump_one_inform(cl_list_item_t * p_list_item, void *cxt) p_infr->report_addr.static_rate, cl_ntoh32(p_infr->report_addr.addr_type.gsi.remote_qp), cl_ntoh32(p_infr->report_addr.addr_type.gsi.remote_qkey), - cl_ntoh16(p_infr->report_addr.addr_type.gsi.pkey), + p_infr->report_addr.addr_type.gsi.pkey_ix, p_infr->report_addr.addr_type.gsi.service_level); } @@ -886,6 +886,7 @@ int osm_sa_db_file_load(osm_opensm_t * p_osm) } else if (!strncmp(p, "InformInfo Record:", 18)) { ib_inform_info_record_t i_rec; osm_mad_addr_t rep_addr; + ib_net16_t val16; p_mgrp = NULL; memset(&i_rec, 0, sizeof(i_rec)); @@ -931,8 +932,8 @@ int osm_sa_db_file_load(osm_opensm_t * p_osm) &rep_addr.addr_type.gsi.remote_qp); PARSE_AHEAD(p, net32, " remote_qkey=0x", &rep_addr.addr_type.gsi.remote_qkey); - PARSE_AHEAD(p, net16, " pkey=0x", - &rep_addr.addr_type.gsi.pkey); + PARSE_AHEAD(p, net16, " pkey_ix=0x", &val16); + rep_addr.addr_type.gsi.pkey_ix = cl_ntoh16(val16); PARSE_AHEAD(p, net8, " sl=0x", &rep_addr.addr_type.gsi.service_level); -- 1.5.4.rc2.38.gd6da3 From ll.ss19 at aliceadsl.fr Sun Jan 20 11:48:38 2008 From: ll.ss19 at aliceadsl.fr (=?iso-8859-1?Q?ll.ss19@aliceadsl.fr?=) Date: Sun, 20 Jan 2008 20:48:38 +0100 Subject: [ofa-general] GOD BLESS YOU Message-ID: Dear Friend, It is indeed my pleasure to write to you this letter, which I believe will be a surprise, met on he net we are both complete strangers. As you read this, I don't want you to feel sorry for me, because I believe everyone will die someday. My name is Ahmed Abubakar, a former oil merchant in the middle east. I have been diagnosed with Esophageal cancer which was discovered very late, due to my laxity in caring for my health. It has defiled all forms of medicine, and right now I have only about a few months to live, according to medical experts. I have not particularly lived my life so well, as I never really cared for anyone not even myself but my business. Though I am very rich, I was never generous, I was always hostile to people and only focus on my business as that was the only thing I cared for, but now I regret all this as I now know that there is more to life than just wanting to have or make all the money in the world. I believe when I have a second chance to come to this world I would live my life a different way from how I had lived it, now that it is dark for me, I have willed and given most of my properties and assets to my immediate and extended family members and as well as a few close friends. To correct my wrong past life, I have decided to give alms to charity organizations, as I want this to be one of the last good deeds I do on earth. So far, I have distributed money to some charity organizations in the U.A.E, Algeria and Malaysia. Now that my health has deteriorated so badly, I cannot do this my self anymore. I once asked members of my family to close one of my accounts and distribute the money which I have there to charity organization in Bulgaria, India and Pakistan, they refused and kept the money to themselves. Hence, I do not trust them anymore, as they seem not to be contended with what I have left for them. The last of my money which no one knows of is the huge cash deposit of Six million dollars that I have with a Fiducially Company. I will want you to help me collect this deposit and dispatched it to charity organizations and you must be sending me information's of how it was disbursed by email. I have set aside 20% for you for your time and patience. Thanks. Eng. Ahmed Abubakar. NB Please do not share my email address with anyone as I have received some emails from some unscrupulous people claiming to be charity organizations and other weird stories. However; I will be happy to hear from you from this my e-mail box{ abubakarahmed007 at yahoo.com} ---------------------- ALICE C'EST ENCORE MIEUX AVEC LA MUSIQUE ! -------------------- Découvrez vite l'offre exclusive ALICE BOX avec ALICE MUSIC, le téléchargement légal et illimité de plus de 300 000 titres ! En cliquant ici http://alicemusic.aliceadsl.fr Offre soumise à conditions From sashak at voltaire.com Sun Jan 20 12:12:10 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 20 Jan 2008 20:12:10 +0000 Subject: [ofa-general] [PATCH] ibutils/ibis: gsi.pkey to gsi.pkey_ix rename In-Reply-To: <20080120200639.GU10650@sashak.voltaire.com> References: <4782160B.1080709@dev.mellanox.co.il> <478217A6.80307@dev.mellanox.co.il> <47821FF3.7020705@dev.mellanox.co.il> <1199718187.20870.102.camel@hrosenstock-ws.xsigo.com> <20080113193435.GG10650@sashak.voltaire.com> <20080113193559.GH10650@sashak.voltaire.com> <20080115182142.GC16009@sashak.voltaire.com> <20080115113533.10c3c7c9.weiny2@llnl.gov> <20080115195002.GL16009@sashak.voltaire.com> <20080120200639.GU10650@sashak.voltaire.com> Message-ID: <20080120201210.GV10650@sashak.voltaire.com> This follows gsi.pkey to gsi.pkey_ix renaming in OpenSM mad address structure. Signed-off-by: Sasha Khapyorsky --- ibis/src/ibbbm.c | 2 +- ibis/src/ibcr.c | 2 +- ibis/src/ibpm.c | 2 +- ibis/src/ibvs.c | 4 ++-- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/ibis/src/ibbbm.c b/ibis/src/ibbbm.c index e60c861..5457c72 100644 --- a/ibis/src/ibbbm.c +++ b/ibis/src/ibbbm.c @@ -226,7 +226,7 @@ __ibbbm_vpd( mad_addr.static_rate = 0; mad_addr.addr_type.gsi.remote_qp=cl_hton32(1); mad_addr.addr_type.gsi.remote_qkey = cl_hton32(0x80010000); - mad_addr.addr_type.gsi.pkey = 0; + mad_addr.addr_type.gsi.pkey_ix = 0; mad_addr.addr_type.gsi.service_level = 0; mad_addr.addr_type.gsi.global_route = FALSE; diff --git a/ibis/src/ibcr.c b/ibis/src/ibcr.c index 18405ad..3d8654e 100644 --- a/ibis/src/ibcr.c +++ b/ibis/src/ibcr.c @@ -185,7 +185,7 @@ __ibcr_prep_cr_mad( mad_addr.static_rate = 0; mad_addr.addr_type.gsi.remote_qp=cl_hton32(1); mad_addr.addr_type.gsi.remote_qkey = cl_hton32(0x80010000); - mad_addr.addr_type.gsi.pkey = 0; + mad_addr.addr_type.gsi.pkey_ix = 0; mad_addr.addr_type.gsi.service_level = 0; mad_addr.addr_type.gsi.global_route = FALSE; diff --git a/ibis/src/ibpm.c b/ibis/src/ibpm.c index fce144e..0680deb 100644 --- a/ibis/src/ibpm.c +++ b/ibis/src/ibpm.c @@ -176,7 +176,7 @@ __ibpm_prep_port_counter_mad( mad_addr.static_rate = 0; mad_addr.addr_type.gsi.remote_qp=cl_hton32(1); mad_addr.addr_type.gsi.remote_qkey = cl_hton32(0x80010000); - mad_addr.addr_type.gsi.pkey = 0; + mad_addr.addr_type.gsi.pkey_ix = 0; mad_addr.addr_type.gsi.service_level = 0; mad_addr.addr_type.gsi.global_route = FALSE; diff --git a/ibis/src/ibvs.c b/ibis/src/ibvs.c index e581d0f..2857278 100644 --- a/ibis/src/ibvs.c +++ b/ibis/src/ibvs.c @@ -242,7 +242,7 @@ __ibvs_init_mad_addr( p_mad_addr->static_rate = 0; p_mad_addr->addr_type.gsi.remote_qp=cl_hton32(1); p_mad_addr->addr_type.gsi.remote_qkey = cl_hton32(0x80010000); - p_mad_addr->addr_type.gsi.pkey = 0; + p_mad_addr->addr_type.gsi.pkey_ix = 0; p_mad_addr->addr_type.gsi.service_level = 0; p_mad_addr->addr_type.gsi.global_route = FALSE; } @@ -1360,7 +1360,7 @@ ibvs_plft_map_get( mad_addr.static_rate = 0; mad_addr.addr_type.gsi.remote_qp=cl_hton32(0); mad_addr.addr_type.gsi.remote_qkey = 0; - mad_addr.addr_type.gsi.pkey = 0; + mad_addr.addr_type.gsi.pkey_ix = 0; mad_addr.addr_type.gsi.service_level = 0; mad_addr.addr_type.gsi.global_route = FALSE; -- 1.5.4.rc2.38.gd6da3 From ntb at bons-offices.com Sun Jan 20 19:08:44 2008 From: ntb at bons-offices.com (Tonia Mahoney) Date: Mon, 21 Jan 2008 11:08:44 +0800 Subject: [ofa-general] Save on quality software! Message-ID: <01c85c1d$fc7e4600$85d30975@ntb> Brilliant opportunity to get software right at the same time you need it without waiting for a CD to be delivered. Just pay money and download your soft. Low prices, discounts and special offers! Most popular localized software in German, French, Italian, Spanish, English and many other languages of the world! After purchasing you can install our software on any computer you'd like since it's not restricted. Access to all updates! Money back guarantee! http://fostsofte.com Get software you need right now! From misvarisa at inmail24.com Sun Jan 20 20:28:15 2008 From: misvarisa at inmail24.com (DR. VARISA .W. VALY) Date: Sun, 20 Jan 2008 22:28:15 -0600 Subject: [ofa-general] INVITATION FOR CONFERENCE Message-ID: Helo Dear, We are cordially inviting you to our twin combined conference which will be held. In Anaheim California From the 19th - 21st of March 2008 and Dakar senegal from the 25th - 28th of March 2008. If you are interested to participate and want to represent your country,you may contact the secretariat of the organizing committee for details and information. You should also inform them that you were invited to participate by a friend of yours (DR. VARISA.W. VALY), who is a member of the American Youths 4 Peace and a staff of (WORLD YOUTH ORGANIZATION FOR HUMAN RIGHTS). The benevolent donors of the Organizing Committee will provide round trip air tickets and accommodation for the period of participants Stay in the U.S, to all registered participants. You will only be responsible for your own hotel booking in Dakar where the second phase of the conference will be held. If you are a holder of an international passport that may require visa to enter the United States you may inform the conference secretariat at the time of registration, as the organizing committee is responsible for all visa arrangements and travel assistances. Please for More detailed information and registration, kindly reach me back with below email contact: ( varisa3 at yahoo.com ) Sincerely, DR. VARISA W. VALY From tziporet at dev.mellanox.co.il Mon Jan 21 00:17:44 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 21 Jan 2008 10:17:44 +0200 Subject: [ofa-general] Support for Ammasso in OFED 1.3? In-Reply-To: References: Message-ID: <47945528.3020701@mellanox.co.il> Allen Hubbe wrote: > Hello, > > Are there plans to include support for Ammasso hardware with OFED 1.3? > Is there anything that prevents the iw_c2 module or libamso from being > distributed and installed with the rest of OFED? > > Allen Hubbe There is no maintainer for Ammasso driver and library and this is the reason its not in OFED Tziporet From personalizec05 at inometa.de Sun Jan 20 00:49:10 2008 From: personalizec05 at inometa.de (Pamela Riley) Date: Mon, 20 Jan 2008 16:49:10 +0800 Subject: [ofa-general] Accept the correct decision Message-ID: <891532041.58895758510791@inometa.de> Having real difficulty getting an er gp ec ce ti kly on??? Take V kll ia ul g lha ra.... Get rock hard erection within 30 mins after taking the pill. Rediscover your youthful s sp e lmp x power easily and effectively. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tziporet at dev.mellanox.co.il Mon Jan 21 00:59:30 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 21 Jan 2008 10:59:30 +0200 Subject: [ofa-general] OFED 1.3 RC2 release is available In-Reply-To: <478E3F62.1040307@datadirectnet.com> References: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> <478E3F62.1040307@datadirectnet.com> Message-ID: <47945EF2.4030801@mellanox.co.il> Martin W. Schlining III wrote: > What about the recent patches to SRP to use the request_limit_delta > field in the SRP Login response? Are those changes destined for OFED 1.3? > Are these going to 2.6.24? If yes then they will be in OFED 1.3. If these go only to 2.6.25 we can add them to OFED 1.3. For this can you generate these patches against 2.6.24 and we can pull them in Thanks, Tziporet From bart.vanassche at gmail.com Mon Jan 21 01:14:01 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 21 Jan 2008 10:14:01 +0100 Subject: [ofa-general] Performance of MT25204 versus MT25208 In-Reply-To: <6C2C79E72C305246B504CBA17B5500C903230685@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C90323048D@mtlexch01.mtl.com> <6C2C79E72C305246B504CBA17B5500C903230685@mtlexch01.mtl.com> Message-ID: On Jan 18, 2008 5:09 PM, Sagi Rotem wrote: > This may be your problem: > MaxReadReq 128 bytes > U need a BIOS update , common value with good performance is 512. > Alternatively u can force it using setpci but than system may be > unstable. After unloading and reloading the ib_mthca module with parameter tune_pci=1 (which sets MaxReadReq to 4096,) ib_rdma_bw now reports a bandwidth of 933 MB/s. Thanks, Bart. From Sagir at mellanox.co.il Mon Jan 21 01:15:48 2008 From: Sagir at mellanox.co.il (Sagi Rotem) Date: Mon, 21 Jan 2008 11:15:48 +0200 Subject: [ofa-general] Performance of MT25204 versus MT25208 In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C90323048D@mtlexch01.mtl.com> <6C2C79E72C305246B504CBA17B5500C903230685@mtlexch01.mtl.com> Message-ID: <6C2C79E72C305246B504CBA17B5500C903286A72@mtlexch01.mtl.com> Yes, this is the third option ;-) but pay attention that this is a WA, u need to upgrade the BIOS in order to get a stable system -----Original Message----- From: Bart Van Assche [mailto:bart.vanassche at gmail.com] Sent: Monday, January 21, 2008 11:14 AM To: Sagi Rotem Cc: Openib-General; Chuck Hartley Subject: Re: [ofa-general] Performance of MT25204 versus MT25208 On Jan 18, 2008 5:09 PM, Sagi Rotem wrote: > This may be your problem: > MaxReadReq 128 bytes > U need a BIOS update , common value with good performance is 512. > Alternatively u can force it using setpci but than system may be > unstable. After unloading and reloading the ib_mthca module with parameter tune_pci=1 (which sets MaxReadReq to 4096,) ib_rdma_bw now reports a bandwidth of 933 MB/s. Thanks, Bart. From ogerlitz at voltaire.com Mon Jan 21 01:19:26 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Mon, 21 Jan 2008 11:19:26 +0200 Subject: [ofa-general] [PATCH 14/16] ib/ipoib: Support modifying IPOIB CQ moderation params In-Reply-To: <1200839570.6925.130.camel@mtls03> References: <1200501508.13546.83.camel@mtls03> <478F56AD.1030006@voltaire.com> <1200585653.6925.57.camel@mtls03> <47931EC5.1080300@voltaire.com> <1200839570.6925.130.camel@mtls03> Message-ID: <4794639E.7070501@voltaire.com> Eli Cohen wrote: > Thinking it over , if I want to make the code correct as it is now (e.g. > a single cq for both tx and rx), I must require that the user supply > both tx and rx values in each invocation of ethtool and that these > values be identical. Alternatively we can say that rx dictates > configuration for both rx and tx. When we get separate tx and rx CQs > I will modify this patch to exclude tx configuration. my vote is for having the rx dictating the configuration and excluding the tx handling, but if you want to have it now and remove it later, that's fine. Or. From eli at mellanox.co.il Mon Jan 21 01:27:33 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Mon, 21 Jan 2008 11:27:33 +0200 Subject: [ofa-general] [PATCH 14/16] ib/ipoib: Support modifying IPOIB CQ moderation params In-Reply-To: <4794639E.7070501@voltaire.com> References: <1200501508.13546.83.camel@mtls03> <478F56AD.1030006@voltaire.com> <1200585653.6925.57.camel@mtls03> <47931EC5.1080300@voltaire.com> <1200839570.6925.130.camel@mtls03> <4794639E.7070501@voltaire.com> Message-ID: <1200907653.6925.139.camel@mtls03> On Mon, 2008-01-21 at 11:19 +0200, Or Gerlitz wrote: > Eli Cohen wrote: > > Thinking it over , if I want to make the code correct as it is now (e.g. > > a single cq for both tx and rx), I must require that the user supply > > both tx and rx values in each invocation of ethtool and that these > > values be identical. Alternatively we can say that rx dictates > > configuration for both rx and tx. When we get separate tx and rx CQs > > I will modify this patch to exclude tx configuration. > > > my vote is for having the rx dictating the configuration and excluding > the tx handling, but if you want to have it now and remove it later, > that's fine. > > Or. > I also favor the idea of rx dictating the configuration - that's how it is implemented currently. From hch at infradead.org Mon Jan 21 02:01:51 2008 From: hch at infradead.org (Christoph Hellwig) Date: Mon, 21 Jan 2008 10:01:51 +0000 Subject: [ofa-general] Re: InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: References: Message-ID: <20080121100151.GD5333@infradead.org> On Thu, Jan 17, 2008 at 04:11:11PM -0800, Roland Dreier wrote: > - Neteffect "nes" driver. It's not terribly clean code but since > it's a new driver that is completely self-contained, I plan on > merging it and letting cleanups happen upstream. New code should be better quality than old code, not worse. I haven't actually seen the driver yet, but by that statement I'd be clearly against a merge. From sartorial0 at biobeier.de Sun Jan 20 02:30:44 2008 From: sartorial0 at biobeier.de (Brock Thomson) Date: Mon, 20 Jan 2008 12:30:44 +0200 Subject: [ofa-general] Guys Need This Message-ID: <555793312.86081284798015@biobeier.de> Have they ever told you this, "God! Your p oez en gp is is so small!"? Didn't you feel sad? Don't let ladies choose sexual toys but not you! M sq eg uhj ad ow ik will make you a real man ! You should simply rely on this excellent preparation! "Wow! Your pe xws ni hf s is so large!" Isn't that what you always wanted to hear? Soon you'll be the only one they will want!Me le ga qri di xkb k is your real cure! Pay for   http://home.graffiti.net/iluchkina/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlad at lists.openfabrics.org Mon Jan 21 03:12:00 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 21 Jan 2008 03:12:00 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080121-0200 daily build status Message-ID: <20080121111200.364A3E601A5@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.22 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.18-53.el5 Passed on ia64 with linux-2.6.23 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.19 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ppc64 with linux-2.6.12 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.12 Passed on ppc64 with linux-2.6.14 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.20 Passed on powerpc with linux-2.6.12 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Failed: From investigatetv at jugendkorbinian.de Sun Jan 20 05:54:08 2008 From: investigatetv at jugendkorbinian.de (Jill Camacho) Date: Mon, 20 Jan 2008 14:54:08 +0100 Subject: [ofa-general] Real sexual preparations which always work Message-ID: <611898523.35006173426823@jugendkorbinian.de> Ladies can sence a confident man, and they like it. http://home.graffiti.net/iluchkina/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From gshipman at ornl.gov Mon Jan 21 07:28:52 2008 From: gshipman at ornl.gov (Shipman, Galen M.) Date: Mon, 21 Jan 2008 10:28:52 -0500 Subject: [ofa-general] rdma_create_qp fails with -12 Message-ID: We are seeing failures setting up a QP using rdma_create_qp. This only occurs when: init_qp_attr.cap.max_send_wr init_qp_attr.cap.max_recv_wr Totals to more than 16K. I have queried the device attributes and I have found: ib_device_attr.max_qp_wr = 65535 The CQ is setup to have: init_qp_attr.cap.max_send_wr + init_qp_attr.cap.max_recv_wr CQE's and we are not seeing any failures creating the CQ with ib_create_cq. I have tried replicating this problem using user level verbs but I am unable to do so using ibv_create_qp. I used the test program located here: http://lists.openfabrics.org/pipermail/general/2006-February/017104.html For reference ibv_devinfo reports: hca_id: mthca0 fw_ver: 4.7.600 node_guid: 0008:f104:0397:b6b8 sys_image_guid: 0008:f104:0397:b6bb vendor_id: 0x08f1 vendor_part_id: 25208 hw_ver: 0xA0 board_id: VLT0040010001 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 1 port_lid: 1 port_lmc: 0x00 port: 2 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 512 (2) sm_lid: 0 port_lid: 0 port_lmc: 0x00 And lspci: 04:00.0 InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) (rev 20) Any ideas here? Thanks, Galen From pompeymaw986 at crossway.de Sun Jan 20 08:31:38 2008 From: pompeymaw986 at crossway.de (Marlon Best) Date: Mon, 20 Jan 2008 08:31:38 -0800 Subject: [ofa-general] Our best decision is suitable for every age Message-ID: <811879552.61087745815945@crossway.de> Hello motoOpenib hey cat d gj i vt ck! you can make it much bi vce g euy ger, what are you waiting for? http://home.graffiti.net/iluchkina/ Marlon Best -------------- next part -------------- An HTML attachment was scrubbed... URL: From tziporet at mellanox.co.il Mon Jan 21 08:38:48 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Mon, 21 Jan 2008 18:38:48 +0200 Subject: [ofa-general] OFED meeting agenda Message-ID: <6C2C79E72C305246B504CBA17B5500C903286F5C@mtlexch01.mtl.com> Hi, Assuming we will have enough people in today's meeting (due to Martin Luther King Day here in the US) This is the agenda: 1. OFED 1.3-rc2 testing status - all 2. XRC status update - code is ready and working - to be submitted to the list today or tomorrow 3. Open discussion Tziporet -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillowda at ornl.gov Mon Jan 21 09:23:57 2008 From: dillowda at ornl.gov (David Dillow) Date: Mon, 21 Jan 2008 12:23:57 -0500 Subject: [ofa-general] rdma_create_qp fails with -12 In-Reply-To: References: Message-ID: <1200936237.23538.15.camel@obelisk.thedillows.org> On Mon, 2008-01-21 at 10:28 -0500, Shipman, Galen M. wrote: > We are seeing failures setting up a QP using rdma_create_qp. > > This only occurs when: > init_qp_attr.cap.max_send_wr > init_qp_attr.cap.max_recv_wr > > Totals to more than 16K. > > I have queried the device attributes and I have found: > > ib_device_attr.max_qp_wr = 65535 -12 is -ENOMEM You may want to add some printk's to mthca_alloc_wqe_buf() in mthca_qp.c. I think you're failing in the line qp->wrid = kmalloc((qp->rq.max + qp->sq.max) * sizeof (u64), GFP_KERNEL); rq.max gets set the max_recv_wr, sq.max gets max_send_wr, so you're trying to allocate 128KB when those total 16K -- that's trying to allocate 32 contiguous pages, which I believe is the max RHEL4's kernel will let you do via kmalloc(). New kernels may have alleviated this somewhat -- my Fedora 8 box has a 1MB slab/slub cache, but good luck actually allocating that if the box has been up any length of time. I'm not sure why it would work under userspace, but I've not looked very hard either. Perhaps the IOMMU is coming into play there? -- Dave Dillow National Center for Computational Science Oak Ridge National Laboratory (865) 241-6602 office From cicatriceshd at hela-food.de Sun Jan 20 10:02:49 2008 From: cicatriceshd at hela-food.de (Katy Craig) Date: Mon, 20 Jan 2008 18:02:49 +0000 Subject: [ofa-general] This secret will change your sexual life instantly Message-ID: <571517091.19537837020959@hela-food.de> Do the favour to the woman! The reasonable woman never will say to you, that she is dissatisfied by the si hcy ze yours p bq en buo is. But - the brawny and large me dkc mb hsg er in itself is the strongest activator for the women, and the greater si jq ze me uh ans the greater number of the nervous terminations, which he influences, thus delivering much more pleasant and bright se jk xu ucl al pleasure. The increase of the m gm em bs ber is possible to achieve by different ways. But it is not necessary to subject the he aya alth and precious body of danger. Reserve in us a m oiy ea ib ns for increase of the me ywd mb vk er and you quickly will achieve desired result. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.hefty at intel.com Mon Jan 21 10:31:52 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Mon, 21 Jan 2008 10:31:52 -0800 Subject: [ofa-general] RE: [PATCH 2/2] Update %install section of librdmacm spec file In-Reply-To: References: Message-ID: <000101c85c5b$e450f7b0$9b37170a@amr.corp.intel.com> Thanks - both patches applied From hamilton at lamer.com Mon Jan 21 09:16:04 2008 From: hamilton at lamer.com (Rolex Watches) Date: Mon, 21 Jan 2008 17:16:04 +0000 Subject: [ofa-general] Replica Watches Message-ID: <000601c85c60$0675ffd5$cc3e46be@fhofqm> We feature the most exclusive watch brands! Just check for our replicas! -------------- next part -------------- An HTML attachment was scrubbed... URL: From hartlch14 at gmail.com Mon Jan 21 12:06:55 2008 From: hartlch14 at gmail.com (Chuck Hartley) Date: Mon, 21 Jan 2008 15:06:55 -0500 Subject: [ofa-general] Performance of MT25204 versus MT25208 In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C90323048D@mtlexch01.mtl.com> <6C2C79E72C305246B504CBA17B5500C903230685@mtlexch01.mtl.com> Message-ID: It turns out that SuperMicro recently released a BIOS update for our motherboards. No release notes to say what all was done, but after installing the update we now have MaxReadReq = 4096. I ran ib_rdma_bw and now we are getting 1500MB/s instead of 1336. On Jan 21, 2008 4:14 AM, Bart Van Assche wrote: > On Jan 18, 2008 5:09 PM, Sagi Rotem wrote: > > This may be your problem: > > MaxReadReq 128 bytes > > U need a BIOS update , common value with good performance is 512. > > Alternatively u can force it using setpci but than system may be > > unstable. > > After unloading and reloading the ib_mthca module with parameter > tune_pci=1 (which sets MaxReadReq to 4096,) ib_rdma_bw now reports a > bandwidth of 933 MB/s. > > Thanks, > > Bart. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Mon Jan 21 12:39:36 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Mon, 21 Jan 2008 14:39:36 -0600 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: <20080121203829.3143.26181.stgit@dell3.ogc.int> References: <20080121203829.3143.26181.stgit@dell3.ogc.int> Message-ID: <20080121203935.3143.12791.stgit@dell3.ogc.int> RDMA/cxgb3: fix page shift calculation in build_phys_page_list() The existing logic incorrectly maps this buffer list: 0: addr 0x10001000, size 0x1000 1: addr 0x10002000, size 0x1000 To this bogus page list: 0: 0x10000000 1: 0x10002000 The shift calculation must also take into account the address of the first entry masked by the page_mask as well as the last address+size rounded up to the next page size. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/iwch_mem.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_mem.c b/drivers/infiniband/hw/cxgb3/iwch_mem.c index a6c2c4b..73bfd16 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_mem.c +++ b/drivers/infiniband/hw/cxgb3/iwch_mem.c @@ -122,6 +122,13 @@ int build_phys_page_list(struct ib_phys_buf *buffer_list, *total_size += buffer_list[i].size; if (i > 0) mask |= buffer_list[i].addr; + else + mask |= buffer_list[i].addr & PAGE_MASK; + if (i != num_phys_buf - 1) + mask |= buffer_list[i].addr + buffer_list[i].size; + else + mask |= (buffer_list[i].addr + buffer_list[i].size + + PAGE_SIZE - 1) & PAGE_MASK; } if (*total_size > 0xFFFFFFFFULL) From swise at opengridcomputing.com Mon Jan 21 12:39:33 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Mon, 21 Jan 2008 14:39:33 -0600 Subject: [ofa-general] [PATCH 1/3] RDMA/cxgb3: Flush the RQ when closing. In-Reply-To: <20080121203829.3143.26181.stgit@dell3.ogc.int> References: <20080121203829.3143.26181.stgit@dell3.ogc.int> Message-ID: <20080121203932.3143.87972.stgit@dell3.ogc.int> RDMA/cxgb3: Flush the RQ when closing. - for kernel mode cqs, call event notification handler when flushing - flush qp when moving from RTS -> CLOSING - fixed logic to identify a kernel mode qp Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/iwch_qp.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c b/drivers/infiniband/hw/cxgb3/iwch_qp.c index 9bb8112..7681fdc 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_qp.c +++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c @@ -642,6 +642,7 @@ static void __flush_qp(struct iwch_qp *qhp, unsigned long *flag) cxio_flush_rq(&qhp->wq, &rchp->cq, count); spin_unlock(&qhp->lock); spin_unlock_irqrestore(&rchp->lock, *flag); + (*rchp->ibcq.comp_handler)(&rchp->ibcq, rchp->ibcq.cq_context); /* locking heirarchy: cq lock first, then qp lock. */ spin_lock_irqsave(&schp->lock, *flag); @@ -651,6 +652,7 @@ static void __flush_qp(struct iwch_qp *qhp, unsigned long *flag) cxio_flush_sq(&qhp->wq, &schp->cq, count); spin_unlock(&qhp->lock); spin_unlock_irqrestore(&schp->lock, *flag); + (*schp->ibcq.comp_handler)(&schp->ibcq, schp->ibcq.cq_context); /* deref */ if (atomic_dec_and_test(&qhp->refcnt)) @@ -661,7 +663,7 @@ static void __flush_qp(struct iwch_qp *qhp, unsigned long *flag) static void flush_qp(struct iwch_qp *qhp, unsigned long *flag) { - if (t3b_device(qhp->rhp)) + if (qhp->ibqp.uobject) cxio_set_wq_in_error(&qhp->wq); else __flush_qp(qhp, flag); @@ -830,10 +832,11 @@ int iwch_modify_qp(struct iwch_dev *rhp, struct iwch_qp *qhp, disconnect = 1; ep = qhp->ep; } + flush_qp(qhp, &flag); break; case IWCH_QP_STATE_TERMINATE: qhp->attr.state = IWCH_QP_STATE_TERMINATE; - if (t3b_device(qhp->rhp)) + if (qhp->ibqp.uobject) cxio_set_wq_in_error(&qhp->wq); if (!internal) terminate = 1; From swise at opengridcomputing.com Mon Jan 21 12:39:38 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Mon, 21 Jan 2008 14:39:38 -0600 Subject: [ofa-general] [PATCH 3/3] RDMA/cxgb3: Mark qp as privileged based on user capabilities. In-Reply-To: <20080121203829.3143.26181.stgit@dell3.ogc.int> References: <20080121203829.3143.26181.stgit@dell3.ogc.int> Message-ID: <20080121203938.3143.44928.stgit@dell3.ogc.int> RDMA/cxgb3: Mark qp as privileged based on user capabilities. This is needed for zero-stag support. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/cxio_wr.h | 3 ++- drivers/infiniband/hw/cxgb3/iwch_qp.c | 1 + 2 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/cxio_wr.h b/drivers/infiniband/hw/cxgb3/cxio_wr.h index c84d4ac..d72b584 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_wr.h +++ b/drivers/infiniband/hw/cxgb3/cxio_wr.h @@ -324,7 +324,8 @@ struct t3_genbit { }; enum rdma_init_wr_flags { - RECVS_POSTED = 1, + RECVS_POSTED = (1<<0), + PRIV_QP = (1<<1), }; union t3_wr { diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c b/drivers/infiniband/hw/cxgb3/iwch_qp.c index 7681fdc..ea2cdd7 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_qp.c +++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c @@ -717,6 +717,7 @@ static int rdma_init(struct iwch_dev *rhp, struct iwch_qp *qhp, init_attr.qp_dma_addr = qhp->wq.dma_addr; init_attr.qp_dma_size = (1UL << qhp->wq.size_log2); init_attr.flags = rqes_posted(qhp) ? RECVS_POSTED : 0; + init_attr.flags |= capable(CAP_NET_BIND_SERVICE) ? PRIV_QP : 0; init_attr.irs = qhp->ep->rcv_seq; PDBG("%s init_attr.rq_addr 0x%x init_attr.rq_size = %d " "flags 0x%x qpcaps 0x%x\n", __FUNCTION__, From swise at opengridcomputing.com Mon Jan 21 12:41:30 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Mon, 21 Jan 2008 14:41:30 -0600 Subject: [ofa-general] [PATCH RESEND 0/3] RDMA/cxgb3 fixes Message-ID: <20080121204130.3820.11053.stgit@dell3.ogc.int> Hey Roland, Please include these three iw_cxgb3 fixes for 2.6.25. The first two fix bugs found doing Lustre testing, and the last patch correctly marks privileged qps. Shortlog: RDMA/cxgb3: Flush the RQ when closing. RDMA/cxgb3: fix page shift calculation in build_phys_page_list() RDMA/cxgb3: Mark qp as privileged based on user capabilities. -- Steve. From swise at opengridcomputing.com Mon Jan 21 12:42:09 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Mon, 21 Jan 2008 14:42:09 -0600 Subject: [ofa-general] [PATCH RESEND 1/3] RDMA/cxgb3: Flush the RQ when closing. In-Reply-To: <20080121204130.3820.11053.stgit@dell3.ogc.int> References: <20080121204130.3820.11053.stgit@dell3.ogc.int> Message-ID: <20080121204208.3820.92974.stgit@dell3.ogc.int> RDMA/cxgb3: Flush the RQ when closing. - for kernel mode cqs, call event notification handler when flushing - flush qp when moving from RTS -> CLOSING - fixed logic to identify a kernel mode qp Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/iwch_qp.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c b/drivers/infiniband/hw/cxgb3/iwch_qp.c index 9bb8112..7681fdc 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_qp.c +++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c @@ -642,6 +642,7 @@ static void __flush_qp(struct iwch_qp *qhp, unsigned long *flag) cxio_flush_rq(&qhp->wq, &rchp->cq, count); spin_unlock(&qhp->lock); spin_unlock_irqrestore(&rchp->lock, *flag); + (*rchp->ibcq.comp_handler)(&rchp->ibcq, rchp->ibcq.cq_context); /* locking heirarchy: cq lock first, then qp lock. */ spin_lock_irqsave(&schp->lock, *flag); @@ -651,6 +652,7 @@ static void __flush_qp(struct iwch_qp *qhp, unsigned long *flag) cxio_flush_sq(&qhp->wq, &schp->cq, count); spin_unlock(&qhp->lock); spin_unlock_irqrestore(&schp->lock, *flag); + (*schp->ibcq.comp_handler)(&schp->ibcq, schp->ibcq.cq_context); /* deref */ if (atomic_dec_and_test(&qhp->refcnt)) @@ -661,7 +663,7 @@ static void __flush_qp(struct iwch_qp *qhp, unsigned long *flag) static void flush_qp(struct iwch_qp *qhp, unsigned long *flag) { - if (t3b_device(qhp->rhp)) + if (qhp->ibqp.uobject) cxio_set_wq_in_error(&qhp->wq); else __flush_qp(qhp, flag); @@ -830,10 +832,11 @@ int iwch_modify_qp(struct iwch_dev *rhp, struct iwch_qp *qhp, disconnect = 1; ep = qhp->ep; } + flush_qp(qhp, &flag); break; case IWCH_QP_STATE_TERMINATE: qhp->attr.state = IWCH_QP_STATE_TERMINATE; - if (t3b_device(qhp->rhp)) + if (qhp->ibqp.uobject) cxio_set_wq_in_error(&qhp->wq); if (!internal) terminate = 1; From swise at opengridcomputing.com Mon Jan 21 12:42:11 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Mon, 21 Jan 2008 14:42:11 -0600 Subject: [ofa-general] [PATCH RESEND 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: <20080121204130.3820.11053.stgit@dell3.ogc.int> References: <20080121204130.3820.11053.stgit@dell3.ogc.int> Message-ID: <20080121204211.3820.31337.stgit@dell3.ogc.int> RDMA/cxgb3: fix page shift calculation in build_phys_page_list() The existing logic incorrectly maps this buffer list: 0: addr 0x10001000, size 0x1000 1: addr 0x10002000, size 0x1000 To this bogus page list: 0: 0x10000000 1: 0x10002000 The shift calculation must also take into account the address of the first entry masked by the page_mask as well as the last address+size rounded up to the next page size. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/iwch_mem.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_mem.c b/drivers/infiniband/hw/cxgb3/iwch_mem.c index a6c2c4b..73bfd16 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_mem.c +++ b/drivers/infiniband/hw/cxgb3/iwch_mem.c @@ -122,6 +122,13 @@ int build_phys_page_list(struct ib_phys_buf *buffer_list, *total_size += buffer_list[i].size; if (i > 0) mask |= buffer_list[i].addr; + else + mask |= buffer_list[i].addr & PAGE_MASK; + if (i != num_phys_buf - 1) + mask |= buffer_list[i].addr + buffer_list[i].size; + else + mask |= (buffer_list[i].addr + buffer_list[i].size + + PAGE_SIZE - 1) & PAGE_MASK; } if (*total_size > 0xFFFFFFFFULL) From swise at opengridcomputing.com Mon Jan 21 12:42:13 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Mon, 21 Jan 2008 14:42:13 -0600 Subject: [ofa-general] [PATCH RESEND 3/3] RDMA/cxgb3: Mark qp as privileged based on user capabilities. In-Reply-To: <20080121204130.3820.11053.stgit@dell3.ogc.int> References: <20080121204130.3820.11053.stgit@dell3.ogc.int> Message-ID: <20080121204213.3820.12396.stgit@dell3.ogc.int> RDMA/cxgb3: Mark qp as privileged based on user capabilities. This is needed for zero-stag support. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/cxio_wr.h | 3 ++- drivers/infiniband/hw/cxgb3/iwch_qp.c | 1 + 2 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/cxio_wr.h b/drivers/infiniband/hw/cxgb3/cxio_wr.h index c84d4ac..d72b584 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_wr.h +++ b/drivers/infiniband/hw/cxgb3/cxio_wr.h @@ -324,7 +324,8 @@ struct t3_genbit { }; enum rdma_init_wr_flags { - RECVS_POSTED = 1, + RECVS_POSTED = (1<<0), + PRIV_QP = (1<<1), }; union t3_wr { diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c b/drivers/infiniband/hw/cxgb3/iwch_qp.c index 7681fdc..ea2cdd7 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_qp.c +++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c @@ -717,6 +717,7 @@ static int rdma_init(struct iwch_dev *rhp, struct iwch_qp *qhp, init_attr.qp_dma_addr = qhp->wq.dma_addr; init_attr.qp_dma_size = (1UL << qhp->wq.size_log2); init_attr.flags = rqes_posted(qhp) ? RECVS_POSTED : 0; + init_attr.flags |= capable(CAP_NET_BIND_SERVICE) ? PRIV_QP : 0; init_attr.irs = qhp->ep->rcv_seq; PDBG("%s init_attr.rq_addr 0x%x init_attr.rq_size = %d " "flags 0x%x qpcaps 0x%x\n", __FUNCTION__, From fibulasdx8 at b-e.de Sun Jan 20 13:04:28 2008 From: fibulasdx8 at b-e.de (Kristine Crain) Date: Mon, 20 Jan 2008 22:04:28 +0100 Subject: [ofa-general] You can increase the time of your sexual act Message-ID: <156817131.49453704224078@b-e.de> Do you believe in miracles? We guess you're likely to give a negative answer. We hadn't believed, either...until the moment Me ja gaD jlx ic bnb k was invented! The effect this remedy produces on a human ph sli all ykp us cannot be called otherwise than a Mi mjm ra wh cle! Just picture to yourself, that your love wand suddenly becomes longer and thicker and makes women tremble with passion! It's fabulous! So, hurry up, accomplish a miracle in your life with this wonder-m pvl ed phq ici eyz ne! http://home.graffiti.net/iluchkina/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From binnedzwe77 at synthese-design.de Sun Jan 20 13:48:52 2008 From: binnedzwe77 at synthese-design.de (Brad Morton) Date: Mon, 20 Jan 2008 23:48:52 +0200 Subject: [ofa-general] People Laugh at You Message-ID: <595844453.13973969174003@synthese-design.de> Now that you've met a gal that's hot You wanna screw her juicy twat. She looks so sizzling, she's so nice! But would your p cq en cm is odd e si bgp ze suffice? Not sure she will ask for more? You need a thing she would adore! But how to get it long and thick? Your only hope is M fpm eg svy aD wgt ik! You'll get so wanted super-si xyh ze And see great pleasure in her eyes! Your schlong will pound her box so deep, Tonight you'll hardly fall asleep! So try today this wonder-pi'll And change your life at your own will! -------------- next part -------------- An HTML attachment was scrubbed... URL: From qnrctk at idc-ch2m.com Mon Jan 21 13:56:32 2008 From: qnrctk at idc-ch2m.com (qnrctk at idc-ch2m.com) Date: Mon, 21 Jan 2008 16:56:32 -0500 Subject: [ofa-general] My Love Message-ID: <47951510.2070602@idc-ch2m.com> Your Friend and Lover http://71.147.36.230/ From dwmworthm at mworth.com Mon Jan 21 13:57:58 2008 From: dwmworthm at mworth.com (George Spangler) Date: Mon, 21 Jan 2008 13:57:58 -0800 Subject: [ofa-general] Make absolutely safe purchase with CanadianPharmacy Message-ID: <425579405.77163215423860@mworth.com> Take advantage of CanadianPharmacy's special half-price offer and buy your drugs at lowest Internet prices.Visit CanadianPharmacy to choose from the wide range of cheap and quality products. Order drugs online in Canada and they will be delivered to you fast and discreet at much more cheaper prices than in America. Prompt service, fast delivery. If your order is $300+, you will receive 12 bonus pills.We offer best prices for best people. http://geocities.com/boyermohamed/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From michaelc at cs.wisc.edu Mon Jan 21 14:13:25 2008 From: michaelc at cs.wisc.edu (Mike Christie) Date: Mon, 21 Jan 2008 16:13:25 -0600 Subject: [ofa-general] Re: [PATCH 2/2] IB/iser: lower queue depth In-Reply-To: <478F258D.3080500@voltaire.com> References: <478F247C.9010306@voltaire.com> <478F258D.3080500@voltaire.com> Message-ID: <47951905.7060700@cs.wisc.edu> Erez Zilber wrote: > Add change_queue_depth handler to scsi_host_template in the > iSER driver. This handler was added to iscsi_tcp in order to > solve the problem of queue depth which was too high for some > targets. It is also applicable for iSER. > > Signed-off-by: Erez Zilber > --- > drivers/infiniband/ulp/iser/iscsi_iser.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c > index bad8dac..dfa5a45 100644 > --- a/drivers/infiniband/ulp/iser/iscsi_iser.c > +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c > @@ -551,6 +551,7 @@ static struct scsi_host_template iscsi_iser_sht = { > .module = THIS_MODULE, > .name = "iSCSI Initiator over iSER, v." DRV_VER, > .queuecommand = iscsi_queuecommand, > + .change_queue_depth = iscsi_change_queue_depth, > .can_queue = ISCSI_DEF_XMIT_CMDS_MAX - 1, > .sg_tablesize = ISCSI_ISER_SG_TABLESIZE, > .max_sectors = 1024, Sorry for the late reply. As you know I was busy with work stuff. This looks ok to me. Signed-off-by: Mike Christie One thing you will also want to do is hook iser in the session creation command that sets this at creation time. In userspace westore the queue depths we want to use for specific targets, and then have the session's host cmd_per_lun use that that value at startup. If in iscsi_iser_session_create, you just pass qdepth to iscsi_session_setup instead of passing it ISCSI_MAX_CMD_PER_LUN you will be set. I did not do that for you because we had the disagreement about the optimal setting. From illegibilityv002 at messe-stuttgart.de Sun Jan 20 21:00:49 2008 From: illegibilityv002 at messe-stuttgart.de (Kaitlin Walton) Date: Tue, 21 Jan 2008 12:00:49 +0700 Subject: [ofa-general] Is yours Below 5 Innches Long Message-ID: <354197292.65530803739144@messe-stuttgart.de> Always it is necessary to be ahead. It is a correct choice.V gcc P wo X km Lhttp://home.graffiti.net/iluchkina/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eli at mellanox.co.il Mon Jan 21 23:49:25 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Tue, 22 Jan 2008 09:49:25 +0200 Subject: [ofa-general] [PATCH] ib_mthca: Pre link receive WQEs Message-ID: <1200988165.6925.171.camel@mtls03> Pre link receive WQEs Pre linking of receive WQEs is required in Tavor mode. This is required for both SRQ and regular QPs. Remove an always true condition. For memfree linking to the nxt wqe is moved to mthca_free_srq_wqe() as in Tavor mode. Signed-off-by: Eli Cohen Reviewed-by: Jack Morgenstein --- drivers/infiniband/hw/mthca/mthca_qp.c | 13 ++++++++----- drivers/infiniband/hw/mthca/mthca_srq.c | 27 ++++++++++++++------------- 2 files changed, 22 insertions(+), 18 deletions(-) diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c index 86aa732..7072a29 100644 --- a/drivers/infiniband/hw/mthca/mthca_qp.c +++ b/drivers/infiniband/hw/mthca/mthca_qp.c @@ -1175,6 +1175,7 @@ static int mthca_alloc_qp_common(struct mthca_dev *dev, { int ret; int i; + struct mthca_next_seg *next; qp->refcount = 1; init_waitqueue_head(&qp->wait); @@ -1217,7 +1218,6 @@ static int mthca_alloc_qp_common(struct mthca_dev *dev, } if (mthca_is_memfree(dev)) { - struct mthca_next_seg *next; struct mthca_data_seg *scatter; int size = (sizeof (struct mthca_next_seg) + qp->rq.max_gs * sizeof (struct mthca_data_seg)) / 16; @@ -1240,6 +1240,13 @@ static int mthca_alloc_qp_common(struct mthca_dev *dev, qp->sq.wqe_shift) + qp->send_wqe_offset); } + } else { + for (i = 0; i < qp->rq.max; ++i) { + next = get_recv_wqe(qp, i); + next->nda_op = htonl((((i + 1) & (qp->rq.max - 1)) << + qp->rq.wqe_shift) | 1); + } + } qp->sq.last = get_send_wqe(qp, qp->sq.max - 1); @@ -1863,7 +1870,6 @@ int mthca_tavor_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, prev_wqe = qp->rq.last; qp->rq.last = wqe; - ((struct mthca_next_seg *) wqe)->nda_op = 0; ((struct mthca_next_seg *) wqe)->ee_nds = cpu_to_be32(MTHCA_NEXT_DBD); ((struct mthca_next_seg *) wqe)->flags = 0; @@ -1885,9 +1891,6 @@ int mthca_tavor_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, qp->wrid[ind] = wr->wr_id; - ((struct mthca_next_seg *) prev_wqe)->nda_op = - cpu_to_be32((ind << qp->rq.wqe_shift) | 1); - wmb(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = cpu_to_be32(MTHCA_NEXT_DBD | size); diff --git a/drivers/infiniband/hw/mthca/mthca_srq.c b/drivers/infiniband/hw/mthca/mthca_srq.c index 553d681..af8483c 100644 --- a/drivers/infiniband/hw/mthca/mthca_srq.c +++ b/drivers/infiniband/hw/mthca/mthca_srq.c @@ -175,9 +175,17 @@ static int mthca_alloc_srq_buf(struct mthca_dev *dev, struct mthca_pd *pd, * scatter list L_Keys to the sentry value of 0x100. */ for (i = 0; i < srq->max; ++i) { - wqe = get_wqe(srq, i); + struct mthca_next_seg *next; - *wqe_to_link(wqe) = i < srq->max - 1 ? i + 1 : -1; + next = wqe = get_wqe(srq, i); + + if (i < srq->max - 1) { + *wqe_to_link(wqe) = i + 1; + next->nda_op = htonl(((i + 1) << srq->wqe_shift) | 1); + } else { + *wqe_to_link(wqe) = -1; + next->nda_op = 0; + } for (scatter = wqe + sizeof (struct mthca_next_seg); (void *) scatter < wqe + (1 << srq->wqe_shift); @@ -470,16 +478,15 @@ out: void mthca_free_srq_wqe(struct mthca_srq *srq, u32 wqe_addr) { int ind; + struct mthca_next_seg *last_free; ind = wqe_addr >> srq->wqe_shift; spin_lock(&srq->lock); - if (likely(srq->first_free >= 0)) - *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; - else - srq->first_free = ind; - + last_free = get_wqe(srq, srq->last_free); + *wqe_to_link(last_free) = ind; + last_free->nda_op = htonl((ind << srq->wqe_shift) | 1); *wqe_to_link(get_wqe(srq, ind)) = -1; srq->last_free = ind; @@ -528,7 +535,6 @@ int mthca_tavor_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr, prev_wqe = srq->last; srq->last = wqe; - ((struct mthca_next_seg *) wqe)->nda_op = 0; ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ @@ -549,9 +555,6 @@ int mthca_tavor_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr, if (i < srq->max_gs) mthca_set_data_seg_inval(wqe); - ((struct mthca_next_seg *) prev_wqe)->nda_op = - cpu_to_be32((ind << srq->wqe_shift) | 1); - wmb(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = cpu_to_be32(MTHCA_NEXT_DBD); @@ -633,8 +636,6 @@ int mthca_arbel_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr, break; } - ((struct mthca_next_seg *) wqe)->nda_op = - cpu_to_be32((next_ind << srq->wqe_shift) | 1); ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ -- 1.5.3.8 From eli at mellanox.co.il Mon Jan 21 23:49:26 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Tue, 22 Jan 2008 09:49:26 +0200 Subject: [ofa-general] [PATCH] ib/libmthca: Pre link receive WQEs Message-ID: <1200988166.6925.172.camel@mtls03> [PATCH] Pre link receive WQEs Pre linking of receive WQEs is required in Tavor mode. This is required for both SRQ and regular QPs. Remove an always true condition. For memfree linking to the nxt wqe is moved to mthca_free_srq_wqe() as in Tavor mode. Signed-off-by: Eli Cohen Reviewed-by: Jack Morgenstein --- src/qp.c | 14 ++++++++------ src/srq.c | 28 +++++++++++++++------------- 2 files changed, 23 insertions(+), 19 deletions(-) diff --git a/src/qp.c b/src/qp.c index 841e316..f3aa6c7 100644 --- a/src/qp.c +++ b/src/qp.c @@ -360,7 +360,6 @@ int mthca_tavor_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, prev_wqe = qp->rq.last; qp->rq.last = wqe; - ((struct mthca_next_seg *) wqe)->nda_op = 0; ((struct mthca_next_seg *) wqe)->ee_nds = htonl(MTHCA_NEXT_DBD); ((struct mthca_next_seg *) wqe)->flags = @@ -388,9 +387,6 @@ int mthca_tavor_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, qp->wrid[ind + qp->sq.max] = wr->wr_id; - ((struct mthca_next_seg *) prev_wqe)->nda_op = - htonl((ind << qp->rq.wqe_shift) | 1); - wmb(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = htonl(MTHCA_NEXT_DBD | size); @@ -786,6 +782,8 @@ int mthca_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap, { int size; int max_sq_sge; + struct mthca_next_seg *next; + int i; qp->rq.max_gs = cap->max_recv_sge; qp->sq.max_gs = cap->max_send_sge; @@ -860,9 +858,7 @@ int mthca_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap, memset(qp->buf.buf, 0, qp->buf_size); if (mthca_is_memfree(pd->context)) { - struct mthca_next_seg *next; struct mthca_data_seg *scatter; - int i; uint32_t sz; sz = htonl((sizeof (struct mthca_next_seg) + @@ -886,6 +882,12 @@ int mthca_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap, qp->sq.wqe_shift) + qp->send_wqe_offset); } + } else { + for (i = 0; i < qp->rq.max; ++i) { + next = get_recv_wqe(qp, i); + next->nda_op = htonl((((i + 1) & (qp->rq.max - 1)) << + qp->rq.wqe_shift) | 1); + } } qp->sq.last = get_send_wqe(qp, qp->sq.max - 1); diff --git a/src/srq.c b/src/srq.c index f9fc006..1d326b8 100644 --- a/src/srq.c +++ b/src/srq.c @@ -64,13 +64,13 @@ static inline int *wqe_to_link(void *wqe) void mthca_free_srq_wqe(struct mthca_srq *srq, int ind) { - pthread_spin_lock(&srq->lock); + struct mthca_next_seg *last_free; - if (srq->first_free >= 0) - *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; - else - srq->first_free = ind; + pthread_spin_lock(&srq->lock); + last_free = get_wqe(srq, srq->last_free); + *wqe_to_link(last_free) = ind; + last_free->nda_op = htonl((ind << srq->wqe_shift) | 1); *wqe_to_link(get_wqe(srq, ind)) = -1; srq->last_free = ind; @@ -117,7 +117,6 @@ int mthca_tavor_post_srq_recv(struct ibv_srq *ibsrq, prev_wqe = srq->last; srq->last = wqe; - ((struct mthca_next_seg *) wqe)->nda_op = 0; ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ @@ -146,9 +145,6 @@ int mthca_tavor_post_srq_recv(struct ibv_srq *ibsrq, ((struct mthca_data_seg *) wqe)->addr = 0; } - ((struct mthca_next_seg *) prev_wqe)->nda_op = - htonl((ind << srq->wqe_shift) | 1); - wmb(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = htonl(MTHCA_NEXT_DBD); @@ -222,8 +218,6 @@ int mthca_arbel_post_srq_recv(struct ibv_srq *ibsrq, break; } - ((struct mthca_next_seg *) wqe)->nda_op = - htonl((next_ind << srq->wqe_shift) | 1); ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ @@ -306,9 +300,17 @@ int mthca_alloc_srq_buf(struct ibv_pd *pd, struct ibv_srq_attr *attr, */ for (i = 0; i < srq->max; ++i) { - wqe = get_wqe(srq, i); + struct mthca_next_seg *next; - *wqe_to_link(wqe) = i < srq->max - 1 ? i + 1 : -1; + next = wqe = get_wqe(srq, i); + + if (i < srq->max - 1) { + *wqe_to_link(wqe) = i + 1; + next->nda_op = htonl(((i + 1) << srq->wqe_shift) | 1); + } else { + *wqe_to_link(wqe) = -1; + next->nda_op = 0; + } for (scatter = wqe + sizeof (struct mthca_next_seg); (void *) scatter < wqe + (1 << srq->wqe_shift); -- 1.5.3.8 From eli at dev.mellanox.co.il Mon Jan 21 23:49:22 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 22 Jan 2008 09:49:22 +0200 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <478B8172.2010104@mellanox.co.il> References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> Message-ID: <1200988162.6925.170.camel@mtls03> On Mon, 2008-01-14 at 17:36 +0200, Tziporet Koren wrote: > Roland Dreier wrote: > > > Mellanox: can you take this test case and see if it is indeed a > > > firmware issue? I could believe that there is a bug in libmthca's > > > mthca_tavor_post_recv() function too... > > > > Hi Tziporet -- any update about this issue (bad WQE address in CQE on > > non-mem-free HCAs)? > > > > > > We succeeded to reproduce this problem here and its under debug > > Tziporet I am sending two patches, one for userspace and one for kernel space which solves this issue. From erezz at voltaire.com Tue Jan 22 02:06:25 2008 From: erezz at voltaire.com (Erez Zilber) Date: Tue, 22 Jan 2008 12:06:25 +0200 Subject: [ofa-general] [PATCH] IB/iSER: add logical unit reset support Message-ID: <4795C021.6030707@voltaire.com> eh_device_reset_handler was already added to scsi_host_template in iscsi_tcp, and is now added also for iscsi_iser. Signed-off-by: Erez Zilber Signed-off-by: Mike Christie --- drivers/infiniband/ulp/iser/iscsi_iser.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c index fd69fb3..4cd0705 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.c +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c @@ -552,6 +552,7 @@ static struct scsi_host_template iscsi_iser_sht = { .max_sectors = 1024, .cmd_per_lun = ISCSI_MAX_CMD_PER_LUN, .eh_abort_handler = iscsi_eh_abort, + .eh_device_reset_handler= iscsi_eh_device_reset, .eh_host_reset_handler = iscsi_eh_host_reset, .use_clustering = DISABLE_CLUSTERING, .proc_name = "iscsi_iser", -- 1.5.3.7 From erezz at voltaire.com Tue Jan 22 02:13:40 2008 From: erezz at voltaire.com (Erez Zilber) Date: Tue, 22 Jan 2008 12:13:40 +0200 Subject: [ofa-general] [PATCH] IB/iSER: add logical unit reset support In-Reply-To: <4795C021.6030707@voltaire.com> References: <4795C021.6030707@voltaire.com> Message-ID: <4795C1D4.9050301@voltaire.com> Erez Zilber wrote: > eh_device_reset_handler was already added to scsi_host_template > in iscsi_tcp, and is now added also for iscsi_iser. > > Signed-off-by: Erez Zilber > Signed-off-by: Mike Christie > --- > drivers/infiniband/ulp/iser/iscsi_iser.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c > index fd69fb3..4cd0705 100644 > --- a/drivers/infiniband/ulp/iser/iscsi_iser.c > +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c > @@ -552,6 +552,7 @@ static struct scsi_host_template iscsi_iser_sht = { > .max_sectors = 1024, > .cmd_per_lun = ISCSI_MAX_CMD_PER_LUN, > .eh_abort_handler = iscsi_eh_abort, > + .eh_device_reset_handler= iscsi_eh_device_reset, > .eh_host_reset_handler = iscsi_eh_host_reset, > .use_clustering = DISABLE_CLUSTERING, > .proc_name = "iscsi_iser", -- Roland, This patch was sent directly to linux-scsi because it depends on patches that exist only in scsi-misc tree. You won't be able to build your tree if you try to apply this patch. Erez From lingeriedxa273 at xbox-newz.de Mon Jan 21 02:49:40 2008 From: lingeriedxa273 at xbox-newz.de (William Morris) Date: Tue, 21 Jan 2008 11:49:40 +0100 Subject: [ofa-general] Know her from the sexual side how is she inside exactly Message-ID: <035056519.87509489207375@xbox-newz.de> Hallo Openib not tomorrow, not next week, enlarge you co gl ck today http://home.graffiti.net/iluchkina/ William Morris -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlad at lists.openfabrics.org Tue Jan 22 03:11:25 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 22 Jan 2008 03:11:25 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080122-0200 daily build status Message-ID: <20080122111125.4A13CE60086@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on ia64 with linux-2.6.14 Passed on x86_64 with linux-2.6.16 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.13 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.15 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.19 Passed on ia64 with linux-2.6.17 Passed on powerpc with linux-2.6.14 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.18 Passed on powerpc with linux-2.6.13 Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.18 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.15 Passed on ppc64 with linux-2.6.13 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ppc64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.18-53.el5 Failed: From ogerlitz at voltaire.com Tue Jan 22 03:19:37 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Tue, 22 Jan 2008 13:19:37 +0200 Subject: [ofa-general] rdma_create_qp fails with -12 In-Reply-To: <1200936237.23538.15.camel@obelisk.thedillows.org> References: <1200936237.23538.15.camel@obelisk.thedillows.org> Message-ID: <4795D149.3010405@voltaire.com> David Dillow wrote: > On Mon, 2008-01-21 at 10:28 -0500, Shipman, Galen M. wrote: >> We are seeing failures setting up a QP using rdma_create_qp. >> This only occurs when: >> init_qp_attr.cap.max_send_wr >> init_qp_attr.cap.max_recv_wr >> Totals to more than 16K. > -12 is -ENOMEM Hi Galen, indeed, this was discussed on the list before with the reason for the -ENOMEM failure being the kmalloc (and its such, as mentioed in https://bugs.openfabrics.org/show_bug.cgi?id=331) mentioned below. Basically, I see it more of a missing questionable feature then a bug, but you may want to get a response from Roland on that. What would be an applicative design/need to allow sender or bunch of senders (in case of SRQ) to have few K of credits (=inflight messages), can you give an example of concrete middleware/app that would use that? Or. > You may want to add some printk's to mthca_alloc_wqe_buf() in > mthca_qp.c. I think you're failing in the line > qp->wrid = kmalloc((qp->rq.max + qp->sq.max) * sizeof (u64), > GFP_KERNEL); > > rq.max gets set the max_recv_wr, sq.max gets max_send_wr, so you're > trying to allocate 128KB when those total 16K -- that's trying to > allocate 32 contiguous pages, which I believe is the max RHEL4's kernel > will let you do via kmalloc(). New kernels may have alleviated this > somewhat -- my Fedora 8 box has a 1MB slab/slub cache, but good luck > actually allocating that if the box has been up any length of time. > > I'm not sure why it would work under userspace, but I've not looked very > hard either. Perhaps the IOMMU is coming into play there? From blarf at lastchance2dance.com Tue Jan 22 03:35:34 2008 From: blarf at lastchance2dance.com (Rosendo Pineda) Date: Tue, 22 Jan 2008 14:35:34 +0300 Subject: [ofa-general] Penisverlaengerung Message-ID: <114973333.59146275874706@lastchance2dance.com> Traum Penisverlaengerung Finden Sie, dass Ihr Penis zu klein ist? Haben Sie oft das Gefuehl, das Ihr Partner sexuell unbefriedigt ist? Wollen Sie einfach mehr Maennlichkeit und sexuelle Stabilitaet erreichen und dadurch auch ein selbstbewusstes Auftreten? Wollen Sie einfach Ihr Liebesleben aufpeppen? Haben Sie Probleme sich nackt in der Oeffentlichkeit zu zeigen? Wurden Sie in Ihrer Jugend oft wegen Ihres kleinen Penis aufgezogen? Wenn Sie nur eine der Fragen mit JA beantworten koennen sind Sie bei uns genau an der richtigen Adresse! - medizinisch anerkannt - schneller Versand weltweit - auf voellig natuerliche Art und Weise! - ohne teure und peinliche oder schmerzhafte Geraete - sehr diskret Versand-und Rechnungsanschrift - die Breite nimmt um mindestens 20% zu - hilft schnelle Ejakulation zu stoppen! - haertere Errrektion - 100% Geld-Zurueck-Garantie - 3 Flaschen VPXL kostenlos! - sichere 256 Bit verschluesselte Auftragsabwicklung - medizinische Unbedenklichkeit - keine Nebenwirkungen - Verlaengerung um mindestens 3 cm garantiert http://iorapad.com Jetzt bestellen -------------- next part -------------- An HTML attachment was scrubbed... URL: From ogerlitz at voltaire.com Tue Jan 22 03:21:40 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Tue, 22 Jan 2008 13:21:40 +0200 Subject: [ofa-general] rdma_create_qp fails with -12 In-Reply-To: <1200936237.23538.15.camel@obelisk.thedillows.org> References: <1200936237.23538.15.camel@obelisk.thedillows.org> Message-ID: <4795D1C4.5060704@voltaire.com> David Dillow wrote: > I'm not sure why it would work under userspace, but I've not looked very > hard either. b/c in user space the buffers for the WQEs are allocated by the verbs library > Perhaps the IOMMU is coming into play there? no, the IOMMU is not related to this limitation Or. From mark.wright at rockwool.pl Tue Jan 22 05:10:18 2008 From: mark.wright at rockwool.pl (mark.wright at rockwool.pl) Date: Tue, 22 Jan 2008 05:10:18 -0800 Subject: [ofa-general] Surrounded by Love Message-ID: <001801c85cf8$22b2b100$ab4723be@utfbr> Falling In Love with You http://90.151.16.218/ From kliteyn at mellanox.co.il Mon Jan 21 17:09:54 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: Mon, 21 Jan 2008 17:09:54 -0800 (PST) Subject: [ofa-general] ***SPAM*** nightly osm_sim report 2008-01-22:normal completion Message-ID: <20080122010956.C50A9E60194@openfabrics.org> From: kliteyn at mellanox.co.il Return-Path: kliteyn at mellanox.co.il Message-ID: X-OriginalArrivalTime: 22 Jan 2008 01:09:51.0117 (UTC) FILETIME=[7D3D33D0:01C85C93] Date: 22 Jan 2008 03:09:51 +0200 OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-21 OpenSM git rev = Fri_Jan_18_16:37:22_2008 [c9ccc473cb622f9b0e0a2d1a5492749d3c185484] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From ogerlitz at voltaire.com Tue Jan 22 07:08:44 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Tue, 22 Jan 2008 17:08:44 +0200 (IST) Subject: [ofa-general] Re: [PATCH] ib/ipoib: handle Gratuitous ARP & bonding failover race also for connected mode neighbours In-Reply-To: References: Message-ID: On Thu, 17 Jan 2008, Or Gerlitz wrote: > I have tested this patch on 2.6.24-rc1 (and its now in progress for 2.6.24-rc8) > things are basically working fine, but I do want to play more with bonding fail-overs > to make sure nothing was broken wrt to Gratuitous ARP etc, will let you know. I have did some more testing but not enough to say if without this patch fail-over under connected mode is always slow. Being away for the rest of this week, I will continue working on it next week. Or. From bart.vanassche at gmail.com Tue Jan 22 07:59:57 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 22 Jan 2008 16:59:57 +0100 Subject: [ofa-general] Building libsdp.so Message-ID: Hello, I'm probably overlooking something trivial, but can anyone tell me how to build libsdp.so (Sockets Direct Protocol) from the OFED source distribution (http://www.openfabrics.org/builds/ofed-1.2.5/release/OFED-1.2.5.4.tgz) ? Thanks, Bart. From tziporet at mellanox.co.il Tue Jan 22 08:08:06 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Tue, 22 Jan 2008 18:08:06 +0200 Subject: [ofa-general] OFED Jan 21 meeting summary on RC2 status In-Reply-To: <6C2C79E72C305246B504CBA17B5500C9031E5E56@mtlexch01.mtl.com> Message-ID: <6C2C79E72C305246B504CBA17B5500C9032876AD@mtlexch01.mtl.com> OFED Jan-21 meeting summary on OFED 1.3-rc2 status Meeting summary: ------------------ * RC2 is in good status, beside some bugs that should be fixed for RC3 * RC3 is planned for next week Meeting details: ---------------- 1. Review RC2 status * Qlogic - status is good with their vnic and general tests; see issues with qperf on RDS * Intel - RC2 is OK, vmapich is fixed for ia64 * Mellanox - regression is good and stable; Cleanup SDP bugs. * IBM - Status is good; PPC issues resolved * Neteffect - Status is OK * Chelsio - Testing progress well * Cisco - no update * Voltaire - have issues with bonding and IPoIB performance * MPI - all MPI packages are in good shape 2. Update on tasks that should be completed for RC2: * XRC - enhanced API - should be submitted today * IPoIB - need to resolve the new issue reported by Voltaire Tziporet From jimmott at austin.rr.com Tue Jan 22 08:14:31 2008 From: jimmott at austin.rr.com (Jim Mott) Date: Tue, 22 Jan 2008 10:14:31 -0600 Subject: [ofa-general] Building libsdp.so In-Reply-To: References: Message-ID: <000001c85d11$df3d41a0$9db7c4e0$@rr.com> It is built and installed as part of the full OFED build process. Just build OFED (./install.sh --with-all-libs) ought to do it. -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Bart Van Assche Sent: Tuesday, January 22, 2008 10:00 AM To: Openib-General Subject: [ofa-general] Building libsdp.so Hello, I'm probably overlooking something trivial, but can anyone tell me how to build libsdp.so (Sockets Direct Protocol) from the OFED source distribution (http://www.openfabrics.org/builds/ofed-1.2.5/release/OFED-1.2.5.4.tgz) ? Thanks, Bart. _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From bart.vanassche at gmail.com Tue Jan 22 08:18:24 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 22 Jan 2008 17:18:24 +0100 Subject: [ofa-general] Building libsdp.so In-Reply-To: <000001c85d11$df3d41a0$9db7c4e0$@rr.com> References: <000001c85d11$df3d41a0$9db7c4e0$@rr.com> Message-ID: On Jan 22, 2008 5:14 PM, Jim Mott wrote: > It is built and installed as part of the full OFED build process. Just build OFED > (./install.sh --with-all-libs) ought to do it. Sorry, but the install.sh script doesn't seem to accept this syntax: $ ./install.sh --with-all-libs install.sh [ -c ] [ -net ] Bart. From vopros at kp.ru Tue Jan 22 09:15:59 2008 From: vopros at kp.ru (Adolph Orozco) Date: Tue, 22 Jan 2008 17:15:59 +0000 Subject: [ofa-general] [University news] Message-ID: <01c85d1a$74ca3180$09a5c65a@vopros> WHAT A GREAT IDEA! We provide a concept that will allow anyone with sufficient work experience to obtain a fully vreifiable University Degree. Bachelors, Masters or even a Doctorate. Think of it, within four to six weeks, you too could be a college graduate. Many people share the same frustration, they are all doing the work of the person that has the degree and the person that has the degree is getting all the money. Don?t you think that it is time you were paid fair compensatino for the level of work you are already doing? This is your chance to finally make the right move and receive your due benefits. If you are like most people, you are more than qualified with your experience, but are lacking that prestigoius piece of paper known as a diploma that is often the passport to success. CALL US TODAY AND GIVE YOUR WORK EXPERIENCE THE CHANCE TO EARN YOU THE HIGHER COMPENSATION YOUD ESERVE! Ring Anytime +1-305-735-2839 Live All The Time assailant. Bennington had knocked him down again, and this time the Donnelly mused. He would not be able to do anything with this also a good way to get them instantly upon announcement, as the From weiny2 at llnl.gov Tue Jan 22 09:03:32 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Tue, 22 Jan 2008 09:03:32 -0800 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/ibprintswitch.pl: fix regex when searching for switch by name In-Reply-To: <20080120150421.GM10650@sashak.voltaire.com> References: <20080118163710.4fe2a64d.weiny2@llnl.gov> <20080119185547.GJ10979@sashak.voltaire.com> <20080120150421.GM10650@sashak.voltaire.com> Message-ID: <20080122090332.7761291b.weiny2@llnl.gov> On Sun, 20 Jan 2008 15:04:21 +0000 Sasha Khapyorsky wrote: > On 18:55 Sat 19 Jan , Sasha Khapyorsky wrote: > > > diff --git a/infiniband-diags/scripts/ibprintswitch.pl b/infiniband-diags/scripts/ibprintswitch.pl > > > index d28a839..23a39b5 100755 > > > --- a/infiniband-diags/scripts/ibprintswitch.pl > > > +++ b/infiniband-diags/scripts/ibprintswitch.pl > > > @@ -104,7 +104,7 @@ sub main > > > print $ports{$port}; > > > } > > > } > > > - if ("0x$guid" eq $target_switch || $desc =~ /.*$target_switch\s+.*/) > > > + if ("0x$guid" eq $target_switch || $desc =~ /.*$target_switch.*/) > > > > When the original regex will not work? > > Actually I think I see why this change is. You want to match node > description as regex. Looks fine for me. Yes exactly. Ira From vlad at dev.mellanox.co.il Tue Jan 22 09:38:05 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 22 Jan 2008 19:38:05 +0200 Subject: [ofa-general] Building libsdp.so In-Reply-To: References: Message-ID: <479629FD.2060608@dev.mellanox.co.il> Bart Van Assche wrote: > Hello, > > I'm probably overlooking something trivial, but can anyone tell me how > to build libsdp.so (Sockets Direct Protocol) from the OFED source > distribution (http://www.openfabrics.org/builds/ofed-1.2.5/release/OFED-1.2.5.4.tgz) > ? > > Thanks, > > Bart. Hi Bart, To install OFED-1.2.5.4 with libsdp, do the following steps: Open OFED-1.2.5.4.tgz cd OFED-1.2.5.4 cp docs/ofed.conf-example my_ofed.conf Check that all required packages are set to "y" (and libsdp=y) in the my_ofed.conf Then run ./install.sh -c my_ofed.conf Regards, Vladimir From neswletter at hotelprovincia.com.br Tue Jan 22 10:42:37 2008 From: neswletter at hotelprovincia.com.br (Hotel Provincia) Date: Tue, 22 Jan 2008 18:42:37 GMT Subject: [ofa-general] Faculdade Mater Dei - Vestibular Arquitetura 22/1/200815:42:20 Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mail_mkt_vest_2008_arqui.jpg Type: image/jpeg Size: 161572 bytes Desc: not available URL: From dsanchez at math-interactive.com Tue Jan 22 11:10:56 2008 From: dsanchez at math-interactive.com (Deborah Ashby) Date: Tue, 22 Jan 2008 21:10:56 +0200 Subject: [ofa-general] Die Software mit Maximum Qualitä t zu Minimalpreis Message-ID: <701193829.09833914302549@math-interactive.com> Wir freuen uns darauf, Ihnen lokalisierte Versionen bekannter Programme anbieten zu können: Englisch, Deutsch, Französisch, Italienisch, Spanisch und viele andere Sprachen! Sofort nach dem Kauf können Sie jedes Programm herunterladen und installieren. http://geocities.com/gerard.morton/ Unser Preis: * Windows XP Professional With SP2 Full Version: $59.95 * Adobe Acrobat 8.0 Professional: $69.95 * Office 2003 Professional (including Publisher 2003): $59.95 * Adobe Photoshop CS3 Extended: $79.95 * Adobe Creative Suite 3 Design Premium: $229.95 * Adobe Photoshop CS2 with ImageReady CS2: $79.95 http://geocities.com/gerard.morton/ Wir haben mehr 300 verschiedener Programmes für PC und Macintosh! Kaufen jetzt, warten Sie nicht! -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Tue Jan 22 13:56:00 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 22 Jan 2008 13:56:00 -0800 Subject: [ofa-general] Re: InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: <20080121100151.GD5333@infradead.org> (Christoph Hellwig's message of "Mon, 21 Jan 2008 10:01:51 +0000") References: <20080121100151.GD5333@infradead.org> Message-ID: > > - Neteffect "nes" driver. It's not terribly clean code but since > > it's a new driver that is completely self-contained, I plan on > > merging it and letting cleanups happen upstream. > > New code should be better quality than old code, not worse. I haven't > actually seen the driver yet, but by that statement I'd be clearly > against a merge. The driver has been posted a few times; the latest code is in the "neteffect" branch of my tree: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git neteffect It's not *that* bad -- certainly there are lots of things that could be improved (sparse endianness annotation, too many lines that are way to long, strange indentation of case labeles, etc, etc) but it is a self-contained hardware driver. I agree with Linus's position (stated at the last kernel summit) that we ought to merge hardware drivers early, so that users get the drivers with as little hassle as possible. We lose a little leverage in getting cleanups done, but the number of people who see the code and are able to clean it up increases, so I think it's a good trade-off. - R. From info at myigla.net Tue Jan 22 14:20:45 2008 From: info at myigla.net (=?windows-1255?B?5Pfs4/rtIOD6IPnu6+0g4eLl4uwgPw==?=) Date: Tue, 22 Jan 2008 14:20:45 -0800 Subject: [ofa-general] =?windows-1255?b?9vjr6e0g7Pn0+CDg+iDk7uXw6ejp7yA/?= =?windows-1255?b?IPb45SD3+fg=?= Message-ID: <20080122222045.BA4BDE60296@openfabrics.org> An HTML attachment was scrubbed... URL: From pradeeps at linux.vnet.ibm.com Tue Jan 22 14:41:03 2008 From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana) Date: Tue, 22 Jan 2008 14:41:03 -0800 Subject: [ofa-general] ib_mthca "Missing DCS" Message-ID: <479670FF.9070805@linux.vnet.ibm.com> When we moved a Meallanox HCA from one bus to another on a P5 system we get the "Missing DCS" error and ib_mthca fails during init. Why do we get this error and what does it mean? Pradeep From rdreier at cisco.com Tue Jan 22 14:45:20 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 22 Jan 2008 14:45:20 -0800 Subject: [ofa-general] ib_mthca "Missing DCS" In-Reply-To: <479670FF.9070805@linux.vnet.ibm.com> (Pradeep Satyanarayana's message of "Tue, 22 Jan 2008 14:41:03 -0800") References: <479670FF.9070805@linux.vnet.ibm.com> Message-ID: > When we moved a Meallanox HCA from one bus to another on a P5 system > we get the "Missing DCS" error and ib_mthca fails during init. Why > do we get this error and what does it mean? It comes from this code: /* * Check for BARs. We expect 0: 1MB, 2: 8MB, 4: DDR (may not * be present) */ if (!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM) || pci_resource_len(pdev, 0) != 1 << 20) { dev_err(&pdev->dev, "Missing DCS, aborting.\n"); err = -ENODEV; goto err_disable_pdev; } In other words the driver doesn't see the first mmio BAR of the PCI device. What does lspci show for the HCA in this case? (Also are you moving the device while the system is running, or are you moving the HCA when the system is down? Hotplug "should work" but I don't know if anyone else has tried it) - R. From rdreier at cisco.com Tue Jan 22 15:00:13 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 22 Jan 2008 15:00:13 -0800 Subject: [ofa-general] Re: [PATCH RESEND 3/3] RDMA/cxgb3: Mark qp as privileged based on user capabilities. In-Reply-To: <20080121204213.3820.12396.stgit@dell3.ogc.int> (Steve Wise's message of "Mon, 21 Jan 2008 14:42:13 -0600") References: <20080121204130.3820.11053.stgit@dell3.ogc.int> <20080121204213.3820.12396.stgit@dell3.ogc.int> Message-ID: thanks, applied 1-3 From rdreier at cisco.com Tue Jan 22 15:03:31 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 22 Jan 2008 15:03:31 -0800 Subject: [ofa-general] rdma_create_qp fails with -12 In-Reply-To: <4795D149.3010405@voltaire.com> (Or Gerlitz's message of "Tue, 22 Jan 2008 13:19:37 +0200") References: <1200936237.23538.15.camel@obelisk.thedillows.org> <4795D149.3010405@voltaire.com> Message-ID: > Hi Galen, indeed, this was discussed on the list before with the > reason for the -ENOMEM failure being the kmalloc (and its such, as > mentioed in https://bugs.openfabrics.org/show_bug.cgi?id=331) > mentioned below. > > Basically, I see it more of a missing questionable feature then a bug, > but you may want to get a response from Roland on that. Yes, I agree with Or's assessment. The work request ID auxiliary array is allocated using contiguous memory in the kernel (it's not a problem in userspace because userspace virtual memory need not be physically contiguous). Removing this limitation would make the code more complex, and in general supporting huge queue depths hasn't seemed that important. - R. From rdreier at cisco.com Tue Jan 22 15:04:39 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 22 Jan 2008 15:04:39 -0800 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: <200801190927.08992.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Sat, 19 Jan 2008 09:27:08 +0200") References: <478D1A49.1080807@mellanox.co.il> <20080117153043.GA10065@minantech.com> <200801190927.08992.jackm@dev.mellanox.co.il> Message-ID: > > I guess you mean just implement XRC without allowing multiple > > processes to share an XRC domain?  That actually seems like a sensible > > thing to implement as well... > > This is part of the current XRC implementation -- just give -1 as the fd value > in ibv_open_xrc_domain(). I *think* Gleb's point was that the XRC implementation could be much simpler if this were the *only* case supported -- you wouldn't need all the complexity of kernel receive QPs etc I guess. Gleb, is that what you meant? From YJia at tmriusa.com Tue Jan 22 15:15:19 2008 From: YJia at tmriusa.com (Yicheng Jia) Date: Tue, 22 Jan 2008 17:15:19 -0600 Subject: [ofa-general] fast path operation? In-Reply-To: Message-ID: Hi Roland, I got another question about your code for mthca HCA. In the "mthca_provider.c", the functions "mthca_create_cq(struct ib_device *ibdev, int entries, struct ib_ucontext *context, struct ib_udata *udata)" and "mthca_alloc_pd(struct ib_device *ibdev, struct ib_ucontext *context, struct ib_udata *udata)" all have a parameter "ib_ucontext*", while in the "Verbs.c", the calling functions "ib_create_cq" and "ib_alloc_pd" all pass "NULL" to this parameter, is the " ib_ucontext *" only needed for fast path operation, in which situation the user application can write directly to user space? If it's true, how to set up the fast path operation in the driver environment? Thanks! Yicheng Roland Dreier 01/04/2008 02:45 PM To Yicheng Jia cc general at lists.openfabrics.org, Jack Morgenstein Subject Re: [ofa-general] synchronize commands issued to MTHCA > I'm using Duo-core Xeon and I just grep the source of "mmiowb()" in kernel > 2.6.23 include/asm-x86_64 /io.h and found that this function does nothing > on x86_64 platform, is it true? Yes -- this is why I kept referring to large SGI Altix systems. - R. _____________________________________________________________________________ Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com _____________________________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Tue Jan 22 15:23:23 2008 From: rdreier at cisco.com (Roland Dreier) Date: Tue, 22 Jan 2008 15:23:23 -0800 Subject: [ofa-general] fast path operation? In-Reply-To: (Yicheng Jia's message of "Tue, 22 Jan 2008 17:15:19 -0600") References: Message-ID: > I got another question about your code for mthca HCA. In the > "mthca_provider.c", the functions "mthca_create_cq(struct ib_device > *ibdev, int entries, struct ib_ucontext *context, struct ib_udata *udata)" > and "mthca_alloc_pd(struct ib_device *ibdev, struct ib_ucontext *context, > struct ib_udata *udata)" all have a parameter "ib_ucontext*", while in the > "Verbs.c", the calling functions "ib_create_cq" and "ib_alloc_pd" all pass > "NULL" to this parameter, is the " ib_ucontext *" only needed for fast > path operation, in which situation the user application can write directly > to user space? If it's true, how to set up the fast path operation in the > driver environment? Yes, the struct ib_ucontext is used for objects that are created and accessed in userspace. The parameter will be non-NULL for commands from drivers/infiniband/core/uverbs_cmd.c. - R. From gstreiff at NetEffect.com Tue Jan 22 15:49:12 2008 From: gstreiff at NetEffect.com (Glenn Streiff) Date: Tue, 22 Jan 2008 17:49:12 -0600 Subject: [ofa-general] Re: InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: Message-ID: <5E701717F2B2ED4EA60F87C8AA57B7CC0794FEA2@venom2> > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org]On Behalf Of > Roland Dreier > Sent: Tuesday, January 22, 2008 3:56 PM > To: Christoph Hellwig > Cc: linux-kernel at vger.kernel.org; general at lists.openfabrics.org > Subject: [ofa-general] Re: InfiniBand/RDMA merge plans for 2.6.25 > > > > > - Neteffect "nes" driver. It's not terribly clean code > but since > > > it's a new driver that is completely self-contained, I plan on > > > merging it and letting cleanups happen upstream. > > > > New code should be better quality than old code, not > worse. I haven't > > actually seen the driver yet, but by that statement I'd be clearly > > against a merge. > > The driver has been posted a few times; the latest code is in the > "neteffect" branch of my tree: > > > git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniban > d.git neteffect > > It's not *that* bad -- certainly there are lots of things that could > be improved (sparse endianness annotation, too many lines that are way > to long, strange indentation of case labeles, etc, etc) but it is a > self-contained hardware driver. I agree with Linus's position (stated > at the last kernel summit) that we ought to merge hardware drivers > early, so that users get the drivers with as little hassle as > possible. We lose a little leverage in getting cleanups done, but the > number of people who see the code and are able to clean it up > increases, so I think it's a good trade-off. > > - R. > My view is the code should and will be cleaned up based upon the feedback we've gotten from the community. It is a priority for me. Several cleanup fixes are in the queue and are being worked. Haven't slipped into complacency at the prospect of the merge. Glenn gstreiff at neteffect.com From arlin.r.davis at intel.com Tue Jan 22 16:01:04 2008 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Tue, 22 Jan 2008 16:01:04 -0800 Subject: [ofa-general] [PATCH] uDAPL v2 openib_cma: DTO extended cookie cleanup issue with inbound rdma write with immediate completions Message-ID: Fix for uDAPL v2 using extended operation. After extension completion, the DTO cookie must be checked for type before deallocating to handle inbound immediate data in receive. The sample dtestx client will fail when running loopback if the rdma immediate is received from remote before the rdma immediate request completion fires. dtestx client error: dat_ib_post_rdma_write_immed returned DAT_INSUFFICIENT_RESOURCES : DAT_RESOURCE_MEMORY Signed-off by: Arlin Davis diff --git a/dapl/common/dapl_evd_util.c b/dapl/common/dapl_evd_util.c index a3bcbd5..a993b02 100755 --- a/dapl/common/dapl_evd_util.c +++ b/dapl/common/dapl_evd_util.c @@ -1041,7 +1041,10 @@ dapli_evd_cqe_to_event ( DAPL_GET_CQE_OPTYPE(cqe_ptr) != OP_RECEIVE)) { dapls_cqe_to_event_extension (ep_ptr, cookie, cqe_ptr, event_ptr); - dapls_cookie_dealloc (&ep_ptr->req_buffer, cookie); + if (cookie->val.dto.type == DAPL_DTO_TYPE_RECV) + dapls_cookie_dealloc (&ep_ptr->recv_buffer, cookie); + else + dapls_cookie_dealloc (&ep_ptr->req_buffer, cookie); break; } #endif From kliteyn at mellanox.co.il Tue Jan 22 17:48:08 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 23 Jan 2008 03:48:08 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-23:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-22 OpenSM git rev = Fri_Jan_18_16:37:22_2008 [c9ccc473cb622f9b0e0a2d1a5492749d3c185484] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=397 Fail=3 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 7 LidMgr IS3-128.topo Failures: 3 LidMgr IS3-128.topo From oceanicbankplc2008 at yahoo.fr Tue Jan 22 04:04:52 2008 From: oceanicbankplc2008 at yahoo.fr (oceanicbankplc2008 at yahoo.fr) Date: Tue, 22 Jan 2008 13:04:52 +0100 Subject: [ofa-general] Good day, Your ATM-CARD Payment Sucessful !!! Message-ID: <4795dbe4.de.6fd0.458285556@elvigiaensenada.com> Greetings to you & your Family. I have been waiting for you since to contact me for your Confirmable bank Draft of $1,250,000.00 United States Dollars But I did not hear from you since that time,then i went and deposited the $1,250,000.00 United States Dollars, draft/cheque in the bank. We have arranged your payment through swift card payment center Asia pacific, this card center will send you an atm card which you will use to withdraw your money in any atm machine in any part of the world, but the maximum is One Thousand, Five Hundred United States Dollars Per-Day. Kindly contact the below person who is in possition to release your ATM Payment Card and Send your information which they will use to send the card to you. 1) Your full name.........................; 2)Your home address were you want them to send the atmcard................ 3)Current occupation.................... 4)Age...................................; 5)Your current home telephone number/mobile phone number................. However, kindly contact the below person who is in position to release your atm card. Directors name is (DR.Mohammed IBRAHIM), ATM Payment Department OCEANIC BANK PLC (O B P) Telephone.......+229-93-68-43-97 Fax...................+229-3690637. Email: (oceanicbankplc2008 at yahoo.fr) I have paid for the processing and delivery charges.the only money that your are going to pay to them is only $98 dollars which they will use to obtain the affidevit of onwership from the federal high court of Benin Republic. this is the code of cond of conduct which is(ATM-0411) so you have to indicate this code when contacting the ATM DEPT CENTER IN OCEANIC BANK by using it as your subject. Try to contact them as soon as possible to quicken the processing of your card before your draft gets expired. You should also Let me know as soon as you receive it from bank ok. Yours Faithfully, MR, james S, May the unfailing grace and power of the most God keep and guide you From dynsosallasuv at sosalla.de Tue Jan 22 19:18:29 2008 From: dynsosallasuv at sosalla.de (Harrison Moseley) Date: Wed, 23 Jan 2008 00:18:29 -0300 Subject: [ofa-general] Be a hero! Message-ID: <955129365.19470238501086@sosalla.de> Would you like to be a real hero in bed? Nothing can assist you better than meds from us! Buy remedies under license just from our drug store!! Enlarge readily your strength now! http://rainagaining.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mashirle at us.ibm.com Tue Jan 22 10:08:41 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Tue, 22 Jan 2008 10:08:41 -0800 Subject: [ofa-general] [RFC] IPoIB UD 4K MTU support Message-ID: <1201025321.756.33.camel@localhost.localdomain> Hello Roland, IPoIB UD currently supports up to 2K MTU. Below is the draft patch to enable IPoIB UD 4K MTU support for any IB device who has 4K MTU like IBM eHCA. This patch limits packet in one page range by setting IPoIB UD MTU size as 4K-48 (40 GRH, 4 IPoIB header, 4 padding to IP header align) to avoid two contiguous pages allocation when kernel page size is 4K. Enabling IPoIB UD 4K MTU relies on both SM to set default broadcast group 4K MTU and of course switch should support 4K MTU. When SM default broadcast group MTU sets 2K, IPoIB UD MTU will fall back to 2K. I have tested 2K MTU. 4K MTU is still under testing. The reason I send this patch out before my test for review is I want comments as early as possible. So I can integrate the comments into this patch and hopefully we can make it into OFED-1.3-rc3 which is around Jan.30. Thanks Shirley diff -urpN ipoib/ipoib.h /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib.h --- ipoib/ipoib.h 2008-01-21 14:16:19.000000000 -0500 +++ /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib.h 2008-01-22 15:50:13.000000000 -0500 @@ -56,9 +56,6 @@ /* constants */ enum { - IPOIB_PACKET_SIZE = 2048, - IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, - IPOIB_ENCAP_LEN = 4, IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header to 16 */ @@ -320,6 +317,7 @@ struct ipoib_dev_priv { struct dentry *mcg_dentry; struct dentry *path_dentry; #endif + unsigned int max_ib_mtu; }; struct ipoib_ah { @@ -698,4 +696,11 @@ extern int ipoib_debug_level; #define IPOIB_QPN(ha) (be32_to_cpup((__be32 *) ha) & 0xffffff) +/* padding packet to fit one page size for 4K IB mtu */ +static inline int ipoib_ud_mtu(unsigned int ib_mtu) +{ + return (ib_mtu < 4096) ? (ib_mtu - IPOIB_ENCAP_LEN) : + (ib_mtu - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4); +} + #endif /* _IPOIB_H */ diff -urpN ipoib/ipoib_ib.c /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_ib.c --- ipoib/ipoib_ib.c 2008-01-10 13:13:12.000000000 -0500 +++ /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2008-01-22 15:58:16.000000000 -0500 @@ -87,6 +87,15 @@ void ipoib_free_ah(struct kref *kref) spin_unlock_irqrestore(&priv->lock, flags); } +static int ipoib_ud_buf_size(unsigned int max_ib_mtu) +{ + if (max_ib_mtu < 4096) + return (max_ib_mtu + IB_GRH_BYTES); + else + /* padding packet to one page for 4K mtu */ + return (max_ib_mtu - 4); +} + static int ipoib_ib_post_receive(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -96,7 +105,7 @@ static int ipoib_ib_post_receive(struct int ret; list.addr = priv->rx_ring[id].mapping; - list.length = IPOIB_BUF_SIZE; + list.length = ipoib_ud_buf_size(priv->max_ib_mtu); list.lkey = priv->mr->lkey; param.next = NULL; @@ -108,7 +117,7 @@ static int ipoib_ib_post_receive(struct if (unlikely(ret)) { ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); ib_dma_unmap_single(priv->ca, priv->rx_ring[id].mapping, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); dev_kfree_skb_any(priv->rx_ring[id].skb); priv->rx_ring[id].skb = NULL; } @@ -122,7 +131,7 @@ static int ipoib_alloc_rx_skb(struct net struct sk_buff *skb; u64 addr; - skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); + skb = dev_alloc_skb(ipoib_ud_buf_size(priv->max_ib_mtu) + 4); if (!skb) return -ENOMEM; @@ -133,7 +142,7 @@ static int ipoib_alloc_rx_skb(struct net */ skb_reserve(skb, 4); - addr = ib_dma_map_single(priv->ca, skb->data, IPOIB_BUF_SIZE, + addr = ib_dma_map_single(priv->ca, skb->data, ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { dev_kfree_skb_any(skb); @@ -190,7 +199,7 @@ static void ipoib_ib_handle_rx_wc(struct "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); ib_dma_unmap_single(priv->ca, addr, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); dev_kfree_skb_any(skb); priv->rx_ring[wr_id].skb = NULL; return; @@ -215,7 +224,7 @@ static void ipoib_ib_handle_rx_wc(struct ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", wc->byte_len, wc->slid); - ib_dma_unmap_single(priv->ca, addr, IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ib_dma_unmap_single(priv->ca, addr, ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); skb_put(skb, wc->byte_len); skb_pull(skb, IB_GRH_BYTES); @@ -632,7 +641,7 @@ int ipoib_ib_dev_stop(struct net_device continue; ib_dma_unmap_single(priv->ca, rx_req->mapping, - IPOIB_BUF_SIZE, + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); dev_kfree_skb_any(rx_req->skb); rx_req->skb = NULL; diff -urpN ipoib/ipoib_main.c /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_main.c --- ipoib/ipoib_main.c 2008-01-21 14:43:39.000000000 -0500 +++ /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-22 15:39:44.000000000 -0500 @@ -193,7 +193,7 @@ static int ipoib_change_mtu(struct net_d return 0; } - if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) + if (new_mtu > ipoib_ud_mtu(priv->max_ib_mtu)) return -EINVAL; priv->admin_mtu = new_mtu; @@ -978,7 +978,7 @@ static void ipoib_setup(struct net_devic dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; /* MTU will be reset when mcast join happens */ - dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; + dev->mtu = ipoib_ud_mtu(priv->max_ib_mtu); priv->mcast_mtu = priv->admin_mtu = dev->mtu; memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); @@ -1112,6 +1112,7 @@ static struct net_device *ipoib_add_port struct ib_device *hca, u8 port) { struct ipoib_dev_priv *priv; + struct ib_port_attr attr; int result = -ENOMEM; priv = ipoib_intf_alloc(format); @@ -1120,6 +1121,13 @@ static struct net_device *ipoib_add_port SET_NETDEV_DEV(priv->dev, hca->dma_device); + if (!ib_query_port(hca, port, &attr)) + priv->max_ib_mtu = ib_mtu_enum_to_int(attr.max_mtu); + else { + printk(KERN_WARNING "%s: ib_query_port %d failed\n", + hca->name, port); + goto device_init_failed; + } result = ib_query_pkey(hca, port, 0, &priv->pkey); if (result) { printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", diff -urpN ipoib/ipoib_multicast.c /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_multicast.c --- ipoib/ipoib_multicast.c 2008-01-10 13:13:12.000000000 -0500 +++ /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-01-22 15:42:10.000000000 -0500 @@ -567,9 +567,7 @@ void ipoib_mcast_join_task(struct work_s return; } - priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - - IPOIB_ENCAP_LEN; - + priv->mcast_mtu = ipoib_ud_mtu(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); From hch at infradead.org Tue Jan 22 21:50:14 2008 From: hch at infradead.org (Christoph Hellwig) Date: Wed, 23 Jan 2008 05:50:14 +0000 Subject: [ofa-general] Re: InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: References: <20080121100151.GD5333@infradead.org> Message-ID: <20080123055014.GA9256@infradead.org> On Tue, Jan 22, 2008 at 01:56:00PM -0800, Roland Dreier wrote: > be improved (sparse endianness annotation, that's a blocker for sure. No new code that's not sparse clean, please. From yangdong at ncic.ac.cn Wed Jan 23 00:27:34 2008 From: yangdong at ncic.ac.cn (yangdong) Date: Wed, 23 Jan 2008 16:27:34 +0800 Subject: [ofa-general] ib_poll_cq err: IB_WC_LOC_QP_OP_ERR Message-ID: <4796FA76.7020401@ncic.ac.cn> i do rdma-read in kernel, but when i use ib_poll_cq, there is a err: scq completion failed status IB_WC_LOC_QP_OP_ERR, anyone please tell me where is my omission? Thx. -- Dong Yang Institute of Computing Technology, Chinese Academy of Sciences Address: National Research Center for Intelligent Computing Systems (NCIC), P.O. Box 2704, Beijing 100080, P.R. China Phone: +86-10-62601005 From neurogenic at airamericaradio.com Wed Jan 23 00:30:11 2008 From: neurogenic at airamericaradio.com (Esteb Sphon) Date: Wed, 23 Jan 2008 08:30:11 +0000 Subject: [ofa-general] pansies Message-ID: <7270176362.20080123082746@airamericaradio.com> Heyello, Downloadaable Softwaare http://www.geocities.com/yfgiw8ibus9cb1/ The temper which would be glad at lincoln's death dinky house, the mascot, and the rest of the tiny be best pleased at that either. The fellow is and he laughed till he cried. the apemen laughed north has presented itself to capable soldiers, 400,000 francs. 4. The woods and forests of la i may yet think of you sometimes when i recall an expression of admiring apprehension, as she for his cough from herbs and roots, but lot would may at present be, and it is not impossible that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dotanb at dev.mellanox.co.il Wed Jan 23 01:03:07 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Wed, 23 Jan 2008 11:03:07 +0200 Subject: [ofa-general] ib_poll_cq err: IB_WC_LOC_QP_OP_ERR In-Reply-To: <4796FA76.7020401@ncic.ac.cn> References: <4796FA76.7020401@ncic.ac.cn> Message-ID: <479702CB.80409@dev.mellanox.co.il> yangdong wrote: > i do rdma-read in kernel, but when i use ib_poll_cq, there is a err: > scq completion failed status IB_WC_LOC_QP_OP_ERR, anyone please tell > me where is my omission? Thx. > Are you sure that this is an RC QP? Dotan From dotanb at dev.mellanox.co.il Wed Jan 23 01:13:17 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Wed, 23 Jan 2008 11:13:17 +0200 Subject: [ofa-general] why does ibv_device_attr.*_guid are in network order? Message-ID: <4797052D.4070502@dev.mellanox.co.il> Hi. I noticed that the following attributes in ibv_device_attr are in network order: node_guid sys_image_guid What is the reason for this? thanks Dotan From bart.vanassche at gmail.com Wed Jan 23 01:52:19 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 23 Jan 2008 10:52:19 +0100 Subject: [ofa-general] Building libsdp.so In-Reply-To: <479629FD.2060608@dev.mellanox.co.il> References: <479629FD.2060608@dev.mellanox.co.il> Message-ID: On Jan 22, 2008 6:38 PM, Vladimir Sokolovsky wrote: > > Bart Van Assche wrote: > > I'm probably overlooking something trivial, but can anyone tell me how > > to build libsdp.so (Sockets Direct Protocol) from the OFED source > > distribution (http://www.openfabrics.org/builds/ofed-1.2.5/release/OFED-1.2.5.4.tgz) > > ? > > Check that all required packages are set to "y" (and libsdp=y) in the my_ofed.conf > Then run > ./install.sh -c my_ofed.conf All InfiniBand tests I have performed until now have been performed with the InfiniBand kernel modules from the mainstream Linux kernel. Since the ib_sdp kernel module is not included in this tree, I have copied the sdp source code to the 2.6.22.9 kernel source tree and have it built from there. This resulted in a build error: ib_create_cq() is called from sdp_cma.c with five arguments while include/rdma/ib_verbs.h specifies that it has six arguments. After I added a sixth argument (zero), the module compiled and works fine. Can the ib_sdp kernel module please be sent upstream ? Bart. From jackm at dev.mellanox.co.il Wed Jan 23 01:59:30 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Jan 2008 11:59:30 +0200 Subject: [ofa-general] [PATCH 0/ 8] XRC patch series (including xrc receive-only QPs) Message-ID: <200801231159.30989.jackm@dev.mellanox.co.il> This patch series is the updated XRC implementation (kernel and user (libibverbs and libmlx4)). Please give feedback -- I'm still reviewing the locking in this implementation. The kernel patches are all based on git: //git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git branch: for-2.6.25 commit: 5e8a3c6041ded7e306607bb6c96a0e68ca4dd2b4 ***** In addition, the kernel patch series requires that Eli Cohen's patch 7/16, posted January 16, be applied first ([ofa-general] [PATCH 7/16] ib/core: Add creation flags to QPs ) ***** The patches should be applied in the order posted. Changes: - Added creation of XRC receive-only QPs for userspace, which reside in kernel space (user cannot post-to or poll these QPs). Motivation: MPI community required XRC receive QPs which would not be destroyed when the creating process terminated. Solution: Userspace requests that a QP be created in kernel space. Each userspace process using that QP (i.e. receiving packets on an XRC SRQ via the qp), registers with that QP (-- the creator is also registered, whether or not it is a user of the QP). When the last userspace user unregisters with the QP, it is destroyed. Unregistration is also part of userspace cleanup, so there is no leakage. API for this: ibv_create_xrc_rcv_qp ibv_modify_xrc_rcv_qp ibv_query_xrc_rcv_qp ibv_reg_xrc_rcv_qp ibv_unreg_xrc_rcv_qp Creating process workflow: ibv_create_xrc_rcv_qp -- to create ibv_modify_xrc_rcv_qp -- to move QP to INIT ibv_modify_xrc_rcv_qp -- to move QP to RTR (to RTS is not needed for receive-only QPs) ibv_unreg_xrc_rcv_qp -- instead of destroy. Using process workflow ibv_create_xrc_srq -- to create an SRQ ibv_reg_xrc_rcv_qp -- to register with the QP as a user ibv_destroy_srq ibv_unreg_xrc_rcv_qp -- to "unregister" with the QP. If no user process remain registered, the QP is destroyed. NOTES: 1. Since there is no userspace object for the QP, the API uses the XRC domain object and qp number instead. 2. Registration needs to be performed only once per process (multiple registrations count as a single registration). 3. Async events for the receive QP are delivered to all registered processes. The event ID is "OR'ed" with 0x80000000, to indicate that this is an XRC receive-only QP event. The element field union value "xrc_qp_num" is set to the QP number which generated the event. - Jack From jackm at dev.mellanox.co.il Wed Jan 23 01:59:45 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Jan 2008 11:59:45 +0200 Subject: [ofa-general] [PATCH 2/8] libmlx4: implement XRC qps (including xrc receive-only qps) Message-ID: <200801231159.46116.jackm@dev.mellanox.co.il> Implements the XRC-receive-only verbs for the userspace driver. Signed-off-by: Jack Morgenstein diff --git a/configure.in b/configure.in index 25f27f7..9304539 100644 --- a/configure.in +++ b/configure.in @@ -42,6 +42,9 @@ AC_CHECK_HEADER(valgrind/memcheck.h, dnl Checks for typedefs, structures, and compiler characteristics. AC_C_CONST AC_CHECK_SIZEOF(long) +AC_CHECK_MEMBER(struct ibv_context.xrc_ops, + [AC_DEFINE([HAVE_IBV_XRC_OPS], 1, [Define to 1 if xrc_ops is a member of ibv_context])],, + [#include ]) dnl Checks for library functions AC_CHECK_FUNC(ibv_read_sysfs_file, [], diff --git a/src/cq.c b/src/cq.c index cae8406..5d67b5f 100644 --- a/src/cq.c +++ b/src/cq.c @@ -196,10 +196,12 @@ static int mlx4_poll_one(struct mlx4_cq *cq, struct mlx4_cqe *cqe; struct mlx4_srq *srq; uint32_t qpn; + uint32_t srqn; uint32_t g_mlpath_rqpn; uint16_t wqe_index; int is_error; int is_send; + int is_src_recv = 0; cqe = next_cqe_sw(cq); if (!cqe) @@ -221,20 +223,30 @@ static int mlx4_poll_one(struct mlx4_cq *cq, is_error = (cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) == MLX4_CQE_OPCODE_ERROR; - if (!*cur_qp || - (ntohl(cqe->my_qpn) & 0xffffff) != (*cur_qp)->ibv_qp.qp_num) { - /* - * We do not have to take the QP table lock here, - * because CQs will be locked while QPs are removed - * from the table. - */ - *cur_qp = mlx4_find_qp(to_mctx(cq->ibv_cq.context), - ntohl(cqe->my_qpn) & 0xffffff); - if (!*cur_qp) + if (qpn & MLX4_XRC_QPN_BIT && !is_send) { + srqn = ntohl(cqe->g_mlpath_rqpn) & 0xffffff; + /* + * We do not have to take the XRC SRQ table lock here, + * because CQs will be locked while XRC SRQs are removed + * from the table. + */ + srq = mlx4_find_xrc_srq(to_mctx(cq->ibv_cq.context), srqn); + if (!srq) return CQ_POLL_ERR; - } - - wc->qp_num = (*cur_qp)->ibv_qp.qp_num; + is_src_recv = 1; + } else if (!*cur_qp || (qpn & 0xffffff) != (*cur_qp)->ibv_qp.qp_num) { + /* + * We do not have to take the QP table lock here, + * because CQs will be locked while QPs are removed + * from the table. + */ + *cur_qp = mlx4_find_qp(to_mctx(cq->ibv_cq.context), + qpn & 0xffffff); + if (!*cur_qp) + return CQ_POLL_ERR; + } + + wc->qp_num = qpn & 0xffffff; if (is_send) { wq = &(*cur_qp)->sq; @@ -242,6 +254,10 @@ static int mlx4_poll_one(struct mlx4_cq *cq, wq->tail += (uint16_t) (wqe_index - (uint16_t) wq->tail); wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)]; ++wq->tail; + } else if (is_src_recv) { + wqe_index = htons(cqe->wqe_index); + wc->wr_id = srq->wrid[wqe_index]; + mlx4_free_srq_wqe(srq, wqe_index); } else if ((*cur_qp)->ibv_qp.srq) { srq = to_msrq((*cur_qp)->ibv_qp.srq); wqe_index = htons(cqe->wqe_index); @@ -387,6 +403,10 @@ void mlx4_cq_clean(struct mlx4_cq *cq, uint32_t qpn, struct mlx4_srq *srq) uint32_t prod_index; uint8_t owner_bit; int nfreed = 0; + int is_xrc_srq = 0; + + if (srq && srq->ibv_srq.xrc_cq) + is_xrc_srq = 1; pthread_spin_lock(&cq->lock); @@ -407,7 +427,12 @@ void mlx4_cq_clean(struct mlx4_cq *cq, uint32_t qpn, struct mlx4_srq *srq) */ while ((int) --prod_index - (int) cq->cons_index >= 0) { cqe = get_cqe(cq, prod_index & cq->ibv_cq.cqe); - if ((ntohl(cqe->my_qpn) & 0xffffff) == qpn) { + if (is_xrc_srq && + (ntohl(cqe->g_mlpath_rqpn & 0xffffff) == srq->srqn) && + !(cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK)) { + mlx4_free_srq_wqe(srq, ntohs(cqe->wqe_index)); + ++nfreed; + } else if ((ntohl(cqe->my_qpn) & 0xffffff) == qpn) { if (srq && !(cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK)) mlx4_free_srq_wqe(srq, ntohs(cqe->wqe_index)); ++nfreed; diff --git a/src/mlx4-abi.h b/src/mlx4-abi.h index 20a40c9..1b1253c 100644 --- a/src/mlx4-abi.h +++ b/src/mlx4-abi.h @@ -68,6 +68,14 @@ struct mlx4_resize_cq { __u64 buf_addr; }; +#ifdef HAVE_IBV_XRC_OPS +struct mlx4_create_xrc_srq { + struct ibv_create_xrc_srq ibv_cmd; + __u64 buf_addr; + __u64 db_addr; +}; +#endif + struct mlx4_create_srq { struct ibv_create_srq ibv_cmd; __u64 buf_addr; @@ -90,4 +98,12 @@ struct mlx4_create_qp { __u8 reserved[5]; }; +#ifdef HAVE_IBV_XRC_OPS +struct mlx4_open_xrc_domain_resp { + struct ibv_open_xrc_domain_resp ibv_resp; + __u32 xrcdn; + __u32 reserved; +}; +#endif + #endif /* MLX4_ABI_H */ diff --git a/src/mlx4.c b/src/mlx4.c index 671e849..12d2255 100644 --- a/src/mlx4.c +++ b/src/mlx4.c @@ -68,6 +68,19 @@ struct { HCA(MELLANOX, 0x673c), /* MT25408 "Hermon" QDR PCIe gen2 */ }; +#ifdef HAVE_IBV_XRC_OPS +static struct ibv_xrc_ops mlx4_xrc_ops = { + .create_xrc_srq = mlx4_create_xrc_srq, + .open_xrc_domain = mlx4_open_xrc_domain, + .close_xrc_domain = mlx4_close_xrc_domain, + .create_xrc_rcv_qp = mlx4_create_xrc_rcv_qp, + .modify_xrc_rcv_qp = mlx4_modify_xrc_rcv_qp, + .query_xrc_rcv_qp = mlx4_query_xrc_rcv_qp, + .reg_xrc_rcv_qp = mlx4_reg_xrc_rcv_qp, + .unreg_xrc_rcv_qp = mlx4_unreg_xrc_rcv_qp, +}; +#endif + static struct ibv_context_ops mlx4_ctx_ops = { .query_device = mlx4_query_device, .query_port = mlx4_query_port, @@ -124,6 +137,15 @@ static struct ibv_context *mlx4_alloc_context(struct ibv_device *ibdev, int cmd_ for (i = 0; i < MLX4_QP_TABLE_SIZE; ++i) context->qp_table[i].refcnt = 0; + context->num_xrc_srqs = resp.qp_tab_size; + context->xrc_srq_table_shift = ffs(context->num_xrc_srqs) - 1 + - MLX4_XRC_SRQ_TABLE_BITS; + context->xrc_srq_table_mask = (1 << context->xrc_srq_table_shift) - 1; + + pthread_mutex_init(&context->xrc_srq_table_mutex, NULL); + for (i = 0; i < MLX4_XRC_SRQ_TABLE_SIZE; ++i) + context->xrc_srq_table[i].refcnt = 0; + for (i = 0; i < MLX4_NUM_DB_TYPE; ++i) context->db_list[i] = NULL; @@ -156,6 +178,9 @@ static struct ibv_context *mlx4_alloc_context(struct ibv_device *ibdev, int cmd_ pthread_spin_init(&context->uar_lock, PTHREAD_PROCESS_PRIVATE); context->ibv_ctx.ops = mlx4_ctx_ops; +#ifdef HAVE_IBV_XRC_OPS + context->ibv_ctx.xrc_ops = &mlx4_xrc_ops; +#endif if (mlx4_query_device(&context->ibv_ctx, &dev_attrs)) goto query_free; diff --git a/src/mlx4.h b/src/mlx4.h index 0b47adc..0dd2d9a 100644 --- a/src/mlx4.h +++ b/src/mlx4.h @@ -111,6 +111,16 @@ enum { MLX4_QP_TABLE_MASK = MLX4_QP_TABLE_SIZE - 1 }; +enum { + MLX4_XRC_SRQ_TABLE_BITS = 8, + MLX4_XRC_SRQ_TABLE_SIZE = 1 << MLX4_XRC_SRQ_TABLE_BITS, + MLX4_XRC_SRQ_TABLE_MASK = MLX4_XRC_SRQ_TABLE_SIZE - 1 +}; + +enum { + MLX4_XRC_QPN_BIT = (1 << 23) +}; + enum mlx4_db_type { MLX4_DB_TYPE_CQ, MLX4_DB_TYPE_RQ, @@ -174,6 +184,15 @@ struct mlx4_context { int max_sge; int max_cqe; + struct { + struct mlx4_srq **table; + int refcnt; + } xrc_srq_table[MLX4_XRC_SRQ_TABLE_SIZE]; + pthread_mutex_t xrc_srq_table_mutex; + int num_xrc_srqs; + int xrc_srq_table_shift; + int xrc_srq_table_mask; + struct mlx4_db_page *db_list[MLX4_NUM_DB_TYPE]; pthread_mutex_t db_list_mutex; }; @@ -259,6 +278,11 @@ struct mlx4_ah { struct mlx4_av av; }; +struct mlx4_xrc_domain { + struct ibv_xrc_domain ibv_xrcd; + uint32_t xrcdn; +}; + static inline unsigned long align(unsigned long val, unsigned long align) { return (val + align - 1) & ~(align - 1); @@ -303,6 +327,13 @@ static inline struct mlx4_ah *to_mah(struct ibv_ah *ibah) return to_mxxx(ah, ah); } +#ifdef HAVE_IBV_XRC_OPS +static inline struct mlx4_xrc_domain *to_mxrcd(struct ibv_xrc_domain *ibxrcd) +{ + return to_mxxx(xrcd, xrc_domain); +} +#endif + int mlx4_alloc_buf(struct mlx4_buf *buf, size_t size, int page_size); void mlx4_free_buf(struct mlx4_buf *buf); @@ -347,6 +378,10 @@ void mlx4_free_srq_wqe(struct mlx4_srq *srq, int ind); int mlx4_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); +struct mlx4_srq *mlx4_find_xrc_srq(struct mlx4_context *ctx, uint32_t xrc_srqn); +int mlx4_store_xrc_srq(struct mlx4_context *ctx, uint32_t xrc_srqn, + struct mlx4_srq *srq); +void mlx4_clear_xrc_srq(struct mlx4_context *ctx, uint32_t xrc_srqn); struct ibv_qp *mlx4_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int mlx4_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, @@ -377,5 +412,31 @@ int mlx4_alloc_av(struct mlx4_pd *pd, struct ibv_ah_attr *attr, void mlx4_free_av(struct mlx4_ah *ah); int mlx4_attach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid); int mlx4_detach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid); +#ifdef HAVE_IBV_XRC_OPS +struct ibv_srq *mlx4_create_xrc_srq(struct ibv_pd *pd, + struct ibv_xrc_domain *xrc_domain, + struct ibv_cq *xrc_cq, + struct ibv_srq_init_attr *attr); +struct ibv_xrc_domain *mlx4_open_xrc_domain(struct ibv_context *context, + int fd, int oflag); + +int mlx4_close_xrc_domain(struct ibv_xrc_domain *d); +int mlx4_create_xrc_rcv_qp(struct ibv_qp_init_attr *init_attr, + uint32_t *xrc_qp_num); +int mlx4_modify_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, + int attr_mask); +int mlx4_query_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, + int attr_mask, + struct ibv_qp_init_attr *init_attr); +int mlx4_reg_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num); +int mlx4_unreg_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num); +#endif + #endif /* MLX4_H */ diff --git a/src/qp.c b/src/qp.c index 9f219b9..4322513 100644 --- a/src/qp.c +++ b/src/qp.c @@ -210,7 +224,7 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, ctrl = wqe = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1)); qp->sq.wrid[ind & (qp->sq.wqe_cnt - 1)] = wr->wr_id; - ctrl->srcrb_flags = + ctrl->xrcrb_flags = (wr->send_flags & IBV_SEND_SIGNALED ? htonl(MLX4_WQE_CTRL_CQ_UPDATE) : 0) | (wr->send_flags & IBV_SEND_SOLICITED ? @@ -227,6 +241,9 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, size = sizeof *ctrl / 16; switch (ibqp->qp_type) { + case IBV_QPT_XRC: + ctrl->xrcrb_flags |= htonl(wr->xrc_remote_srq_num << 8); + /* fall thru */ case IBV_QPT_RC: case IBV_QPT_UC: switch (wr->opcode) { @@ -526,6 +544,7 @@ void mlx4_calc_sq_wqe_size(struct ibv_qp_cap *cap, enum ibv_qp_type type, size += sizeof (struct mlx4_wqe_raddr_seg); break; + case IBV_QPT_XRC: case IBV_QPT_RC: size += sizeof (struct mlx4_wqe_raddr_seg); /* @@ -614,6 +633,7 @@ void mlx4_set_sq_sizes(struct mlx4_qp *qp, struct ibv_qp_cap *cap, case IBV_QPT_UC: case IBV_QPT_RC: + case IBV_QPT_XRC: wqe_size -= sizeof (struct mlx4_wqe_raddr_seg); break; diff --git a/src/srq.c b/src/srq.c index ba2ceb9..b70c18b 100644 --- a/src/srq.c +++ b/src/srq.c @@ -167,3 +167,53 @@ int mlx4_alloc_srq_buf(struct ibv_pd *pd, struct ibv_srq_attr *attr, return 0; } + +struct mlx4_srq *mlx4_find_xrc_srq(struct mlx4_context *ctx, uint32_t xrc_srqn) +{ + int tind = (xrc_srqn & (ctx->num_xrc_srqs - 1)) >> ctx->xrc_srq_table_shift; + + if (ctx->xrc_srq_table[tind].refcnt) + return ctx->xrc_srq_table[tind].table[xrc_srqn & ctx->xrc_srq_table_mask]; + else + return NULL; +} + +int mlx4_store_xrc_srq(struct mlx4_context *ctx, uint32_t xrc_srqn, + struct mlx4_srq *srq) +{ + int tind = (xrc_srqn & (ctx->num_xrc_srqs - 1)) >> ctx->xrc_srq_table_shift; + int ret = 0; + + pthread_mutex_lock(&ctx->xrc_srq_table_mutex); + + if (!ctx->xrc_srq_table[tind].refcnt) { + ctx->xrc_srq_table[tind].table = calloc(ctx->xrc_srq_table_mask + 1, + sizeof (struct mlx4_srq *)); + if (!ctx->xrc_srq_table[tind].table) { + ret = -1; + goto out; + } + } + + ++ctx->xrc_srq_table[tind].refcnt; + ctx->xrc_srq_table[tind].table[xrc_srqn & ctx->xrc_srq_table_mask] = srq; + +out: + pthread_mutex_unlock(&ctx->xrc_srq_table_mutex); + return ret; +} + +void mlx4_clear_xrc_srq(struct mlx4_context *ctx, uint32_t xrc_srqn) +{ + int tind = (xrc_srqn & (ctx->num_xrc_srqs - 1)) >> ctx->xrc_srq_table_shift; + + pthread_mutex_lock(&ctx->xrc_srq_table_mutex); + + if (!--ctx->xrc_srq_table[tind].refcnt) + free(ctx->xrc_srq_table[tind].table); + else + ctx->xrc_srq_table[tind].table[xrc_srqn & ctx->xrc_srq_table_mask] = NULL; + + pthread_mutex_unlock(&ctx->xrc_srq_table_mutex); +} + diff --git a/src/verbs.c b/src/verbs.c index 38b40d2..bb904e2 100644 --- a/src/verbs.c +++ b/src/verbs.c @@ -320,18 +320,36 @@ int mlx4_query_srq(struct ibv_srq *srq, return ibv_cmd_query_srq(srq, attr, &cmd, sizeof cmd); } -int mlx4_destroy_srq(struct ibv_srq *srq) +int mlx4_destroy_srq(struct ibv_srq *ibsrq) { + struct mlx4_srq *srq = to_msrq(ibsrq); + struct mlx4_cq *mcq = NULL; int ret; - ret = ibv_cmd_destroy_srq(srq); - if (ret) + if (ibsrq->xrc_cq) { + /* is an xrc_srq */ + mcq = to_mcq(ibsrq->xrc_cq); + mlx4_cq_clean(mcq, 0, srq); + pthread_spin_lock(&mcq->lock); + mlx4_clear_xrc_srq(to_mctx(ibsrq->context), srq->srqn); + pthread_spin_unlock(&mcq->lock); + } + + ret = ibv_cmd_destroy_srq(ibsrq); + if (ret) { + if (ibsrq->xrc_cq) { + pthread_spin_lock(&mcq->lock); + mlx4_store_xrc_srq(to_mctx(ibsrq->context), + srq->srqn, srq); + pthread_spin_unlock(&mcq->lock); + } return ret; + } - mlx4_free_db(to_mctx(srq->context), MLX4_DB_TYPE_RQ, to_msrq(srq)->db); - mlx4_free_buf(&to_msrq(srq)->buf); - free(to_msrq(srq)->wrid); - free(to_msrq(srq)); + mlx4_free_db(to_mctx(ibsrq->context), MLX4_DB_TYPE_RQ, srq->db); + mlx4_free_buf(&srq->buf); + free(srq->wrid); + free(srq); return 0; } @@ -367,7 +385,7 @@ struct ibv_qp *mlx4_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) qp->sq.wqe_cnt = align_queue_size(attr->cap.max_send_wr + qp->sq_spare_wqes); qp->rq.wqe_cnt = align_queue_size(attr->cap.max_recv_wr); - if (attr->srq) + if (attr->srq || attr->qp_type == IBV_QPT_XRC) attr->cap.max_recv_wr = qp->rq.wqe_cnt = 0; else { if (attr->cap.max_recv_sge < 1) @@ -385,7 +403,7 @@ struct ibv_qp *mlx4_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE)) goto err_free; - if (!attr->srq) { + if (!attr->srq && attr->qp_type != IBV_QPT_XRC) { qp->db = mlx4_alloc_db(to_mctx(pd->context), MLX4_DB_TYPE_RQ); if (!qp->db) goto err_free; @@ -394,7 +412,7 @@ struct ibv_qp *mlx4_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) } cmd.buf_addr = (uintptr_t) qp->buf.buf; - if (attr->srq) + if (attr->srq || attr->qp_type == IBV_QPT_XRC) cmd.db_addr = 0; else cmd.db_addr = (uintptr_t) qp->db; @@ -437,7 +455,7 @@ err_destroy: ibv_cmd_destroy_qp(&qp->ibv_qp); err_rq_db: - if (!attr->srq) + if (!attr->srq && attr->qp_type != IBV_QPT_XRC) mlx4_free_db(to_mctx(pd->context), MLX4_DB_TYPE_RQ, qp->db); err_free: @@ -496,7 +514,7 @@ int mlx4_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, mlx4_cq_clean(to_mcq(qp->send_cq), qp->qp_num, NULL); mlx4_init_qp_indices(to_mqp(qp)); - if (!qp->srq) + if (!qp->srq && qp->qp_type != IBV_QPT_XRC) *to_mqp(qp)->db = 0; } @@ -558,7 +576,7 @@ int mlx4_destroy_qp(struct ibv_qp *ibqp) return ret; } - if (!ibqp->srq) + if (!ibqp->srq && ibqp->qp_type != IBV_QPT_XRC) mlx4_free_db(to_mctx(ibqp->context), MLX4_DB_TYPE_RQ, qp->db); free(qp->sq.wrid); if (qp->rq.wqe_cnt) @@ -616,3 +634,158 @@ int mlx4_detach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid) { return ibv_cmd_detach_mcast(qp, gid, lid); } + +#ifdef HAVE_IBV_XRC_OPS +struct ibv_srq *mlx4_create_xrc_srq(struct ibv_pd *pd, + struct ibv_xrc_domain *xrc_domain, + struct ibv_cq *xrc_cq, + struct ibv_srq_init_attr *attr) +{ + struct mlx4_create_xrc_srq cmd; + struct mlx4_create_srq_resp resp; + struct mlx4_srq *srq; + int ret; + + /* Sanity check SRQ size before proceeding */ + if (attr->attr.max_wr > 1 << 16 || attr->attr.max_sge > 64) + return NULL; + + srq = malloc(sizeof *srq); + if (!srq) + return NULL; + + if (pthread_spin_init(&srq->lock, PTHREAD_PROCESS_PRIVATE)) + goto err; + + srq->max = align_queue_size(attr->attr.max_wr + 1); + srq->max_gs = attr->attr.max_sge; + srq->counter = 0; + + if (mlx4_alloc_srq_buf(pd, &attr->attr, srq)) + goto err; + + srq->db = mlx4_alloc_db(to_mctx(pd->context), MLX4_DB_TYPE_RQ); + if (!srq->db) + goto err_free; + + *srq->db = 0; + + cmd.buf_addr = (uintptr_t) srq->buf.buf; + cmd.db_addr = (uintptr_t) srq->db; + + ret = ibv_cmd_create_xrc_srq(pd, &srq->ibv_srq, attr, + xrc_domain->handle, + xrc_cq->handle, + &cmd.ibv_cmd, sizeof cmd, + &resp.ibv_resp, sizeof resp); + if (ret) + goto err_db; + + srq->ibv_srq.xrc_srq_num = srq->srqn = resp.srqn; + + ret = mlx4_store_xrc_srq(to_mctx(pd->context), srq->ibv_srq.xrc_srq_num, srq); + if (ret) + goto err_destroy; + + return &srq->ibv_srq; + +err_destroy: + ibv_cmd_destroy_srq(&srq->ibv_srq); + +err_db: + mlx4_free_db(to_mctx(pd->context), MLX4_DB_TYPE_RQ, srq->db); + +err_free: + free(srq->wrid); + mlx4_free_buf(&srq->buf); + +err: + free(srq); + + return NULL; +} + +struct ibv_xrc_domain *mlx4_open_xrc_domain(struct ibv_context *context, + int fd, int oflag) +{ + int ret; + struct mlx4_open_xrc_domain_resp resp; + struct mlx4_xrc_domain *xrcd; + + xrcd = malloc(sizeof *xrcd); + if (!xrcd) + return NULL; + + ret = ibv_cmd_open_xrc_domain(context, fd, oflag, &xrcd->ibv_xrcd, + &resp.ibv_resp, sizeof resp); + if (ret) { + free(xrcd); + return NULL; + } + + xrcd->xrcdn = resp.xrcdn; + return &xrcd->ibv_xrcd; +} + +int mlx4_close_xrc_domain(struct ibv_xrc_domain *d) +{ + ibv_cmd_close_xrc_domain(d); + free(d); + return 0; +} + +int mlx4_create_xrc_rcv_qp(struct ibv_qp_init_attr *init_attr, + uint32_t *xrc_qp_num) +{ + + return ibv_cmd_create_xrc_rcv_qp(init_attr, xrc_qp_num); +} + +int mlx4_modify_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, + int attr_mask) +{ + return ibv_cmd_modify_xrc_rcv_qp(xrc_domain, xrc_qp_num, + attr, attr_mask); +} + +int mlx4_query_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, + int attr_mask, + struct ibv_qp_init_attr *init_attr) +{ + int ret; + + ret = ibv_cmd_query_xrc_rcv_qp(xrc_domain, xrc_qp_num, + attr, attr_mask, init_attr); + if (ret) + return ret; + + init_attr->cap.max_send_wr = init_attr->cap.max_send_sge = 1; + init_attr->cap.max_recv_sge = init_attr->cap.max_recv_wr = 0; + init_attr->cap.max_inline_data = 0; + init_attr->recv_cq = init_attr->send_cq = NULL; + init_attr->srq = NULL; + init_attr->xrc_domain = xrc_domain; + init_attr->qp_type = IBV_QPT_XRC; + init_attr->qp_context = NULL; + attr->cap = init_attr->cap; + + return 0; +} + +int mlx4_reg_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num) +{ + return ibv_cmd_reg_xrc_rcv_qp(xrc_domain, xrc_qp_num); +} + +int mlx4_unreg_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num) +{ + return ibv_cmd_unreg_xrc_rcv_qp(xrc_domain, xrc_qp_num); +} + +#endif diff --git a/src/wqe.h b/src/wqe.h index 6f7f309..fa2f8ac 100644 --- a/src/wqe.h +++ b/src/wqe.h @@ -65,7 +65,7 @@ struct mlx4_wqe_ctrl_seg { * [1] SE (solicited event) * [0] FL (force loopback) */ - uint32_t srcrb_flags; + uint32_t xrcrb_flags; /* * imm is immediate data for send/RDMA write w/ immediate; * also invalidation key for send with invalidate; input From jackm at dev.mellanox.co.il Wed Jan 23 01:59:41 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Jan 2008 11:59:41 +0200 Subject: [ofa-general] [PATCH 1/8] libibverbs: implement XRC qps Message-ID: <200801231159.41526.jackm@dev.mellanox.co.il> Implements the full XRC QP interface. Changes: Added creation of XRC receive-only QPs for userspace, which reside in kernel space (user cannot post-to or poll these QPs). Motivation: MPI community requires XRC receive QPs which will not be destroyed when the creating process terminates. Solution: Userspace requests that a QP be created in kernel space. Each userspace process using that QP (i.e. receiving packets on an XRC SRQ via the qp), registers with that QP (-- the creator is also registered, whether or not it is a user of the QP). When the last userspace user unregisters with the QP, it is destroyed. Unregistration is also part of userspace process cleanup, so there is no leakage. This patch implements the following: ibv_create_xrc_rcv_qp ibv_modify_xrc_rcv_qp ibv_query_xrc_rcv_qp ibv_reg_xrc_rcv_qp ibv_unreg_xrc_rcv_qp Creating process (userspace) workflow: ibv_create_xrc_rcv_qp -- to create (also registers the QP). ibv_modify_xrc_rcv_qp -- to move QP to INIT ibv_modify_xrc_rcv_qp -- to move QP to RTR (to RTS is not needed for receive-only QPs) ibv_unreg_xrc_rcv_qp -- instead of destroy. Using process workflow ibv_create_xrc_srq -- to create an SRQ ibv_reg_xrc_rcv_qp -- to register with the QP as a user ibv_destroy_srq ibv_unreg_xrc_rcv_qp -- to "unregister" with the QP. If no user processes remain registered, the QP is destroyed. NOTES: 1. Since there is no userspace object for the QP, the API uses the XRC domain object and qp number instead. 2. Registration needs to be performed only once per process (multiple registrations count as a single registration). 3. Async events for the receive QP are delivered to all registered processes. The event ID is "OR'ed" with 0x80000000, to indicate that this is an XRC receive-only QP event. The (new) element-field union value "xrc_qp_num" is set to the QP number which generated the event. If the QP goes into the error state for any reason, each registered userspace process will receive the LAST_WQE_REACHED event for the QP; each process should then call ibv_unreg_xrc_rcv_qp() so that the QP will be destroyed. Signed-off-by: Jack Morgenstein diff --git a/include/infiniband/driver.h b/include/infiniband/driver.h index 67a3bf8..e871f7d 100644 --- a/include/infiniband/driver.h +++ b/include/infiniband/driver.h @@ -99,6 +99,11 @@ int ibv_cmd_create_srq(struct ibv_pd *pd, struct ibv_srq *srq, struct ibv_srq_init_attr *attr, struct ibv_create_srq *cmd, size_t cmd_size, struct ibv_create_srq_resp *resp, size_t resp_size); +int ibv_cmd_create_xrc_srq(struct ibv_pd *pd, + struct ibv_srq *srq, struct ibv_srq_init_attr *attr, + uint32_t xrc_domain, uint32_t xrc_cq, + struct ibv_create_xrc_srq *cmd, size_t cmd_size, + struct ibv_create_srq_resp *resp, size_t resp_size); int ibv_cmd_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, enum ibv_srq_attr_mask srq_attr_mask, @@ -134,6 +139,20 @@ int ibv_cmd_detach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid); int ibv_dontfork_range(void *base, size_t size); int ibv_dofork_range(void *base, size_t size); +int ibv_cmd_open_xrc_domain(struct ibv_context *context, int fd, int oflag, + struct ibv_xrc_domain *d, + struct ibv_open_xrc_domain_resp *resp, + size_t resp_size); +int ibv_cmd_close_xrc_domain(struct ibv_xrc_domain *d); +int ibv_cmd_create_xrc_rcv_qp(struct ibv_qp_init_attr *init_attr, + uint32_t *xrc_rcv_qpn); +int ibv_cmd_modify_xrc_rcv_qp(struct ibv_xrc_domain *d, uint32_t xrc_rcv_qpn, + struct ibv_qp_attr *attr, int attr_mask); +int ibv_cmd_query_xrc_rcv_qp(struct ibv_xrc_domain *d, uint32_t xrc_rcv_qpn, + struct ibv_qp_attr *attr, int attr_mask, + struct ibv_qp_init_attr *init_attr); +int ibv_cmd_reg_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, uint32_t xrc_qp_num); +int ibv_cmd_unreg_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, uint32_t xrc_qp_num); /* * sysfs helper functions diff --git a/include/infiniband/kern-abi.h b/include/infiniband/kern-abi.h index 0db083a..97949b9 100644 --- a/include/infiniband/kern-abi.h +++ b/include/infiniband/kern-abi.h @@ -85,7 +85,15 @@ enum { IB_USER_VERBS_CMD_MODIFY_SRQ, IB_USER_VERBS_CMD_QUERY_SRQ, IB_USER_VERBS_CMD_DESTROY_SRQ, - IB_USER_VERBS_CMD_POST_SRQ_RECV + IB_USER_VERBS_CMD_POST_SRQ_RECV, + IB_USER_VERBS_CMD_CREATE_XRC_SRQ, + IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN, + IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN, + IB_USER_VERBS_CMD_CREATE_XRC_RCV_QP, + IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP, + IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP, + IB_USER_VERBS_CMD_REG_XRC_RCV_QP, + IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP, }; /* @@ -567,6 +575,92 @@ struct ibv_destroy_qp_resp { __u32 events_reported; }; +struct ibv_create_xrc_rcv_qp { + __u32 command; + __u16 in_words; + __u16 out_words; + __u64 response; + __u64 user_handle; + __u32 xrc_domain_handle; + __u32 max_send_wr; + __u32 max_recv_wr; + __u32 max_send_sge; + __u32 max_recv_sge; + __u32 max_inline_data; + __u8 sq_sig_all; + __u8 qp_type; + __u8 reserved[2]; + __u64 driver_data[0]; +}; + +struct ibv_create_xrc_rcv_qp_resp { + __u32 qpn; + __u32 reserved; +}; + +struct ibv_modify_xrc_rcv_qp { + __u32 command; + __u16 in_words; + __u16 out_words; + __u32 xrc_domain_handle; + __u32 qp_num; + struct ibv_qp_dest dest; + struct ibv_qp_dest alt_dest; + __u32 attr_mask; + __u32 qkey; + __u32 rq_psn; + __u32 sq_psn; + __u32 dest_qp_num; + __u32 qp_access_flags; + __u16 pkey_index; + __u16 alt_pkey_index; + __u8 qp_state; + __u8 cur_qp_state; + __u8 path_mtu; + __u8 path_mig_state; + __u8 en_sqd_async_notify; + __u8 max_rd_atomic; + __u8 max_dest_rd_atomic; + __u8 min_rnr_timer; + __u8 port_num; + __u8 timeout; + __u8 retry_cnt; + __u8 rnr_retry; + __u8 alt_port_num; + __u8 alt_timeout; + __u8 reserved[2]; + __u64 driver_data[0]; +}; + +struct ibv_query_xrc_rcv_qp { + __u32 command; + __u16 in_words; + __u16 out_words; + __u64 response; + __u32 xrc_domain_handle; + __u32 qp_num; + __u32 attr_mask; + __u64 driver_data[0]; +}; + +struct ibv_reg_xrc_rcv_qp { + __u32 command; + __u16 in_words; + __u16 out_words; + __u32 xrc_domain_handle; + __u32 qp_num; + __u64 driver_data[0]; +}; + +struct ibv_unreg_xrc_rcv_qp { + __u32 command; + __u16 in_words; + __u16 out_words; + __u32 xrc_domain_handle; + __u32 qp_num; + __u64 driver_data[0]; +}; + struct ibv_kern_send_wr { __u64 wr_id; __u32 num_sge; @@ -706,6 +800,21 @@ struct ibv_create_srq { __u64 driver_data[0]; }; +struct ibv_create_xrc_srq { + __u32 command; + __u16 in_words; + __u16 out_words; + __u64 response; + __u64 user_handle; + __u32 pd_handle; + __u32 max_wr; + __u32 max_sge; + __u32 srq_limit; + __u32 xrcd_handle; + __u32 xrc_cq; + __u64 driver_data[0]; +}; + struct ibv_create_srq_resp { __u32 srq_handle; __u32 max_wr; @@ -754,6 +863,29 @@ struct ibv_destroy_srq_resp { __u32 events_reported; }; +struct ibv_open_xrc_domain { + __u32 command; + __u16 in_words; + __u16 out_words; + __u64 response; + __u32 fd; + __u32 oflags; + __u64 driver_data[0]; +}; + +struct ibv_open_xrc_domain_resp { + __u32 xrcd_handle; +}; + +struct ibv_close_xrc_domain { + __u32 command; + __u16 in_words; + __u16 out_words; + __u64 response; + __u32 xrcd_handle; + __u64 driver_data[0]; +}; + /* * Compatibility with older ABI versions */ @@ -803,6 +935,14 @@ enum { * trick opcodes in IBV_INIT_CMD() doesn't break. */ IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL_V2 = -1, + IB_USER_VERBS_CMD_CREATE_XRC_SRQ_V2 = -1, + IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN_V2 = -1, + IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN_V2 = -1, + IB_USER_VERBS_CMD_CREATE_XRC_RCV_QP_V2 = -1, + IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP_V2 = -1, + IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP_V2 = -1, + IB_USER_VERBS_CMD_REG_XRC_RCV_QP_V2 = -1, + IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP_V2 = -1, }; struct ibv_destroy_cq_v1 { diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h index acc1b82..a032a67 100644 --- a/include/infiniband/verbs.h +++ b/include/infiniband/verbs.h @@ -92,7 +92,8 @@ enum ibv_device_cap_flags { IBV_DEVICE_SYS_IMAGE_GUID = 1 << 11, IBV_DEVICE_RC_RNR_NAK_GEN = 1 << 12, IBV_DEVICE_SRQ_RESIZE = 1 << 13, - IBV_DEVICE_N_NOTIFY_CQ = 1 << 14 + IBV_DEVICE_N_NOTIFY_CQ = 1 << 14, + IBV_DEVICE_XRC = 1 << 20 }; enum ibv_atomic_cap { @@ -204,12 +205,17 @@ enum ibv_event_type { IBV_EVENT_CLIENT_REREGISTER }; +enum ibv_event_flags { + IBV_XRC_QP_EVENT_FLAG = 0x80000000, +}; + struct ibv_async_event { union { struct ibv_cq *cq; struct ibv_qp *qp; struct ibv_srq *srq; int port_num; + uint32_t xrc_qp_num; } element; enum ibv_event_type event_type; }; @@ -370,6 +376,11 @@ struct ibv_ah_attr { uint8_t port_num; }; +struct ibv_xrc_domain { + struct ibv_context *context; + uint32_t handle; +}; + enum ibv_srq_attr_mask { IBV_SRQ_MAX_WR = 1 << 0, IBV_SRQ_LIMIT = 1 << 1 @@ -389,7 +400,8 @@ struct ibv_srq_init_attr { enum ibv_qp_type { IBV_QPT_RC = 2, IBV_QPT_UC, - IBV_QPT_UD + IBV_QPT_UD, + IBV_QPT_XRC }; struct ibv_qp_cap { @@ -408,6 +420,7 @@ struct ibv_qp_init_attr { struct ibv_qp_cap cap; enum ibv_qp_type qp_type; int sq_sig_all; + struct ibv_xrc_domain *xrc_domain; }; enum ibv_qp_attr_mask { @@ -526,6 +539,7 @@ struct ibv_send_wr { uint32_t remote_qkey; } ud; } wr; + uint32_t xrc_remote_srq_num; }; struct ibv_recv_wr { @@ -553,6 +567,10 @@ struct ibv_srq { pthread_mutex_t mutex; pthread_cond_t cond; uint32_t events_completed; + + uint32_t xrc_srq_num; + struct ibv_xrc_domain *xrc_domain; + struct ibv_cq *xrc_cq; }; struct ibv_qp { @@ -570,6 +588,8 @@ struct ibv_qp { pthread_mutex_t mutex; pthread_cond_t cond; uint32_t events_completed; + + struct ibv_xrc_domain *xrc_domain; }; struct ibv_comp_channel { @@ -624,6 +644,32 @@ struct ibv_device { char ibdev_path[IBV_SYSFS_PATH_MAX]; }; +struct ibv_xrc_ops { + struct ibv_srq * (*create_xrc_srq)(struct ibv_pd *pd, + struct ibv_xrc_domain *xrc_domain, + struct ibv_cq *xrc_cq, + struct ibv_srq_init_attr *srq_init_attr); + struct ibv_xrc_domain * (*open_xrc_domain)(struct ibv_context *context, + int fd, int oflag); + int (*close_xrc_domain)(struct ibv_xrc_domain *d); + int (*create_xrc_rcv_qp)(struct ibv_qp_init_attr *init_attr, + uint32_t *xrc_qp_num); + int (*modify_xrc_rcv_qp)(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, + int attr_mask); + int (*query_xrc_rcv_qp)(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, + int attr_mask, + struct ibv_qp_init_attr *init_attr); + int (*reg_xrc_rcv_qp)(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num); + int (*unreg_xrc_rcv_qp)(struct ibv_xrc_domain *xrc_domain, + uint32_t xrc_qp_num); + +}; + struct ibv_context_ops { int (*query_device)(struct ibv_context *context, struct ibv_device_attr *device_attr); @@ -690,6 +736,7 @@ struct ibv_context { int num_comp_vectors; pthread_mutex_t mutex; void *abi_compat; + struct ibv_xrc_ops *xrc_ops; }; /** @@ -912,6 +959,25 @@ struct ibv_srq *ibv_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *srq_init_attr); /** + * ibv_create_xrc_srq - Creates a SRQ associated with the specified protection + * domain and xrc domain. + * @pd: The protection domain associated with the SRQ. + * @xrc_domain: The XRC domain associated with the SRQ. + * @xrc_cq: CQ to report completions for XRC packets on. + * + * @srq_init_attr: A list of initial attributes required to create the SRQ. + * + * srq_attr->max_wr and srq_attr->max_sge are read the determine the + * requested size of the SRQ, and set to the actual values allocated + * on return. If ibv_create_srq() succeeds, then max_wr and max_sge + * will always be at least as large as the requested values. + */ +struct ibv_srq *ibv_create_xrc_srq(struct ibv_pd *pd, + struct ibv_xrc_domain *xrc_domain, + struct ibv_cq *xrc_cq, + struct ibv_srq_init_attr *srq_init_attr); + +/** * ibv_modify_srq - Modifies the attributes for the specified SRQ. * @srq: The SRQ to modify. * @srq_attr: On input, specifies the SRQ attributes to modify. On output, @@ -1074,6 +1140,136 @@ int ibv_detach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid); */ int ibv_fork_init(void); +/** + * ibv_open_xrc_domain - open an XRC domain + * Returns a reference to an XRC domain. + * + * @context: Device context + * @fd: descriptor for inode associated with the domain + * If fd == -1, no inode is associated with the domain; in this case, + * the only legal value for oflag is O_CREAT + * + * @oflag: oflag values are constructed by OR-ing flags from the following list + * + * O_CREAT + * If a domain belonging to device named by context is already associated + * with the inode, this flag has no effect, except as noted under O_EXCL + * below. Otherwise, a new XRC domain is created and is associated with + * inode specified by fd. + * + * O_EXCL + * If O_EXCL and O_CREAT are set, open will fail if a domain associated with + * the inode exists. The check for the existence of the domain and creation + * of the domain if it does not exist is atomic with respect to other + * processes executing open with fd naming the same inode. + */ +struct ibv_xrc_domain *ibv_open_xrc_domain(struct ibv_context *context, + int fd, int oflag); + +/** + * ibv_close_xrc_domain - close an XRC domain + * If this is the last reference, destroys the domain. + * + * @d: reference to XRC domain to close + * + * close is implicitly performed at process exit. + */ +int ibv_close_xrc_domain(struct ibv_xrc_domain *d); + +/** + * ibv_create_xrc_rcv_qp - creates an XRC QP for serving as a receive-side only QP, + * + * This QP is created in kernel space, and persists until the last process registered + * for the QP calls ibv_unreg_xrc_rcv_qp() (at which time the QP is destroyed). + * + * @init_attr: init attributes to use for QP. xrc domain MUST be included here. All other fields + * are ignored. + * + * @xrc_rcv_qpn: qp_num of created QP (if success). To be passed to the remote node (sender). + * The remote node will use xrc_rcv_qpn in ibv_post_send when sending to + * XRC SRQ's on this host in the same xrc domain. + * + * RETURNS: success (0), or a (negative) error value. + * + * NOTE: this verb also registers the calling user-process with the QP at its creation time + * (implicit call to ibv_reg_xrc_rcv_qp), to avoid race conditions. + * The creating process will need to call ibv_unreg_xrc_qp() for the QP to release it from + * this process. + */ +int ibv_create_xrc_rcv_qp(struct ibv_qp_init_attr *init_attr, + uint32_t *xrc_rcv_qpn); + +/** + * ibv_modify_xrc_rcv_qp - modifies an xrc_rcv qp. + * + * @xrc_domain: xrc domain the QP belongs to (for verification). + * @xrc_qp_num: The (24 bit) number of the XRC QP. + * @attr: modify-qp attributes. The following fields must be specified: + * for RESET_2_INIT: qp_state, pkey_index , port, qp_access_flags + * for INIT_2_RTR: qp_state, path_mtu, dest_qp_num, rq_psn, max_dest_rd_atomic, + * min_rnr_timer, ah_attr + * The QP need not be brought to RTS for the QP to operate as a receive-only QP. + * @attr_mask: bitmap indicating which attributes are provided in the attr struct. + * used for validity checking. The following bits must be set: + * for RESET_2_INIT: IBV_QP_PKEY_INDEX, IBV_QP_PORT, IBV_QP_ACCESS_FLAGS, IBV_QP_STATE + * for INIT_2_RTR: IBV_QP_AV, IBV_QP_PATH_MTU, IBV_QP_DEST_QPN, IBV_QP_RQ_PSN, + * IBV_QP_MAX_DEST_RD_ATOMIC, IBV_QP_MIN_RNR_TIMER, IBV_QP_STATE + * + * RETURNS: success (0), or a (negative) error value. + * + */ +int ibv_modify_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, int attr_mask); + +/** + * ibv_query_xrc_rcv_qp - queries an xrc_rcv qp. + * + * @xrc_domain: xrc domain the QP belongs to (for verification). + * @xrc_qp_num: The (24 bit) number of the XRC QP. + * @attr: for returning qp attributes. + * @attr_mask: bitmap indicating which attributes to return. + * @init_attr: for returning the init attributes + * + * RETURNS: success (0), or a (negative) error value. + * + */ +int ibv_query_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, int attr_mask, + struct ibv_qp_init_attr *init_attr); + +/** + * ibv_reg_xrc_rcv_qp: registers a user process with an XRC QP which serves as + * a receive-side only QP. + * + * @xrc_domain: xrc domain the QP belongs to (for verification). + * @xrc_qp_num: The (24 bit) number of the XRC QP. + * + * RETURNS: success (0), + * or error (-EINVAL), if: + * 1. There is no such QP_num allocated. + * 2. The QP is allocated, but is not an receive XRC QP + * 3. The XRC QP does not belong to the given domain. + */ +int ibv_reg_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, uint32_t xrc_qp_num); + +/** + * ibv_unreg_xrc_rcv_qp: detaches a user process from an XRC QP serving as + * a receive-side only QP. If as a result, there are no remaining + * userspace processes registered for this XRC QP, it is destroyed. + * + * @xrc_domain: xrc domain the QP belongs to (for verification). + * @xrc_qp_num: The (24 bit) number of the XRC QP. + * + * RETURNS: success (0), + * or error (-EINVAL), if: + * 1. There is no such QP_num allocated. + * 2. The QP is allocated, but is not an XRC QP + * 3. The XRC QP does not belong to the given domain. + * NOTE: I don't see any reason to return a special code if the QP is destroyed -- the unregister simply + * succeeds. + */ +int ibv_unreg_xrc_rcv_qp(struct ibv_xrc_domain *xrc_domain, uint32_t xrc_qp_num); + END_C_DECLS # undef __attribute_const diff --git a/src/cmd.c b/src/cmd.c index 31b592e..2857a6c 100644 --- a/src/cmd.c +++ b/src/cmd.c @@ -248,7 +248,7 @@ int ibv_cmd_reg_mr(struct ibv_pd *pd, void *addr, size_t length, if (write(pd->context->cmd_fd, cmd, cmd_size) != cmd_size) return errno; - VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); + VALGRIND_MAKE_MEM_DEFINED(resp, resp_size); mr->handle = resp->mr_handle; mr->lkey = resp->lkey; @@ -291,7 +291,7 @@ static int ibv_cmd_create_cq_v2(struct ibv_context *context, int cqe, if (write(context->cmd_fd, cmd, cmd_size) != cmd_size) return errno; - VALGRIND_MAKE_MEM_DEFINED(resp, sizeof resp_size); + VALGRIND_MAKE_MEM_DEFINED(resp, resp_size); cq->handle = resp->cq_handle; cq->cqe = resp->cqe; @@ -432,6 +432,7 @@ int ibv_cmd_destroy_cq(struct ibv_cq *cq) IBV_INIT_CMD_RESP(&cmd, sizeof cmd, DESTROY_CQ, &resp, sizeof resp); cmd.cq_handle = cq->handle; + cmd.reserved = 0; if (write(cq->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) return errno; @@ -482,6 +483,34 @@ int ibv_cmd_create_srq(struct ibv_pd *pd, return 0; } +int ibv_cmd_create_xrc_srq(struct ibv_pd *pd, + struct ibv_srq *srq, struct ibv_srq_init_attr *attr, + uint32_t xrcd_handle, uint32_t xrc_cq, + struct ibv_create_xrc_srq *cmd, size_t cmd_size, + struct ibv_create_srq_resp *resp, size_t resp_size) +{ + IBV_INIT_CMD_RESP(cmd, cmd_size, CREATE_XRC_SRQ, resp, resp_size); + cmd->user_handle = (uintptr_t) srq; + cmd->pd_handle = pd->handle; + cmd->max_wr = attr->attr.max_wr; + cmd->max_sge = attr->attr.max_sge; + cmd->srq_limit = attr->attr.srq_limit; + cmd->xrcd_handle = xrcd_handle; + cmd->xrc_cq = xrc_cq; + + if (write(pd->context->cmd_fd, cmd, cmd_size) != cmd_size) + return errno; + + VALGRIND_MAKE_MEM_DEFINED(resp, resp_size); + + srq->handle = resp->srq_handle; + srq->context = pd->context; + attr->attr.max_wr = resp->max_wr; + attr->attr.max_sge = resp->max_sge; + + return 0; +} + static int ibv_cmd_modify_srq_v3(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, enum ibv_srq_attr_mask srq_attr_mask, @@ -539,10 +568,13 @@ int ibv_cmd_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, IBV_INIT_CMD_RESP(cmd, cmd_size, QUERY_SRQ, &resp, sizeof resp); cmd->srq_handle = srq->handle; + cmd->reserved = 0; if (write(srq->context->cmd_fd, cmd, cmd_size) != cmd_size) return errno; + VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); + srq_attr->max_wr = resp.max_wr; srq_attr->max_sge = resp.max_sge; srq_attr->srq_limit = resp.srq_limit; @@ -573,10 +605,13 @@ int ibv_cmd_destroy_srq(struct ibv_srq *srq) IBV_INIT_CMD_RESP(&cmd, sizeof cmd, DESTROY_SRQ, &resp, sizeof resp); cmd.srq_handle = srq->handle; + cmd.reserved = 0; if (write(srq->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) return errno; + VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); + pthread_mutex_lock(&srq->mutex); while (srq->events_completed != resp.events_reported) pthread_cond_wait(&srq->cond, &srq->mutex); @@ -596,7 +631,6 @@ int ibv_cmd_create_qp(struct ibv_pd *pd, cmd->pd_handle = pd->handle; cmd->send_cq_handle = attr->send_cq->handle; cmd->recv_cq_handle = attr->recv_cq->handle; - cmd->srq_handle = attr->srq ? attr->srq->handle : 0; cmd->max_send_wr = attr->cap.max_send_wr; cmd->max_recv_wr = attr->cap.max_recv_wr; cmd->max_send_sge = attr->cap.max_send_sge; @@ -605,6 +639,9 @@ int ibv_cmd_create_qp(struct ibv_pd *pd, cmd->sq_sig_all = attr->sq_sig_all; cmd->qp_type = attr->qp_type; cmd->is_srq = !!attr->srq; + cmd->srq_handle = attr->qp_type == IBV_QPT_XRC ? + (attr->xrc_domain ? attr->xrc_domain->handle : 0) : + (attr->srq ? attr->srq->handle : 0); cmd->reserved = 0; if (write(pd->context->cmd_fd, cmd, cmd_size) != cmd_size) @@ -657,6 +694,8 @@ int ibv_cmd_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, if (write(qp->context->cmd_fd, cmd, cmd_size) != cmd_size) return errno; + VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); + attr->qkey = resp.qkey; attr->rq_psn = resp.rq_psn; attr->sq_psn = resp.sq_psn; @@ -713,6 +752,8 @@ int ibv_cmd_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, init_attr->recv_cq = qp->recv_cq; init_attr->srq = qp->srq; init_attr->qp_type = qp->qp_type; + if (qp->qp_type == IBV_QPT_XRC) + init_attr->xrc_domain = qp->xrc_domain; init_attr->cap.max_send_wr = resp.max_send_wr; init_attr->cap.max_recv_wr = resp.max_recv_wr; init_attr->cap.max_send_sge = resp.max_send_sge; @@ -787,6 +828,186 @@ int ibv_cmd_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, return 0; } +int ibv_cmd_create_xrc_rcv_qp(struct ibv_qp_init_attr *init_attr, + uint32_t *xrc_rcv_qpn) +{ + struct ibv_create_xrc_rcv_qp cmd; + struct ibv_create_xrc_rcv_qp_resp resp; + + if (abi_ver < 6) + return ENOSYS; + + IBV_INIT_CMD_RESP(&cmd, sizeof cmd, CREATE_XRC_RCV_QP, &resp, sizeof resp); + + cmd.xrc_domain_handle = init_attr->xrc_domain->handle; + cmd.max_send_wr = init_attr->cap.max_send_wr; + cmd.max_recv_wr = init_attr->cap.max_recv_wr; + cmd.max_send_sge = init_attr->cap.max_send_sge; + cmd.max_recv_sge = init_attr->cap.max_recv_sge; + cmd.max_inline_data = init_attr->cap.max_inline_data; + cmd.sq_sig_all = init_attr->sq_sig_all; + cmd.qp_type = init_attr->qp_type; + cmd.reserved[0] = cmd.reserved[1] = 0; + + if (write(init_attr->xrc_domain->context->cmd_fd, &cmd, sizeof cmd) != + sizeof cmd) + return errno; + + *xrc_rcv_qpn = resp.qpn; + + return 0; +} + +int ibv_cmd_modify_xrc_rcv_qp(struct ibv_xrc_domain *d, uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, int attr_mask) +{ + struct ibv_modify_xrc_rcv_qp cmd; + + if (abi_ver < 6) + return ENOSYS; + + IBV_INIT_CMD(&cmd, sizeof cmd, MODIFY_XRC_RCV_QP); + + cmd.xrc_domain_handle = d->handle; + cmd.qp_num = xrc_qp_num; + cmd.attr_mask = attr_mask; + cmd.qkey = attr->qkey; + cmd.rq_psn = attr->rq_psn; + cmd.sq_psn = attr->sq_psn; + cmd.dest_qp_num = attr->dest_qp_num; + cmd.qp_access_flags = attr->qp_access_flags; + cmd.pkey_index = attr->pkey_index; + cmd.alt_pkey_index = attr->alt_pkey_index; + cmd.qp_state = attr->qp_state; + cmd.cur_qp_state = attr->cur_qp_state; + cmd.path_mtu = attr->path_mtu; + cmd.path_mig_state = attr->path_mig_state; + cmd.en_sqd_async_notify = attr->en_sqd_async_notify; + cmd.max_rd_atomic = attr->max_rd_atomic; + cmd.max_dest_rd_atomic = attr->max_dest_rd_atomic; + cmd.min_rnr_timer = attr->min_rnr_timer; + cmd.port_num = attr->port_num; + cmd.timeout = attr->timeout; + cmd.retry_cnt = attr->retry_cnt; + cmd.rnr_retry = attr->rnr_retry; + cmd.alt_port_num = attr->alt_port_num; + cmd.alt_timeout = attr->alt_timeout; + + memcpy(cmd.dest.dgid, attr->ah_attr.grh.dgid.raw, 16); + cmd.dest.flow_label = attr->ah_attr.grh.flow_label; + cmd.dest.dlid = attr->ah_attr.dlid; + cmd.dest.reserved = 0; + cmd.dest.sgid_index = attr->ah_attr.grh.sgid_index; + cmd.dest.hop_limit = attr->ah_attr.grh.hop_limit; + cmd.dest.traffic_class = attr->ah_attr.grh.traffic_class; + cmd.dest.sl = attr->ah_attr.sl; + cmd.dest.src_path_bits = attr->ah_attr.src_path_bits; + cmd.dest.static_rate = attr->ah_attr.static_rate; + cmd.dest.is_global = attr->ah_attr.is_global; + cmd.dest.port_num = attr->ah_attr.port_num; + + memcpy(cmd.alt_dest.dgid, attr->alt_ah_attr.grh.dgid.raw, 16); + cmd.alt_dest.flow_label = attr->alt_ah_attr.grh.flow_label; + cmd.alt_dest.dlid = attr->alt_ah_attr.dlid; + cmd.alt_dest.reserved = 0; + cmd.alt_dest.sgid_index = attr->alt_ah_attr.grh.sgid_index; + cmd.alt_dest.hop_limit = attr->alt_ah_attr.grh.hop_limit; + cmd.alt_dest.traffic_class = attr->alt_ah_attr.grh.traffic_class; + cmd.alt_dest.sl = attr->alt_ah_attr.sl; + cmd.alt_dest.src_path_bits = attr->alt_ah_attr.src_path_bits; + cmd.alt_dest.static_rate = attr->alt_ah_attr.static_rate; + cmd.alt_dest.is_global = attr->alt_ah_attr.is_global; + cmd.alt_dest.port_num = attr->alt_ah_attr.port_num; + + cmd.reserved[0] = cmd.reserved[1] = 0; + + if (write(d->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) + return errno; + + return 0; +} + +int ibv_cmd_query_xrc_rcv_qp(struct ibv_xrc_domain *d, uint32_t xrc_qp_num, + struct ibv_qp_attr *attr, int attr_mask, + struct ibv_qp_init_attr *init_attr) +{ + struct ibv_query_xrc_rcv_qp cmd; + struct ibv_query_qp_resp resp; + + if (abi_ver < 6) + return ENOSYS; + + IBV_INIT_CMD_RESP(&cmd, sizeof cmd, QUERY_XRC_RCV_QP, &resp, sizeof resp); + cmd.xrc_domain_handle = d->handle; + cmd.qp_num = xrc_qp_num; + cmd.attr_mask = attr_mask; + + if (write(d->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) + return errno; + + VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); + + attr->qkey = resp.qkey; + attr->rq_psn = resp.rq_psn; + attr->sq_psn = resp.sq_psn; + attr->dest_qp_num = resp.dest_qp_num; + attr->qp_access_flags = resp.qp_access_flags; + attr->pkey_index = resp.pkey_index; + attr->alt_pkey_index = resp.alt_pkey_index; + attr->qp_state = resp.qp_state; + attr->cur_qp_state = resp.cur_qp_state; + attr->path_mtu = resp.path_mtu; + attr->path_mig_state = resp.path_mig_state; + attr->sq_draining = resp.sq_draining; + attr->max_rd_atomic = resp.max_rd_atomic; + attr->max_dest_rd_atomic = resp.max_dest_rd_atomic; + attr->min_rnr_timer = resp.min_rnr_timer; + attr->port_num = resp.port_num; + attr->timeout = resp.timeout; + attr->retry_cnt = resp.retry_cnt; + attr->rnr_retry = resp.rnr_retry; + attr->alt_port_num = resp.alt_port_num; + attr->alt_timeout = resp.alt_timeout; + attr->cap.max_send_wr = resp.max_send_wr; + attr->cap.max_recv_wr = resp.max_recv_wr; + attr->cap.max_send_sge = resp.max_send_sge; + attr->cap.max_recv_sge = resp.max_recv_sge; + attr->cap.max_inline_data = resp.max_inline_data; + + memcpy(attr->ah_attr.grh.dgid.raw, resp.dest.dgid, 16); + attr->ah_attr.grh.flow_label = resp.dest.flow_label; + attr->ah_attr.dlid = resp.dest.dlid; + attr->ah_attr.grh.sgid_index = resp.dest.sgid_index; + attr->ah_attr.grh.hop_limit = resp.dest.hop_limit; + attr->ah_attr.grh.traffic_class = resp.dest.traffic_class; + attr->ah_attr.sl = resp.dest.sl; + attr->ah_attr.src_path_bits = resp.dest.src_path_bits; + attr->ah_attr.static_rate = resp.dest.static_rate; + attr->ah_attr.is_global = resp.dest.is_global; + attr->ah_attr.port_num = resp.dest.port_num; + + memcpy(attr->alt_ah_attr.grh.dgid.raw, resp.alt_dest.dgid, 16); + attr->alt_ah_attr.grh.flow_label = resp.alt_dest.flow_label; + attr->alt_ah_attr.dlid = resp.alt_dest.dlid; + attr->alt_ah_attr.grh.sgid_index = resp.alt_dest.sgid_index; + attr->alt_ah_attr.grh.hop_limit = resp.alt_dest.hop_limit; + attr->alt_ah_attr.grh.traffic_class = resp.alt_dest.traffic_class; + attr->alt_ah_attr.sl = resp.alt_dest.sl; + attr->alt_ah_attr.src_path_bits = resp.alt_dest.src_path_bits; + attr->alt_ah_attr.static_rate = resp.alt_dest.static_rate; + attr->alt_ah_attr.is_global = resp.alt_dest.is_global; + attr->alt_ah_attr.port_num = resp.alt_dest.port_num; + + init_attr->cap.max_send_wr = resp.max_send_wr; + init_attr->cap.max_recv_wr = resp.max_recv_wr; + init_attr->cap.max_send_sge = resp.max_send_sge; + init_attr->cap.max_recv_sge = resp.max_recv_sge; + init_attr->cap.max_inline_data = resp.max_inline_data; + init_attr->sq_sig_all = resp.sq_sig_all; + + return 0; +} + static int ibv_cmd_destroy_qp_v1(struct ibv_qp *qp) { struct ibv_destroy_qp_v1 cmd; @@ -1067,6 +1288,7 @@ int ibv_cmd_destroy_qp(struct ibv_qp *qp) IBV_INIT_CMD_RESP(&cmd, sizeof cmd, DESTROY_QP, &resp, sizeof resp); cmd.qp_handle = qp->handle; + cmd.reserved = 0; if (write(qp->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) return errno; @@ -1089,6 +1311,7 @@ int ibv_cmd_attach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid) memcpy(cmd.gid, gid->raw, sizeof cmd.gid); cmd.qp_handle = qp->handle; cmd.mlid = lid; + cmd.reserved = 0; if (write(qp->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) return errno; @@ -1104,9 +1327,81 @@ int ibv_cmd_detach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid) memcpy(cmd.gid, gid->raw, sizeof cmd.gid); cmd.qp_handle = qp->handle; cmd.mlid = lid; + cmd.reserved = 0; if (write(qp->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) return errno; return 0; } + +int ibv_cmd_open_xrc_domain(struct ibv_context *context, int fd, int oflag, + struct ibv_xrc_domain *d, + struct ibv_open_xrc_domain_resp *resp, + size_t resp_size) +{ + struct ibv_open_xrc_domain cmd; + + if (abi_ver < 6) + return ENOSYS; + + IBV_INIT_CMD_RESP(&cmd, sizeof cmd, OPEN_XRC_DOMAIN, resp, resp_size); + cmd.fd = fd; + cmd.oflags = oflag; + + if (write(context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) + return errno; + + d->handle = resp->xrcd_handle; + + return 0; +} + +int ibv_cmd_close_xrc_domain(struct ibv_xrc_domain *d) +{ + struct ibv_close_xrc_domain cmd; + + if (abi_ver < 6) + return ENOSYS; + + IBV_INIT_CMD(&cmd, sizeof cmd, CLOSE_XRC_DOMAIN); + cmd.xrcd_handle = d->handle; + + if (write(d->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) + return errno; + return 0; +} + +int ibv_cmd_reg_xrc_rcv_qp(struct ibv_xrc_domain *d, uint32_t xrc_qp_num) +{ + struct ibv_reg_xrc_rcv_qp cmd; + + if (abi_ver < 6) + return ENOSYS; + + IBV_INIT_CMD(&cmd, sizeof cmd, REG_XRC_RCV_QP); + cmd.xrc_domain_handle = d->handle; + cmd.qp_num = xrc_qp_num; + + if (write(d->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) + return errno; + return 0; +} + +int ibv_cmd_unreg_xrc_rcv_qp(struct ibv_xrc_domain *d, uint32_t xrc_qp_num) +{ + struct ibv_unreg_xrc_rcv_qp cmd; + + if (abi_ver < 6) + return ENOSYS; + + IBV_INIT_CMD(&cmd, sizeof cmd, UNREG_XRC_RCV_QP); + cmd.xrc_domain_handle = d->handle; + cmd.qp_num = xrc_qp_num; + + if (write(d->context->cmd_fd, &cmd, sizeof cmd) != sizeof cmd) + return errno; + return 0; +} + + diff --git a/src/device.c b/src/device.c index 3abc1eb..8af0eaa 100644 --- a/src/device.c +++ b/src/device.c @@ -182,31 +182,33 @@ int __ibv_get_async_event(struct ibv_context *context, event->event_type = ev.event_type; - switch (event->event_type) { - case IBV_EVENT_CQ_ERR: - event->element.cq = (void *) (uintptr_t) ev.element; - break; - - case IBV_EVENT_QP_FATAL: - case IBV_EVENT_QP_REQ_ERR: - case IBV_EVENT_QP_ACCESS_ERR: - case IBV_EVENT_COMM_EST: - case IBV_EVENT_SQ_DRAINED: - case IBV_EVENT_PATH_MIG: - case IBV_EVENT_PATH_MIG_ERR: - case IBV_EVENT_QP_LAST_WQE_REACHED: - event->element.qp = (void *) (uintptr_t) ev.element; - break; - - case IBV_EVENT_SRQ_ERR: - case IBV_EVENT_SRQ_LIMIT_REACHED: - event->element.srq = (void *) (uintptr_t) ev.element; - break; - - default: - event->element.port_num = ev.element; - break; - } + if (event->event_type & IBV_XRC_QP_EVENT_FLAG) { + event->element.xrc_qp_num = ev.element; + } else + switch (event->event_type) { + case IBV_EVENT_CQ_ERR: + event->element.cq = (void *) (uintptr_t) ev.element; + break; + + case IBV_EVENT_QP_FATAL: + case IBV_EVENT_QP_REQ_ERR: + case IBV_EVENT_QP_ACCESS_ERR: + case IBV_EVENT_COMM_EST: + case IBV_EVENT_SQ_DRAINED: + case IBV_EVENT_PATH_MIG: + case IBV_EVENT_PATH_MIG_ERR: + case IBV_EVENT_QP_LAST_WQE_REACHED: + event->element.qp = (void *) (uintptr_t) ev.element; + break; + + case IBV_EVENT_SRQ_ERR: + case IBV_EVENT_SRQ_LIMIT_REACHED: + event->element.srq = (void *) (uintptr_t) ev.element; + break; + default: + event->element.port_num = ev.element; + break; + } if (context->ops.async_event) context->ops.async_event(event); diff --git a/src/libibverbs.map b/src/libibverbs.map index 3a346ed..dfa53a4 100644 --- a/src/libibverbs.map +++ b/src/libibverbs.map @@ -91,4 +91,15 @@ IBVERBS_1.1 { ibv_dontfork_range; ibv_dofork_range; ibv_register_driver; + ibv_create_xrc_srq; + ibv_cmd_create_xrc_srq; + ibv_open_xrc_domain; + ibv_cmd_open_xrc_domain; + ibv_close_xrc_domain; + ibv_cmd_close_xrc_domain; + ibv_cmd_create_xrc_rcv_qp; + ibv_cmd_modify_xrc_rcv_qp; + ibv_cmd_query_xrc_rcv_qp; + ibv_cmd_reg_xrc_rcv_qp; + ibv_cmd_unreg_xrc_rcv_qp; } IBVERBS_1.0; diff --git a/src/verbs.c b/src/verbs.c index f5cf4d3..11d3c4c 100644 --- a/src/verbs.c +++ b/src/verbs.c @@ -364,6 +364,9 @@ struct ibv_srq *__ibv_create_srq(struct ibv_pd *pd, srq->context = pd->context; srq->srq_context = srq_init_attr->srq_context; srq->pd = pd; + srq->xrc_domain = NULL; + srq->xrc_cq = NULL; + srq->xrc_srq_num = 0; srq->events_completed = 0; pthread_mutex_init(&srq->mutex, NULL); pthread_cond_init(&srq->cond, NULL); @@ -373,6 +376,33 @@ struct ibv_srq *__ibv_create_srq(struct ibv_pd *pd, } default_symver(__ibv_create_srq, ibv_create_srq); +struct ibv_srq *__ibv_create_xrc_srq(struct ibv_pd *pd, + struct ibv_xrc_domain *xrc_domain, + struct ibv_cq *xrc_cq, + struct ibv_srq_init_attr *srq_init_attr) +{ + struct ibv_srq *srq; + + if (!pd->context->xrc_ops) + return NULL; + + srq = pd->context->xrc_ops->create_xrc_srq(pd, xrc_domain, + xrc_cq, srq_init_attr); + if (srq) { + srq->context = pd->context; + srq->srq_context = srq_init_attr->srq_context; + srq->pd = pd; + srq->xrc_domain = xrc_domain; + srq->xrc_cq = xrc_cq; + srq->events_completed = 0; + pthread_mutex_init(&srq->mutex, NULL); + pthread_cond_init(&srq->cond, NULL); + } + + return srq; +} +default_symver(__ibv_create_xrc_srq, ibv_create_xrc_srq); + int __ibv_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, enum ibv_srq_attr_mask srq_attr_mask) @@ -396,8 +426,9 @@ default_symver(__ibv_destroy_srq, ibv_destroy_srq); struct ibv_qp *__ibv_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr) { - struct ibv_qp *qp = pd->context->ops.create_qp(pd, qp_init_attr); + struct ibv_qp *qp; + qp = pd->context->ops.create_qp(pd, qp_init_attr); if (qp) { qp->context = pd->context; qp->qp_context = qp_init_attr->qp_context; @@ -408,6 +439,8 @@ struct ibv_qp *__ibv_create_qp(struct ibv_pd *pd, qp->qp_type = qp_init_attr->qp_type; qp->state = IBV_QPS_RESET; qp->events_completed = 0; + qp->xrc_domain = qp_init_attr->qp_type == IBV_QPT_XRC ? + qp_init_attr->xrc_domain : NULL; pthread_mutex_init(&qp->mutex, NULL); pthread_cond_init(&qp->cond, NULL); } @@ -541,3 +574,92 @@ int __ibv_detach_mcast(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid) return qp->context->ops.detach_mcast(qp, gid, lid); } default_symver(__ibv_detach_mcast, ibv_detach_mcast); + +struct ibv_xrc_domain *__ibv_open_xrc_domain(struct ibv_context *context, + int fd, int oflag) +{ + struct ibv_xrc_domain *d; + + if (!context->xrc_ops) + return NULL; + + d = context->xrc_ops->open_xrc_domain(context, fd, oflag); + if (d) + d->context = context; + + return d; +} +default_symver(__ibv_open_xrc_domain, ibv_open_xrc_domain); + +int __ibv_close_xrc_domain(struct ibv_xrc_domain *d) +{ + if (!d->context->xrc_ops) + return 0; + + return d->context->xrc_ops->close_xrc_domain(d); +} +default_symver(__ibv_close_xrc_domain, ibv_close_xrc_domain); + +int __ibv_create_xrc_rcv_qp(struct ibv_qp_init_attr *init_attr, + uint32_t *xrc_rcv_qpn) +{ + struct ibv_context *c; + if (!init_attr || !(init_attr->xrc_domain)) + return EINVAL; + + c = init_attr->xrc_domain->context; + if (!c->xrc_ops) + return ENOSYS; + + return c->xrc_ops->create_xrc_rcv_qp(init_attr, + xrc_rcv_qpn); +} +default_symver(__ibv_create_xrc_rcv_qp, ibv_create_xrc_rcv_qp); + +int __ibv_modify_xrc_rcv_qp(struct ibv_xrc_domain *d, + uint32_t xrc_rcv_qpn, + struct ibv_qp_attr *attr, + int attr_mask) +{ + if (!d || !attr) + return EINVAL; + + if (!d->context->xrc_ops) + return ENOSYS; + + return d->context->xrc_ops->modify_xrc_rcv_qp(d, xrc_rcv_qpn, attr, + attr_mask); +} +default_symver(__ibv_modify_xrc_rcv_qp, ibv_modify_xrc_rcv_qp); + +int __ibv_query_xrc_rcv_qp(struct ibv_xrc_domain *d, + uint32_t xrc_rcv_qpn, + struct ibv_qp_attr *attr, + int attr_mask, + struct ibv_qp_init_attr *init_attr) +{ + if (!d) + return EINVAL; + + if (!d->context->xrc_ops) + return ENOSYS; + + return d->context->xrc_ops->query_xrc_rcv_qp(d, xrc_rcv_qpn, attr, + attr_mask, init_attr); +} +default_symver(__ibv_query_xrc_rcv_qp, ibv_query_xrc_rcv_qp); + +int __ibv_reg_xrc_rcv_qp(struct ibv_xrc_domain *d, + uint32_t xrc_rcv_qpn) +{ + return d->context->xrc_ops->reg_xrc_rcv_qp(d, xrc_rcv_qpn); +} +default_symver(__ibv_reg_xrc_rcv_qp, ibv_reg_xrc_rcv_qp); + +int __ibv_unreg_xrc_rcv_qp(struct ibv_xrc_domain *d, + uint32_t xrc_rcv_qpn) +{ + return d->context->xrc_ops->unreg_xrc_rcv_qp(d, xrc_rcv_qpn); +} +default_symver(__ibv_unreg_xrc_rcv_qp, ibv_unreg_xrc_rcv_qp); + From jackm at dev.mellanox.co.il Wed Jan 23 02:00:03 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Jan 2008 12:00:03 +0200 Subject: [ofa-general] [PATCH 3/8] mlx4: implement XRC qps (including XRC receive-only) Message-ID: <200801231200.03980.jackm@dev.mellanox.co.il> mlx4: Implements XRC support for userspace XRC QPs. Changes: Added support for XRC RCV-only QP (requested by userspace, but resides in kernel space). Signed-off-by: Jack Morgenstein Index: infiniband/include/linux/mlx4/device.h =================================================================== --- infiniband.orig/include/linux/mlx4/device.h 2008-01-22 18:41:24.000000000 +0200 +++ infiniband/include/linux/mlx4/device.h 2008-01-22 19:11:45.000000000 +0200 @@ -56,6 +56,7 @@ enum { MLX4_DEV_CAP_FLAG_RC = 1 << 0, MLX4_DEV_CAP_FLAG_UC = 1 << 1, MLX4_DEV_CAP_FLAG_UD = 1 << 2, + MLX4_DEV_CAP_FLAG_XRC = 1 << 3, MLX4_DEV_CAP_FLAG_SRQ = 1 << 6, MLX4_DEV_CAP_FLAG_IPOIB_CSUM = 1 << 7, MLX4_DEV_CAP_FLAG_BAD_PKEY_CNTR = 1 << 8, @@ -176,6 +177,8 @@ struct mlx4_caps { int num_pds; int reserved_pds; int mtt_entry_sz; + int reserved_xrcds; + int max_xrcds; u32 max_msg_sz; u32 page_size_cap; u32 flags; @@ -312,6 +315,9 @@ void mlx4_buf_free(struct mlx4_dev *dev, int mlx4_pd_alloc(struct mlx4_dev *dev, u32 *pdn); void mlx4_pd_free(struct mlx4_dev *dev, u32 pdn); +int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn); +void mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn); + int mlx4_uar_alloc(struct mlx4_dev *dev, struct mlx4_uar *uar); void mlx4_uar_free(struct mlx4_dev *dev, struct mlx4_uar *uar); @@ -336,8 +342,8 @@ void mlx4_cq_free(struct mlx4_dev *dev, int mlx4_qp_alloc(struct mlx4_dev *dev, int sqpn, struct mlx4_qp *qp); void mlx4_qp_free(struct mlx4_dev *dev, struct mlx4_qp *qp); -int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, struct mlx4_mtt *mtt, - u64 db_rec, struct mlx4_srq *srq); +int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, u32 cqn, u16 xrcd, + struct mlx4_mtt *mtt, u64 db_rec, struct mlx4_srq *srq); void mlx4_srq_free(struct mlx4_dev *dev, struct mlx4_srq *srq); int mlx4_srq_arm(struct mlx4_dev *dev, struct mlx4_srq *srq, int limit_watermark); int mlx4_srq_query(struct mlx4_dev *dev, struct mlx4_srq *srq, int *limit_watermark); Index: infiniband/drivers/infiniband/hw/mlx4/main.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/main.c 2008-01-22 18:41:12.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/main.c 2008-01-22 19:45:11.000000000 +0200 @@ -99,6 +99,8 @@ static int mlx4_ib_query_device(struct i props->device_cap_flags |= IB_DEVICE_AUTO_PATH_MIG; if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_UD_AV_PORT) props->device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) + props->device_cap_flags |= IB_DEVICE_XRC; props->vendor_id = be32_to_cpup((__be32 *) (out_mad->data + 36)) & 0xffffff; @@ -406,6 +408,7 @@ static struct ib_pd *mlx4_ib_alloc_pd(st if (!pd) return ERR_PTR(-ENOMEM); + memset(pd, 0, sizeof *pd); err = mlx4_pd_alloc(to_mdev(ibdev)->dev, &pd->pdn); if (err) { kfree(pd); @@ -442,6 +445,80 @@ static int mlx4_ib_mcg_detach(struct ib_ &to_mqp(ibqp)->mqp, gid->raw); } +static void mlx4_dummy_comp_handler(struct ib_cq *cq, void *cq_context) +{ +} + +static struct ib_xrcd *mlx4_ib_alloc_xrcd(struct ib_device *ibdev, + struct ib_ucontext *context, + struct ib_udata *udata) +{ + struct mlx4_ib_xrcd *xrcd; + struct mlx4_ib_dev *mdev = to_mdev(ibdev); + struct ib_pd *pd; + struct ib_cq *cq; + int err; + + if (!(mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return ERR_PTR(-ENOSYS); + + xrcd = kmalloc(sizeof *xrcd, GFP_KERNEL); + if (!xrcd) + return ERR_PTR(-ENOMEM); + + err = mlx4_xrcd_alloc(mdev->dev, &xrcd->xrcdn); + if (err) + goto err_xrcd; + + pd = mlx4_ib_alloc_pd(ibdev,NULL,NULL); + if (IS_ERR(pd)) { + err = PTR_ERR(pd); + goto err_pd; + } + pd->device = ibdev; + + cq = mlx4_ib_create_cq(ibdev, 1, 0, NULL, NULL); + if (IS_ERR(cq)) { + err = PTR_ERR(cq); + goto err_cq; + } + cq->device = ibdev; + cq->comp_handler = mlx4_dummy_comp_handler; + + if (context) + if (ib_copy_to_udata(udata, &xrcd->xrcdn, sizeof (__u32))) { + err = -EFAULT; + goto err_copy; + } + + xrcd->cq = cq; + xrcd->pd = pd; + return &xrcd->ibxrcd; + +err_copy: + mlx4_ib_destroy_cq(cq); +err_cq: + mlx4_ib_dealloc_pd(pd); +err_pd: + mlx4_xrcd_free(mdev->dev, xrcd->xrcdn); +err_xrcd: + kfree(xrcd); + return ERR_PTR(err); +} + +static int mlx4_ib_dealloc_xrcd(struct ib_xrcd *xrcd) +{ + struct mlx4_ib_xrcd *mxrcd = to_mxrcd(xrcd); + + mlx4_ib_destroy_cq(mxrcd->cq); + mlx4_ib_dealloc_pd(mxrcd->pd); + mlx4_xrcd_free(to_mdev(xrcd->device)->dev, to_mxrcd(xrcd)->xrcdn); + kfree(xrcd); + + return 0; +} + + static int init_node_data(struct mlx4_ib_dev *dev) { struct ib_smp *in_mad = NULL; @@ -611,6 +688,25 @@ static void *mlx4_ib_add(struct mlx4_dev ibdev->ib_dev.map_phys_fmr = mlx4_ib_map_phys_fmr; ibdev->ib_dev.unmap_fmr = mlx4_ib_unmap_fmr; ibdev->ib_dev.dealloc_fmr = mlx4_ib_fmr_dealloc; + if (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) { + ibdev->ib_dev.create_xrc_srq = mlx4_ib_create_xrc_srq; + ibdev->ib_dev.alloc_xrcd = mlx4_ib_alloc_xrcd; + ibdev->ib_dev.dealloc_xrcd = mlx4_ib_dealloc_xrcd; + ibdev->ib_dev.create_xrc_rcv_qp = mlx4_ib_create_xrc_rcv_qp; + ibdev->ib_dev.modify_xrc_rcv_qp = mlx4_ib_modify_xrc_rcv_qp; + ibdev->ib_dev.query_xrc_rcv_qp = mlx4_ib_query_xrc_rcv_qp; + ibdev->ib_dev.reg_xrc_rcv_qp = mlx4_ib_reg_xrc_rcv_qp; + ibdev->ib_dev.unreg_xrc_rcv_qp = mlx4_ib_unreg_xrc_rcv_qp; + ibdev->ib_dev.uverbs_cmd_mask |= + (1ull << IB_USER_VERBS_CMD_CREATE_XRC_SRQ) | + (1ull << IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN) | + (1ull << IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN) | + (1ull << IB_USER_VERBS_CMD_CREATE_XRC_RCV_QP) | + (1ull << IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP) | + (1ull << IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP) | + (1ull << IB_USER_VERBS_CMD_REG_XRC_RCV_QP) | + (1ull << IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP); + } if (init_node_data(ibdev)) goto err_map; Index: infiniband/drivers/infiniband/hw/mlx4/mlx4_ib.h =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/mlx4_ib.h 2008-01-22 19:05:29.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/mlx4_ib.h 2008-01-22 19:45:11.000000000 +0200 @@ -73,6 +73,13 @@ struct mlx4_ib_pd { u32 pdn; }; +struct mlx4_ib_xrcd { + struct ib_xrcd ibxrcd; + u32 xrcdn; + struct ib_pd *pd; + struct ib_cq *cq; +}; + struct mlx4_ib_cq_buf { struct mlx4_buf buf; struct mlx4_mtt mtt; @@ -127,6 +134,9 @@ struct mlx4_ib_qp { struct mlx4_mtt mtt; int buf_size; struct mutex mutex; + enum qp_create_flags create_flags; + struct list_head xrc_reg_list; + u16 xrcdn; u8 port; u8 alt_port; u8 atomic_rd_en; @@ -189,6 +199,11 @@ static inline struct mlx4_ib_pd *to_mpd( return container_of(ibpd, struct mlx4_ib_pd, ibpd); } +static inline struct mlx4_ib_xrcd *to_mxrcd(struct ib_xrcd *ibxrcd) +{ + return container_of(ibxrcd, struct mlx4_ib_xrcd, ibxrcd); +} + static inline struct mlx4_ib_cq *to_mcq(struct ib_cq *ibcq) { return container_of(ibcq, struct mlx4_ib_cq, ibcq); @@ -263,6 +278,11 @@ int mlx4_ib_destroy_ah(struct ib_ah *ah) struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd, struct ib_srq_init_attr *init_attr, struct ib_udata *udata); +struct ib_srq *mlx4_ib_create_xrc_srq(struct ib_pd *pd, + struct ib_cq *xrc_cq, + struct ib_xrcd *xrcd, + struct ib_srq_init_attr *init_attr, + struct ib_udata *udata); int mlx4_ib_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, enum ib_srq_attr_mask attr_mask, struct ib_udata *udata); int mlx4_ib_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr); @@ -299,6 +319,16 @@ int mlx4_ib_map_phys_fmr(struct ib_fmr * u64 iova); int mlx4_ib_unmap_fmr(struct list_head *fmr_list); int mlx4_ib_fmr_dealloc(struct ib_fmr *fmr); +int mlx4_ib_create_xrc_rcv_qp(struct ib_qp_init_attr *init_attr, + u32 *qp_num); +int mlx4_ib_modify_xrc_rcv_qp(struct ib_xrcd *xrcd, u32 qp_num, + struct ib_qp_attr *attr, int attr_mask); +int mlx4_ib_query_xrc_rcv_qp(struct ib_xrcd *xrcd, u32 qp_num, + struct ib_qp_attr *attr, int attr_mask, + struct ib_qp_init_attr *init_attr); +int mlx4_ib_reg_xrc_rcv_qp(struct ib_xrcd *xrcd, void * context, u32 qp_num); +int mlx4_ib_unreg_xrc_rcv_qp(struct ib_xrcd *xrcd, void * context, u32 qp_num); + static inline int mlx4_ib_ah_grh_present(struct mlx4_ib_ah *ah) { Index: infiniband/drivers/net/mlx4/xrcd.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ infiniband/drivers/net/mlx4/xrcd.c 2008-01-22 19:11:45.000000000 +0200 @@ -0,0 +1,70 @@ +/* + * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved. + * Copyright (c) 2007 Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include + +#include "mlx4.h" + +int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + + *xrcdn = mlx4_bitmap_alloc(&priv->xrcd_bitmap); + if (*xrcdn == -1) + return -ENOMEM; + + return 0; +} +EXPORT_SYMBOL_GPL(mlx4_xrcd_alloc); + +void mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn) +{ + mlx4_bitmap_free(&mlx4_priv(dev)->xrcd_bitmap, xrcdn); +} +EXPORT_SYMBOL_GPL(mlx4_xrcd_free); + +int __devinit mlx4_init_xrcd_table(struct mlx4_dev *dev) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + + return mlx4_bitmap_init(&priv->xrcd_bitmap, (1 << 16), + (1 << 16) - 1, dev->caps.reserved_xrcds + 1); +} + +void mlx4_cleanup_xrcd_table(struct mlx4_dev *dev) +{ + mlx4_bitmap_cleanup(&mlx4_priv(dev)->xrcd_bitmap); +} + + Index: infiniband/drivers/net/mlx4/mlx4.h =================================================================== --- infiniband.orig/drivers/net/mlx4/mlx4.h 2008-01-22 18:41:14.000000000 +0200 +++ infiniband/drivers/net/mlx4/mlx4.h 2008-01-22 19:11:45.000000000 +0200 @@ -260,6 +260,7 @@ struct mlx4_priv { struct mlx4_cmd cmd; struct mlx4_bitmap pd_bitmap; + struct mlx4_bitmap xrcd_bitmap; struct mlx4_uar_table uar_table; struct mlx4_mr_table mr_table; struct mlx4_cq_table cq_table; @@ -289,6 +290,7 @@ void mlx4_bitmap_cleanup(struct mlx4_bit int mlx4_reset(struct mlx4_dev *dev); int mlx4_init_pd_table(struct mlx4_dev *dev); +int mlx4_init_xrcd_table(struct mlx4_dev *dev); int mlx4_init_uar_table(struct mlx4_dev *dev); int mlx4_init_mr_table(struct mlx4_dev *dev); int mlx4_init_eq_table(struct mlx4_dev *dev); @@ -305,6 +307,7 @@ void mlx4_cleanup_cq_table(struct mlx4_d void mlx4_cleanup_qp_table(struct mlx4_dev *dev); void mlx4_cleanup_srq_table(struct mlx4_dev *dev); void mlx4_cleanup_mcg_table(struct mlx4_dev *dev); +void mlx4_cleanup_xrcd_table(struct mlx4_dev *dev); void mlx4_start_catas_poll(struct mlx4_dev *dev); void mlx4_stop_catas_poll(struct mlx4_dev *dev); Index: infiniband/drivers/net/mlx4/main.c =================================================================== --- infiniband.orig/drivers/net/mlx4/main.c 2008-01-22 18:41:14.000000000 +0200 +++ infiniband/drivers/net/mlx4/main.c 2008-01-22 19:11:45.000000000 +0200 @@ -159,6 +159,10 @@ static int mlx4_dev_cap(struct mlx4_dev dev->caps.page_size_cap = ~(u32) (dev_cap->min_page_sz - 1); dev->caps.flags = dev_cap->flags; dev->caps.stat_rate_support = dev_cap->stat_rate_support; + dev->caps.reserved_xrcds = (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) ? + dev_cap->reserved_xrcds : 0; + dev->caps.max_xrcds = (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) ? + dev_cap->max_xrcds : 0; return 0; } @@ -586,11 +590,18 @@ static int mlx4_setup_hca(struct mlx4_de goto err_kar_unmap; } + err = mlx4_init_xrcd_table(dev); + if (err) { + mlx4_err(dev, "Failed to initialize " + "extended reliably connected domain table, aborting.\n"); + goto err_pd_table_free; + } + err = mlx4_init_mr_table(dev); if (err) { mlx4_err(dev, "Failed to initialize " "memory region table, aborting.\n"); - goto err_pd_table_free; + goto err_xrcd_table_free; } err = mlx4_init_eq_table(dev); @@ -674,6 +685,9 @@ err_eq_table_free: err_mr_table_free: mlx4_cleanup_mr_table(dev); +err_xrcd_table_free: + mlx4_cleanup_xrcd_table(dev); + err_pd_table_free: mlx4_cleanup_pd_table(dev); @@ -847,6 +861,7 @@ err_cleanup: mlx4_cmd_use_polling(dev); mlx4_cleanup_eq_table(dev); mlx4_cleanup_mr_table(dev); + mlx4_cleanup_xrcd_table(dev); mlx4_cleanup_pd_table(dev); mlx4_cleanup_uar_table(dev); @@ -906,6 +921,7 @@ static void mlx4_remove_one(struct pci_d mlx4_cmd_use_polling(dev); mlx4_cleanup_eq_table(dev); mlx4_cleanup_mr_table(dev); + mlx4_cleanup_xrcd_table(dev); mlx4_cleanup_pd_table(dev); iounmap(priv->kar); Index: infiniband/drivers/net/mlx4/srq.c =================================================================== --- infiniband.orig/drivers/net/mlx4/srq.c 2008-01-22 18:41:14.000000000 +0200 +++ infiniband/drivers/net/mlx4/srq.c 2008-01-22 19:11:45.000000000 +0200 @@ -40,20 +40,20 @@ struct mlx4_srq_context { __be32 state_logsize_srqn; u8 logstride; - u8 reserved1[3]; - u8 pg_offset; - u8 reserved2[3]; - u32 reserved3; + u8 reserved1; + __be16 xrc_domain; + __be32 pg_offset_cqn; + u32 reserved2; u8 log_page_size; - u8 reserved4[2]; + u8 reserved3[2]; u8 mtt_base_addr_h; __be32 mtt_base_addr_l; __be32 pd; __be16 limit_watermark; __be16 wqe_cnt; - u16 reserved5; + u16 reserved4; __be16 wqe_counter; - u32 reserved6; + u32 reserved5; __be64 db_rec_addr; }; @@ -109,8 +109,8 @@ static int mlx4_QUERY_SRQ(struct mlx4_de MLX4_CMD_TIME_CLASS_A); } -int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, struct mlx4_mtt *mtt, - u64 db_rec, struct mlx4_srq *srq) +int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, u32 cqn, u16 xrcd, + struct mlx4_mtt *mtt, u64 db_rec, struct mlx4_srq *srq) { struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table; struct mlx4_cmd_mailbox *mailbox; @@ -148,6 +148,8 @@ int mlx4_srq_alloc(struct mlx4_dev *dev, srq_context->state_logsize_srqn = cpu_to_be32((ilog2(srq->max) << 24) | srq->srqn); srq_context->logstride = srq->wqe_shift - 4; + srq_context->xrc_domain = cpu_to_be16(xrcd); + srq_context->pg_offset_cqn = cpu_to_be32(cqn & 0xffffff); srq_context->log_page_size = mtt->page_shift - MLX4_ICM_PAGE_SHIFT; mtt_addr = mlx4_mtt_addr(dev, mtt); Index: infiniband/drivers/net/mlx4/fw.c =================================================================== --- infiniband.orig/drivers/net/mlx4/fw.c 2008-01-22 19:01:33.000000000 +0200 +++ infiniband/drivers/net/mlx4/fw.c 2008-01-22 19:11:45.000000000 +0200 @@ -159,6 +159,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev * #define QUERY_DEV_CAP_MAX_MCG_OFFSET 0x63 #define QUERY_DEV_CAP_RSVD_PD_OFFSET 0x64 #define QUERY_DEV_CAP_MAX_PD_OFFSET 0x65 +#define QUERY_DEV_CAP_RSVD_XRC_OFFSET 0x66 +#define QUERY_DEV_CAP_MAX_XRC_OFFSET 0x67 #define QUERY_DEV_CAP_RDMARC_ENTRY_SZ_OFFSET 0x80 #define QUERY_DEV_CAP_QPC_ENTRY_SZ_OFFSET 0x82 #define QUERY_DEV_CAP_AUX_ENTRY_SZ_OFFSET 0x84 @@ -262,6 +264,11 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev * MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_PD_OFFSET); dev_cap->max_pds = 1 << (field & 0x3f); + MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_XRC_OFFSET); + dev_cap->reserved_xrcds = field >> 4; + MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_XRC_OFFSET); + dev_cap->max_xrcds = 1 << (field & 0x1f); + MLX4_GET(size, outbox, QUERY_DEV_CAP_RDMARC_ENTRY_SZ_OFFSET); dev_cap->rdmarc_entry_sz = size; MLX4_GET(size, outbox, QUERY_DEV_CAP_QPC_ENTRY_SZ_OFFSET); Index: infiniband/drivers/net/mlx4/fw.h =================================================================== --- infiniband.orig/drivers/net/mlx4/fw.h 2008-01-22 18:41:14.000000000 +0200 +++ infiniband/drivers/net/mlx4/fw.h 2008-01-22 19:11:45.000000000 +0200 @@ -82,6 +82,8 @@ struct mlx4_dev_cap { int max_mcgs; int reserved_pds; int max_pds; + int reserved_xrcds; + int max_xrcds; int qpc_entry_sz; int rdmarc_entry_sz; int altc_entry_sz; Index: infiniband/drivers/infiniband/hw/mlx4/qp.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/qp.c 2008-01-22 19:05:29.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/qp.c 2008-01-22 19:45:11.000000000 +0200 @@ -54,6 +54,12 @@ enum { MLX4_IB_UD_HEADER_SIZE = 72 }; + +struct mlx4_ib_xrc_reg_entry { + struct list_head list; + void *context; +}; + struct mlx4_ib_sqp { struct mlx4_ib_qp qp; int pkey_index; @@ -130,14 +136,15 @@ static void stamp_send_wqe(struct mlx4_i static void mlx4_ib_qp_event(struct mlx4_qp *qp, enum mlx4_event type) { struct ib_event event; - struct ib_qp *ibqp = &to_mibqp(qp)->ibqp; + struct mlx4_ib_qp *mqp = to_mibqp(qp); + struct ib_qp *ibqp = &mqp->ibqp; + struct mlx4_ib_xrc_reg_entry *ctx_entry; if (type == MLX4_EVENT_TYPE_PATH_MIG) to_mibqp(qp)->port = to_mibqp(qp)->alt_port; if (ibqp->event_handler) { event.device = ibqp->device; - event.element.qp = ibqp; switch (type) { case MLX4_EVENT_TYPE_PATH_MIG: event.event = IB_EVENT_PATH_MIG; @@ -169,7 +176,16 @@ static void mlx4_ib_qp_event(struct mlx4 return; } - ibqp->event_handler(&event, ibqp->qp_context); + if (!(ibqp->qp_type == IB_QPT_XRC && + mqp->create_flags & XRC_RCV_QP)) { + event.element.qp = ibqp; + ibqp->event_handler(&event, ibqp->qp_context); + } else { + event.event |= IB_XRC_QP_EVENT_FLAG; + event.element.xrc_qp_num = ibqp->qp_num; + list_for_each_entry(ctx_entry, &mqp->xrc_reg_list, list) + ibqp->event_handler(&event, ctx_entry->context); + } } } @@ -209,14 +225,14 @@ static int send_wqe_overhead(enum ib_qp_ } static int set_rq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap, - int is_user, int has_srq, struct mlx4_ib_qp *qp) + int is_user, int has_srq_or_is_xrc, struct mlx4_ib_qp *qp) { /* Sanity check RQ size before proceeding */ if (cap->max_recv_wr > dev->dev->caps.max_wqes || cap->max_recv_sge > dev->dev->caps.max_rq_sg) return -EINVAL; - if (has_srq) { + if (has_srq_or_is_xrc) { /* QPs attached to an SRQ should have no RQ */ if (cap->max_recv_wr) return -EINVAL; @@ -328,7 +344,8 @@ static int create_qp_common(struct mlx4_ qp->sq.head = 0; qp->sq.tail = 0; - err = set_rq_size(dev, &init_attr->cap, !!pd->uobject, !!init_attr->srq, qp); + err = set_rq_size(dev, &init_attr->cap, !!pd->uobject, + !!init_attr->srq || !!init_attr->xrc_domain , qp); if (err) goto err; @@ -362,7 +379,7 @@ static int create_qp_common(struct mlx4_ if (err) goto err_mtt; - if (!init_attr->srq) { + if (!init_attr->srq && init_attr->qp_type != IB_QPT_XRC) { err = mlx4_ib_db_map_user(to_mucontext(pd->uobject->context), ucmd.db_addr, &qp->db); if (err) @@ -375,7 +392,7 @@ static int create_qp_common(struct mlx4_ if (err) goto err; - if (!init_attr->srq) { + if (!init_attr->srq && init_attr->qp_type != IB_QPT_XRC) { err = mlx4_ib_db_alloc(dev, &qp->db, 0); if (err) goto err; @@ -410,6 +427,9 @@ static int create_qp_common(struct mlx4_ if (err) goto err_wrid; + if (init_attr->qp_type == IB_QPT_XRC) + qp->mqp.qpn |= (1 << 23); + /* * Hardware wants QPN written in big-endian order (after * shifting) for send doorbell. Precompute this value to save @@ -428,7 +448,7 @@ static int create_qp_common(struct mlx4_ err_wrid: if (pd->uobject) { - if (!init_attr->srq) + if (!init_attr->srq && init_attr->qp_type != IB_QPT_XRC) mlx4_ib_db_unmap_user(to_mucontext(pd->uobject->context), &qp->db); } else { @@ -446,7 +466,7 @@ err_buf: mlx4_buf_free(dev->dev, qp->buf_size, &qp->buf); err_db: - if (!pd->uobject && !init_attr->srq) + if (!pd->uobject && !init_attr->srq && init_attr->qp_type != IB_QPT_XRC) mlx4_ib_db_free(dev, &qp->db); err: @@ -524,7 +544,7 @@ static void destroy_qp_common(struct mlx mlx4_mtt_cleanup(dev->dev, &qp->mtt); if (is_user) { - if (!qp->ibqp.srq) + if (!qp->ibqp.srq && qp->ibqp.qp_type != IB_QPT_XRC) mlx4_ib_db_unmap_user(to_mucontext(qp->ibqp.uobject->context), &qp->db); ib_umem_release(qp->umem); @@ -532,7 +552,7 @@ static void destroy_qp_common(struct mlx kfree(qp->sq.wrid); kfree(qp->rq.wrid); mlx4_buf_free(dev->dev, qp->buf_size, &qp->buf); - if (!qp->ibqp.srq) + if (!qp->ibqp.srq && qp->ibqp.qp_type != IB_QPT_XRC) mlx4_ib_db_free(dev, &qp->db); } } @@ -547,6 +567,9 @@ struct ib_qp *mlx4_ib_create_qp(struct i int err; switch (init_attr->qp_type) { + case IB_QPT_XRC: + if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return ERR_PTR(-ENOSYS); case IB_QPT_RC: case IB_QPT_UC: case IB_QPT_UD: @@ -555,12 +578,20 @@ struct ib_qp *mlx4_ib_create_qp(struct i if (!qp) return ERR_PTR(-ENOMEM); + memset(qp, 0, sizeof *qp); + INIT_LIST_HEAD(&qp->xrc_reg_list); + qp->create_flags = init_attr->create_flags; err = create_qp_common(dev, pd, init_attr, udata, 0, qp); if (err) { kfree(qp); return ERR_PTR(err); } + if (init_attr->qp_type == IB_QPT_XRC) + qp->xrcdn = to_mxrcd(init_attr->xrc_domain)->xrcdn; + else + qp->xrcdn = 0; + qp->ibqp.qp_num = qp->mqp.qpn; break; @@ -625,6 +656,7 @@ static int to_mlx4_st(enum ib_qp_type ty case IB_QPT_RC: return MLX4_QP_ST_RC; case IB_QPT_UC: return MLX4_QP_ST_UC; case IB_QPT_UD: return MLX4_QP_ST_UD; + case IB_QPT_XRC: return MLX4_QP_ST_XRC; case IB_QPT_SMI: case IB_QPT_GSI: return MLX4_QP_ST_MLX; default: return -1; @@ -769,8 +801,11 @@ static int __mlx4_ib_modify_qp(struct ib context->sq_size_stride = ilog2(qp->sq.wqe_cnt) << 3; context->sq_size_stride |= qp->sq.wqe_shift - 4; - if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) + if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) { context->sq_size_stride |= !!qp->sq_no_prefetch << 7; + if (ibqp->qp_type == IB_QPT_XRC) + context->xrcd = cpu_to_be32((u32) qp->xrcdn); + } if (qp->ibqp.uobject) context->usr_page = cpu_to_be32(to_mucontext(ibqp->uobject->context)->uar.index); @@ -882,7 +917,8 @@ static int __mlx4_ib_modify_qp(struct ib if (ibqp->srq) context->srqn = cpu_to_be32(1 << 24 | to_msrq(ibqp->srq)->msrq.srqn); - if (!ibqp->srq && cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) + if (!ibqp->srq && ibqp->qp_type != IB_QPT_XRC && + cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) context->db_rec_addr = cpu_to_be64(qp->db.dma); if (cur_state == IB_QPS_INIT && @@ -969,7 +1005,7 @@ static int __mlx4_ib_modify_qp(struct ib qp->rq.tail = 0; qp->sq.head = 0; qp->sq.tail = 0; - if (!ibqp->srq) + if (!ibqp->srq && ibqp->qp_type != IB_QPT_XRC) *qp->db.db = 0; } @@ -1662,3 +1698,244 @@ done: return 0; } +int mlx4_ib_create_xrc_rcv_qp(struct ib_qp_init_attr *init_attr, + u32 *qp_num) +{ + struct mlx4_ib_dev *dev = to_mdev(init_attr->xrc_domain->device); + struct mlx4_ib_xrcd *xrcd = to_mxrcd(init_attr->xrc_domain); + struct ib_qp_init_attr ia = *init_attr; + struct mlx4_ib_qp *qp; + struct ib_qp *ibqp; + struct mlx4_ib_xrc_reg_entry *ctx_entry; + + if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return -ENOSYS; + + ctx_entry = kmalloc(sizeof *ctx_entry, GFP_KERNEL); + if (!ctx_entry) + return -ENOMEM; + + ia.qp_type = IB_QPT_XRC; + ia.create_flags = XRC_RCV_QP; + ia.recv_cq = ia.send_cq = xrcd->cq; + + ibqp = mlx4_ib_create_qp(xrcd->pd, &ia, NULL); + if (IS_ERR(ibqp)) { + kfree(ctx_entry); + return PTR_ERR(ibqp); + } + + /* set the ibpq attributes which will be used by the mlx4 module */ + ibqp->device = init_attr->xrc_domain->device; + ibqp->pd = xrcd->pd; + ibqp->send_cq = ibqp->recv_cq = xrcd->cq; + ibqp->event_handler = init_attr->event_handler; + ibqp->qp_context = init_attr->qp_context; + ibqp->qp_type = init_attr->qp_type; + ibqp->xrcd = init_attr->xrc_domain; + + qp = to_mqp(ibqp); + + mutex_lock(&qp->mutex); + ctx_entry->context = init_attr->qp_context; + list_add_tail(&ctx_entry->list, &qp->xrc_reg_list); + mutex_unlock(&qp->mutex); + *qp_num = qp->mqp.qpn; + return 0; +} + +int mlx4_ib_modify_xrc_rcv_qp(struct ib_xrcd *ibxrcd, u32 qp_num, + struct ib_qp_attr *attr, int attr_mask) +{ + struct mlx4_ib_dev *dev = to_mdev(ibxrcd->device); + struct mlx4_ib_xrcd *xrcd = to_mxrcd(ibxrcd); + struct mlx4_qp *mqp; + int err; + + if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return -ENOSYS; + + mqp = __mlx4_qp_lookup(dev->dev, qp_num); + if (unlikely(!mqp)) { + printk(KERN_WARNING "mlx4_ib_reg_xrc_rcv_qp: unknown QPN %06x\n", + qp_num); + return -EINVAL; + } + + if (xrcd->xrcdn != to_mxrcd(to_mibqp(mqp)->ibqp.xrcd)->xrcdn) + return -EINVAL; + + err = mlx4_ib_modify_qp(&(to_mibqp(mqp)->ibqp), attr, attr_mask, NULL); + return err; +} + +int mlx4_ib_query_xrc_rcv_qp(struct ib_xrcd *ibxrcd, u32 qp_num, + struct ib_qp_attr *qp_attr, int qp_attr_mask, + struct ib_qp_init_attr *qp_init_attr) +{ + struct mlx4_ib_dev *dev = to_mdev(ibxrcd->device); + struct mlx4_ib_xrcd *xrcd = to_mxrcd(ibxrcd); + struct mlx4_ib_qp *qp; + struct mlx4_qp *mqp; + struct mlx4_qp_context context; + int mlx4_state; + int err; + + if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return -ENOSYS; + + mqp = __mlx4_qp_lookup(dev->dev, qp_num); + if (unlikely(!mqp)) { + printk(KERN_WARNING "mlx4_ib_reg_xrc_rcv_qp: unknown QPN %06x\n", + qp_num); + return -EINVAL; + } + + qp = to_mibqp(mqp); + if (xrcd->xrcdn != to_mxrcd(qp->ibqp.xrcd)->xrcdn) + return -EINVAL; + + if (qp->state == IB_QPS_RESET) { + qp_attr->qp_state = IB_QPS_RESET; + goto done; + } + + err = mlx4_qp_query(dev->dev, mqp, &context); + if (err) + return -EINVAL; + + mlx4_state = be32_to_cpu(context.flags) >> 28; + + qp_attr->qp_state = to_ib_qp_state(mlx4_state); + qp_attr->path_mtu = context.mtu_msgmax >> 5; + qp_attr->path_mig_state = + to_ib_mig_state((be32_to_cpu(context.flags) >> 11) & 0x3); + qp_attr->qkey = be32_to_cpu(context.qkey); + qp_attr->rq_psn = be32_to_cpu(context.rnr_nextrecvpsn) & 0xffffff; + qp_attr->sq_psn = be32_to_cpu(context.next_send_psn) & 0xffffff; + qp_attr->dest_qp_num = be32_to_cpu(context.remote_qpn) & 0xffffff; + qp_attr->qp_access_flags = + to_ib_qp_access_flags(be32_to_cpu(context.params2)); + + if (qp->ibqp.qp_type == IB_QPT_RC || qp->ibqp.qp_type == IB_QPT_UC || + qp->ibqp.qp_type == IB_QPT_XRC) { + to_ib_ah_attr(dev->dev, &qp_attr->ah_attr, &context.pri_path); + to_ib_ah_attr(dev->dev, &qp_attr->alt_ah_attr, &context.alt_path); + qp_attr->alt_pkey_index = context.alt_path.pkey_index & 0x7f; + qp_attr->alt_port_num = qp_attr->alt_ah_attr.port_num; + } + + qp_attr->pkey_index = context.pri_path.pkey_index & 0x7f; + if (qp_attr->qp_state == IB_QPS_INIT) + qp_attr->port_num = qp->port; + else + qp_attr->port_num = context.pri_path.sched_queue & 0x40 ? 2 : 1; + + /* qp_attr->en_sqd_async_notify is only applicable in modify qp */ + qp_attr->sq_draining = mlx4_state == MLX4_QP_STATE_SQ_DRAINING; + + qp_attr->max_rd_atomic = 1 << ((be32_to_cpu(context.params1) >> 21) & 0x7); + + qp_attr->max_dest_rd_atomic = + 1 << ((be32_to_cpu(context.params2) >> 21) & 0x7); + qp_attr->min_rnr_timer = + (be32_to_cpu(context.rnr_nextrecvpsn) >> 24) & 0x1f; + qp_attr->timeout = context.pri_path.ackto >> 3; + qp_attr->retry_cnt = (be32_to_cpu(context.params1) >> 16) & 0x7; + qp_attr->rnr_retry = (be32_to_cpu(context.params1) >> 13) & 0x7; + qp_attr->alt_timeout = context.alt_path.ackto >> 3; + +done: + qp_attr->cur_qp_state = qp_attr->qp_state; + qp_attr->cap.max_recv_wr = 0; + qp_attr->cap.max_recv_sge = 0; + qp_attr->cap.max_send_wr = 0; + qp_attr->cap.max_send_sge = 0; + qp_attr->cap.max_inline_data = 0; + qp_init_attr->cap = qp_attr->cap; + + return 0; +} + +int mlx4_ib_reg_xrc_rcv_qp(struct ib_xrcd *xrcd, void *context, u32 qp_num) +{ + + struct mlx4_ib_xrcd *mxrcd = to_mxrcd(xrcd); + + struct mlx4_qp *mqp; + struct mlx4_ib_qp *mibqp; + struct mlx4_ib_xrc_reg_entry *ctx_entry, *tmp; + + mqp = __mlx4_qp_lookup(to_mdev(xrcd->device)->dev, qp_num); + if (unlikely(!mqp)) { + printk(KERN_WARNING "mlx4_ib_reg_xrc_rcv_qp: unknown QPN %06x\n", + qp_num); + return -EINVAL; + } + + mibqp = to_mibqp(mqp); + + if (mxrcd->xrcdn != to_mxrcd(mibqp->ibqp.xrcd)->xrcdn) + return -EINVAL; + + ctx_entry = kmalloc(sizeof *ctx_entry, GFP_KERNEL); + if (!ctx_entry) + return -ENOMEM; + + mutex_lock(&mibqp->mutex); + list_for_each_entry(tmp, &mibqp->xrc_reg_list, list) + if (tmp->context == context) { + mutex_unlock(&mibqp->mutex); + kfree(ctx_entry); + return 0; + } + + ctx_entry->context = context; + list_add_tail(&ctx_entry->list, &mibqp->xrc_reg_list); + mutex_unlock(&mibqp->mutex); + return 0; +} + +int mlx4_ib_unreg_xrc_rcv_qp(struct ib_xrcd *xrcd, void *context, u32 qp_num) +{ + + struct mlx4_ib_xrcd *mxrcd = to_mxrcd(xrcd); + + struct mlx4_qp *mqp; + struct mlx4_ib_qp *mibqp; + struct mlx4_ib_xrc_reg_entry *ctx_entry, *tmp; + int found = 0; + + mqp = __mlx4_qp_lookup(to_mdev(xrcd->device)->dev, qp_num); + if (unlikely(!mqp)) { + printk(KERN_WARNING "mlx4_ib_unreg_xrc_rcv_qp: unknown QPN %06x\n", + qp_num); + return -EINVAL; + } + + mibqp = to_mibqp(mqp); + + if (mxrcd->xrcdn != (mibqp->xrcdn & 0xffff)) + return -EINVAL; + + mutex_lock(&mibqp->mutex); + list_for_each_entry_safe(ctx_entry, tmp, &mibqp->xrc_reg_list, list) + if (ctx_entry->context == context) { + found = 1; + list_del(&ctx_entry->list); + kfree(ctx_entry); + break; + } + + mutex_unlock(&mibqp->mutex); + if (!found) + return -EINVAL; + + /* destroy the QP if the registration list is empty */ + /* NOTE: MUTEX IS NOT GOOD ENOUGH -- HAVE HOLE HERE */ + if (list_empty(&mibqp->xrc_reg_list)) + mlx4_ib_destroy_qp(&mibqp->ibqp); + + return 0; +} + Index: infiniband/drivers/infiniband/hw/mlx4/srq.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/srq.c 2008-01-22 18:41:12.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/srq.c 2008-01-22 19:11:45.000000000 +0200 @@ -72,13 +72,17 @@ static void mlx4_ib_srq_event(struct mlx } } -struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd, - struct ib_srq_init_attr *init_attr, - struct ib_udata *udata) +struct ib_srq *mlx4_ib_create_xrc_srq(struct ib_pd *pd, + struct ib_cq *xrc_cq, + struct ib_xrcd *xrcd, + struct ib_srq_init_attr *init_attr, + struct ib_udata *udata) { struct mlx4_ib_dev *dev = to_mdev(pd->device); struct mlx4_ib_srq *srq; struct mlx4_wqe_srq_next_seg *next; + u32 cqn; + u16 xrcdn; int desc_size; int buf_size; int err; @@ -172,7 +176,11 @@ struct ib_srq *mlx4_ib_create_srq(struct } } - err = mlx4_srq_alloc(dev->dev, to_mpd(pd)->pdn, &srq->mtt, + cqn = xrc_cq ? (u32) (to_mcq(xrc_cq)->mcq.cqn) : 0; + xrcdn = xrcd ? (u16) (to_mxrcd(xrcd)->xrcdn) : + (u16) dev->dev->caps.reserved_xrcds; + + err = mlx4_srq_alloc(dev->dev, to_mpd(pd)->pdn, cqn, xrcdn, &srq->mtt, srq->db.dma, &srq->msrq); if (err) goto err_wrid; @@ -240,6 +248,13 @@ int mlx4_ib_modify_srq(struct ib_srq *ib return 0; } +struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd, + struct ib_srq_init_attr *init_attr, + struct ib_udata *udata) +{ + return mlx4_ib_create_xrc_srq(pd, NULL, NULL, init_attr, udata); +} + int mlx4_ib_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr) { struct mlx4_ib_dev *dev = to_mdev(ibsrq->device); Index: infiniband/include/linux/mlx4/qp.h =================================================================== --- infiniband.orig/include/linux/mlx4/qp.h 2008-01-22 18:41:24.000000000 +0200 +++ infiniband/include/linux/mlx4/qp.h 2008-01-22 19:11:45.000000000 +0200 @@ -74,6 +74,7 @@ enum { MLX4_QP_ST_UC = 0x1, MLX4_QP_ST_RD = 0x2, MLX4_QP_ST_UD = 0x3, + MLX4_QP_ST_XRC = 0x6, MLX4_QP_ST_MLX = 0x7 }; @@ -136,7 +137,7 @@ struct mlx4_qp_context { __be32 ssn; __be32 params2; __be32 rnr_nextrecvpsn; - __be32 srcd; + __be32 xrcd; __be32 cqn_recv; __be64 db_rec_addr; __be32 qkey; Index: infiniband/drivers/net/mlx4/Makefile =================================================================== --- infiniband.orig/drivers/net/mlx4/Makefile 2008-01-22 18:41:14.000000000 +0200 +++ infiniband/drivers/net/mlx4/Makefile 2008-01-22 19:11:45.000000000 +0200 @@ -1,4 +1,4 @@ obj-$(CONFIG_MLX4_CORE) += mlx4_core.o mlx4_core-y := alloc.o catas.o cmd.o cq.o eq.o fw.o icm.o intf.o main.o mcg.o \ - mr.o pd.o profile.o qp.o reset.o srq.o + mr.o pd.o profile.o qp.o reset.o srq.o xrcd.o Index: infiniband/drivers/net/mlx4/qp.c =================================================================== --- infiniband.orig/drivers/net/mlx4/qp.c 2008-01-22 18:41:14.000000000 +0200 +++ infiniband/drivers/net/mlx4/qp.c 2008-01-22 19:11:45.000000000 +0200 @@ -263,10 +263,12 @@ int mlx4_init_qp_table(struct mlx4_dev * * We reserve 2 extra QPs per port for the special QPs. The * block of special QPs must be aligned to a multiple of 8, so * round up. + * We also reserve the MSB of the 24-bit QP number to indicate + * an XRC qp. */ dev->caps.sqp_start = ALIGN(dev->caps.reserved_qps, 8); err = mlx4_bitmap_init(&qp_table->bitmap, dev->caps.num_qps, - (1 << 24) - 1, dev->caps.sqp_start + 8); + (1 << 23) - 1, dev->caps.sqp_start + 8); if (err) return err; Index: infiniband/drivers/infiniband/hw/mlx4/cq.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/cq.c 2008-01-22 19:01:33.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/cq.c 2008-01-22 19:45:11.000000000 +0200 @@ -108,6 +108,7 @@ struct ib_cq *mlx4_ib_create_cq(struct i if (!cq) return ERR_PTR(-ENOMEM); + memset(cq, 0, sizeof *cq); entries = roundup_pow_of_two(entries + 1); cq->ibcq.cqe = entries - 1; buf_size = entries * sizeof (struct mlx4_cqe); From jackm at dev.mellanox.co.il Wed Jan 23 02:00:14 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Jan 2008 12:00:14 +0200 Subject: [ofa-general] [PATCH 4/8] mlx4: implement XRC qps for kernel-space apps Message-ID: <200801231200.14374.jackm@dev.mellanox.co.il> mlx4: Implement XRC for kernel-space applications. Changes: none Signed-off-by: Jack Morgenstein Index: infiniband/drivers/infiniband/hw/mlx4/cq.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/cq.c 2008-01-22 19:45:11.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/cq.c 2008-01-22 19:54:40.000000000 +0200 @@ -32,6 +32,7 @@ #include #include +#include #include "mlx4_ib.h" #include "user.h" @@ -312,8 +313,10 @@ static int mlx4_ib_poll_one(struct mlx4_ struct mlx4_qp *mqp; struct mlx4_ib_wq *wq; struct mlx4_ib_srq *srq; + struct mlx4_srq *msrq; int is_send; int is_error; + int is_xrc_recv = 0; u32 g_mlpath_rqpn; u16 wqe_ctr; @@ -333,7 +336,23 @@ static int mlx4_ib_poll_one(struct mlx4_ is_error = (cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) == MLX4_CQE_OPCODE_ERROR; - if (!*cur_qp || + if ((be32_to_cpu(cqe->my_qpn) & (1 << 23)) && !is_send) { + /* + * We do not have to take the XRC SRQ table lock here, + * because CQs will be locked while XRC SRQs are removed + * from the table. + */ + msrq = __mlx4_srq_lookup(to_mdev(cq->ibcq.device)->dev, + be32_to_cpu(cqe->g_mlpath_rqpn) & + 0xffffff); + if (unlikely(!msrq)) { + printk(KERN_WARNING "CQ %06x with entry for unknown XRC SRQ %06x\n", + cq->mcq.cqn, be32_to_cpu(cqe->g_mlpath_rqpn) & 0xffffff); + return -EINVAL; + } + is_xrc_recv = 1; + srq = to_mibsrq(msrq); + } else if (!*cur_qp || (be32_to_cpu(cqe->my_qpn) & 0xffffff) != (*cur_qp)->mqp.qpn) { /* * We do not have to take the QP table lock here, @@ -351,7 +370,7 @@ static int mlx4_ib_poll_one(struct mlx4_ *cur_qp = to_mibqp(mqp); } - wc->qp = &(*cur_qp)->ibqp; + wc->qp = is_xrc_recv ? NULL: &(*cur_qp)->ibqp; if (is_send) { wq = &(*cur_qp)->sq; @@ -359,6 +378,10 @@ static int mlx4_ib_poll_one(struct mlx4_ wq->tail += (u16) (wqe_ctr - (u16) wq->tail); wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)]; ++wq->tail; + } else if (is_xrc_recv) { + wqe_ctr = be16_to_cpu(cqe->wqe_index); + wc->wr_id = srq->wrid[wqe_ctr]; + mlx4_ib_free_srq_wqe(srq, wqe_ctr); } else if ((*cur_qp)->ibqp.srq) { srq = to_msrq((*cur_qp)->ibqp.srq); wqe_ctr = be16_to_cpu(cqe->wqe_index); @@ -482,6 +505,10 @@ void __mlx4_ib_cq_clean(struct mlx4_ib_c int nfreed = 0; struct mlx4_cqe *cqe, *dest; u8 owner_bit; + int is_xrc_srq = 0; + + if (srq && srq->ibsrq.xrc_cq) + is_xrc_srq = 1; /* * First we need to find the current producer index, so we @@ -500,7 +527,9 @@ void __mlx4_ib_cq_clean(struct mlx4_ib_c */ while ((int) --prod_index - (int) cq->mcq.cons_index >= 0) { cqe = get_cqe(cq, prod_index & cq->ibcq.cqe); - if ((be32_to_cpu(cqe->my_qpn) & 0xffffff) == qpn) { + if (((be32_to_cpu(cqe->my_qpn) & 0xffffff) == qpn) || + (is_xrc_srq && + (be32_to_cpu(cqe->g_mlpath_rqpn) & 0xffffff) == srq->msrq.srqn)) { if (srq && !(cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK)) mlx4_ib_free_srq_wqe(srq, be16_to_cpu(cqe->wqe_index)); ++nfreed; Index: infiniband/drivers/net/mlx4/mlx4.h =================================================================== --- infiniband.orig/drivers/net/mlx4/mlx4.h 2008-01-22 19:11:45.000000000 +0200 +++ infiniband/drivers/net/mlx4/mlx4.h 2008-01-22 19:54:40.000000000 +0200 @@ -220,7 +220,6 @@ struct mlx4_eq_table { struct mlx4_srq_table { struct mlx4_bitmap bitmap; spinlock_t lock; - struct radix_tree_root tree; struct mlx4_icm_table table; struct mlx4_icm_table cmpt_table; }; Index: infiniband/drivers/net/mlx4/srq.c =================================================================== --- infiniband.orig/drivers/net/mlx4/srq.c 2008-01-22 19:11:45.000000000 +0200 +++ infiniband/drivers/net/mlx4/srq.c 2008-01-22 19:54:40.000000000 +0200 @@ -64,7 +64,7 @@ void mlx4_srq_event(struct mlx4_dev *dev spin_lock(&srq_table->lock); - srq = radix_tree_lookup(&srq_table->tree, srqn & (dev->caps.num_srqs - 1)); + srq = radix_tree_lookup(&dev->srq_table_tree, srqn & (dev->caps.num_srqs - 1)); if (srq) atomic_inc(&srq->refcount); @@ -131,7 +131,7 @@ int mlx4_srq_alloc(struct mlx4_dev *dev, goto err_put; spin_lock_irq(&srq_table->lock); - err = radix_tree_insert(&srq_table->tree, srq->srqn, srq); + err = radix_tree_insert(&dev->srq_table_tree, srq->srqn, srq); spin_unlock_irq(&srq_table->lock); if (err) goto err_cmpt_put; @@ -170,7 +170,7 @@ int mlx4_srq_alloc(struct mlx4_dev *dev, err_radix: spin_lock_irq(&srq_table->lock); - radix_tree_delete(&srq_table->tree, srq->srqn); + radix_tree_delete(&dev->srq_table_tree, srq->srqn); spin_unlock_irq(&srq_table->lock); err_cmpt_put: @@ -186,18 +186,29 @@ err_out: } EXPORT_SYMBOL_GPL(mlx4_srq_alloc); -void mlx4_srq_free(struct mlx4_dev *dev, struct mlx4_srq *srq) +void mlx4_srq_invalidate(struct mlx4_dev *dev, struct mlx4_srq *srq) { - struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table; int err; err = mlx4_HW2SW_SRQ(dev, NULL, srq->srqn); if (err) mlx4_warn(dev, "HW2SW_SRQ failed (%d) for SRQN %06x\n", err, srq->srqn); +} +EXPORT_SYMBOL_GPL(mlx4_srq_invalidate); + +void mlx4_srq_remove(struct mlx4_dev *dev, struct mlx4_srq *srq) +{ + struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table; spin_lock_irq(&srq_table->lock); - radix_tree_delete(&srq_table->tree, srq->srqn); + radix_tree_delete(&dev->srq_table_tree, srq->srqn); spin_unlock_irq(&srq_table->lock); +} +EXPORT_SYMBOL_GPL(mlx4_srq_remove); + +void mlx4_srq_free(struct mlx4_dev *dev, struct mlx4_srq *srq) +{ + struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table; if (atomic_dec_and_test(&srq->refcount)) complete(&srq->free); @@ -243,7 +254,7 @@ int mlx4_init_srq_table(struct mlx4_dev int err; spin_lock_init(&srq_table->lock); - INIT_RADIX_TREE(&srq_table->tree, GFP_ATOMIC); + INIT_RADIX_TREE(&dev->srq_table_tree, GFP_ATOMIC); err = mlx4_bitmap_init(&srq_table->bitmap, dev->caps.num_srqs, dev->caps.num_srqs - 1, dev->caps.reserved_srqs); Index: infiniband/include/linux/mlx4/device.h =================================================================== --- infiniband.orig/include/linux/mlx4/device.h 2008-01-22 19:11:45.000000000 +0200 +++ infiniband/include/linux/mlx4/device.h 2008-01-22 19:54:40.000000000 +0200 @@ -290,6 +290,7 @@ struct mlx4_dev { unsigned long flags; struct mlx4_caps caps; struct radix_tree_root qp_table_tree; + struct radix_tree_root srq_table_tree; u32 rev_id; char board_id[MLX4_BOARD_ID_LEN]; }; Index: infiniband/include/linux/mlx4/srq.h =================================================================== --- infiniband.orig/include/linux/mlx4/srq.h 2008-01-22 18:41:24.000000000 +0200 +++ infiniband/include/linux/mlx4/srq.h 2008-01-22 19:54:40.000000000 +0200 @@ -33,10 +33,21 @@ #ifndef MLX4_SRQ_H #define MLX4_SRQ_H +#include +#include + struct mlx4_wqe_srq_next_seg { u16 reserved1; __be16 next_wqe_index; u32 reserved2[3]; }; +void mlx4_srq_invalidate(struct mlx4_dev *dev, struct mlx4_srq *srq); +void mlx4_srq_remove(struct mlx4_dev *dev, struct mlx4_srq *srq); + +static inline struct mlx4_srq *__mlx4_srq_lookup(struct mlx4_dev *dev, u32 srqn) +{ + return radix_tree_lookup(&dev->srq_table_tree, srqn & (dev->caps.num_srqs - 1)); +} + #endif /* MLX4_SRQ_H */ Index: infiniband/drivers/infiniband/hw/mlx4/srq.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/srq.c 2008-01-22 19:11:45.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/srq.c 2008-01-22 19:54:40.000000000 +0200 @@ -187,11 +187,13 @@ struct ib_srq *mlx4_ib_create_xrc_srq(st srq->msrq.event = mlx4_ib_srq_event; - if (pd->uobject) + if (pd->uobject) { if (ib_copy_to_udata(udata, &srq->msrq.srqn, sizeof (__u32))) { err = -EFAULT; goto err_wrid; } + } else + srq->ibsrq.xrc_srq_num = srq->msrq.srqn; init_attr->attr.max_wr = srq->msrq.max - 1; @@ -277,6 +279,18 @@ int mlx4_ib_destroy_srq(struct ib_srq *s { struct mlx4_ib_dev *dev = to_mdev(srq->device); struct mlx4_ib_srq *msrq = to_msrq(srq); + struct mlx4_ib_cq *cq; + + mlx4_srq_invalidate(dev->dev, &msrq->msrq); + + if (srq->xrc_cq && !srq->uobject) { + cq = to_mcq(srq->xrc_cq); + spin_lock_irq(&cq->lock); + __mlx4_ib_cq_clean(cq, -1, msrq); + mlx4_srq_remove(dev->dev, &msrq->msrq); + spin_unlock_irq(&cq->lock); + } else + mlx4_srq_remove(dev->dev, &msrq->msrq); mlx4_srq_free(dev->dev, &msrq->msrq); mlx4_mtt_cleanup(dev->dev, &msrq->mtt); Index: infiniband/drivers/infiniband/hw/mlx4/qp.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/qp.c 2008-01-22 19:45:11.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/qp.c 2008-01-22 19:56:43.000000000 +0200 @@ -204,6 +204,7 @@ static int send_wqe_overhead(enum ib_qp_ case IB_QPT_UC: return sizeof (struct mlx4_wqe_ctrl_seg) + sizeof (struct mlx4_wqe_raddr_seg); + case IB_QPT_XRC: case IB_QPT_RC: return sizeof (struct mlx4_wqe_ctrl_seg) + sizeof (struct mlx4_wqe_atomic_seg) + @@ -1015,7 +1016,7 @@ out: } static const struct ib_qp_attr mlx4_ib_qp_attr = { .port_num = 1 }; -static const int mlx4_ib_qp_attr_mask_table[IB_QPT_UD + 1] = { +static const int mlx4_ib_qp_attr_mask_table[IB_QPT_XRC + 1] = { [IB_QPT_UD] = (IB_QP_PKEY_INDEX | IB_QP_PORT | IB_QP_QKEY), @@ -1025,6 +1026,9 @@ static const int mlx4_ib_qp_attr_mask_ta [IB_QPT_RC] = (IB_QP_PKEY_INDEX | IB_QP_PORT | IB_QP_ACCESS_FLAGS), + [IB_QPT_XRC] = (IB_QP_PKEY_INDEX | + IB_QP_PORT | + IB_QP_ACCESS_FLAGS), [IB_QPT_SMI] = (IB_QP_PKEY_INDEX | IB_QP_QKEY), [IB_QPT_GSI] = (IB_QP_PKEY_INDEX | @@ -1355,6 +1359,10 @@ int mlx4_ib_post_send(struct ib_qp *ibqp size = sizeof *ctrl / 16; switch (ibqp->qp_type) { + case IB_QPT_XRC: + ctrl->srcrb_flags |= + cpu_to_be32(wr->xrc_remote_srq_num << 8); + /* fall thru */ case IB_QPT_RC: case IB_QPT_UC: switch (wr->opcode) { @@ -1647,7 +1655,8 @@ int mlx4_ib_query_qp(struct ib_qp *ibqp, qp_attr->qp_access_flags = to_ib_qp_access_flags(be32_to_cpu(context.params2)); - if (qp->ibqp.qp_type == IB_QPT_RC || qp->ibqp.qp_type == IB_QPT_UC) { + if (qp->ibqp.qp_type == IB_QPT_RC || qp->ibqp.qp_type == IB_QPT_UC || + qp->ibqp.qp_type == IB_QPT_XRC) { to_ib_ah_attr(dev->dev, &qp_attr->ah_attr, &context.pri_path); to_ib_ah_attr(dev->dev, &qp_attr->alt_ah_attr, &context.alt_path); qp_attr->alt_pkey_index = context.alt_path.pkey_index & 0x7f; From jackm at dev.mellanox.co.il Wed Jan 23 02:00:30 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Jan 2008 12:00:30 +0200 Subject: [ofa-general] [PATCH 7/8] core: Add XRC receive-only qp support Message-ID: <200801231200.30590.jackm@dev.mellanox.co.il> ib/core: Implement XRC receive-only QPs for userspace apps. Added creation of XRC receive-only QPs for userspace, which reside in kernel space (user cannot post-to or poll these QPs). Motivation: MPI community requires XRC receive QPs which will not be destroyed when the creating process terminates. Solution: Userspace requests that a QP be created in kernel space. Each userspace process using that QP (i.e. receiving packets on an XRC SRQ via the qp), registers with that QP (-- the creator is also registered, whether or not it is a user of the QP). When the last userspace user unregisters with the QP, it is destroyed. Unregistration is also part of userspace process cleanup, so there is no leakage. This patch implements the kernel procedures to implement the following (new) libibverbs API: ibv_create_xrc_rcv_qp ibv_modify_xrc_rcv_qp ibv_query_xrc_rcv_qp ibv_reg_xrc_rcv_qp ibv_unreg_xrc_rcv_qp In addition, the patch implements the foundation for distributing XRC-receive-only QP events to userspace processes registered with that QP. Finally, the patch modifies ib_uverbs_close_xrc_domain() to return BUSY if any resources are still in use by the process, so that the XRC rcv-only QP cleanup can operate properly. Signed-off-by: Jack Morgenstein Index: infiniband/include/rdma/ib_verbs.h =================================================================== --- infiniband.orig/include/rdma/ib_verbs.h 2008-01-22 20:18:28.000000000 +0200 +++ infiniband/include/rdma/ib_verbs.h 2008-01-22 20:18:46.000000000 +0200 @@ -285,6 +285,10 @@ enum ib_event_type { IB_EVENT_CLIENT_REREGISTER }; +enum ib_event_flags { + IB_XRC_QP_EVENT_FLAG = 0x80000000, +}; + struct ib_event { struct ib_device *device; union { @@ -292,6 +296,7 @@ struct ib_event { struct ib_qp *qp; struct ib_srq *srq; u8 port_num; + u32 xrc_qp_num; } element; enum ib_event_type event; }; @@ -492,6 +497,7 @@ enum ib_qp_type { enum qp_create_flags { QP_CREATE_LSO = 1 << 0, + XRC_RCV_QP = 1 << 1, }; struct ib_qp_init_attr { @@ -723,6 +729,7 @@ struct ib_ucontext { struct list_head srq_list; struct list_head ah_list; struct list_head xrc_domain_list; + struct list_head xrc_reg_qp_list; int closing; }; @@ -744,6 +751,12 @@ struct ib_udata { size_t outlen; }; +struct ib_uxrc_rcv_object { + struct list_head list; /* link to context's list */ + u32 qp_num; + u32 domain_handle; +}; + struct ib_pd { struct ib_device *device; struct ib_uobject *uobject; @@ -1053,6 +1066,23 @@ struct ib_device { struct ib_ucontext *context, struct ib_udata *udata); int (*dealloc_xrcd)(struct ib_xrcd *xrcd); + int (*create_xrc_rcv_qp)(struct ib_qp_init_attr *init_attr, + u32* qp_num); + int (*modify_xrc_rcv_qp)(struct ib_xrcd *xrcd, + u32 qp_num, + struct ib_qp_attr *attr, + int attr_mask); + int (*query_xrc_rcv_qp)(struct ib_xrcd *xrcd, + u32 qp_num, + struct ib_qp_attr *attr, + int attr_mask, + struct ib_qp_init_attr *init_attr); + int (*reg_xrc_rcv_qp)(struct ib_xrcd *xrcd, + void *context, + u32 qp_num); + int (*unreg_xrc_rcv_qp)(struct ib_xrcd *xrcd, + void *context, + u32 qp_num); struct ib_dma_mapping_ops *dma_ops; Index: infiniband/drivers/infiniband/core/uverbs_main.c =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs_main.c 2008-01-22 20:18:28.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs_main.c 2008-01-22 20:18:46.000000000 +0200 @@ -114,6 +114,11 @@ static ssize_t (*uverbs_cmd_table[])(str [IB_USER_VERBS_CMD_CREATE_XRC_SRQ] = ib_uverbs_create_xrc_srq, [IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN] = ib_uverbs_open_xrc_domain, [IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN] = ib_uverbs_close_xrc_domain, + [IB_USER_VERBS_CMD_CREATE_XRC_RCV_QP] = ib_uverbs_create_xrc_rcv_qp, + [IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP] = ib_uverbs_modify_xrc_rcv_qp, + [IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP] = ib_uverbs_query_xrc_rcv_qp, + [IB_USER_VERBS_CMD_REG_XRC_RCV_QP] = ib_uverbs_reg_xrc_rcv_qp, + [IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP] = ib_uverbs_unreg_xrc_rcv_qp, }; static struct vfsmount *uverbs_event_mnt; @@ -191,6 +196,7 @@ static int ib_uverbs_cleanup_ucontext(st struct ib_ucontext *context) { struct ib_uobject *uobj, *tmp; + struct ib_uxrc_rcv_object *xrc_qp_obj, *tmp1; if (!context) return 0; @@ -251,6 +257,13 @@ static int ib_uverbs_cleanup_ucontext(st kfree(uobj); } + list_for_each_entry_safe(xrc_qp_obj, tmp1, &context->xrc_reg_qp_list, list) { + list_del(&xrc_qp_obj->list); + ib_uverbs_cleanup_xrc_rcv_qp(file, xrc_qp_obj->domain_handle, + xrc_qp_obj->qp_num); + kfree(xrc_qp_obj); + } + mutex_lock(&file->device->ib_dev->xrcd_table_mutex); list_for_each_entry_safe(uobj, tmp, &context->xrc_domain_list, list) { struct ib_xrcd *xrcd = uobj->object; @@ -506,6 +519,12 @@ void ib_uverbs_event_handler(struct ib_e NULL, NULL); } +void ib_uverbs_xrc_rcv_qp_event_handler(struct ib_event *event, void *context_ptr) +{ + ib_uverbs_async_handler(context_ptr, event->element.xrc_qp_num, + event->event, NULL, NULL); +} + struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file, int is_async, int *fd) { Index: infiniband/drivers/infiniband/core/uverbs_cmd.c =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs_cmd.c 2008-01-22 20:18:28.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs_cmd.c 2008-01-22 20:18:46.000000000 +0200 @@ -315,6 +315,7 @@ ssize_t ib_uverbs_get_context(struct ib_ INIT_LIST_HEAD(&ucontext->srq_list); INIT_LIST_HEAD(&ucontext->ah_list); INIT_LIST_HEAD(&ucontext->xrc_domain_list); + INIT_LIST_HEAD(&ucontext->xrc_reg_qp_list); ucontext->closing = 0; resp.num_comp_vectors = file->device->num_comp_vectors; @@ -1080,6 +1081,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uv goto err_put; } + attr.create_flags = 0; attr.event_handler = ib_uverbs_qp_event_handler; attr.qp_context = file; attr.send_cq = scq; @@ -2588,10 +2590,13 @@ ssize_t ib_uverbs_close_xrc_domain(struc put_uobj_write(uobj); - if (ret && !inode) + if (ret) { + if (inode) + atomic_inc(&xrcd->usecnt); goto err_unlock_mutex; + } - if (!ret && inode) + if (inode) xrcd_table_delete(file->device->ib_dev, inode); idr_remove_uobj(&ib_uverbs_xrc_domain_idr, uobj); @@ -2611,7 +2616,7 @@ err_unlock_mutex: } void ib_uverbs_dealloc_xrcd(struct ib_device *ib_dev, - struct ib_xrcd *xrcd) + struct ib_xrcd *xrcd) { struct inode *inode = NULL; int ret = 0; @@ -2625,4 +2630,353 @@ void ib_uverbs_dealloc_xrcd(struct ib_de xrcd_table_delete(ib_dev, inode); } +ssize_t ib_uverbs_create_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_create_xrc_rcv_qp cmd; + struct ib_uverbs_create_xrc_rcv_qp_resp resp; + struct ib_uxrc_rcv_object *obj; + struct ib_qp_init_attr init_attr; + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + u32 qp_num; + int err; + + if (out_len < sizeof resp) + return -ENOSPC; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + obj = kmalloc(sizeof *obj, GFP_KERNEL); + if (!obj) + return -ENOMEM; + + xrcd = idr_read_xrcd(cmd.xrc_domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) { + err = -EINVAL; + goto err_out; + } + + memset(&init_attr, 0, sizeof init_attr); + init_attr.event_handler = ib_uverbs_xrc_rcv_qp_event_handler; + init_attr.qp_context = file; + init_attr.srq = NULL; + init_attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; + init_attr.qp_type = IB_QPT_XRC; + init_attr.xrc_domain = xrcd; + init_attr.create_flags = XRC_RCV_QP; + + init_attr.cap.max_send_wr = 1; + init_attr.cap.max_recv_wr = 0; + init_attr.cap.max_send_sge = 1; + init_attr.cap.max_recv_sge = 0; + init_attr.cap.max_inline_data = 0; + + err = xrcd->device->create_xrc_rcv_qp(&init_attr, &qp_num); + if (err) + goto err_put; + + memset(&resp, 0, sizeof resp); + resp.qpn = qp_num; + + if (copy_to_user((void __user *) (unsigned long) cmd.response, + &resp, sizeof resp)) { + err = -EFAULT; + goto err_destroy; + } + + atomic_inc(&xrcd->usecnt); + put_xrcd_read(xrcd_uobj); + obj->qp_num = qp_num; + obj->domain_handle = cmd.xrc_domain_handle; + mutex_lock(&file->mutex); + list_add_tail(&obj->list, &file->ucontext->xrc_reg_qp_list); + mutex_unlock(&file->mutex); + + return in_len; + +err_destroy: + xrcd->device->unreg_xrc_rcv_qp(xrcd, file, qp_num); +err_put: + put_xrcd_read(xrcd_uobj); +err_out: + kfree(obj); + return err; +} + +ssize_t ib_uverbs_modify_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_modify_xrc_rcv_qp cmd; + struct ib_qp_attr *attr; + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + int err; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + attr = kmalloc(sizeof *attr, GFP_KERNEL); + if (!attr) + return -ENOMEM; + + xrcd = idr_read_xrcd(cmd.xrc_domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) { + kfree(attr); + return -EINVAL; + } + + memset(attr, 0, sizeof *attr); + attr->qp_state = cmd.qp_state; + attr->cur_qp_state = cmd.cur_qp_state; + attr->qp_access_flags = cmd.qp_access_flags; + attr->pkey_index = cmd.pkey_index; + attr->port_num = cmd.port_num; + attr->path_mtu = cmd.path_mtu; + attr->path_mig_state = cmd.path_mig_state; + attr->qkey = cmd.qkey; + attr->rq_psn = cmd.rq_psn; + attr->sq_psn = cmd.sq_psn; + attr->dest_qp_num = cmd.dest_qp_num; + attr->alt_pkey_index = cmd.alt_pkey_index; + attr->en_sqd_async_notify = cmd.en_sqd_async_notify; + attr->max_rd_atomic = cmd.max_rd_atomic; + attr->max_dest_rd_atomic = cmd.max_dest_rd_atomic; + attr->min_rnr_timer = cmd.min_rnr_timer; + attr->port_num = cmd.port_num; + attr->timeout = cmd.timeout; + attr->retry_cnt = cmd.retry_cnt; + attr->rnr_retry = cmd.rnr_retry; + attr->alt_port_num = cmd.alt_port_num; + attr->alt_timeout = cmd.alt_timeout; + + memcpy(attr->ah_attr.grh.dgid.raw, cmd.dest.dgid, 16); + attr->ah_attr.grh.flow_label = cmd.dest.flow_label; + attr->ah_attr.grh.sgid_index = cmd.dest.sgid_index; + attr->ah_attr.grh.hop_limit = cmd.dest.hop_limit; + attr->ah_attr.grh.traffic_class = cmd.dest.traffic_class; + attr->ah_attr.dlid = cmd.dest.dlid; + attr->ah_attr.sl = cmd.dest.sl; + attr->ah_attr.src_path_bits = cmd.dest.src_path_bits; + attr->ah_attr.static_rate = cmd.dest.static_rate; + attr->ah_attr.ah_flags = cmd.dest.is_global ? IB_AH_GRH : 0; + attr->ah_attr.port_num = cmd.dest.port_num; + + memcpy(attr->alt_ah_attr.grh.dgid.raw, cmd.alt_dest.dgid, 16); + attr->alt_ah_attr.grh.flow_label = cmd.alt_dest.flow_label; + attr->alt_ah_attr.grh.sgid_index = cmd.alt_dest.sgid_index; + attr->alt_ah_attr.grh.hop_limit = cmd.alt_dest.hop_limit; + attr->alt_ah_attr.grh.traffic_class = cmd.alt_dest.traffic_class; + attr->alt_ah_attr.dlid = cmd.alt_dest.dlid; + attr->alt_ah_attr.sl = cmd.alt_dest.sl; + attr->alt_ah_attr.src_path_bits = cmd.alt_dest.src_path_bits; + attr->alt_ah_attr.static_rate = cmd.alt_dest.static_rate; + attr->alt_ah_attr.ah_flags = cmd.alt_dest.is_global ? IB_AH_GRH : 0; + attr->alt_ah_attr.port_num = cmd.alt_dest.port_num; + + err = xrcd->device->modify_xrc_rcv_qp(xrcd, cmd.qp_num, attr, cmd.attr_mask); + put_xrcd_read(xrcd_uobj); + kfree(attr); + return err; +} + +ssize_t ib_uverbs_query_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_query_xrc_rcv_qp cmd; + struct ib_uverbs_query_qp_resp resp; + struct ib_qp_attr *attr; + struct ib_qp_init_attr *init_attr; + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + int ret; + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + attr = kmalloc(sizeof *attr, GFP_KERNEL); + init_attr = kmalloc(sizeof *init_attr, GFP_KERNEL); + if (!attr || !init_attr) { + ret = -ENOMEM; + goto out; + } + + xrcd = idr_read_xrcd(cmd.xrc_domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) { + ret = -EINVAL; + goto out; + } + + ret = xrcd->device->query_xrc_rcv_qp(xrcd, cmd.qp_num, attr, + cmd.attr_mask, init_attr); + + put_xrcd_read(xrcd_uobj); + + if (ret) + goto out; + + memset(&resp, 0, sizeof resp); + resp.qp_state = attr->qp_state; + resp.cur_qp_state = attr->cur_qp_state; + resp.path_mtu = attr->path_mtu; + resp.path_mig_state = attr->path_mig_state; + resp.qkey = attr->qkey; + resp.rq_psn = attr->rq_psn; + resp.sq_psn = attr->sq_psn; + resp.dest_qp_num = attr->dest_qp_num; + resp.qp_access_flags = attr->qp_access_flags; + resp.pkey_index = attr->pkey_index; + resp.alt_pkey_index = attr->alt_pkey_index; + resp.sq_draining = attr->sq_draining; + resp.max_rd_atomic = attr->max_rd_atomic; + resp.max_dest_rd_atomic = attr->max_dest_rd_atomic; + resp.min_rnr_timer = attr->min_rnr_timer; + resp.port_num = attr->port_num; + resp.timeout = attr->timeout; + resp.retry_cnt = attr->retry_cnt; + resp.rnr_retry = attr->rnr_retry; + resp.alt_port_num = attr->alt_port_num; + resp.alt_timeout = attr->alt_timeout; + + memcpy(resp.dest.dgid, attr->ah_attr.grh.dgid.raw, 16); + resp.dest.flow_label = attr->ah_attr.grh.flow_label; + resp.dest.sgid_index = attr->ah_attr.grh.sgid_index; + resp.dest.hop_limit = attr->ah_attr.grh.hop_limit; + resp.dest.traffic_class = attr->ah_attr.grh.traffic_class; + resp.dest.dlid = attr->ah_attr.dlid; + resp.dest.sl = attr->ah_attr.sl; + resp.dest.src_path_bits = attr->ah_attr.src_path_bits; + resp.dest.static_rate = attr->ah_attr.static_rate; + resp.dest.is_global = !!(attr->ah_attr.ah_flags & IB_AH_GRH); + resp.dest.port_num = attr->ah_attr.port_num; + + memcpy(resp.alt_dest.dgid, attr->alt_ah_attr.grh.dgid.raw, 16); + resp.alt_dest.flow_label = attr->alt_ah_attr.grh.flow_label; + resp.alt_dest.sgid_index = attr->alt_ah_attr.grh.sgid_index; + resp.alt_dest.hop_limit = attr->alt_ah_attr.grh.hop_limit; + resp.alt_dest.traffic_class = attr->alt_ah_attr.grh.traffic_class; + resp.alt_dest.dlid = attr->alt_ah_attr.dlid; + resp.alt_dest.sl = attr->alt_ah_attr.sl; + resp.alt_dest.src_path_bits = attr->alt_ah_attr.src_path_bits; + resp.alt_dest.static_rate = attr->alt_ah_attr.static_rate; + resp.alt_dest.is_global = !!(attr->alt_ah_attr.ah_flags & IB_AH_GRH); + resp.alt_dest.port_num = attr->alt_ah_attr.port_num; + + resp.max_send_wr = init_attr->cap.max_send_wr; + resp.max_recv_wr = init_attr->cap.max_recv_wr; + resp.max_send_sge = init_attr->cap.max_send_sge; + resp.max_recv_sge = init_attr->cap.max_recv_sge; + resp.max_inline_data = init_attr->cap.max_inline_data; + resp.sq_sig_all = init_attr->sq_sig_type == IB_SIGNAL_ALL_WR; + + if (copy_to_user((void __user *) (unsigned long) cmd.response, + &resp, sizeof resp)) + ret = -EFAULT; + +out: + kfree(attr); + kfree(init_attr); + + return ret ? ret : in_len; +} + +ssize_t ib_uverbs_reg_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_reg_xrc_rcv_qp cmd; + struct ib_uxrc_rcv_object *qp_obj, *tmp; + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + int ret; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + qp_obj = kmalloc(sizeof *qp_obj, GFP_KERNEL); + if (!qp_obj) + return -ENOMEM; + + xrcd = idr_read_xrcd(cmd.xrc_domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) { + ret = -EINVAL; + goto err_out; + } + + ret = xrcd->device->reg_xrc_rcv_qp(xrcd, file, cmd.qp_num); + if (ret) + goto err_put; + + atomic_inc(&xrcd->usecnt); + put_xrcd_read(xrcd_uobj); + mutex_lock(&file->mutex); + list_for_each_entry(tmp, &file->ucontext->xrc_reg_qp_list, list) + if (cmd.qp_num == tmp->qp_num) { + kfree(qp_obj); + mutex_unlock(&file->mutex); + put_xrcd_read(xrcd_uobj); + return 0; + } + qp_obj->qp_num = cmd.qp_num; + qp_obj->domain_handle = cmd.xrc_domain_handle; + list_add_tail(&qp_obj->list, &file->ucontext->xrc_reg_qp_list); + mutex_unlock(&file->mutex); + return 0; + +err_put: + put_xrcd_read(xrcd_uobj); +err_out: + + kfree(qp_obj); + return ret; +} + +int ib_uverbs_cleanup_xrc_rcv_qp(struct ib_uverbs_file *file, + u32 domain_handle, u32 qp_num) +{ + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + int err; + + xrcd = idr_read_xrcd(domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) + return -EINVAL; + + err = xrcd->device->unreg_xrc_rcv_qp(xrcd, file, qp_num); + + if (!err) + atomic_dec(&xrcd->usecnt); + put_xrcd_read(xrcd_uobj); + return err; +} + +ssize_t ib_uverbs_unreg_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_unreg_xrc_rcv_qp cmd; + struct ib_uxrc_rcv_object *qp_obj, *tmp; + int ret; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + ret = ib_uverbs_cleanup_xrc_rcv_qp(file, cmd.xrc_domain_handle, cmd.qp_num); + if (ret) + return ret; + + mutex_lock(&file->mutex); + list_for_each_entry_safe(qp_obj, tmp, &file->ucontext->xrc_reg_qp_list, list) + if (cmd.qp_num == qp_obj->qp_num) { + list_del(&qp_obj->list); + kfree(qp_obj); + break; + } + mutex_unlock(&file->mutex); + return 0; + +} Index: infiniband/include/rdma/ib_user_verbs.h =================================================================== --- infiniband.orig/include/rdma/ib_user_verbs.h 2008-01-22 20:16:45.000000000 +0200 +++ infiniband/include/rdma/ib_user_verbs.h 2008-01-22 20:18:46.000000000 +0200 @@ -86,7 +86,12 @@ enum { IB_USER_VERBS_CMD_POST_SRQ_RECV, IB_USER_VERBS_CMD_CREATE_XRC_SRQ, IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN, - IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN + IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN, + IB_USER_VERBS_CMD_CREATE_XRC_RCV_QP, + IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP, + IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP, + IB_USER_VERBS_CMD_REG_XRC_RCV_QP, + IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP, }; /* @@ -714,6 +719,76 @@ struct ib_uverbs_close_xrc_domain { __u64 driver_data[0]; }; +struct ib_uverbs_create_xrc_rcv_qp { + __u64 response; + __u64 user_handle; + __u32 xrc_domain_handle; + __u32 max_send_wr; + __u32 max_recv_wr; + __u32 max_send_sge; + __u32 max_recv_sge; + __u32 max_inline_data; + __u8 sq_sig_all; + __u8 qp_type; + __u8 reserved[2]; + __u64 driver_data[0]; +}; + +struct ib_uverbs_create_xrc_rcv_qp_resp { + __u32 qpn; + __u32 reserved; +}; + +struct ib_uverbs_modify_xrc_rcv_qp { + __u32 xrc_domain_handle; + __u32 qp_num; + struct ib_uverbs_qp_dest dest; + struct ib_uverbs_qp_dest alt_dest; + __u32 attr_mask; + __u32 qkey; + __u32 rq_psn; + __u32 sq_psn; + __u32 dest_qp_num; + __u32 qp_access_flags; + __u16 pkey_index; + __u16 alt_pkey_index; + __u8 qp_state; + __u8 cur_qp_state; + __u8 path_mtu; + __u8 path_mig_state; + __u8 en_sqd_async_notify; + __u8 max_rd_atomic; + __u8 max_dest_rd_atomic; + __u8 min_rnr_timer; + __u8 port_num; + __u8 timeout; + __u8 retry_cnt; + __u8 rnr_retry; + __u8 alt_port_num; + __u8 alt_timeout; + __u8 reserved[2]; + __u64 driver_data[0]; +}; + +struct ib_uverbs_query_xrc_rcv_qp { + __u64 response; + __u32 xrc_domain_handle; + __u32 qp_num; + __u32 attr_mask; + __u64 driver_data[0]; +}; + +struct ib_uverbs_reg_xrc_rcv_qp { + __u32 xrc_domain_handle; + __u32 qp_num; + __u64 driver_data[0]; +}; + +struct ib_uverbs_unreg_xrc_rcv_qp { + __u32 xrc_domain_handle; + __u32 qp_num; + __u64 driver_data[0]; +}; #endif /* IB_USER_VERBS_H */ Index: infiniband/drivers/infiniband/core/uverbs.h =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs.h 2008-01-22 20:18:28.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs.h 2008-01-22 20:18:46.000000000 +0200 @@ -163,8 +163,12 @@ void ib_uverbs_qp_event_handler(struct i void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr); void ib_uverbs_event_handler(struct ib_event_handler *handler, struct ib_event *event); +void ib_uverbs_xrc_rcv_qp_event_handler(struct ib_event *event, + void *context_ptr); void ib_uverbs_dealloc_xrcd(struct ib_device *ib_dev, struct ib_xrcd *xrcd); +int ib_uverbs_cleanup_xrc_rcv_qp(struct ib_uverbs_file *file, + u32 domain_handle, u32 qp_num); #define IB_UVERBS_DECLARE_CMD(name) \ ssize_t ib_uverbs_##name(struct ib_uverbs_file *file, \ @@ -202,6 +206,11 @@ IB_UVERBS_DECLARE_CMD(destroy_srq); IB_UVERBS_DECLARE_CMD(create_xrc_srq); IB_UVERBS_DECLARE_CMD(open_xrc_domain); IB_UVERBS_DECLARE_CMD(close_xrc_domain); +IB_UVERBS_DECLARE_CMD(create_xrc_rcv_qp); +IB_UVERBS_DECLARE_CMD(modify_xrc_rcv_qp); +IB_UVERBS_DECLARE_CMD(query_xrc_rcv_qp); +IB_UVERBS_DECLARE_CMD(reg_xrc_rcv_qp); +IB_UVERBS_DECLARE_CMD(unreg_xrc_rcv_qp); #endif /* UVERBS_H */ From jackm at dev.mellanox.co.il Wed Jan 23 02:00:20 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Jan 2008 12:00:20 +0200 Subject: [ofa-general] [PATCH 5/8] core: Implement XRC support (not including xrc receive-only qps) Message-ID: <200801231200.20637.jackm@dev.mellanox.co.il> IB/core: Implement XRC support at verbs layer (for case in which fd is not used when opening an xrc_domain). Changes: none Signed-off-by: Jack Morgenstein =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs_main.c 2008-01-22 18:41:12.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs_main.c 2008-01-22 20:16:45.000000000 +0200 @@ -74,6 +74,7 @@ DEFINE_IDR(ib_uverbs_ah_idr); DEFINE_IDR(ib_uverbs_cq_idr); DEFINE_IDR(ib_uverbs_qp_idr); DEFINE_IDR(ib_uverbs_srq_idr); +DEFINE_IDR(ib_uverbs_xrc_domain_idr); static spinlock_t map_lock; static struct ib_uverbs_device *dev_table[IB_UVERBS_MAX_DEVICES]; @@ -110,6 +111,9 @@ static ssize_t (*uverbs_cmd_table[])(str [IB_USER_VERBS_CMD_MODIFY_SRQ] = ib_uverbs_modify_srq, [IB_USER_VERBS_CMD_QUERY_SRQ] = ib_uverbs_query_srq, [IB_USER_VERBS_CMD_DESTROY_SRQ] = ib_uverbs_destroy_srq, + [IB_USER_VERBS_CMD_CREATE_XRC_SRQ] = ib_uverbs_create_xrc_srq, + [IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN] = ib_uverbs_open_xrc_domain, + [IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN] = ib_uverbs_close_xrc_domain, }; static struct vfsmount *uverbs_event_mnt; @@ -213,17 +217,6 @@ static int ib_uverbs_cleanup_ucontext(st kfree(uqp); } - list_for_each_entry_safe(uobj, tmp, &context->cq_list, list) { - struct ib_cq *cq = uobj->object; - struct ib_uverbs_event_file *ev_file = cq->cq_context; - struct ib_ucq_object *ucq = - container_of(uobj, struct ib_ucq_object, uobject); - - idr_remove_uobj(&ib_uverbs_cq_idr, uobj); - ib_destroy_cq(cq); - ib_uverbs_release_ucq(file, ev_file, ucq); - kfree(ucq); - } list_for_each_entry_safe(uobj, tmp, &context->srq_list, list) { struct ib_srq *srq = uobj->object; @@ -236,6 +229,18 @@ static int ib_uverbs_cleanup_ucontext(st kfree(uevent); } + list_for_each_entry_safe(uobj, tmp, &context->cq_list, list) { + struct ib_cq *cq = uobj->object; + struct ib_uverbs_event_file *ev_file = cq->cq_context; + struct ib_ucq_object *ucq = + container_of(uobj, struct ib_ucq_object, uobject); + + idr_remove_uobj(&ib_uverbs_cq_idr, uobj); + ib_destroy_cq(cq); + ib_uverbs_release_ucq(file, ev_file, ucq); + kfree(ucq); + } + /* XXX Free MWs */ list_for_each_entry_safe(uobj, tmp, &context->mr_list, list) { @@ -246,6 +251,14 @@ static int ib_uverbs_cleanup_ucontext(st kfree(uobj); } + list_for_each_entry_safe(uobj, tmp, &context->xrc_domain_list, list) { + struct ib_xrcd *xrcd = uobj->object; + + idr_remove_uobj(&ib_uverbs_xrc_domain_idr, uobj); + ib_dealloc_xrcd(xrcd); + kfree(uobj); + } + list_for_each_entry_safe(uobj, tmp, &context->pd_list, list) { struct ib_pd *pd = uobj->object; Index: infiniband/include/rdma/ib_user_verbs.h =================================================================== --- infiniband.orig/include/rdma/ib_user_verbs.h 2008-01-22 18:41:24.000000000 +0200 +++ infiniband/include/rdma/ib_user_verbs.h 2008-01-22 20:16:45.000000000 +0200 @@ -83,7 +83,10 @@ enum { IB_USER_VERBS_CMD_MODIFY_SRQ, IB_USER_VERBS_CMD_QUERY_SRQ, IB_USER_VERBS_CMD_DESTROY_SRQ, - IB_USER_VERBS_CMD_POST_SRQ_RECV + IB_USER_VERBS_CMD_POST_SRQ_RECV, + IB_USER_VERBS_CMD_CREATE_XRC_SRQ, + IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN, + IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN }; /* @@ -643,6 +646,18 @@ struct ib_uverbs_create_srq { __u64 driver_data[0]; }; +struct ib_uverbs_create_xrc_srq { + __u64 response; + __u64 user_handle; + __u32 pd_handle; + __u32 max_wr; + __u32 max_sge; + __u32 srq_limit; + __u32 xrcd_handle; + __u32 xrc_cq; + __u64 driver_data[0]; +}; + struct ib_uverbs_create_srq_resp { __u32 srq_handle; __u32 max_wr; @@ -682,4 +697,23 @@ struct ib_uverbs_destroy_srq_resp { __u32 events_reported; }; +struct ib_uverbs_open_xrc_domain { + __u64 response; + __u32 fd; + __u32 oflags; + __u64 driver_data[0]; +}; + +struct ib_uverbs_open_xrc_domain_resp { + __u32 xrcd_handle; +}; + +struct ib_uverbs_close_xrc_domain { + __u64 response; + __u32 xrcd_handle; + __u64 driver_data[0]; +}; + + + #endif /* IB_USER_VERBS_H */ Index: infiniband/drivers/infiniband/core/uverbs_cmd.c =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs_cmd.c 2008-01-22 20:05:16.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs_cmd.c 2008-01-22 20:16:45.000000000 +0200 @@ -256,6 +256,16 @@ static void put_srq_read(struct ib_srq * put_uobj_read(srq->uobject); } +static struct ib_xrcd *idr_read_xrcd(int xrcd_handle, struct ib_ucontext *context) +{ + return idr_read_obj(&ib_uverbs_xrc_domain_idr, xrcd_handle, context, 0); +} + +static void put_xrcd_read(struct ib_xrcd *xrcd) +{ + put_uobj_read(xrcd->uobject); +} + ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file, const char __user *buf, int in_len, int out_len) @@ -299,6 +309,7 @@ ssize_t ib_uverbs_get_context(struct ib_ INIT_LIST_HEAD(&ucontext->qp_list); INIT_LIST_HEAD(&ucontext->srq_list); INIT_LIST_HEAD(&ucontext->ah_list); + INIT_LIST_HEAD(&ucontext->xrc_domain_list); ucontext->closing = 0; resp.num_comp_vectors = file->device->num_comp_vectors; @@ -1028,6 +1039,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uv struct ib_srq *srq; struct ib_qp *qp; struct ib_qp_init_attr attr; + struct ib_xrcd *xrcd; int ret; if (out_len < sizeof resp) @@ -1047,13 +1059,17 @@ ssize_t ib_uverbs_create_qp(struct ib_uv init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext, &qp_lock_key); down_write(&obj->uevent.uobject.mutex); - srq = cmd.is_srq ? idr_read_srq(cmd.srq_handle, file->ucontext) : NULL; + srq = (cmd.is_srq && cmd.qp_type != IB_QPT_XRC) ? + idr_read_srq(cmd.srq_handle, file->ucontext) : NULL; + xrcd = cmd.qp_type == IB_QPT_XRC ? + idr_read_xrcd(cmd.srq_handle, file->ucontext) : NULL; pd = idr_read_pd(cmd.pd_handle, file->ucontext); scq = idr_read_cq(cmd.send_cq_handle, file->ucontext, 0); rcq = cmd.recv_cq_handle == cmd.send_cq_handle ? scq : idr_read_cq(cmd.recv_cq_handle, file->ucontext, 1); - if (!pd || !scq || !rcq || (cmd.is_srq && !srq)) { + if (!pd || !scq || !rcq || cmd.is_srq && !srq || + cmd.qp_type == IB_QPT_XRC && !xrcd) { ret = -EINVAL; goto err_put; } @@ -1065,6 +1081,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uv attr.srq = srq; attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; attr.qp_type = cmd.qp_type; + attr.xrc_domain = xrcd; attr.create_flags = 0; attr.cap.max_send_wr = cmd.max_send_wr; @@ -1092,11 +1109,14 @@ ssize_t ib_uverbs_create_qp(struct ib_uv qp->event_handler = attr.event_handler; qp->qp_context = attr.qp_context; qp->qp_type = attr.qp_type; + qp->xrcd = attr.xrc_domain; atomic_inc(&pd->usecnt); atomic_inc(&attr.send_cq->usecnt); atomic_inc(&attr.recv_cq->usecnt); if (attr.srq) atomic_inc(&attr.srq->usecnt); + else if (attr.xrc_domain) + atomic_inc(&attr.xrc_domain->usecnt); obj->uevent.uobject.object = qp; ret = idr_add_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject); @@ -1124,6 +1144,8 @@ ssize_t ib_uverbs_create_qp(struct ib_uv put_cq_read(rcq); if (srq) put_srq_read(srq); + if (xrcd) + put_xrcd_read(xrcd); mutex_lock(&file->mutex); list_add_tail(&obj->uevent.uobject.list, &file->ucontext->qp_list); @@ -1150,6 +1172,8 @@ err_put: put_cq_read(rcq); if (srq) put_srq_read(srq); + if (xrcd) + put_xrcd_read(xrcd); put_uobj_write(&obj->uevent.uobject); return ret; @@ -1993,6 +2017,8 @@ ssize_t ib_uverbs_create_srq(struct ib_u srq->uobject = &obj->uobject; srq->event_handler = attr.event_handler; srq->srq_context = attr.srq_context; + srq->xrc_cq = NULL; + srq->xrcd = NULL; atomic_inc(&pd->usecnt); atomic_set(&srq->usecnt, 0); @@ -2038,6 +2064,135 @@ err: return ret; } +ssize_t ib_uverbs_create_xrc_srq(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_create_xrc_srq cmd; + struct ib_uverbs_create_srq_resp resp; + struct ib_udata udata; + struct ib_uevent_object *obj; + struct ib_pd *pd; + struct ib_srq *srq; + struct ib_cq *xrc_cq; + struct ib_xrcd *xrcd; + struct ib_srq_init_attr attr; + int ret; + + if (out_len < sizeof resp) + return -ENOSPC; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + INIT_UDATA(&udata, buf + sizeof cmd, + (unsigned long) cmd.response + sizeof resp, + in_len - sizeof cmd, out_len - sizeof resp); + + obj = kmalloc(sizeof *obj, GFP_KERNEL); + if (!obj) + return -ENOMEM; + + init_uobj(&obj->uobject, cmd.user_handle, file->ucontext, &srq_lock_key); + down_write(&obj->uobject.mutex); + + pd = idr_read_pd(cmd.pd_handle, file->ucontext); + if (!pd) { + ret = -EINVAL; + goto err; + } + + xrc_cq = idr_read_cq(cmd.xrc_cq, file->ucontext, 0); + if (!xrc_cq) { + ret = -EINVAL; + goto err_put_pd; + } + + xrcd = idr_read_xrcd(cmd.xrcd_handle, file->ucontext); + if (!xrcd) { + ret = -EINVAL; + goto err_put_cq; + } + + + attr.event_handler = ib_uverbs_srq_event_handler; + attr.srq_context = file; + attr.attr.max_wr = cmd.max_wr; + attr.attr.max_sge = cmd.max_sge; + attr.attr.srq_limit = cmd.srq_limit; + + obj->events_reported = 0; + INIT_LIST_HEAD(&obj->event_list); + + srq = pd->device->create_xrc_srq(pd, xrc_cq, xrcd, &attr, &udata); + if (IS_ERR(srq)) { + ret = PTR_ERR(srq); + goto err_put; + } + + srq->device = pd->device; + srq->pd = pd; + srq->uobject = &obj->uobject; + srq->event_handler = attr.event_handler; + srq->srq_context = attr.srq_context; + srq->xrc_cq = xrc_cq; + srq->xrcd = xrcd; + atomic_inc(&pd->usecnt); + atomic_inc(&xrc_cq->usecnt); + atomic_inc(&xrcd->usecnt); + + atomic_set(&srq->usecnt, 0); + + obj->uobject.object = srq; + ret = idr_add_uobj(&ib_uverbs_srq_idr, &obj->uobject); + if (ret) + goto err_destroy; + + memset(&resp, 0, sizeof resp); + resp.srq_handle = obj->uobject.id; + resp.max_wr = attr.attr.max_wr; + resp.max_sge = attr.attr.max_sge; + + if (copy_to_user((void __user *) (unsigned long) cmd.response, + &resp, sizeof resp)) { + ret = -EFAULT; + goto err_copy; + } + + put_xrcd_read(xrcd); + put_cq_read(xrc_cq); + put_pd_read(pd); + + mutex_lock(&file->mutex); + list_add_tail(&obj->uobject.list, &file->ucontext->srq_list); + mutex_unlock(&file->mutex); + + obj->uobject.live = 1; + + up_write(&obj->uobject.mutex); + + return in_len; + +err_copy: + idr_remove_uobj(&ib_uverbs_srq_idr, &obj->uobject); + +err_destroy: + ib_destroy_srq(srq); + +err_put: + put_xrcd_read(xrcd); + +err_put_cq: + put_cq_read(xrc_cq); + +err_put_pd: + put_pd_read(pd); + +err: + put_uobj_write(&obj->uobject); + return ret; +} + ssize_t ib_uverbs_modify_srq(struct ib_uverbs_file *file, const char __user *buf, int in_len, int out_len) @@ -2156,3 +2311,120 @@ ssize_t ib_uverbs_destroy_srq(struct ib_ return ret ? ret : in_len; } + +ssize_t ib_uverbs_open_xrc_domain(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_open_xrc_domain cmd; + struct ib_uverbs_open_xrc_domain_resp resp; + struct ib_udata udata; + struct ib_uobject *uobj; + struct ib_xrcd *xrcd; + int ret; + + if (out_len < sizeof resp) + return -ENOSPC; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + /* file descriptors/inodes not yet implemented */ + if (cmd.fd != (u32) (-1)) + return -ENOSYS; + + INIT_UDATA(&udata, buf + sizeof cmd, + (unsigned long) cmd.response + sizeof resp, + in_len - sizeof cmd, out_len - sizeof resp); + + uobj = kmalloc(sizeof *uobj, GFP_KERNEL); + if (!uobj) + return -ENOMEM; + + init_uobj(uobj, 0, file->ucontext, &pd_lock_key); + down_write(&uobj->mutex); + + + xrcd = file->device->ib_dev->alloc_xrcd(file->device->ib_dev, + file->ucontext, &udata); + if (IS_ERR(xrcd)) { + ret = PTR_ERR(xrcd); + goto err; + } + + xrcd->fd = cmd.fd; + xrcd->flags = cmd.oflags; + xrcd->uobject = uobj; + xrcd->device = file->device->ib_dev; + atomic_set(&xrcd->usecnt, 0); + + uobj->object = xrcd; + ret = idr_add_uobj(&ib_uverbs_xrc_domain_idr, uobj); + if (ret) + goto err_idr; + + memset(&resp, 0, sizeof resp); + resp.xrcd_handle = uobj->id; + + if (copy_to_user((void __user *) (unsigned long) cmd.response, + &resp, sizeof resp)) { + ret = -EFAULT; + goto err_copy; + } + + mutex_lock(&file->mutex); + list_add_tail(&uobj->list, &file->ucontext->xrc_domain_list); + mutex_unlock(&file->mutex); + + uobj->live = 1; + + up_write(&uobj->mutex); + + return in_len; + +err_copy: + idr_remove_uobj(&ib_uverbs_pd_idr, uobj); + +err_idr: + ib_dealloc_xrcd(xrcd); + +err: + put_uobj_write(uobj); + return ret; +} + +ssize_t ib_uverbs_close_xrc_domain(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_close_xrc_domain cmd; + struct ib_uobject *uobj; + int ret; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + uobj = idr_write_uobj(&ib_uverbs_xrc_domain_idr, cmd.xrcd_handle, file->ucontext); + if (!uobj) + return -EINVAL; + + ret = ib_dealloc_xrcd(uobj->object); + if (!ret) + uobj->live = 0; + + put_uobj_write(uobj); + + if (ret) + return ret; + + idr_remove_uobj(&ib_uverbs_xrc_domain_idr, uobj); + + mutex_lock(&file->mutex); + list_del(&uobj->list); + mutex_unlock(&file->mutex); + + put_uobj(uobj); + + return in_len; +} + Index: infiniband/drivers/infiniband/core/verbs.c =================================================================== --- infiniband.orig/drivers/infiniband/core/verbs.c 2008-01-22 18:41:12.000000000 +0200 +++ infiniband/drivers/infiniband/core/verbs.c 2008-01-22 20:16:45.000000000 +0200 @@ -236,6 +236,8 @@ struct ib_srq *ib_create_srq(struct ib_p srq->uobject = NULL; srq->event_handler = srq_init_attr->event_handler; srq->srq_context = srq_init_attr->srq_context; + srq->xrc_cq = NULL; + srq->xrcd = NULL; atomic_inc(&pd->usecnt); atomic_set(&srq->usecnt, 0); } @@ -263,16 +265,25 @@ EXPORT_SYMBOL(ib_query_srq); int ib_destroy_srq(struct ib_srq *srq) { struct ib_pd *pd; + struct ib_cq *xrc_cq; + struct ib_xrcd *xrcd; int ret; if (atomic_read(&srq->usecnt)) return -EBUSY; pd = srq->pd; + xrc_cq = srq->xrc_cq; + xrcd = srq->xrcd; ret = srq->device->destroy_srq(srq); - if (!ret) + if (!ret) { atomic_dec(&pd->usecnt); + if (xrc_cq) + atomic_dec(&xrc_cq->usecnt); + if (xrcd) + atomic_dec(&xrcd->usecnt); + } return ret; } @@ -297,6 +308,7 @@ struct ib_qp *ib_create_qp(struct ib_pd qp->event_handler = qp_init_attr->event_handler; qp->qp_context = qp_init_attr->qp_context; qp->qp_type = qp_init_attr->qp_type; + qp->xrcd = NULL; atomic_inc(&pd->usecnt); atomic_inc(&qp_init_attr->send_cq->usecnt); atomic_inc(&qp_init_attr->recv_cq->usecnt); @@ -328,6 +340,9 @@ static const struct { [IB_QPT_RC] = (IB_QP_PKEY_INDEX | IB_QP_PORT | IB_QP_ACCESS_FLAGS), + [IB_QPT_XRC] = (IB_QP_PKEY_INDEX | + IB_QP_PORT | + IB_QP_ACCESS_FLAGS), [IB_QPT_SMI] = (IB_QP_PKEY_INDEX | IB_QP_QKEY), [IB_QPT_GSI] = (IB_QP_PKEY_INDEX | @@ -350,6 +365,9 @@ static const struct { [IB_QPT_RC] = (IB_QP_PKEY_INDEX | IB_QP_PORT | IB_QP_ACCESS_FLAGS), + [IB_QPT_XRC] = (IB_QP_PKEY_INDEX | + IB_QP_PORT | + IB_QP_ACCESS_FLAGS), [IB_QPT_SMI] = (IB_QP_PKEY_INDEX | IB_QP_QKEY), [IB_QPT_GSI] = (IB_QP_PKEY_INDEX | @@ -369,6 +387,12 @@ static const struct { IB_QP_RQ_PSN | IB_QP_MAX_DEST_RD_ATOMIC | IB_QP_MIN_RNR_TIMER), + [IB_QPT_XRC] = (IB_QP_AV | + IB_QP_PATH_MTU | + IB_QP_DEST_QPN | + IB_QP_RQ_PSN | + IB_QP_MAX_DEST_RD_ATOMIC | + IB_QP_MIN_RNR_TIMER), }, .opt_param = { [IB_QPT_UD] = (IB_QP_PKEY_INDEX | @@ -379,6 +403,9 @@ static const struct { [IB_QPT_RC] = (IB_QP_ALT_PATH | IB_QP_ACCESS_FLAGS | IB_QP_PKEY_INDEX), + [IB_QPT_XRC] = (IB_QP_ALT_PATH | + IB_QP_ACCESS_FLAGS | + IB_QP_PKEY_INDEX), [IB_QPT_SMI] = (IB_QP_PKEY_INDEX | IB_QP_QKEY), [IB_QPT_GSI] = (IB_QP_PKEY_INDEX | @@ -399,6 +426,11 @@ static const struct { IB_QP_RNR_RETRY | IB_QP_SQ_PSN | IB_QP_MAX_QP_RD_ATOMIC), + [IB_QPT_XRC] = (IB_QP_TIMEOUT | + IB_QP_RETRY_CNT | + IB_QP_RNR_RETRY | + IB_QP_SQ_PSN | + IB_QP_MAX_QP_RD_ATOMIC), [IB_QPT_SMI] = IB_QP_SQ_PSN, [IB_QPT_GSI] = IB_QP_SQ_PSN, }, @@ -414,6 +446,11 @@ static const struct { IB_QP_ACCESS_FLAGS | IB_QP_MIN_RNR_TIMER | IB_QP_PATH_MIG_STATE), + [IB_QPT_XRC] = (IB_QP_CUR_STATE | + IB_QP_ALT_PATH | + IB_QP_ACCESS_FLAGS | + IB_QP_MIN_RNR_TIMER | + IB_QP_PATH_MIG_STATE), [IB_QPT_SMI] = (IB_QP_CUR_STATE | IB_QP_QKEY), [IB_QPT_GSI] = (IB_QP_CUR_STATE | @@ -438,6 +475,11 @@ static const struct { IB_QP_ALT_PATH | IB_QP_PATH_MIG_STATE | IB_QP_MIN_RNR_TIMER), + [IB_QPT_XRC] = (IB_QP_CUR_STATE | + IB_QP_ACCESS_FLAGS | + IB_QP_ALT_PATH | + IB_QP_PATH_MIG_STATE | + IB_QP_MIN_RNR_TIMER), [IB_QPT_SMI] = (IB_QP_CUR_STATE | IB_QP_QKEY), [IB_QPT_GSI] = (IB_QP_CUR_STATE | @@ -450,6 +492,7 @@ static const struct { [IB_QPT_UD] = IB_QP_EN_SQD_ASYNC_NOTIFY, [IB_QPT_UC] = IB_QP_EN_SQD_ASYNC_NOTIFY, [IB_QPT_RC] = IB_QP_EN_SQD_ASYNC_NOTIFY, + [IB_QPT_XRC] = IB_QP_EN_SQD_ASYNC_NOTIFY, [IB_QPT_SMI] = IB_QP_EN_SQD_ASYNC_NOTIFY, [IB_QPT_GSI] = IB_QP_EN_SQD_ASYNC_NOTIFY } @@ -472,6 +515,11 @@ static const struct { IB_QP_ACCESS_FLAGS | IB_QP_MIN_RNR_TIMER | IB_QP_PATH_MIG_STATE), + [IB_QPT_XRC] = (IB_QP_CUR_STATE | + IB_QP_ALT_PATH | + IB_QP_ACCESS_FLAGS | + IB_QP_MIN_RNR_TIMER | + IB_QP_PATH_MIG_STATE), [IB_QPT_SMI] = (IB_QP_CUR_STATE | IB_QP_QKEY), [IB_QPT_GSI] = (IB_QP_CUR_STATE | @@ -500,6 +548,18 @@ static const struct { IB_QP_PKEY_INDEX | IB_QP_MIN_RNR_TIMER | IB_QP_PATH_MIG_STATE), + [IB_QPT_XRC] = (IB_QP_PORT | + IB_QP_AV | + IB_QP_TIMEOUT | + IB_QP_RETRY_CNT | + IB_QP_RNR_RETRY | + IB_QP_MAX_QP_RD_ATOMIC | + IB_QP_MAX_DEST_RD_ATOMIC | + IB_QP_ALT_PATH | + IB_QP_ACCESS_FLAGS | + IB_QP_PKEY_INDEX | + IB_QP_MIN_RNR_TIMER | + IB_QP_PATH_MIG_STATE), [IB_QPT_SMI] = (IB_QP_PKEY_INDEX | IB_QP_QKEY), [IB_QPT_GSI] = (IB_QP_PKEY_INDEX | @@ -584,12 +644,14 @@ int ib_destroy_qp(struct ib_qp *qp) struct ib_pd *pd; struct ib_cq *scq, *rcq; struct ib_srq *srq; + struct ib_xrcd *xrcd; int ret; pd = qp->pd; scq = qp->send_cq; rcq = qp->recv_cq; srq = qp->srq; + xrcd = qp->xrcd; ret = qp->device->destroy_qp(qp); if (!ret) { @@ -598,6 +660,8 @@ int ib_destroy_qp(struct ib_qp *qp) atomic_dec(&rcq->usecnt); if (srq) atomic_dec(&srq->usecnt); + if (xrcd) + atomic_dec(&xrcd->usecnt); } return ret; @@ -849,3 +913,14 @@ int ib_detach_mcast(struct ib_qp *qp, un return qp->device->detach_mcast(qp, gid, lid); } EXPORT_SYMBOL(ib_detach_mcast); + +int ib_dealloc_xrcd(struct ib_xrcd *xrcd) +{ + if (atomic_read(&xrcd->usecnt)) + return -EBUSY; + + return xrcd->device->dealloc_xrcd(xrcd); +} +EXPORT_SYMBOL(ib_dealloc_xrcd); + + Index: infiniband/include/rdma/ib_verbs.h =================================================================== --- infiniband.orig/include/rdma/ib_verbs.h 2008-01-22 20:05:16.000000000 +0200 +++ infiniband/include/rdma/ib_verbs.h 2008-01-22 20:16:45.000000000 +0200 @@ -95,7 +95,8 @@ enum ib_device_cap_flags { IB_DEVICE_N_NOTIFY_CQ = (1<<14), IB_DEVICE_ZERO_STAG = (1<<15), IB_DEVICE_SEND_W_INV = (1<<16), - IB_DEVICE_MEM_WINDOW = (1<<17) + IB_DEVICE_MEM_WINDOW = (1<<17), + IB_DEVICE_XRC = (1<<20), }; enum ib_atomic_cap { @@ -482,6 +483,7 @@ enum ib_qp_type { IB_QPT_RC, IB_QPT_UC, IB_QPT_UD, + IB_QPT_XRC, IB_QPT_RAW_IPV6, IB_QPT_RAW_ETY }; @@ -499,6 +501,7 @@ struct ib_qp_init_attr { struct ib_qp_cap cap; enum ib_sig_type sq_sig_type; enum ib_qp_type qp_type; + struct ib_xrcd *xrc_domain; /* XRC qp's only */ u8 port_num; /* special QP types only */ enum qp_create_flags create_flags; }; @@ -717,6 +720,7 @@ struct ib_ucontext { struct list_head qp_list; struct list_head srq_list; struct list_head ah_list; + struct list_head xrc_domain_list; int closing; }; @@ -744,6 +748,18 @@ struct ib_pd { atomic_t usecnt; /* count all resources */ }; +struct ib_xrcd { + struct ib_device *device; + struct ib_uobject *uobject; + struct rb_node node; + u32 xrc_domain_num; + struct inode *inode; + int fd; + u32 flags; + atomic_t usecnt; /* count all resources */ +}; + + struct ib_ah { struct ib_device *device; struct ib_pd *pd; @@ -765,6 +781,8 @@ struct ib_cq { struct ib_srq { struct ib_device *device; struct ib_pd *pd; + struct ib_cq *xrc_cq; + struct ib_xrcd *xrcd; struct ib_uobject *uobject; void (*event_handler)(struct ib_event *, void *); void *srq_context; @@ -782,6 +800,7 @@ struct ib_qp { void *qp_context; u32 qp_num; enum ib_qp_type qp_type; + struct ib_xrcd *xrcd; /* XRC QPs only */ }; struct ib_mr { @@ -1026,6 +1045,15 @@ struct ib_device { struct ib_grh *in_grh, struct ib_mad *in_mad, struct ib_mad *out_mad); + struct ib_srq * (*create_xrc_srq)(struct ib_pd *pd, + struct ib_cq *xrc_cq, + struct ib_xrcd *xrcd, + struct ib_srq_init_attr *srq_init_attr, + struct ib_udata *udata); + struct ib_xrcd * (*alloc_xrcd)(struct ib_device *device, + struct ib_ucontext *context, + struct ib_udata *udata); + int (*dealloc_xrcd)(struct ib_xrcd *xrcd); struct ib_dma_mapping_ops *dma_ops; @@ -1836,4 +1864,11 @@ int ib_attach_mcast(struct ib_qp *qp, un */ int ib_detach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid); + +/** + * ib_dealloc_xrcd - Deallocates an extended reliably connected domain. + * @pd: The xrc domain to deallocate. + */ +int ib_dealloc_xrcd(struct ib_xrcd *xrcd); + #endif /* IB_VERBS_H */ Index: infiniband/drivers/infiniband/core/uverbs.h =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs.h 2008-01-22 18:41:12.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs.h 2008-01-22 20:16:45.000000000 +0200 @@ -143,6 +143,7 @@ extern struct idr ib_uverbs_ah_idr; extern struct idr ib_uverbs_cq_idr; extern struct idr ib_uverbs_qp_idr; extern struct idr ib_uverbs_srq_idr; +extern struct idr ib_uverbs_xrc_domain_idr; void idr_remove_uobj(struct idr *idp, struct ib_uobject *uobj); @@ -196,5 +197,9 @@ IB_UVERBS_DECLARE_CMD(create_srq); IB_UVERBS_DECLARE_CMD(modify_srq); IB_UVERBS_DECLARE_CMD(query_srq); IB_UVERBS_DECLARE_CMD(destroy_srq); +IB_UVERBS_DECLARE_CMD(create_xrc_srq); +IB_UVERBS_DECLARE_CMD(open_xrc_domain); +IB_UVERBS_DECLARE_CMD(close_xrc_domain); + #endif /* UVERBS_H */ From jackm at dev.mellanox.co.il Wed Jan 23 02:00:26 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Jan 2008 12:00:26 +0200 Subject: [ofa-general] [PATCH 6/8] core: Add XRC support for working with file descriptors Message-ID: <200801231200.26600.jackm@dev.mellanox.co.il> Add XRC support for working with file descriptors, to allow sharing XRC domains between processes. Changes: none Signed-off-by: Jack Morgenstein Index: infiniband/drivers/infiniband/core/uverbs_cmd.c =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs_cmd.c 2008-01-22 20:16:45.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs_cmd.c 2008-01-22 20:18:28.000000000 +0200 @@ -39,6 +39,7 @@ #include #include +#include #include "uverbs.h" @@ -256,14 +257,18 @@ static void put_srq_read(struct ib_srq * put_uobj_read(srq->uobject); } -static struct ib_xrcd *idr_read_xrcd(int xrcd_handle, struct ib_ucontext *context) +static struct ib_xrcd *idr_read_xrcd(int xrcd_handle, + struct ib_ucontext *context, + struct ib_uobject **uobj) { - return idr_read_obj(&ib_uverbs_xrc_domain_idr, xrcd_handle, context, 0); + *uobj = idr_read_uobj(&ib_uverbs_xrc_domain_idr, xrcd_handle, + context, 0); + return *uobj ? (*uobj)->object : NULL; } -static void put_xrcd_read(struct ib_xrcd *xrcd) +static void put_xrcd_read(struct ib_uobject *uobj) { - put_uobj_read(xrcd->uobject); + put_uobj_read(uobj); } ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file, @@ -1040,6 +1045,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uv struct ib_qp *qp; struct ib_qp_init_attr attr; struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; int ret; if (out_len < sizeof resp) @@ -1062,7 +1068,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uv srq = (cmd.is_srq && cmd.qp_type != IB_QPT_XRC) ? idr_read_srq(cmd.srq_handle, file->ucontext) : NULL; xrcd = cmd.qp_type == IB_QPT_XRC ? - idr_read_xrcd(cmd.srq_handle, file->ucontext) : NULL; + idr_read_xrcd(cmd.srq_handle, file->ucontext, &xrcd_uobj) : NULL; pd = idr_read_pd(cmd.pd_handle, file->ucontext); scq = idr_read_cq(cmd.send_cq_handle, file->ucontext, 0); rcq = cmd.recv_cq_handle == cmd.send_cq_handle ? @@ -1145,7 +1151,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uv if (srq) put_srq_read(srq); if (xrcd) - put_xrcd_read(xrcd); + put_xrcd_read(xrcd_uobj); mutex_lock(&file->mutex); list_add_tail(&obj->uevent.uobject.list, &file->ucontext->qp_list); @@ -1173,7 +1179,7 @@ err_put: if (srq) put_srq_read(srq); if (xrcd) - put_xrcd_read(xrcd); + put_xrcd_read(xrcd_uobj); put_uobj_write(&obj->uevent.uobject); return ret; @@ -2077,6 +2083,7 @@ ssize_t ib_uverbs_create_xrc_srq(struct struct ib_cq *xrc_cq; struct ib_xrcd *xrcd; struct ib_srq_init_attr attr; + struct ib_uobject *xrcd_uobj; int ret; if (out_len < sizeof resp) @@ -2108,7 +2115,7 @@ ssize_t ib_uverbs_create_xrc_srq(struct goto err_put_pd; } - xrcd = idr_read_xrcd(cmd.xrcd_handle, file->ucontext); + xrcd = idr_read_xrcd(cmd.xrcd_handle, file->ucontext, &xrcd_uobj); if (!xrcd) { ret = -EINVAL; goto err_put_cq; @@ -2159,7 +2166,7 @@ ssize_t ib_uverbs_create_xrc_srq(struct goto err_copy; } - put_xrcd_read(xrcd); + put_xrcd_read(xrcd_uobj); put_cq_read(xrc_cq); put_pd_read(pd); @@ -2180,7 +2187,7 @@ err_destroy: ib_destroy_srq(srq); err_put: - put_xrcd_read(xrcd); + put_xrcd_read(xrcd_uobj); err_put_cq: put_cq_read(xrc_cq); @@ -2312,6 +2319,117 @@ ssize_t ib_uverbs_destroy_srq(struct ib_ return ret ? ret : in_len; } +static struct inode * xrc_fd2inode(unsigned int fd) +{ + struct file * f = fget(fd); + + if (!f) + return NULL; + + return f->f_dentry->d_inode; +} + +struct xrcd_table_entry { + struct rb_node node; + struct inode * inode; + struct ib_xrcd *xrcd; +}; + +static int xrcd_table_insert(struct ib_device *dev, + struct inode *i_n, + struct ib_xrcd *xrcd) +{ + struct xrcd_table_entry *entry, *scan; + struct rb_node **p = &dev->ib_uverbs_xrcd_table.rb_node; + struct rb_node *parent = NULL; + + entry = kmalloc(sizeof(struct xrcd_table_entry), GFP_KERNEL); + if (!entry) + return -ENOMEM; + + entry->inode = i_n; + entry->xrcd = xrcd; + + while (*p) + { + parent = *p; + scan = rb_entry(parent, struct xrcd_table_entry, node); + + if (i_n < scan->inode) + p = &(*p)->rb_left; + else if (i_n > scan->inode) + p = &(*p)->rb_right; + else { + kfree(entry); + return -EEXIST; + } + } + + rb_link_node(&entry->node, parent, p); + rb_insert_color(&entry->node, &dev->ib_uverbs_xrcd_table); + return 0; +} + +static int insert_xrcd(struct ib_device *dev, struct inode *i_n, + struct ib_xrcd *xrcd) +{ + int ret; + + ret = xrcd_table_insert(dev, i_n, xrcd); + if (!ret) + igrab(i_n); + + return ret; +} + +static struct xrcd_table_entry * xrcd_table_search(struct ib_device *dev, + struct inode *i_n) +{ + struct xrcd_table_entry *scan; + struct rb_node **p = &dev->ib_uverbs_xrcd_table.rb_node; + struct rb_node *parent = NULL; + + while (*p) + { + parent = *p; + scan = rb_entry(parent, struct xrcd_table_entry, node); + + if (i_n < scan->inode) + p = &(*p)->rb_left; + else if (i_n > scan->inode) + p = &(*p)->rb_right; + else + return scan; + } + return NULL; +} + +static int find_xrcd(struct ib_device *dev, struct inode *i_n, + struct ib_xrcd **xrcd) +{ + struct xrcd_table_entry *entry; + + entry = xrcd_table_search(dev, i_n); + if (!entry) + return -EINVAL; + + *xrcd = entry->xrcd; + return 0; +} + + +static void xrcd_table_delete(struct ib_device *dev, + struct inode *i_n) +{ + struct xrcd_table_entry *entry = xrcd_table_search(dev, i_n); + + if (entry) { + iput(i_n); + rb_erase(&entry->node, &dev->ib_uverbs_xrcd_table); + kfree(entry); + } +} + ssize_t ib_uverbs_open_xrc_domain(struct ib_uverbs_file *file, const char __user *buf, int in_len, int out_len) @@ -2320,8 +2438,10 @@ ssize_t ib_uverbs_open_xrc_domain(struct struct ib_uverbs_open_xrc_domain_resp resp; struct ib_udata udata; struct ib_uobject *uobj; - struct ib_xrcd *xrcd; - int ret; + struct ib_xrcd *xrcd = NULL; + struct inode *inode = NULL; + int ret = 0; + int new_xrcd = 0; if (out_len < sizeof resp) return -ENOSPC; @@ -2329,35 +2449,55 @@ ssize_t ib_uverbs_open_xrc_domain(struct if (copy_from_user(&cmd, buf, sizeof cmd)) return -EFAULT; - /* file descriptors/inodes not yet implemented */ - if (cmd.fd != (u32) (-1)) - return -ENOSYS; - INIT_UDATA(&udata, buf + sizeof cmd, (unsigned long) cmd.response + sizeof resp, in_len - sizeof cmd, out_len - sizeof resp); + mutex_lock(&file->device->ib_dev->xrcd_table_mutex); + if (cmd.fd != (u32) (-1)) { + /* search for file descriptor */ + inode = xrc_fd2inode(cmd.fd); + if (!inode) { + ret = -EBADF; + goto err_table_mutex_unlock; + } + + ret = find_xrcd(file->device->ib_dev, inode, &xrcd); + if (ret && !(cmd.oflags & O_CREAT)) { + /* no file descriptor. Need CREATE flag */ + ret = -EAGAIN; + goto err_table_mutex_unlock; + } + + if (xrcd && cmd.oflags & O_EXCL){ + ret = -EINVAL; + goto err_table_mutex_unlock; + } + } + uobj = kmalloc(sizeof *uobj, GFP_KERNEL); - if (!uobj) - return -ENOMEM; + if (!uobj) { + ret = -ENOMEM; + goto err_table_mutex_unlock; + } init_uobj(uobj, 0, file->ucontext, &pd_lock_key); down_write(&uobj->mutex); - - xrcd = file->device->ib_dev->alloc_xrcd(file->device->ib_dev, - file->ucontext, &udata); - if (IS_ERR(xrcd)) { - ret = PTR_ERR(xrcd); - goto err; + if (!xrcd) { + xrcd = file->device->ib_dev->alloc_xrcd(file->device->ib_dev, + file->ucontext, &udata); + if (IS_ERR(xrcd)) { + ret = PTR_ERR(xrcd); + goto err; + } + xrcd->uobject = (cmd.fd == -1) ? uobj : NULL; + xrcd->inode = inode; + xrcd->device = file->device->ib_dev; + atomic_set(&xrcd->usecnt, 0); + new_xrcd = 1; } - xrcd->fd = cmd.fd; - xrcd->flags = cmd.oflags; - xrcd->uobject = uobj; - xrcd->device = file->device->ib_dev; - atomic_set(&xrcd->usecnt, 0); - uobj->object = xrcd; ret = idr_add_uobj(&ib_uverbs_xrc_domain_idr, uobj); if (ret) @@ -2366,6 +2506,16 @@ ssize_t ib_uverbs_open_xrc_domain(struct memset(&resp, 0, sizeof resp); resp.xrcd_handle = uobj->id; + if (inode) { + if (new_xrcd) { + /* create new inode/xrcd table entry */ + ret = insert_xrcd(file->device->ib_dev, inode, xrcd); + if (ret) + goto err_insert_xrcd; + } + atomic_inc(&xrcd->usecnt); + } + if (copy_to_user((void __user *) (unsigned long) cmd.response, &resp, sizeof resp)) { ret = -EFAULT; @@ -2380,16 +2530,29 @@ ssize_t ib_uverbs_open_xrc_domain(struct up_write(&uobj->mutex); + mutex_unlock(&file->device->ib_dev->xrcd_table_mutex); return in_len; err_copy: - idr_remove_uobj(&ib_uverbs_pd_idr, uobj); + + if (inode) { + if (new_xrcd) + xrcd_table_delete(file->device->ib_dev, inode); + atomic_dec(&xrcd->usecnt); + } + +err_insert_xrcd: + idr_remove_uobj(&ib_uverbs_xrc_domain_idr, uobj); err_idr: ib_dealloc_xrcd(xrcd); err: put_uobj_write(uobj); + +err_table_mutex_unlock: + + mutex_unlock(&file->device->ib_dev->xrcd_table_mutex); return ret; } @@ -2399,14 +2562,25 @@ ssize_t ib_uverbs_close_xrc_domain(struc { struct ib_uverbs_close_xrc_domain cmd; struct ib_uobject *uobj; - int ret; + struct ib_xrcd *xrcd = NULL; + struct inode *inode = NULL; + int ret = 0; if (copy_from_user(&cmd, buf, sizeof cmd)) return -EFAULT; + mutex_lock(&file->device->ib_dev->xrcd_table_mutex); uobj = idr_write_uobj(&ib_uverbs_xrc_domain_idr, cmd.xrcd_handle, file->ucontext); - if (!uobj) - return -EINVAL; + if (!uobj) { + ret = -EINVAL; + goto err_unlock_mutex; + } + + xrcd = (struct ib_xrcd *) (uobj->object); + inode = xrcd->inode; + + if (inode) + atomic_dec(&xrcd->usecnt); ret = ib_dealloc_xrcd(uobj->object); if (!ret) @@ -2414,8 +2588,11 @@ ssize_t ib_uverbs_close_xrc_domain(struc put_uobj_write(uobj); - if (ret) - return ret; + if (ret && !inode) + goto err_unlock_mutex; + + if (!ret && inode) + xrcd_table_delete(file->device->ib_dev, inode); idr_remove_uobj(&ib_uverbs_xrc_domain_idr, uobj); @@ -2425,6 +2602,27 @@ ssize_t ib_uverbs_close_xrc_domain(struc put_uobj(uobj); + mutex_unlock(&file->device->ib_dev->xrcd_table_mutex); return in_len; + +err_unlock_mutex: + mutex_unlock(&file->device->ib_dev->xrcd_table_mutex); + return ret; } +void ib_uverbs_dealloc_xrcd(struct ib_device *ib_dev, + struct ib_xrcd *xrcd) +{ + struct inode *inode = NULL; + int ret = 0; + + inode = xrcd->inode; + if (inode) + atomic_dec(&xrcd->usecnt); + + ret = ib_dealloc_xrcd(xrcd); + if (!ret && inode) + xrcd_table_delete(ib_dev, inode); +} + + Index: infiniband/include/rdma/ib_verbs.h =================================================================== --- infiniband.orig/include/rdma/ib_verbs.h 2008-01-22 20:16:45.000000000 +0200 +++ infiniband/include/rdma/ib_verbs.h 2008-01-22 20:18:28.000000000 +0200 @@ -52,6 +52,8 @@ #include #include +#include +#include union ib_gid { u8 raw[16]; @@ -751,11 +753,8 @@ struct ib_pd { struct ib_xrcd { struct ib_device *device; struct ib_uobject *uobject; - struct rb_node node; - u32 xrc_domain_num; struct inode *inode; - int fd; - u32 flags; + struct rb_node node; atomic_t usecnt; /* count all resources */ }; @@ -1075,6 +1074,8 @@ struct ib_device { __be64 node_guid; u8 node_type; u8 phys_port_cnt; + struct rb_root ib_uverbs_xrcd_table; + struct mutex xrcd_table_mutex; }; struct ib_client { Index: infiniband/drivers/infiniband/core/device.c =================================================================== --- infiniband.orig/drivers/infiniband/core/device.c 2008-01-22 18:41:12.000000000 +0200 +++ infiniband/drivers/infiniband/core/device.c 2008-01-22 20:18:28.000000000 +0200 @@ -290,6 +290,8 @@ int ib_register_device(struct ib_device INIT_LIST_HEAD(&device->client_data_list); spin_lock_init(&device->event_handler_lock); spin_lock_init(&device->client_data_lock); + device->ib_uverbs_xrcd_table = RB_ROOT; + mutex_init(&device->xrcd_table_mutex); ret = read_port_table_lengths(device); if (ret) { Index: infiniband/drivers/infiniband/core/uverbs_main.c =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs_main.c 2008-01-22 20:16:45.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs_main.c 2008-01-22 20:18:28.000000000 +0200 @@ -251,13 +251,15 @@ static int ib_uverbs_cleanup_ucontext(st kfree(uobj); } + mutex_lock(&file->device->ib_dev->xrcd_table_mutex); list_for_each_entry_safe(uobj, tmp, &context->xrc_domain_list, list) { struct ib_xrcd *xrcd = uobj->object; idr_remove_uobj(&ib_uverbs_xrc_domain_idr, uobj); - ib_dealloc_xrcd(xrcd); + ib_uverbs_dealloc_xrcd(file->device->ib_dev, xrcd); kfree(uobj); } + mutex_unlock(&file->device->ib_dev->xrcd_table_mutex); list_for_each_entry_safe(uobj, tmp, &context->pd_list, list) { struct ib_pd *pd = uobj->object; Index: infiniband/drivers/infiniband/core/uverbs.h =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs.h 2008-01-22 20:16:45.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs.h 2008-01-22 20:18:28.000000000 +0200 @@ -163,6 +163,8 @@ void ib_uverbs_qp_event_handler(struct i void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr); void ib_uverbs_event_handler(struct ib_event_handler *handler, struct ib_event *event); +void ib_uverbs_dealloc_xrcd(struct ib_device *ib_dev, + struct ib_xrcd *xrcd); #define IB_UVERBS_DECLARE_CMD(name) \ ssize_t ib_uverbs_##name(struct ib_uverbs_file *file, \ From jackm at dev.mellanox.co.il Wed Jan 23 02:00:36 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Wed, 23 Jan 2008 12:00:36 +0200 Subject: [ofa-general] [PATCH 8/8] core: Add xrc support for kernel-space apps Message-ID: <200801231200.36852.jackm@dev.mellanox.co.il> IB/core: implement XRC for kernel-space applications. Changes: none Signed-off-by: Jack Morgenstein Index: infiniband/include/rdma/ib_verbs.h =================================================================== --- infiniband.orig/include/rdma/ib_verbs.h 2008-01-22 20:18:46.000000000 +0200 +++ infiniband/include/rdma/ib_verbs.h 2008-01-22 20:19:38.000000000 +0200 @@ -667,6 +667,7 @@ struct ib_send_wr { u8 port_num; /* valid for DR SMPs on switch only */ } ud; } wr; + u32 xrc_remote_srq_num; /* valid for XRC sends only */ }; struct ib_recv_wr { @@ -799,6 +800,7 @@ struct ib_srq { void (*event_handler)(struct ib_event *, void *); void *srq_context; atomic_t usecnt; + u32 xrc_srq_num; }; struct ib_qp { @@ -1266,8 +1268,28 @@ int ib_query_ah(struct ib_ah *ah, struct int ib_destroy_ah(struct ib_ah *ah); /** - * ib_create_srq - Creates a SRQ associated with the specified protection - * domain. + * ib_create_xrc_srq - Creates an XRC SRQ associated with the specified + * protection domain, cq, and xrc domain. + * @pd: The protection domain associated with the SRQ. + * @xrc_cq: The cq to be associated with the XRC SRQ. + * @xrcd: The XRC domain to be associated with the XRC SRQ. + * @srq_init_attr: A list of initial attributes required to create the + * XRC SRQ. If XRC SRQ creation succeeds, then the attributes are updated + * to the actual capabilities of the created XRC SRQ. + * + * srq_attr->max_wr and srq_attr->max_sge are read the determine the + * requested size of the XRC SRQ, and set to the actual values allocated + * on return. If ib_create_xrc_srq() succeeds, then max_wr and max_sge + * will always be at least as large as the requested values. + */ +struct ib_srq *ib_create_xrc_srq(struct ib_pd *pd, + struct ib_cq *xrc_cq, + struct ib_xrcd *xrcd, + struct ib_srq_init_attr *srq_init_attr); + +/** + * ib_create_srq - Creates an SRQ associated with the specified + * protection domain. * @pd: The protection domain associated with the SRQ. * @srq_init_attr: A list of initial attributes required to create the * SRQ. If SRQ creation succeeds, then the attributes are updated to @@ -1898,8 +1920,14 @@ int ib_detach_mcast(struct ib_qp *qp, un /** * ib_dealloc_xrcd - Deallocates an extended reliably connected domain. - * @pd: The xrc domain to deallocate. + * @xrcd: The xrc domain to deallocate. */ int ib_dealloc_xrcd(struct ib_xrcd *xrcd); +/** + * ib_alloc_xrcd - Allocates an extended reliably connected domain. + * @device: The device on which to allocate the xrcd. + */ +struct ib_xrcd * ib_alloc_xrcd(struct ib_device *device); + #endif /* IB_VERBS_H */ Index: infiniband/drivers/infiniband/core/verbs.c =================================================================== --- infiniband.orig/drivers/infiniband/core/verbs.c 2008-01-22 20:16:45.000000000 +0200 +++ infiniband/drivers/infiniband/core/verbs.c 2008-01-22 20:19:38.000000000 +0200 @@ -246,6 +246,36 @@ struct ib_srq *ib_create_srq(struct ib_p } EXPORT_SYMBOL(ib_create_srq); +struct ib_srq *ib_create_xrc_srq(struct ib_pd *pd, + struct ib_cq *xrc_cq, + struct ib_xrcd *xrcd, + struct ib_srq_init_attr *srq_init_attr) +{ + struct ib_srq *srq; + + if (!pd->device->create_xrc_srq) + return ERR_PTR(-ENOSYS); + + srq = pd->device->create_xrc_srq(pd, xrc_cq, xrcd, srq_init_attr, NULL); + + if (!IS_ERR(srq)) { + srq->device = pd->device; + srq->pd = pd; + srq->uobject = NULL; + srq->event_handler = srq_init_attr->event_handler; + srq->srq_context = srq_init_attr->srq_context; + srq->xrc_cq = xrc_cq; + srq->xrcd = xrcd; + atomic_inc(&pd->usecnt); + atomic_inc(&xrcd->usecnt); + atomic_inc(&xrc_cq->usecnt); + atomic_set(&srq->usecnt, 0); + } + + return srq; +} +EXPORT_SYMBOL(ib_create_xrc_srq); + int ib_modify_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr, enum ib_srq_attr_mask srq_attr_mask) @@ -308,12 +338,15 @@ struct ib_qp *ib_create_qp(struct ib_pd qp->event_handler = qp_init_attr->event_handler; qp->qp_context = qp_init_attr->qp_context; qp->qp_type = qp_init_attr->qp_type; - qp->xrcd = NULL; + qp->xrcd = qp->qp_type == IB_QPT_XRC ? + qp_init_attr->xrc_domain : NULL; atomic_inc(&pd->usecnt); atomic_inc(&qp_init_attr->send_cq->usecnt); atomic_inc(&qp_init_attr->recv_cq->usecnt); if (qp_init_attr->srq) atomic_inc(&qp_init_attr->srq->usecnt); + if (qp->qp_type == IB_QPT_XRC) + atomic_inc(&qp->xrcd->usecnt); } return qp; @@ -645,6 +678,7 @@ int ib_destroy_qp(struct ib_qp *qp) struct ib_cq *scq, *rcq; struct ib_srq *srq; struct ib_xrcd *xrcd; + enum ib_qp_type qp_type = qp->qp_type; int ret; pd = qp->pd; @@ -660,7 +694,7 @@ int ib_destroy_qp(struct ib_qp *qp) atomic_dec(&rcq->usecnt); if (srq) atomic_dec(&srq->usecnt); - if (xrcd) + if (qp_type == IB_QPT_XRC) atomic_dec(&xrcd->usecnt); } @@ -923,4 +957,22 @@ int ib_dealloc_xrcd(struct ib_xrcd *xrcd } EXPORT_SYMBOL(ib_dealloc_xrcd); +struct ib_xrcd * ib_alloc_xrcd(struct ib_device *device) +{ + struct ib_xrcd *xrcd; + + if (!device->alloc_xrcd) + return ERR_PTR(-ENOSYS); + + xrcd = device->alloc_xrcd(device, NULL, NULL); + if (!IS_ERR(xrcd)) { + xrcd->device = device; + xrcd->inode = NULL; + xrcd->uobject = NULL; + atomic_set(&xrcd->usecnt, 0); + } + return xrcd; +} +EXPORT_SYMBOL(ib_alloc_xrcd); + From bart.vanassche at gmail.com Wed Jan 23 02:59:54 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Wed, 23 Jan 2008 11:59:54 +0100 Subject: [ofa-general] Feature request: ability to enable SDP system-wide Message-ID: As known SDP (Sockets Direct Protocol) can be enabled in Linux processes by setting the environment variable of LD_PRELOAD to /usr/lib/libsdp.so before starting a process. This works fine, but this approach cannot be applied to sockets created in kernel space. E.g. the OCFS2 filesystem included in the Linux creates sockets from within kernel space. Are there any plans to implement a facility like the sdpadm command in Solaris which allows to enable SDP system-wide ? See also http://docs.sun.com/app/docs/doc/819-2240/sdpadm-1m?a=view Bart Van Assche. From vlad at lists.openfabrics.org Wed Jan 23 03:08:47 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 23 Jan 2008 03:08:47 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080123-0200 daily build status Message-ID: <20080123110848.1F962E600A0@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ppc64 with linux-2.6.12 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.17 Passed on powerpc with linux-2.6.12 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.15 Passed on x86_64 with linux-2.6.13 Passed on powerpc with linux-2.6.15 Passed on ppc64 with linux-2.6.16 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.17 Passed on ia64 with linux-2.6.18 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.19 Passed on ppc64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.12 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.14 Passed on ia64 with linux-2.6.16 Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.16.21-0.8-default Failed: Build failed on i686 with linux-2.6.21.1 Build failed on ia64 with linux-2.6.21.1 Log: /home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_ia64_check/drivers/infiniband/ulp/ipoib/ipoib_cm.c: In function 'ipoib_cm_send': /home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_ia64_check/drivers/infiniband/ulp/ipoib/ipoib_cm.c:535: error: 'struct net_device' has no member named 'stats' /home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_ia64_check/drivers/infiniband/ulp/ipoib/ipoib_cm.c:545: error: 'struct net_device' has no member named 'stats' make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_ia64_check/drivers/infiniband/ulp/ipoib/ipoib_cm.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_ia64_check/drivers/infiniband/ulp/ipoib] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_ia64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_ia64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.21.1' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- Build failed on x86_64 with linux-2.6.21.1 Log: /home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_cm.c: In function 'ipoib_cm_send': /home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_cm.c:535: error: 'struct net_device' has no member named 'stats' /home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_cm.c:545: error: 'struct net_device' has no member named 'stats' make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/ulp/ipoib/ipoib_cm.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband/ulp/ipoib] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-20080123-0200_linux-2.6.21.1_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.21.1' make: *** [kernel] Error 2 ---------------------------------------------------------------------------------- From dwhastam at hasta.com Wed Jan 23 04:53:05 2008 From: dwhastam at hasta.com (Enriquet Lozano) Date: Wed, 23 Jan 2008 13:53:05 +0100 Subject: [ofa-general] Order now and don't forget about a discount Message-ID: <364133729.80842534252735@hasta.com> Take advantage of CanadianPharmacy's special half-price offer and buy your drugs at lowest Internet prices.CanadianPharmacy offers high quality Canadian products meeting all Pharmaceutical Standards. Wide selection of products which are cheaper than American ones are available to order online. Easy, secure and confidential ordering process.You get 12 bonus pills with any order over $300.We offer best prices for best people. http://geocities.com/cannon.cliff/Thank You for Your time and for your attention Yours faithfully, Enriquet Lozano -------------- next part -------------- An HTML attachment was scrubbed... URL: From dotanb at dev.mellanox.co.il Wed Jan 23 05:19:24 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Wed, 23 Jan 2008 15:19:24 +0200 Subject: [ofa-general] modify QP from SQE -> RTS (after query QP) fails without IBV_QP_CUR_STATE Message-ID: <47973EDC.5030504@dev.mellanox.co.il> Hi Roland. Recently i started to test the SQE state. During the test writing i found out that if a non RC QP has an error in the SQ the QP is being moved to the SQE state (as it should) by the HCA. I knew that the QP is in SQE because query QP gave me this state in the QP attributes. When i tried to recover the QP and modify it's state to RTS i had to use the flag IBV_QP_CUR_STATE because the internal QP structure assumed that the QP state is RTS (although query QP noticed otherwise). I created and tested 2 patches for both mthca and mlx4 drivers to update the internal QP state when a successful query QP was executed. Will you accept those patches? thanks Dotan From jimmott at austin.rr.com Wed Jan 23 05:55:16 2008 From: jimmott at austin.rr.com (Jim Mott) Date: Wed, 23 Jan 2008 07:55:16 -0600 Subject: [ofa-general] Feature request: ability to enable SDP system-wide In-Reply-To: References: Message-ID: <000601c85dc7$95b48550$c11d8ff0$@rr.com> There are no current plans to do this. -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Bart Van Assche Sent: Wednesday, January 23, 2008 5:00 AM To: Openib-General Subject: [ofa-general] Feature request: ability to enable SDP system-wide As known SDP (Sockets Direct Protocol) can be enabled in Linux processes by setting the environment variable of LD_PRELOAD to /usr/lib/libsdp.so before starting a process. This works fine, but this approach cannot be applied to sockets created in kernel space. E.g. the OCFS2 filesystem included in the Linux creates sockets from within kernel space. Are there any plans to implement a facility like the sdpadm command in Solaris which allows to enable SDP system-wide ? See also http://docs.sun.com/app/docs/doc/819-2240/sdpadm-1m?a=view Bart Van Assche. _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From glebn at voltaire.com Wed Jan 23 06:11:38 2008 From: glebn at voltaire.com (Gleb Natapov) Date: Wed, 23 Jan 2008 16:11:38 +0200 Subject: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness In-Reply-To: References: <478D1A49.1080807@mellanox.co.il> <20080117153043.GA10065@minantech.com> <200801190927.08992.jackm@dev.mellanox.co.il> Message-ID: <20080123141137.GC7336@minantech.com> On Tue, Jan 22, 2008 at 03:04:39PM -0800, Roland Dreier wrote: > > > I guess you mean just implement XRC without allowing multiple > > > processes to share an XRC domain?  That actually seems like a sensible > > > thing to implement as well... > > > > This is part of the current XRC implementation -- just give -1 as the fd value > > in ibv_open_xrc_domain(). > > I *think* Gleb's point was that the XRC implementation could be much > simpler if this were the *only* case supported -- you wouldn't need > all the complexity of kernel receive QPs etc I guess. Gleb, is that > what you meant? Yes that exactly what I meant. Just to clarify my position on XRC API. I am not against it (even if I think that usefulness of sharing QPs between processes is overestimated why should I be right?), I just want to be sure that it will not be changed in a couple of month after release of OFED 1.3 because kernel people will not except it into the kernel as is. -- Gleb. From dotanb at dev.mellanox.co.il Wed Jan 23 06:25:59 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Wed, 23 Jan 2008 16:25:59 +0200 Subject: [ofa-general] [RFC] IPoIB UD 4K MTU support In-Reply-To: <1201025321.756.33.camel@localhost.localdomain> References: <1201025321.756.33.camel@localhost.localdomain> Message-ID: <47974E77.1020606@dev.mellanox.co.il> Shirley Ma wrote: > Hello Roland, > > IPoIB UD currently supports up to 2K MTU. Below is the draft patch to > enable IPoIB UD 4K MTU support for any IB device who has 4K MTU like IBM > eHCA. This patch limits packet in one page range by setting IPoIB UD MTU > size as 4K-48 (40 GRH, 4 IPoIB header, 4 padding to IP header align) to > avoid two contiguous pages allocation when kernel page size is 4K. > Enabling IPoIB UD 4K MTU relies on both SM to set default broadcast > group 4K MTU and of course switch should support 4K MTU. When SM default > broadcast group MTU sets 2K, IPoIB UD MTU will fall back to 2K. > > I have tested 2K MTU. 4K MTU is still under testing. The reason I send > this patch out before my test for review is I want comments as early as > possible. So I can integrate the comments into this patch and hopefully > we can make it into OFED-1.3-rc3 which is around Jan.30. > > Hi. > + if (!ib_query_port(hca, port, &attr)) > + priv->max_ib_mtu = ib_mtu_enum_to_int(attr.); > + else { > + printk(KERN_WARNING "%s: ib_query_port %d failed\n", > + hca->name, port); > + goto device_init_failed; > + } > What will happen if the IB port supports 4KB MTU but the operational MTU of this port is bellow this value? thanks Dotan From glebn at voltaire.com Wed Jan 23 06:27:06 2008 From: glebn at voltaire.com (Gleb Natapov) Date: Wed, 23 Jan 2008 16:27:06 +0200 Subject: [ofa-general] Resizing SRQ Message-ID: <20080123142706.GD7336@minantech.com> Hi, I want to enlarge SRQ by using modify_srq() verb. IB Spec says that it is possible, but looking at the libmlx4 code I see that modify_srq() is implemented like this: int mlx4_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, enum ibv_srq_attr_mask attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(srq, attr, attr_mask, &cmd, sizeof cmd); } I don't see buffer reallocations here, so do I miss something or HW/SW don't support this yet? -- Gleb. From dotanb at dev.mellanox.co.il Wed Jan 23 06:57:07 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Wed, 23 Jan 2008 16:57:07 +0200 Subject: [ofa-general] Resizing SRQ In-Reply-To: <20080123142706.GD7336@minantech.com> References: <20080123142706.GD7336@minantech.com> Message-ID: <479755C3.3040205@dev.mellanox.co.il> Gleb Natapov wrote: > Hi, > > I want to enlarge SRQ by using modify_srq() verb. IB Spec says that it is > possible, but looking at the libmlx4 code I see that modify_srq() is > implemented like this: > int mlx4_modify_srq(struct ibv_srq *srq, > struct ibv_srq_attr *attr, > enum ibv_srq_attr_mask attr_mask) > { > struct ibv_modify_srq cmd; > > return ibv_cmd_modify_srq(srq, attr, attr_mask, &cmd, sizeof cmd); > } > I don't see buffer reallocations here, so do I miss something or HW/SW > don't support this yet? > Resize SRQ is not supported by the mlx4 low level driver. Dotan From swise at opengridcomputing.com Wed Jan 23 07:08:46 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Wed, 23 Jan 2008 09:08:46 -0600 Subject: [ofa-general] [GIT PULL ofed-1.3] iw_cxgb3 bug fixes Message-ID: <4797587E.1040207@opengridcomputing.com> Vlad, Please pull three bug fixes from: git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel The patches have been accepted upstream by Roland. See: http://lkml.org/lkml/2008/1/21/252 Shortlog: RDMA/cxgb3: Flush the RQ when closing. RDMA/cxgb3: fix page shift calculation in build_phys_page_list() RDMA/cxgb3: Mark qp as privileged based on user capabilities. Thanks, Steve. From glebn at voltaire.com Wed Jan 23 07:11:14 2008 From: glebn at voltaire.com (Gleb Natapov) Date: Wed, 23 Jan 2008 17:11:14 +0200 Subject: [ofa-general] Resizing SRQ In-Reply-To: <479755C3.3040205@dev.mellanox.co.il> References: <20080123142706.GD7336@minantech.com> <479755C3.3040205@dev.mellanox.co.il> Message-ID: <20080123151114.GF7336@minantech.com> On Wed, Jan 23, 2008 at 04:57:07PM +0200, Dotan Barak wrote: > Gleb Natapov wrote: >> Hi, >> >> I want to enlarge SRQ by using modify_srq() verb. IB Spec says that it is >> possible, but looking at the libmlx4 code I see that modify_srq() is >> implemented like this: >> int mlx4_modify_srq(struct ibv_srq *srq, >> struct ibv_srq_attr *attr, >> enum ibv_srq_attr_mask attr_mask) >> { >> struct ibv_modify_srq cmd; >> >> return ibv_cmd_modify_srq(srq, attr, attr_mask, &cmd, sizeof cmd); >> } >> I don't see buffer reallocations here, so do I miss something or HW/SW >> don't support this yet? >> > Resize SRQ is not supported by the mlx4 low level driver. > Well I guess it is not implemented for mthca too then since libmthca function looks the same. Is there any plans to implement it? -- Gleb. From vlad at dev.mellanox.co.il Wed Jan 23 07:15:40 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Wed, 23 Jan 2008 17:15:40 +0200 Subject: [ofa-general] Re: [GIT PULL ofed-1.3] iw_cxgb3 bug fixes In-Reply-To: <4797587E.1040207@opengridcomputing.com> References: <4797587E.1040207@opengridcomputing.com> Message-ID: <47975A1C.3060100@dev.mellanox.co.il> Steve Wise wrote: > Vlad, > > Please pull three bug fixes from: > > git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel > > The patches have been accepted upstream by Roland. See: > > http://lkml.org/lkml/2008/1/21/252 > > Shortlog: > RDMA/cxgb3: Flush the RQ when closing. > RDMA/cxgb3: fix page shift calculation in build_phys_page_list() > RDMA/cxgb3: Mark qp as privileged based on user capabilities. > > > Thanks, > > Steve. > Done, Regards, Vladimir From eli at dev.mellanox.co.il Wed Jan 23 07:32:26 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Wed, 23 Jan 2008 17:32:26 +0200 Subject: [ofa-general] [RFC] IPoIB UD 4K MTU support In-Reply-To: <1201025321.756.33.camel@localhost.localdomain> References: <1201025321.756.33.camel@localhost.localdomain> Message-ID: <1201102346.6925.215.camel@mtls03> Hi Shirley, can you send a path to the git tree this patch is based on? On Tue, 2008-01-22 at 10:08 -0800, Shirley Ma wrote: > Hello Roland, > > IPoIB UD currently supports up to 2K MTU. Below is the draft patch to > enable IPoIB UD 4K MTU support for any IB device who has 4K MTU like IBM > eHCA. This patch limits packet in one page range by setting IPoIB UD MTU > size as 4K-48 (40 GRH, 4 IPoIB header, 4 padding to IP header align) to > avoid two contiguous pages allocation when kernel page size is 4K. > Enabling IPoIB UD 4K MTU relies on both SM to set default broadcast > group 4K MTU and of course switch should support 4K MTU. When SM default > broadcast group MTU sets 2K, IPoIB UD MTU will fall back to 2K. > > I have tested 2K MTU. 4K MTU is still under testing. The reason I send > this patch out before my test for review is I want comments as early as > possible. So I can integrate the comments into this patch and hopefully > we can make it into OFED-1.3-rc3 which is around Jan.30. > > Thanks > Shirley > > > diff -urpN > ipoib/ipoib.h /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib.h > --- ipoib/ipoib.h 2008-01-21 14:16:19.000000000 -0500 > +++ /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib.h > 2008-01-22 15:50:13.000000000 -0500 > @@ -56,9 +56,6 @@ > /* constants */ > > enum { > - IPOIB_PACKET_SIZE = 2048, > - IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, > - > IPOIB_ENCAP_LEN = 4, > > IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header > to 16 */ > @@ -320,6 +317,7 @@ struct ipoib_dev_priv { > struct dentry *mcg_dentry; > struct dentry *path_dentry; > #endif > + unsigned int max_ib_mtu; > }; > > struct ipoib_ah { > @@ -698,4 +696,11 @@ extern int ipoib_debug_level; > > #define IPOIB_QPN(ha) (be32_to_cpup((__be32 *) ha) & 0xffffff) > > +/* padding packet to fit one page size for 4K IB mtu */ > +static inline int ipoib_ud_mtu(unsigned int ib_mtu) > +{ > + return (ib_mtu < 4096) ? (ib_mtu - IPOIB_ENCAP_LEN) : > + (ib_mtu - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4); > +} > + > #endif /* _IPOIB_H */ > diff -urpN > ipoib/ipoib_ib.c /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_ib.c > --- ipoib/ipoib_ib.c 2008-01-10 13:13:12.000000000 -0500 > +++ /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_ib.c > 2008-01-22 15:58:16.000000000 -0500 > @@ -87,6 +87,15 @@ void ipoib_free_ah(struct kref *kref) > spin_unlock_irqrestore(&priv->lock, flags); > } > > +static int ipoib_ud_buf_size(unsigned int max_ib_mtu) > +{ > + if (max_ib_mtu < 4096) > + return (max_ib_mtu + IB_GRH_BYTES); > + else > + /* padding packet to one page for 4K mtu */ > + return (max_ib_mtu - 4); > +} > + > static int ipoib_ib_post_receive(struct net_device *dev, int id) > { > struct ipoib_dev_priv *priv = netdev_priv(dev); > @@ -96,7 +105,7 @@ static int ipoib_ib_post_receive(struct > int ret; > > list.addr = priv->rx_ring[id].mapping; > - list.length = IPOIB_BUF_SIZE; > + list.length = ipoib_ud_buf_size(priv->max_ib_mtu); > list.lkey = priv->mr->lkey; > > param.next = NULL; > @@ -108,7 +117,7 @@ static int ipoib_ib_post_receive(struct > if (unlikely(ret)) { > ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); > ib_dma_unmap_single(priv->ca, priv->rx_ring[id].mapping, > - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); > + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); > dev_kfree_skb_any(priv->rx_ring[id].skb); > priv->rx_ring[id].skb = NULL; > } > @@ -122,7 +131,7 @@ static int ipoib_alloc_rx_skb(struct net > struct sk_buff *skb; > u64 addr; > > - skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); > + skb = dev_alloc_skb(ipoib_ud_buf_size(priv->max_ib_mtu) + 4); > if (!skb) > return -ENOMEM; > > @@ -133,7 +142,7 @@ static int ipoib_alloc_rx_skb(struct net > */ > skb_reserve(skb, 4); > > - addr = ib_dma_map_single(priv->ca, skb->data, IPOIB_BUF_SIZE, > + addr = ib_dma_map_single(priv->ca, skb->data, > ipoib_ud_buf_size(priv->max_ib_mtu), > DMA_FROM_DEVICE); > if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { > dev_kfree_skb_any(skb); > @@ -190,7 +199,7 @@ static void ipoib_ib_handle_rx_wc(struct > "(status=%d, wrid=%d vend_err %x)\n", > wc->status, wr_id, wc->vendor_err); > ib_dma_unmap_single(priv->ca, addr, > - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); > + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); > dev_kfree_skb_any(skb); > priv->rx_ring[wr_id].skb = NULL; > return; > @@ -215,7 +224,7 @@ static void ipoib_ib_handle_rx_wc(struct > ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", > wc->byte_len, wc->slid); > > - ib_dma_unmap_single(priv->ca, addr, IPOIB_BUF_SIZE, DMA_FROM_DEVICE); > + ib_dma_unmap_single(priv->ca, addr, > ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); > > skb_put(skb, wc->byte_len); > skb_pull(skb, IB_GRH_BYTES); > @@ -632,7 +641,7 @@ int ipoib_ib_dev_stop(struct net_device > continue; > ib_dma_unmap_single(priv->ca, > rx_req->mapping, > - IPOIB_BUF_SIZE, > + ipoib_ud_buf_size(priv->max_ib_mtu), > DMA_FROM_DEVICE); > dev_kfree_skb_any(rx_req->skb); > rx_req->skb = NULL; > diff -urpN > ipoib/ipoib_main.c /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_main.c > --- ipoib/ipoib_main.c 2008-01-21 14:43:39.000000000 -0500 > +++ /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_main.c > 2008-01-22 15:39:44.000000000 -0500 > @@ -193,7 +193,7 @@ static int ipoib_change_mtu(struct net_d > return 0; > } > > - if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) > + if (new_mtu > ipoib_ud_mtu(priv->max_ib_mtu)) > return -EINVAL; > > priv->admin_mtu = new_mtu; > @@ -978,7 +978,7 @@ static void ipoib_setup(struct net_devic > dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; > > /* MTU will be reset when mcast join happens */ > - dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; > + dev->mtu = ipoib_ud_mtu(priv->max_ib_mtu); > priv->mcast_mtu = priv->admin_mtu = dev->mtu; > > memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); > @@ -1112,6 +1112,7 @@ static struct net_device *ipoib_add_port > struct ib_device *hca, u8 port) > { > struct ipoib_dev_priv *priv; > + struct ib_port_attr attr; > int result = -ENOMEM; > > priv = ipoib_intf_alloc(format); > @@ -1120,6 +1121,13 @@ static struct net_device *ipoib_add_port > > SET_NETDEV_DEV(priv->dev, hca->dma_device); > > + if (!ib_query_port(hca, port, &attr)) > + priv->max_ib_mtu = ib_mtu_enum_to_int(attr.max_mtu); > + else { > + printk(KERN_WARNING "%s: ib_query_port %d failed\n", > + hca->name, port); > + goto device_init_failed; > + } > result = ib_query_pkey(hca, port, 0, &priv->pkey); > if (result) { > printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", > diff -urpN > ipoib/ipoib_multicast.c /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > --- ipoib/ipoib_multicast.c 2008-01-10 13:13:12.000000000 -0500 > +++ /home/shirley/ipoib-test/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-01-22 15:42:10.000000000 -0500 > @@ -567,9 +567,7 @@ void ipoib_mcast_join_task(struct work_s > return; > } > > - priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - > - IPOIB_ENCAP_LEN; > - > + priv->mcast_mtu = > ipoib_ud_mtu(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); > if (!ipoib_cm_admin_enabled(dev)) > dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From swise at opengridcomputing.com Wed Jan 23 07:39:19 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Wed, 23 Jan 2008 09:39:19 -0600 Subject: [ofa-general] Re: [PATCH] Non-supported functions should return NULL when returning pointers In-Reply-To: <20080115235027.GB31543@opengridcomputing.com> References: <20080115235027.GB31543@opengridcomputing.com> Message-ID: <47975FA7.7060508@opengridcomputing.com> Applied. Thanks, Steve. From mashirle at us.ibm.com Tue Jan 22 22:03:36 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Tue, 22 Jan 2008 22:03:36 -0800 Subject: [ofa-general] [RFC] IPoIB UD 4K MTU support In-Reply-To: <47974E77.1020606@dev.mellanox.co.il> References: <1201025321.756.33.camel@localhost.localdomain> <47974E77.1020606@dev.mellanox.co.il> Message-ID: <1201068216.756.53.camel@localhost.localdomain> On Wed, 2008-01-23 at 16:25 +0200, Dotan Barak wrote: > What will happen if the IB port supports 4KB MTU but the operational MTU > of this port is bellow this value? Thanks for your review Dotan. The purpose to query port is for receiving buffer allocation. First we don't need to reallocate buffer when the IPoIB MTU changes from 2K to 4K. Second we don't want to waste memory when the port only supports 2K MTU or less. As I said in previous email, it's IPoIB broadcast group MTU value determined the subnet link-MTU when the operational MTU of this port is below this value, IPoIB broadcast group MTU will be below this value. Have you tested the existing IPoIB with IPoIB broadcast MTU less than 2K, like set IPoIB broadcast MTU as 1K and Chasis port as 2K? If so, do you see any issues? I will test the senarios port MTU is 4K, IPoIB broadcast MTU is 2K. If there is any issue, it can be addressed by replacing priv->max_ib_mtu by priv->mcmember.mtu in ipoib_ud_buf_size() when broadcast group is presented. Thanks Shirley From prescott at hpc.ufl.edu Wed Jan 23 08:05:00 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Wed, 23 Jan 2008 11:05:00 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4787977E.509@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> Message-ID: <479765AC.1040600@hpc.ufl.edu> Steve Wise wrote: > Craig Prescott wrote: >> Steve Wise wrote: >>> Craig Prescott wrote: >>>> Steve Wise wrote: >>>>> >>>>> Craig Prescott wrote: >>>>>> >>>>>> The above call also emits a couple of messages >>>>>> into the listener's syslog now : >>>>>> >>>>>> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid >>>>>> 0x20 opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 opcode >>>>>> 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>> >>>>> This is an async event generated due to a failure processing a SQ >>>>> WR, I think. opcodes and status codes for iw_cxgb3 are in cxio_wr.h. >>>>> type 1 means it was an egress (SQ) failure >>>>> status 0x6 is a base/bounds violation, >>>>> but 14 seems incorrect. That's not a valid T3 opcode. ???? >>>>> >>>> >>>> Ok, thanks! I guess I'm not sure what to make of that yet, though. >>>> >>> >>> See where in iwch_accept_cr() the failure is happening. It doesn't >>> look like send_mpa_reply() is being called. >>> >> >> The ECONNRESET is coming from here in iwch_accept_cr(): >> >> ... >> /* wait for wr_ack */ >> wait_event(ep->com.waitq, ep->com.rpl_done); >> err = ep->com.rpl_err; >> ... >> >> Is that what you thought was happening? > > I don't know exactly what is going on! But the code above means that > the firmware never successfully sent the last streaming message (the > mpa-start reply) and never transitioned the connection into rdma mode. > And the async error might indicate that some WR was posted prior to > doing the rdma_accept() and that WR had problems. Ok. I'm sorry for such a slow response. > a few questions: > > What firmware are you running? ethtool -i will tell you. [root at tebow1 ~]# ethtool -i eth4 driver: cxgb3 version: 1.0-ko firmware-version: T 5.0.0 TP 1.1.0 bus-info: 0000:86:00.0 > What ofed version exactly? OFED 1.3 daily from a few weeks back now: OFED-1.3-20080107-0942 > Does sdp post a SQ or RQ WR prior to doing the rdma_accept()? Can you > dump that work request? Maybe in iwch_post_send and iwch_post_recv, > dump the work request after it is built and before the code rings the > doorbell. You can dump it as 8B flits, and be sure an put the flits in > host byte order. See cxio_dump_wqe() in cxio_dbg.c... The following is the last work request seen before rdma_accept(): iwch_post_receive: Dumping built work request before ring_doorbell: iwch_post_receive: WQE ffff810241d59f80: 17c001008000000d iwch_post_receive: WQE ffff810241d59f88: 0000000000000000 iwch_post_receive: WQE ffff810241d59f90: 0000000000000001 iwch_post_receive: WQE ffff810241d59f98: 000002ff00000810 iwch_post_receive: WQE ffff810241d59fa0: 000000044eac6000 iwch_post_receive: WQE ffff810241d59fa8: 0000000000000000 iwch_post_receive: WQE ffff810241d59fb0: 0000000000000000 iwch_post_receive: WQE ffff810241d59fb8: 0000000000000000 iwch_post_receive: WQE ffff810241d59fc0: 0000000000000000 iwch_post_receive: WQE ffff810241d59fc8: 0000000000000000 iwch_post_receive: WQE ffff810241d59fd0: 0000000000000000 iwch_post_receive: WQE ffff810241d59fd8: 0000000000000000 iwch_post_receive: WQE ffff810241d59fe0: 0000000000000000 iwch_post_receive: returning 0 This comes from sdp_init_qp(), via sdp_connect_handler(). There are a total of 64 work requests (all from iwch_post_receive()) generated while the netserver is trying to handle the RDMA_CM_EVENT_CONNECT_REQUEST. Can you help me decode the above work request? Thanks, Craig From mashirle at us.ibm.com Tue Jan 22 22:08:18 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Tue, 22 Jan 2008 22:08:18 -0800 Subject: [ofa-general] [RFC] IPoIB UD 4K MTU support In-Reply-To: <1201102346.6925.215.camel@mtls03> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> Message-ID: <1201068498.756.59.camel@localhost.localdomain> Hello Eli, On Wed, 2008-01-23 at 17:32 +0200, Eli Cohen wrote: > can you send a path to the git tree this patch is based on? I used OFED-1.3 RC1 tree + SG patch from Pradeep. I can recreate the patch per your request with some changes to address Dotan's concern. Which git tree would you like me to build on? thanks Shirley From felix at chelsio.com Wed Jan 23 08:57:23 2008 From: felix at chelsio.com (Felix Marti) Date: Wed, 23 Jan 2008 08:57:23 -0800 Subject: [ofa-general] SDP and iWARP References: <4783A5B0.6040603@hpc.ufl.edu><4783B3F5.20600@opengridcomputing.com><4783BDD5.7000702@hpc.ufl.edu><4783C326.3070306@opengridcomputing.com><478634A5.3080204@hpc.ufl.edu><47863794.9080709@opengridcomputing.com><47865A4A.4070603@hpc.ufl.edu><47865E5B.4030607@opengridcomputing.com><4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> Message-ID: <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> Hi Craig, Can you please dump not only the last, but the last 4 WRs? Thanks, felix > -----Original Message----- > From: general-bounces at lists.openfabrics.org [mailto:general- > bounces at lists.openfabrics.org] On Behalf Of Craig Prescott > Sent: Wednesday, January 23, 2008 8:05 AM > To: Steve Wise > Cc: general at lists.openfabrics.org > Subject: Re: [ofa-general] SDP and iWARP > > Steve Wise wrote: > > Craig Prescott wrote: > >> Steve Wise wrote: > >>> Craig Prescott wrote: > >>>> Steve Wise wrote: > >>>>> > >>>>> Craig Prescott wrote: > >>>>>> > >>>>>> The above call also emits a couple of messages > >>>>>> into the listener's syslog now : > >>>>>> > >>>>>> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid > >>>>>> 0x20 opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 > >>>>>> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 > opcode > >>>>>> 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 > >>>>>> > >>>>> This is an async event generated due to a failure processing a SQ > >>>>> WR, I think. opcodes and status codes for iw_cxgb3 are in > cxio_wr.h. > >>>>> type 1 means it was an egress (SQ) failure > >>>>> status 0x6 is a base/bounds violation, > >>>>> but 14 seems incorrect. That's not a valid T3 opcode. ???? > >>>>> > >>>> > >>>> Ok, thanks! I guess I'm not sure what to make of that yet, > though. > >>>> > >>> > >>> See where in iwch_accept_cr() the failure is happening. It doesn't > >>> look like send_mpa_reply() is being called. > >>> > >> > >> The ECONNRESET is coming from here in iwch_accept_cr(): > >> > >> ... > >> /* wait for wr_ack */ > >> wait_event(ep->com.waitq, ep->com.rpl_done); > >> err = ep->com.rpl_err; > >> ... > >> > >> Is that what you thought was happening? > > > > I don't know exactly what is going on! But the code above means that > > the firmware never successfully sent the last streaming message (the > > mpa-start reply) and never transitioned the connection into rdma > mode. > > And the async error might indicate that some WR was posted prior to > > doing the rdma_accept() and that WR had problems. > > Ok. I'm sorry for such a slow response. > > > a few questions: > > > > What firmware are you running? ethtool -i will tell you. > > [root at tebow1 ~]# ethtool -i eth4 > driver: cxgb3 > version: 1.0-ko > firmware-version: T 5.0.0 TP 1.1.0 > bus-info: 0000:86:00.0 > > > What ofed version exactly? > > OFED 1.3 daily from a few weeks back now: OFED-1.3-20080107-0942 > > > Does sdp post a SQ or RQ WR prior to doing the rdma_accept()? Can > you > > dump that work request? Maybe in iwch_post_send and iwch_post_recv, > > dump the work request after it is built and before the code rings the > > doorbell. You can dump it as 8B flits, and be sure an put the flits > in > > host byte order. See cxio_dump_wqe() in cxio_dbg.c... > > The following is the last work request seen before rdma_accept(): > > iwch_post_receive: Dumping built work request before ring_doorbell: > iwch_post_receive: WQE ffff810241d59f80: 17c001008000000d > iwch_post_receive: WQE ffff810241d59f88: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f90: 0000000000000001 > iwch_post_receive: WQE ffff810241d59f98: 000002ff00000810 > iwch_post_receive: WQE ffff810241d59fa0: 000000044eac6000 > iwch_post_receive: WQE ffff810241d59fa8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fb0: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fb8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fc0: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fc8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fd0: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fd8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fe0: 0000000000000000 > iwch_post_receive: returning 0 > > This comes from sdp_init_qp(), via sdp_connect_handler(). > There are a total of 64 work requests (all from > iwch_post_receive()) generated while the netserver is > trying to handle the RDMA_CM_EVENT_CONNECT_REQUEST. > > Can you help me decode the above work request? > > Thanks, > Craig > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib- > general From prescott at hpc.ufl.edu Wed Jan 23 08:59:14 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Wed, 23 Jan 2008 11:59:14 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> References: <4783A5B0.6040603@hpc.ufl.edu><4783B3F5.20600@opengridcomputing.com><4783BDD5.7000702@hpc.ufl.edu><4783C326.3070306@opengridcomputing.com><478634A5.3080204@hpc.ufl.edu><47863794.9080709@opengridcomputing.com><47865A4A.4070603@hpc.ufl.edu><47865E5B.4030607@opengridcomputing.com><4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> Message-ID: <47977262.1060906@hpc.ufl.edu> Hi Felix; Here are the last 4 WRs: ... Entering iwch_post_receive iwch_post_receive: Dumping built work request before ring_doorbell: iwch_post_receive: WQE ffff810241d59e00: 17c001008000000d iwch_post_receive: WQE ffff810241d59e08: 0000000000000000 iwch_post_receive: WQE ffff810241d59e10: 0000000000000001 iwch_post_receive: WQE ffff810241d59e18: 000002ff00000810 iwch_post_receive: WQE ffff810241d59e20: 000000044eac3000 iwch_post_receive: WQE ffff810241d59e28: 0000000000000000 iwch_post_receive: WQE ffff810241d59e30: 0000000000000000 iwch_post_receive: WQE ffff810241d59e38: 0000000000000000 iwch_post_receive: WQE ffff810241d59e40: 0000000000000000 iwch_post_receive: WQE ffff810241d59e48: 0000000000000000 iwch_post_receive: WQE ffff810241d59e50: 0000000000000000 iwch_post_receive: WQE ffff810241d59e58: 0000000000000000 iwch_post_receive: WQE ffff810241d59e60: 0000000000000000 iwch_post_receive: returning 0 Entering iwch_post_receive iwch_post_receive: Dumping built work request before ring_doorbell: iwch_post_receive: WQE ffff810241d59e80: 17c001008000000d iwch_post_receive: WQE ffff810241d59e88: 0000000000000000 iwch_post_receive: WQE ffff810241d59e90: 0000000000000001 iwch_post_receive: WQE ffff810241d59e98: 000002ff00000810 iwch_post_receive: WQE ffff810241d59ea0: 000000044eac4000 iwch_post_receive: WQE ffff810241d59ea8: 0000000000000000 iwch_post_receive: WQE ffff810241d59eb0: 0000000000000000 iwch_post_receive: WQE ffff810241d59eb8: 0000000000000000 iwch_post_receive: WQE ffff810241d59ec0: 0000000000000000 iwch_post_receive: WQE ffff810241d59ec8: 0000000000000000 iwch_post_receive: WQE ffff810241d59ed0: 0000000000000000 iwch_post_receive: WQE ffff810241d59ed8: 0000000000000000 iwch_post_receive: WQE ffff810241d59ee0: 0000000000000000 iwch_post_receive: returning 0 Entering iwch_post_receive iwch_post_receive: Dumping built work request before ring_doorbell: iwch_post_receive: WQE ffff810241d59f00: 17c001008000000d iwch_post_receive: WQE ffff810241d59f08: 0000000000000000 iwch_post_receive: WQE ffff810241d59f10: 0000000000000001 iwch_post_receive: WQE ffff810241d59f18: 000002ff00000810 iwch_post_receive: WQE ffff810241d59f20: 000000044eac5000 iwch_post_receive: WQE ffff810241d59f28: 0000000000000000 iwch_post_receive: WQE ffff810241d59f30: 0000000000000000 iwch_post_receive: WQE ffff810241d59f38: 0000000000000000 iwch_post_receive: WQE ffff810241d59f40: 0000000000000000 iwch_post_receive: WQE ffff810241d59f48: 0000000000000000 iwch_post_receive: WQE ffff810241d59f50: 0000000000000000 iwch_post_receive: WQE ffff810241d59f58: 0000000000000000 iwch_post_receive: WQE ffff810241d59f60: 0000000000000000 iwch_post_receive: returning 0 Entering iwch_post_receive iwch_post_receive: Dumping built work request before ring_doorbell: iwch_post_receive: WQE ffff810241d59f80: 17c001008000000d iwch_post_receive: WQE ffff810241d59f88: 0000000000000000 iwch_post_receive: WQE ffff810241d59f90: 0000000000000001 iwch_post_receive: WQE ffff810241d59f98: 000002ff00000810 iwch_post_receive: WQE ffff810241d59fa0: 000000044eac6000 iwch_post_receive: WQE ffff810241d59fa8: 0000000000000000 iwch_post_receive: WQE ffff810241d59fb0: 0000000000000000 iwch_post_receive: WQE ffff810241d59fb8: 0000000000000000 iwch_post_receive: WQE ffff810241d59fc0: 0000000000000000 iwch_post_receive: WQE ffff810241d59fc8: 0000000000000000 iwch_post_receive: WQE ffff810241d59fd0: 0000000000000000 iwch_post_receive: WQE ffff810241d59fd8: 0000000000000000 iwch_post_receive: WQE ffff810241d59fe0: 0000000000000000 iwch_post_receive: returning 0 Thanks, Craig Felix Marti wrote: > Hi Craig, > > Can you please dump not only the last, but the last 4 WRs? > > Thanks, > felix > >> -----Original Message----- >> From: general-bounces at lists.openfabrics.org [mailto:general- >> bounces at lists.openfabrics.org] On Behalf Of Craig Prescott >> Sent: Wednesday, January 23, 2008 8:05 AM >> To: Steve Wise >> Cc: general at lists.openfabrics.org >> Subject: Re: [ofa-general] SDP and iWARP >> >> Steve Wise wrote: >>> Craig Prescott wrote: >>>> Steve Wise wrote: >>>>> Craig Prescott wrote: >>>>>> Steve Wise wrote: >>>>>>> Craig Prescott wrote: >>>>>>>> The above call also emits a couple of messages >>>>>>>> into the listener's syslog now : >>>>>>>> >>>>>>>> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid >>>>>>>> 0x20 opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>>>> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 >> opcode >>>>>>>> 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>>>> >>>>>>> This is an async event generated due to a failure processing a > SQ >>>>>>> WR, I think. opcodes and status codes for iw_cxgb3 are in >> cxio_wr.h. >>>>>>> type 1 means it was an egress (SQ) failure >>>>>>> status 0x6 is a base/bounds violation, >>>>>>> but 14 seems incorrect. That's not a valid T3 opcode. ???? >>>>>>> >>>>>> Ok, thanks! I guess I'm not sure what to make of that yet, >> though. >>>>> See where in iwch_accept_cr() the failure is happening. It > doesn't >>>>> look like send_mpa_reply() is being called. >>>>> >>>> The ECONNRESET is coming from here in iwch_accept_cr(): >>>> >>>> ... >>>> /* wait for wr_ack */ >>>> wait_event(ep->com.waitq, ep->com.rpl_done); >>>> err = ep->com.rpl_err; >>>> ... >>>> >>>> Is that what you thought was happening? >>> I don't know exactly what is going on! But the code above means > that >>> the firmware never successfully sent the last streaming message (the >>> mpa-start reply) and never transitioned the connection into rdma >> mode. >>> And the async error might indicate that some WR was posted prior to >>> doing the rdma_accept() and that WR had problems. >> Ok. I'm sorry for such a slow response. >> >>> a few questions: >>> >>> What firmware are you running? ethtool -i will tell you. >> [root at tebow1 ~]# ethtool -i eth4 >> driver: cxgb3 >> version: 1.0-ko >> firmware-version: T 5.0.0 TP 1.1.0 >> bus-info: 0000:86:00.0 >> >>> What ofed version exactly? >> OFED 1.3 daily from a few weeks back now: OFED-1.3-20080107-0942 >> >>> Does sdp post a SQ or RQ WR prior to doing the rdma_accept()? Can >> you >>> dump that work request? Maybe in iwch_post_send and iwch_post_recv, >>> dump the work request after it is built and before the code rings > the >>> doorbell. You can dump it as 8B flits, and be sure an put the flits >> in >>> host byte order. See cxio_dump_wqe() in cxio_dbg.c... >> The following is the last work request seen before rdma_accept(): >> >> iwch_post_receive: Dumping built work request before ring_doorbell: >> iwch_post_receive: WQE ffff810241d59f80: 17c001008000000d >> iwch_post_receive: WQE ffff810241d59f88: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f90: 0000000000000001 >> iwch_post_receive: WQE ffff810241d59f98: 000002ff00000810 >> iwch_post_receive: WQE ffff810241d59fa0: 000000044eac6000 >> iwch_post_receive: WQE ffff810241d59fa8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fb0: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fb8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fc0: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fc8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fd0: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fd8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fe0: 0000000000000000 >> iwch_post_receive: returning 0 >> >> This comes from sdp_init_qp(), via sdp_connect_handler(). >> There are a total of 64 work requests (all from >> iwch_post_receive()) generated while the netserver is >> trying to handle the RDMA_CM_EVENT_CONNECT_REQUEST. >> >> Can you help me decode the above work request? >> >> Thanks, >> Craig >> >> >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib- >> general From dwfultoolsm at fultools.it Wed Jan 23 09:34:18 2008 From: dwfultoolsm at fultools.it (Crawford Darby) Date: Wed, 23 Jan 2008 18:34:18 +0100 Subject: [ofa-general] Don't miss this unique chance Message-ID: <953329597.53648640312313@fultools.it> We are grateful to all our devoted customers, and to show our appreciation CanadianPharmacy introduced really amazing seasonal discounts. Only during the New Year period - all the products from really impressive selection for a half price.Try our service and you will get deep-discounted quality products delivered fast and discreetly directly to your doorstep. CanadianPharmacy is famous for the level of service and confidentiality. No scamming, no frauds. You get 12 bonus pills with any order over $300.Thanks for being our customer. http://geocities.com/danial.reese/Thank You for Your time and for your attention Yours faithfully, Crawford Darby -------------- next part -------------- An HTML attachment was scrubbed... URL: From hnguyen at linux.vnet.ibm.com Wed Jan 23 09:45:27 2008 From: hnguyen at linux.vnet.ibm.com (Hoang-Nam Nguyen) Date: Wed, 23 Jan 2008 18:45:27 +0100 Subject: [ofa-general] [PATCH 0/3] ofed-1.3-rc3 IB/ehca: upstream patches required for ofed-1.3 release Message-ID: <200801231845.27946.hnguyen@linux.vnet.ibm.com> Hello Tziporet and Vlad! This is a set of upstream patches, which we sent previously to Roland and have been queued by him for 2.6.25, and backport patches for rhel4.5 and rhel4.6 due to the introduced changes. 1/3: upstream patches, see its heading for more detail 2/3: backport patch for rhel4.5 3/3: backport patch for rhel4.6 Those patches should be built cleanly against Vlad's git tree. Please apply if they are ok. For completeness, we actually need two patches that are in 2.6.24-rc6-8, but not in current ofed build yet. I do assume that we/ofed will switch to 2.6.24 release next, so that there is no need for us to do extra work here. Could you please confirm that? Thanks Nam From hnguyen at linux.vnet.ibm.com Wed Jan 23 09:46:35 2008 From: hnguyen at linux.vnet.ibm.com (Hoang-Nam Nguyen) Date: Wed, 23 Jan 2008 18:46:35 +0100 Subject: [ofa-general] [PATCH 1/3] ofed-1.3-rc3 IB/ehca: upstream patches (2.6.25) Message-ID: <200801231846.36045.hnguyen@linux.vnet.ibm.com> IB/ehca: set of patches queued for 2.6.25 and needed for ofed-1.3 release 0001: Add missing spaces in the middle of format strings 0002: Forward event client-reregister-required to registered clients 0003: Use round_jiffies() for EQ polling timer 0004: Remove CQ-QP-link before destroying QP in error path of create_qp() 0005: Define array to store SMI/GSI QPs 0006: Add "port connection autodetect mode" 0007: Prevent RDMA-related connection failures on some eHCA2 hardware Signed-off-by: Hoang-Nam Nguyen --- ...dd_missing_spaces_in_the_middle_of_format.patch | 59 +++ ..._Forward_event_client_reregister_required.patch | 55 +++ ...03_Use_round_jiffies_for_EQ_polling_timer.patch | 31 ++ ...04_Remove_CQ_QP_link_before_destroying_QP.patch | 35 ++ ...ca_0005_Define_array_to_store_SMI_GSI_QPs.patch | 59 +++ ..._0006_Add_port_connection_autodetect_mode.patch | 472 ++++++++++++++++++++ ..._Prevent_RDMA_related_connection_failures.patch | 276 ++++++++++++ 7 files changed, 987 insertions(+), 0 deletions(-) create mode 100644 kernel_patches/fixes/ehca_0001_Add_missing_spaces_in_the_middle_of_format.patch create mode 100644 kernel_patches/fixes/ehca_0002_Forward_event_client_reregister_required.patch create mode 100644 kernel_patches/fixes/ehca_0003_Use_round_jiffies_for_EQ_polling_timer.patch create mode 100644 kernel_patches/fixes/ehca_0004_Remove_CQ_QP_link_before_destroying_QP.patch create mode 100644 kernel_patches/fixes/ehca_0005_Define_array_to_store_SMI_GSI_QPs.patch create mode 100644 kernel_patches/fixes/ehca_0006_Add_port_connection_autodetect_mode.patch create mode 100644 kernel_patches/fixes/ehca_0007_Prevent_RDMA_related_connection_failures.patch diff --git a/kernel_patches/fixes/ehca_0001_Add_missing_spaces_in_the_middle_of_format.patch b/kernel_patches/fixes/ehca_0001_Add_missing_spaces_in_the_middle_of_format.patch new file mode 100644 index 0000000..ca13dd6 --- /dev/null +++ b/kernel_patches/fixes/ehca_0001_Add_missing_spaces_in_the_middle_of_format.patch @@ -0,0 +1,59 @@ +From 41c38ba27fb89140311cfa0b1258b1ccc88eea7b Mon Sep 17 00:00:00 2001 +From: root +Date: Tue, 22 Jan 2008 16:15:17 +0100 +Subject: [PATCH] IB/ehca: Add missing spaces in the middle of format strings. + +Signed-off-by: Joe Perches +Signed-off-by: Roland Dreier +--- + drivers/infiniband/hw/ehca/ehca_cq.c | 2 +- + drivers/infiniband/hw/ehca/ehca_qp.c | 6 +++--- + 2 files changed, 4 insertions(+), 4 deletions(-) + +diff --git a/drivers/infiniband/hw/ehca/ehca_cq.c b/drivers/infiniband/hw/ehca/ehca_cq.c +index 79c25f5..0467c15 100644 +--- a/drivers/infiniband/hw/ehca/ehca_cq.c ++++ b/drivers/infiniband/hw/ehca/ehca_cq.c +@@ -246,7 +246,7 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe, int comp_vector, + } else { + if (h_ret != H_PAGE_REGISTERED) { + ehca_err(device, "Registration of page failed " +- "ehca_cq=%p cq_num=%x h_ret=%li" ++ "ehca_cq=%p cq_num=%x h_ret=%li " + "counter=%i act_pages=%i", + my_cq, my_cq->cq_number, + h_ret, counter, param.act_pages); +diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c +index dd12668..04e711f 100644 +--- a/drivers/infiniband/hw/ehca/ehca_qp.c ++++ b/drivers/infiniband/hw/ehca/ehca_qp.c +@@ -858,7 +858,7 @@ struct ib_srq *ehca_create_srq(struct ib_pd *pd, + update_mask, + mqpcb, my_qp->galpas.kernel); + if (hret != H_SUCCESS) { +- ehca_err(pd->device, "Could not modify SRQ to INIT" ++ ehca_err(pd->device, "Could not modify SRQ to INIT " + "ehca_qp=%p qp_num=%x h_ret=%li", + my_qp, my_qp->real_qp_num, hret); + goto create_srq2; +@@ -872,7 +872,7 @@ struct ib_srq *ehca_create_srq(struct ib_pd *pd, + update_mask, + mqpcb, my_qp->galpas.kernel); + if (hret != H_SUCCESS) { +- ehca_err(pd->device, "Could not enable SRQ" ++ ehca_err(pd->device, "Could not enable SRQ " + "ehca_qp=%p qp_num=%x h_ret=%li", + my_qp, my_qp->real_qp_num, hret); + goto create_srq2; +@@ -886,7 +886,7 @@ struct ib_srq *ehca_create_srq(struct ib_pd *pd, + update_mask, + mqpcb, my_qp->galpas.kernel); + if (hret != H_SUCCESS) { +- ehca_err(pd->device, "Could not modify SRQ to RTR" ++ ehca_err(pd->device, "Could not modify SRQ to RTR " + "ehca_qp=%p qp_num=%x h_ret=%li", + my_qp, my_qp->real_qp_num, hret); + goto create_srq2; +-- +1.5.2 + diff --git a/kernel_patches/fixes/ehca_0002_Forward_event_client_reregister_required.patch b/kernel_patches/fixes/ehca_0002_Forward_event_client_reregister_required.patch new file mode 100644 index 0000000..04159e2 --- /dev/null +++ b/kernel_patches/fixes/ehca_0002_Forward_event_client_reregister_required.patch @@ -0,0 +1,55 @@ +From afe2f1d8e50933645608932bcbba7dd81144a96c Mon Sep 17 00:00:00 2001 +From: root +Date: Tue, 22 Jan 2008 16:19:03 +0100 +Subject: [PATCH] IB/ehca: Forward event client-reregister-required to registered clients + +This patch allows ehca to forward event client-reregister-required to +registered clients. One such event is generated by a switch eg. after +its reboot. + +Signed-off-by: Hoang-Nam Nguyen +Signed-off-by: Roland Dreier +--- + drivers/infiniband/hw/ehca/ehca_irq.c | 12 ++++++++++++ + 1 files changed, 12 insertions(+), 0 deletions(-) + +diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c +index 3f617b2..4c734ec 100644 +--- a/drivers/infiniband/hw/ehca/ehca_irq.c ++++ b/drivers/infiniband/hw/ehca/ehca_irq.c +@@ -62,6 +62,7 @@ + #define NEQE_PORT_NUMBER EHCA_BMASK_IBM( 8, 15) + #define NEQE_PORT_AVAILABILITY EHCA_BMASK_IBM(16, 16) + #define NEQE_DISRUPTIVE EHCA_BMASK_IBM(16, 16) ++#define NEQE_SPECIFIC_EVENT EHCA_BMASK_IBM(16, 23) + + #define ERROR_DATA_LENGTH EHCA_BMASK_IBM(52, 63) + #define ERROR_DATA_TYPE EHCA_BMASK_IBM( 0, 7) +@@ -354,6 +355,7 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe) + { + u8 ec = EHCA_BMASK_GET(NEQE_EVENT_CODE, eqe); + u8 port = EHCA_BMASK_GET(NEQE_PORT_NUMBER, eqe); ++ u8 spec_event; + + switch (ec) { + case 0x30: /* port availability change */ +@@ -394,6 +396,16 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe) + case 0x33: /* trace stopped */ + ehca_err(&shca->ib_device, "Traced stopped."); + break; ++ case 0x34: /* util async event */ ++ spec_event = EHCA_BMASK_GET(NEQE_SPECIFIC_EVENT, eqe); ++ if (spec_event == 0x80) /* client reregister required */ ++ dispatch_port_event(shca, port, ++ IB_EVENT_CLIENT_REREGISTER, ++ "client reregister req."); ++ else ++ ehca_warn(&shca->ib_device, "Unknown util async " ++ "event %x on port %x", spec_event, port); ++ break; + default: + ehca_err(&shca->ib_device, "Unknown event code: %x on %s.", + ec, shca->ib_device.name); +-- +1.5.2 + diff --git a/kernel_patches/fixes/ehca_0003_Use_round_jiffies_for_EQ_polling_timer.patch b/kernel_patches/fixes/ehca_0003_Use_round_jiffies_for_EQ_polling_timer.patch new file mode 100644 index 0000000..5fb3691 --- /dev/null +++ b/kernel_patches/fixes/ehca_0003_Use_round_jiffies_for_EQ_polling_timer.patch @@ -0,0 +1,31 @@ +From a8ed1e3c557c23e60a5bf4b2fe027f8453a255f4 Mon Sep 17 00:00:00 2001 +From: root +Date: Tue, 22 Jan 2008 16:20:37 +0100 +Subject: [PATCH] IB/ehca: Use round_jiffies() for EQ polling timer + +Use round_jiffies() to align ehca's 1-second timer with other timers +and potentially save power by sleeping cores for longer. + +Signed-off-by: Anton Blanchard +Acked-by: Hoang-Nam Nguyen +Signed-off-by: Roland Dreier +--- + drivers/infiniband/hw/ehca/ehca_main.c | 2 +- + 1 files changed, 1 insertions(+), 1 deletions(-) + +diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c +index 90d4334..63d2de3 100644 +--- a/drivers/infiniband/hw/ehca/ehca_main.c ++++ b/drivers/infiniband/hw/ehca/ehca_main.c +@@ -913,7 +913,7 @@ void ehca_poll_eqs(unsigned long data) + ehca_process_eq(shca, 0); + } + } +- mod_timer(&poll_eqs_timer, jiffies + HZ); ++ mod_timer(&poll_eqs_timer, round_jiffies(jiffies + HZ)); + spin_unlock(&shca_list_lock); + } + +-- +1.5.2 + diff --git a/kernel_patches/fixes/ehca_0004_Remove_CQ_QP_link_before_destroying_QP.patch b/kernel_patches/fixes/ehca_0004_Remove_CQ_QP_link_before_destroying_QP.patch new file mode 100644 index 0000000..31c9aa8 --- /dev/null +++ b/kernel_patches/fixes/ehca_0004_Remove_CQ_QP_link_before_destroying_QP.patch @@ -0,0 +1,35 @@ +From ebc0988b682e1fcc8d456b7e16ca94a02ced7e6a Mon Sep 17 00:00:00 2001 +From: root +Date: Tue, 22 Jan 2008 16:21:24 +0100 +Subject: [PATCH] IB/ehca: Remove CQ-QP-link before destroying QP in error path of create_qp() + +Signed-off-by: Hoang-Nam Nguyen +Signed-off-by: Roland Dreier +--- + drivers/infiniband/hw/ehca/ehca_qp.c | 5 ++++- + 1 files changed, 4 insertions(+), 1 deletions(-) + +diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c +index 04e711f..db910bc 100644 +--- a/drivers/infiniband/hw/ehca/ehca_qp.c ++++ b/drivers/infiniband/hw/ehca/ehca_qp.c +@@ -769,12 +769,15 @@ static struct ehca_qp *internal_create_qp( + if (ib_copy_to_udata(udata, &resp, sizeof resp)) { + ehca_err(pd->device, "Copy to udata failed"); + ret = -EINVAL; +- goto create_qp_exit4; ++ goto create_qp_exit5; + } + } + + return my_qp; + ++create_qp_exit5: ++ ehca_cq_unassign_qp(my_qp->send_cq, my_qp->real_qp_num); ++ + create_qp_exit4: + if (HAS_RQ(my_qp)) + ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); +-- +1.5.2 + diff --git a/kernel_patches/fixes/ehca_0005_Define_array_to_store_SMI_GSI_QPs.patch b/kernel_patches/fixes/ehca_0005_Define_array_to_store_SMI_GSI_QPs.patch new file mode 100644 index 0000000..23777de --- /dev/null +++ b/kernel_patches/fixes/ehca_0005_Define_array_to_store_SMI_GSI_QPs.patch @@ -0,0 +1,59 @@ +From 52bdcb2961257e1b1d4564a33c28ed4876453fb3 Mon Sep 17 00:00:00 2001 +From: root +Date: Tue, 22 Jan 2008 16:25:59 +0100 +Subject: [PATCH] IB/ehca: Define array to store SMI/GSI QPs + +Signed-off-by: Hoang-Nam Nguyen +Signed-off-by: Roland Dreier +--- + drivers/infiniband/hw/ehca/ehca_classes.h | 2 +- + drivers/infiniband/hw/ehca/ehca_main.c | 6 +++--- + 2 files changed, 4 insertions(+), 4 deletions(-) + +diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h +index 87f12d4..5c6d3fa 100644 +--- a/drivers/infiniband/hw/ehca/ehca_classes.h ++++ b/drivers/infiniband/hw/ehca/ehca_classes.h +@@ -94,7 +94,7 @@ struct ehca_sma_attr { + + struct ehca_sport { + struct ib_cq *ibcq_aqp1; +- struct ib_qp *ibqp_aqp1; ++ struct ib_qp *ibqp_sqp[2]; + enum ib_port_state port_state; + struct ehca_sma_attr saved_attr; + }; +diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c +index 63d2de3..18122c1 100644 +--- a/drivers/infiniband/hw/ehca/ehca_main.c ++++ b/drivers/infiniband/hw/ehca/ehca_main.c +@@ -498,7 +498,7 @@ static int ehca_create_aqp1(struct ehca_shca *shca, u32 port) + } + sport->ibcq_aqp1 = ibcq; + +- if (sport->ibqp_aqp1) { ++ if (sport->ibqp_sqp[IB_QPT_GSI]) { + ehca_err(&shca->ib_device, "AQP1 QP is already created."); + ret = -EPERM; + goto create_aqp1; +@@ -524,7 +524,7 @@ static int ehca_create_aqp1(struct ehca_shca *shca, u32 port) + ret = PTR_ERR(ibqp); + goto create_aqp1; + } +- sport->ibqp_aqp1 = ibqp; ++ sport->ibqp_sqp[IB_QPT_GSI] = ibqp; + + return 0; + +@@ -537,7 +537,7 @@ static int ehca_destroy_aqp1(struct ehca_sport *sport) + { + int ret; + +- ret = ib_destroy_qp(sport->ibqp_aqp1); ++ ret = ib_destroy_qp(sport->ibqp_sqp[IB_QPT_GSI]); + if (ret) { + ehca_gen_err("Cannot destroy AQP1 QP. ret=%i", ret); + return ret; +-- +1.5.2 + diff --git a/kernel_patches/fixes/ehca_0006_Add_port_connection_autodetect_mode.patch b/kernel_patches/fixes/ehca_0006_Add_port_connection_autodetect_mode.patch new file mode 100644 index 0000000..69bf0ab --- /dev/null +++ b/kernel_patches/fixes/ehca_0006_Add_port_connection_autodetect_mode.patch @@ -0,0 +1,472 @@ +From f75184dd4ac5ac23e05e888bccafbf3d633455fa Mon Sep 17 00:00:00 2001 +From: root +Date: Tue, 22 Jan 2008 16:27:10 +0100 +Subject: [PATCH] IB/ehca: Add "port connection autodetect mode" + +This patch enhances ehca with a capability to "autodetect" the ports +being connected physically. In order to utilize that function the +module option nr_ports must be set to -1 (default is 2 - two +ports). This feature is experimental and will made the default later. + +More detail: + +If the user connects only one port to the switch, current code requires + 1) port one to be connected and + 2) module option nr_ports=1 to be given. + +If autodetect is enabled, ehca will not wait at creation of the GSI QP +for the respective port to become active. Since firmware does not +accept modify_qp() while the port is down at initialization, we need +to cache all calls to modify_qp() for the SMI/GSI QP and just return a +good return code. + +When a port is activated and we get a PORT_ACTIVE event, we replay the +cached modify-qp() parms and re-trigger any posted recv WRs. Only then +do we forward the PORT_ACTIVE event to registered clients. + +The result of this autodetect patch is that all ports will be +accessible by the users. Depending on their respective cabling only +those ports that are connected properly will become operable. If a +user tries to modify a regular QP of a non-connected port, modify_qp() +will fail. Furthermore, ibv_devinfo should show the port state +accordingly. + +Note that this patch primarily improves the loading behaviour of +ehca. If the cable is removed while the driver is operating and +plugged in again, firmware will handle that properly by sending an +appropriate async event. + +Signed-off-by: Hoang-Nam Nguyen +Signed-off-by: Roland Dreier +--- + drivers/infiniband/hw/ehca/ehca_classes.h | 16 +++ + drivers/infiniband/hw/ehca/ehca_irq.c | 26 ++++- + drivers/infiniband/hw/ehca/ehca_iverbs.h | 2 + + drivers/infiniband/hw/ehca/ehca_main.c | 7 +- + drivers/infiniband/hw/ehca/ehca_qp.c | 159 ++++++++++++++++++++++++++++- + drivers/infiniband/hw/ehca/ehca_sqp.c | 6 +- + 6 files changed, 201 insertions(+), 15 deletions(-) + +diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h +index 5c6d3fa..997c3d1 100644 +--- a/drivers/infiniband/hw/ehca/ehca_classes.h ++++ b/drivers/infiniband/hw/ehca/ehca_classes.h +@@ -95,6 +95,10 @@ struct ehca_sma_attr { + struct ehca_sport { + struct ib_cq *ibcq_aqp1; + struct ib_qp *ibqp_sqp[2]; ++ /* lock to serialze modify_qp() calls for sqp in normal ++ * and irq path (when event PORT_ACTIVE is received first time) ++ */ ++ spinlock_t mod_sqp_lock; + enum ib_port_state port_state; + struct ehca_sma_attr saved_attr; + }; +@@ -141,6 +145,14 @@ enum ehca_ext_qp_type { + EQPT_SRQ = 3, + }; + ++/* struct to cache modify_qp()'s parms for GSI/SMI qp */ ++struct ehca_mod_qp_parm { ++ int mask; ++ struct ib_qp_attr attr; ++}; ++ ++#define EHCA_MOD_QP_PARM_MAX 4 ++ + struct ehca_qp { + union { + struct ib_qp ib_qp; +@@ -164,6 +176,9 @@ struct ehca_qp { + struct ehca_cq *recv_cq; + unsigned int sqerr_purgeflag; + struct hlist_node list_entries; ++ /* array to cache modify_qp()'s parms for GSI/SMI qp */ ++ struct ehca_mod_qp_parm *mod_qp_parm; ++ int mod_qp_parm_idx; + /* mmap counter for resources mapped into user space */ + u32 mm_count_squeue; + u32 mm_count_rqueue; +@@ -322,6 +337,7 @@ extern int ehca_static_rate; + extern int ehca_port_act_time; + extern int ehca_use_hp_mr; + extern int ehca_scaling_code; ++extern int ehca_nr_ports; + + struct ipzu_queue_resp { + u32 qe_size; /* queue entry size */ +diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c +index 4c734ec..863b34f 100644 +--- a/drivers/infiniband/hw/ehca/ehca_irq.c ++++ b/drivers/infiniband/hw/ehca/ehca_irq.c +@@ -356,17 +356,33 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe) + u8 ec = EHCA_BMASK_GET(NEQE_EVENT_CODE, eqe); + u8 port = EHCA_BMASK_GET(NEQE_PORT_NUMBER, eqe); + u8 spec_event; ++ struct ehca_sport *sport = &shca->sport[port - 1]; ++ unsigned long flags; + + switch (ec) { + case 0x30: /* port availability change */ + if (EHCA_BMASK_GET(NEQE_PORT_AVAILABILITY, eqe)) { +- shca->sport[port - 1].port_state = IB_PORT_ACTIVE; ++ int suppress_event; ++ /* replay modify_qp for sqps */ ++ spin_lock_irqsave(&sport->mod_sqp_lock, flags); ++ suppress_event = !sport->ibqp_sqp[IB_QPT_GSI]; ++ if (sport->ibqp_sqp[IB_QPT_SMI]) ++ ehca_recover_sqp(sport->ibqp_sqp[IB_QPT_SMI]); ++ if (!suppress_event) ++ ehca_recover_sqp(sport->ibqp_sqp[IB_QPT_GSI]); ++ spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); ++ ++ /* AQP1 was destroyed, ignore this event */ ++ if (suppress_event) ++ break; ++ ++ sport->port_state = IB_PORT_ACTIVE; + dispatch_port_event(shca, port, IB_EVENT_PORT_ACTIVE, + "is active"); + ehca_query_sma_attr(shca, port, +- &shca->sport[port - 1].saved_attr); ++ &sport->saved_attr); + } else { +- shca->sport[port - 1].port_state = IB_PORT_DOWN; ++ sport->port_state = IB_PORT_DOWN; + dispatch_port_event(shca, port, IB_EVENT_PORT_ERR, + "is inactive"); + } +@@ -380,11 +396,11 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe) + ehca_warn(&shca->ib_device, "disruptive port " + "%d configuration change", port); + +- shca->sport[port - 1].port_state = IB_PORT_DOWN; ++ sport->port_state = IB_PORT_DOWN; + dispatch_port_event(shca, port, IB_EVENT_PORT_ERR, + "is inactive"); + +- shca->sport[port - 1].port_state = IB_PORT_ACTIVE; ++ sport->port_state = IB_PORT_ACTIVE; + dispatch_port_event(shca, port, IB_EVENT_PORT_ACTIVE, + "is active"); + } else +diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h +index 5485799..c469bfd 100644 +--- a/drivers/infiniband/hw/ehca/ehca_iverbs.h ++++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h +@@ -200,4 +200,6 @@ void ehca_free_fw_ctrlblock(void *ptr); + #define ehca_free_fw_ctrlblock(ptr) free_page((unsigned long)(ptr)) + #endif + ++void ehca_recover_sqp(struct ib_qp *sqp); ++ + #endif +diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c +index 18122c1..6d247ad 100644 +--- a/drivers/infiniband/hw/ehca/ehca_main.c ++++ b/drivers/infiniband/hw/ehca/ehca_main.c +@@ -87,7 +87,8 @@ MODULE_PARM_DESC(hw_level, + "hardware level" + " (0: autosensing (default), 1: v. 0.20, 2: v. 0.21)"); + MODULE_PARM_DESC(nr_ports, +- "number of connected ports (default: 2)"); ++ "number of connected ports (-1: autodetect, 1: port one only, " ++ "2: two ports (default)"); + MODULE_PARM_DESC(use_hp_mr, + "high performance MRs (0: no (default), 1: yes)"); + MODULE_PARM_DESC(port_act_time, +@@ -675,7 +676,7 @@ static int __devinit ehca_probe(struct of_device *dev, + struct ehca_shca *shca; + const u64 *handle; + struct ib_pd *ibpd; +- int ret; ++ int ret, i; + + handle = of_get_property(dev->node, "ibm,hca-handle", NULL); + if (!handle) { +@@ -696,6 +697,8 @@ static int __devinit ehca_probe(struct of_device *dev, + return -ENOMEM; + } + mutex_init(&shca->modify_mutex); ++ for (i = 0; i < ARRAY_SIZE(shca->sport); i++) ++ spin_lock_init(&shca->sport[i].mod_sqp_lock); + + shca->ofdev = dev; + shca->ipz_hca_handle.handle = *handle; +diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c +index db910bc..53310f0 100644 +--- a/drivers/infiniband/hw/ehca/ehca_qp.c ++++ b/drivers/infiniband/hw/ehca/ehca_qp.c +@@ -729,12 +729,31 @@ static struct ehca_qp *internal_create_qp( + init_attr->cap.max_send_wr = parms.squeue.act_nr_wqes; + my_qp->init_attr = *init_attr; + ++ if (qp_type == IB_QPT_SMI || qp_type == IB_QPT_GSI) { ++ shca->sport[init_attr->port_num - 1].ibqp_sqp[qp_type] = ++ &my_qp->ib_qp; ++ if (ehca_nr_ports < 0) { ++ /* alloc array to cache subsequent modify qp parms ++ * for autodetect mode ++ */ ++ my_qp->mod_qp_parm = ++ kzalloc(EHCA_MOD_QP_PARM_MAX * ++ sizeof(*my_qp->mod_qp_parm), ++ GFP_KERNEL); ++ if (!my_qp->mod_qp_parm) { ++ ehca_err(pd->device, ++ "Could not alloc mod_qp_parm"); ++ goto create_qp_exit4; ++ } ++ } ++ } ++ + /* NOTE: define_apq0() not supported yet */ + if (qp_type == IB_QPT_GSI) { + h_ret = ehca_define_sqp(shca, my_qp, init_attr); + if (h_ret != H_SUCCESS) { + ret = ehca2ib_return_code(h_ret); +- goto create_qp_exit4; ++ goto create_qp_exit5; + } + } + +@@ -743,7 +762,7 @@ static struct ehca_qp *internal_create_qp( + if (ret) { + ehca_err(pd->device, + "Couldn't assign qp to send_cq ret=%i", ret); +- goto create_qp_exit4; ++ goto create_qp_exit5; + } + } + +@@ -769,15 +788,18 @@ static struct ehca_qp *internal_create_qp( + if (ib_copy_to_udata(udata, &resp, sizeof resp)) { + ehca_err(pd->device, "Copy to udata failed"); + ret = -EINVAL; +- goto create_qp_exit5; ++ goto create_qp_exit6; + } + } + + return my_qp; + +-create_qp_exit5: ++create_qp_exit6: + ehca_cq_unassign_qp(my_qp->send_cq, my_qp->real_qp_num); + ++create_qp_exit5: ++ kfree(my_qp->mod_qp_parm); ++ + create_qp_exit4: + if (HAS_RQ(my_qp)) + ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); +@@ -995,7 +1017,7 @@ static int internal_modify_qp(struct ib_qp *ibqp, + unsigned long flags = 0; + + /* do query_qp to obtain current attr values */ +- mqpcb = ehca_alloc_fw_ctrlblock(GFP_KERNEL); ++ mqpcb = ehca_alloc_fw_ctrlblock(GFP_ATOMIC); + if (!mqpcb) { + ehca_err(ibqp->device, "Could not get zeroed page for mqpcb " + "ehca_qp=%p qp_num=%x ", my_qp, ibqp->qp_num); +@@ -1183,6 +1205,8 @@ static int internal_modify_qp(struct ib_qp *ibqp, + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PRIM_P_KEY_IDX, 1); + } + if (attr_mask & IB_QP_PORT) { ++ struct ehca_sport *sport; ++ struct ehca_qp *aqp1; + if (attr->port_num < 1 || attr->port_num > shca->num_ports) { + ret = -EINVAL; + ehca_err(ibqp->device, "Invalid port=%x. " +@@ -1191,6 +1215,29 @@ static int internal_modify_qp(struct ib_qp *ibqp, + shca->num_ports); + goto modify_qp_exit2; + } ++ sport = &shca->sport[attr->port_num - 1]; ++ if (!sport->ibqp_sqp[IB_QPT_GSI]) { ++ /* should not occur */ ++ ret = -EFAULT; ++ ehca_err(ibqp->device, "AQP1 was not created for " ++ "port=%x", attr->port_num); ++ goto modify_qp_exit2; ++ } ++ aqp1 = container_of(sport->ibqp_sqp[IB_QPT_GSI], ++ struct ehca_qp, ib_qp); ++ if (ibqp->qp_type != IB_QPT_GSI && ++ ibqp->qp_type != IB_QPT_SMI && ++ aqp1->mod_qp_parm) { ++ /* ++ * firmware will reject this modify_qp() because ++ * port is not activated/initialized fully ++ */ ++ ret = -EFAULT; ++ ehca_warn(ibqp->device, "Couldn't modify qp port=%x: " ++ "either port is being activated (try again) " ++ "or cabling issue", attr->port_num); ++ goto modify_qp_exit2; ++ } + mqpcb->prim_phys_port = attr->port_num; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PRIM_PHYS_PORT, 1); + } +@@ -1470,6 +1517,8 @@ modify_qp_exit1: + int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, + struct ib_udata *udata) + { ++ struct ehca_shca *shca = container_of(ibqp->device, struct ehca_shca, ++ ib_device); + struct ehca_qp *my_qp = container_of(ibqp, struct ehca_qp, ib_qp); + struct ehca_pd *my_pd = container_of(my_qp->ib_qp.pd, struct ehca_pd, + ib_pd); +@@ -1482,9 +1531,100 @@ int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, + return -EINVAL; + } + ++ /* The if-block below caches qp_attr to be modified for GSI and SMI ++ * qps during the initialization by ib_mad. When the respective port ++ * is activated, ie we got an event PORT_ACTIVE, we'll replay the ++ * cached modify calls sequence, see ehca_recover_sqs() below. ++ * Why that is required: ++ * 1) If one port is connected, older code requires that port one ++ * to be connected and module option nr_ports=1 to be given by ++ * user, which is very inconvenient for end user. ++ * 2) Firmware accepts modify_qp() only if respective port has become ++ * active. Older code had a wait loop of 30sec create_qp()/ ++ * define_aqp1(), which is not appropriate in practice. This ++ * code now removes that wait loop, see define_aqp1(), and always ++ * reports all ports to ib_mad resp. users. Only activated ports ++ * will then usable for the users. ++ */ ++ if (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI) { ++ int port = my_qp->init_attr.port_num; ++ struct ehca_sport *sport = &shca->sport[port - 1]; ++ unsigned long flags; ++ spin_lock_irqsave(&sport->mod_sqp_lock, flags); ++ /* cache qp_attr only during init */ ++ if (my_qp->mod_qp_parm) { ++ struct ehca_mod_qp_parm *p; ++ if (my_qp->mod_qp_parm_idx >= EHCA_MOD_QP_PARM_MAX) { ++ ehca_err(&shca->ib_device, ++ "mod_qp_parm overflow state=%x port=%x" ++ " type=%x", attr->qp_state, ++ my_qp->init_attr.port_num, ++ ibqp->qp_type); ++ spin_unlock_irqrestore(&sport->mod_sqp_lock, ++ flags); ++ return -EINVAL; ++ } ++ p = &my_qp->mod_qp_parm[my_qp->mod_qp_parm_idx]; ++ p->mask = attr_mask; ++ p->attr = *attr; ++ my_qp->mod_qp_parm_idx++; ++ ehca_dbg(&shca->ib_device, ++ "Saved qp_attr for state=%x port=%x type=%x", ++ attr->qp_state, my_qp->init_attr.port_num, ++ ibqp->qp_type); ++ spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); ++ return 0; ++ } ++ spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); ++ } ++ + return internal_modify_qp(ibqp, attr, attr_mask, 0); + } + ++void ehca_recover_sqp(struct ib_qp *sqp) ++{ ++ struct ehca_qp *my_sqp = container_of(sqp, struct ehca_qp, ib_qp); ++ int port = my_sqp->init_attr.port_num; ++ struct ib_qp_attr attr; ++ struct ehca_mod_qp_parm *qp_parm; ++ int i, qp_parm_idx, ret; ++ unsigned long flags, wr_cnt; ++ ++ if (!my_sqp->mod_qp_parm) ++ return; ++ ehca_dbg(sqp->device, "SQP port=%x qp_num=%x", port, sqp->qp_num); ++ ++ qp_parm = my_sqp->mod_qp_parm; ++ qp_parm_idx = my_sqp->mod_qp_parm_idx; ++ for (i = 0; i < qp_parm_idx; i++) { ++ attr = qp_parm[i].attr; ++ ret = internal_modify_qp(sqp, &attr, qp_parm[i].mask, 0); ++ if (ret) { ++ ehca_err(sqp->device, "Could not modify SQP port=%x " ++ "qp_num=%x ret=%x", port, sqp->qp_num, ret); ++ goto free_qp_parm; ++ } ++ ehca_dbg(sqp->device, "SQP port=%x qp_num=%x in state=%x", ++ port, sqp->qp_num, attr.qp_state); ++ } ++ ++ /* re-trigger posted recv wrs */ ++ wr_cnt = my_sqp->ipz_rqueue.current_q_offset / ++ my_sqp->ipz_rqueue.qe_size; ++ if (wr_cnt) { ++ spin_lock_irqsave(&my_sqp->spinlock_r, flags); ++ hipz_update_rqa(my_sqp, wr_cnt); ++ spin_unlock_irqrestore(&my_sqp->spinlock_r, flags); ++ ehca_dbg(sqp->device, "doorbell port=%x qp_num=%x wr_cnt=%lx", ++ port, sqp->qp_num, wr_cnt); ++ } ++ ++free_qp_parm: ++ kfree(qp_parm); ++ /* this prevents subsequent calls to modify_qp() to cache qp_attr */ ++ my_sqp->mod_qp_parm = NULL; ++} ++ + int ehca_query_qp(struct ib_qp *qp, + struct ib_qp_attr *qp_attr, + int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr) +@@ -1772,6 +1912,7 @@ static int internal_destroy_qp(struct ib_device *dev, struct ehca_qp *my_qp, + struct ehca_shca *shca = container_of(dev, struct ehca_shca, ib_device); + struct ehca_pd *my_pd = container_of(my_qp->ib_qp.pd, struct ehca_pd, + ib_pd); ++ struct ehca_sport *sport = &shca->sport[my_qp->init_attr.port_num - 1]; + u32 cur_pid = current->tgid; + u32 qp_num = my_qp->real_qp_num; + int ret; +@@ -1818,6 +1959,14 @@ static int internal_destroy_qp(struct ib_device *dev, struct ehca_qp *my_qp, + port_num = my_qp->init_attr.port_num; + qp_type = my_qp->init_attr.qp_type; + ++ if (qp_type == IB_QPT_SMI || qp_type == IB_QPT_GSI) { ++ spin_lock_irqsave(&sport->mod_sqp_lock, flags); ++ kfree(my_qp->mod_qp_parm); ++ my_qp->mod_qp_parm = NULL; ++ shca->sport[port_num - 1].ibqp_sqp[qp_type] = NULL; ++ spin_unlock_irqrestore(&sport->mod_sqp_lock, flags); ++ } ++ + /* no support for IB_QPT_SMI yet */ + if (qp_type == IB_QPT_GSI) { + struct ib_event event; +diff --git a/drivers/infiniband/hw/ehca/ehca_sqp.c b/drivers/infiniband/hw/ehca/ehca_sqp.c +index f0792e5..79e72b2 100644 +--- a/drivers/infiniband/hw/ehca/ehca_sqp.c ++++ b/drivers/infiniband/hw/ehca/ehca_sqp.c +@@ -40,11 +40,8 @@ + */ + + +-#include +-#include + #include "ehca_classes.h" + #include "ehca_tools.h" +-#include "ehca_qes.h" + #include "ehca_iverbs.h" + #include "hcp_if.h" + +@@ -93,6 +90,9 @@ u64 ehca_define_sqp(struct ehca_shca *shca, + return H_PARAMETER; + } + ++ if (ehca_nr_ports < 0) /* autodetect mode */ ++ return H_SUCCESS; ++ + for (counter = 0; + shca->sport[port - 1].port_state != IB_PORT_ACTIVE && + counter < ehca_port_act_time; +-- +1.5.2 + diff --git a/kernel_patches/fixes/ehca_0007_Prevent_RDMA_related_connection_failures.patch b/kernel_patches/fixes/ehca_0007_Prevent_RDMA_related_connection_failures.patch new file mode 100644 index 0000000..a7744d2 --- /dev/null +++ b/kernel_patches/fixes/ehca_0007_Prevent_RDMA_related_connection_failures.patch @@ -0,0 +1,276 @@ +From a1f46cca6affc61b78050b85eec957b14fa6ea58 Mon Sep 17 00:00:00 2001 +From: root +Date: Tue, 22 Jan 2008 16:27:52 +0100 +Subject: [PATCH] IB/ehca: Prevent RDMA-related connection failures on some eHCA2 hardware + +Some HW revisions of eHCA2 may cause an RC connection to break if they +received RDMA Reads over that connection before. This can be +prevented by assuring that, after the first RDMA Read, the QP receives +a new RDMA Read every few million link packets. + +Include code into the driver that inserts an empty (size 0) RDMA Read +into the message stream every now and then if the consumer doesn't +post them frequently enough. + +Signed-off-by: Joachim Fenkes +Signed-off-by: Roland Dreier +--- + drivers/infiniband/hw/ehca/ehca_classes.h | 5 ++ + drivers/infiniband/hw/ehca/ehca_qp.c | 14 +++- + drivers/infiniband/hw/ehca/ehca_reqs.c | 112 ++++++++++++++++++++-------- + 3 files changed, 95 insertions(+), 36 deletions(-) + +diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h +index 997c3d1..8b76ac3 100644 +--- a/drivers/infiniband/hw/ehca/ehca_classes.h ++++ b/drivers/infiniband/hw/ehca/ehca_classes.h +@@ -183,6 +183,11 @@ struct ehca_qp { + u32 mm_count_squeue; + u32 mm_count_rqueue; + u32 mm_count_galpa; ++ /* unsolicited ack circumvention */ ++ int unsol_ack_circ; ++ int mtu_shift; ++ u32 message_count; ++ u32 packet_count; + }; + + #define IS_SRQ(qp) (qp->ext_type == EQPT_SRQ) +diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c +index 53310f0..e32f964 100644 +--- a/drivers/infiniband/hw/ehca/ehca_qp.c ++++ b/drivers/infiniband/hw/ehca/ehca_qp.c +@@ -592,10 +592,8 @@ static struct ehca_qp *internal_create_qp( + goto create_qp_exit1; + } + +- if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) +- parms.sigtype = HCALL_SIGT_EVERY; +- else +- parms.sigtype = HCALL_SIGT_BY_WQE; ++ /* Always signal by WQE so we can hide circ. WQEs */ ++ parms.sigtype = HCALL_SIGT_BY_WQE; + + /* UD_AV CIRCUMVENTION */ + max_send_sge = init_attr->cap.max_send_sge; +@@ -618,6 +616,10 @@ static struct ehca_qp *internal_create_qp( + parms.squeue.max_sge = max_send_sge; + parms.rqueue.max_sge = max_recv_sge; + ++ /* RC QPs need one more SWQE for unsolicited ack circumvention */ ++ if (qp_type == IB_QPT_RC) ++ parms.squeue.max_wr++; ++ + if (EHCA_BMASK_GET(HCA_CAP_MINI_QP, shca->hca_cap)) { + if (HAS_SQ(my_qp)) + ehca_determine_small_queue( +@@ -650,6 +652,8 @@ static struct ehca_qp *internal_create_qp( + parms.squeue.act_nr_sges = 1; + parms.rqueue.act_nr_sges = 1; + } ++ /* hide the extra WQE */ ++ parms.squeue.act_nr_wqes--; + break; + case IB_QPT_UD: + case IB_QPT_GSI: +@@ -1294,6 +1298,8 @@ static int internal_modify_qp(struct ib_qp *ibqp, + } + + if (attr_mask & IB_QP_PATH_MTU) { ++ /* store ld(MTU) */ ++ my_qp->mtu_shift = attr->path_mtu + 7; + mqpcb->path_mtu = attr->path_mtu; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PATH_MTU, 1); + } +diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c +index ea91360..3aacc8c 100644 +--- a/drivers/infiniband/hw/ehca/ehca_reqs.c ++++ b/drivers/infiniband/hw/ehca/ehca_reqs.c +@@ -50,6 +50,9 @@ + #include "hcp_if.h" + #include "hipz_fns.h" + ++/* in RC traffic, insert an empty RDMA READ every this many packets */ ++#define ACK_CIRC_THRESHOLD 2000000 ++ + static inline int ehca_write_rwqe(struct ipz_queue *ipz_rqueue, + struct ehca_wqe *wqe_p, + struct ib_recv_wr *recv_wr) +@@ -81,7 +84,7 @@ static inline int ehca_write_rwqe(struct ipz_queue *ipz_rqueue, + if (ehca_debug_level) { + ehca_gen_dbg("RECEIVE WQE written into ipz_rqueue=%p", + ipz_rqueue); +- ehca_dmp( wqe_p, 16*(6 + wqe_p->nr_of_data_seg), "recv wqe"); ++ ehca_dmp(wqe_p, 16*(6 + wqe_p->nr_of_data_seg), "recv wqe"); + } + + return 0; +@@ -135,7 +138,8 @@ static void trace_send_wr_ud(const struct ib_send_wr *send_wr) + + static inline int ehca_write_swqe(struct ehca_qp *qp, + struct ehca_wqe *wqe_p, +- const struct ib_send_wr *send_wr) ++ const struct ib_send_wr *send_wr, ++ int hidden) + { + u32 idx; + u64 dma_length; +@@ -176,7 +180,9 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, + + wqe_p->wr_flag = 0; + +- if (send_wr->send_flags & IB_SEND_SIGNALED) ++ if ((send_wr->send_flags & IB_SEND_SIGNALED || ++ qp->init_attr.sq_sig_type == IB_SIGNAL_ALL_WR) ++ && !hidden) + wqe_p->wr_flag |= WQE_WRFLAG_REQ_SIGNAL_COM; + + if (send_wr->opcode == IB_WR_SEND_WITH_IMM || +@@ -199,7 +205,7 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, + + wqe_p->destination_qp_number = send_wr->wr.ud.remote_qpn << 8; + wqe_p->local_ee_context_qkey = remote_qkey; +- if (!send_wr->wr.ud.ah) { ++ if (unlikely(!send_wr->wr.ud.ah)) { + ehca_gen_err("wr.ud.ah is NULL. qp=%p", qp); + return -EINVAL; + } +@@ -255,6 +261,15 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, + } /* eof idx */ + wqe_p->u.nud.atomic_1st_op_dma_len = dma_length; + ++ /* unsolicited ack circumvention */ ++ if (send_wr->opcode == IB_WR_RDMA_READ) { ++ /* on RDMA read, switch on and reset counters */ ++ qp->message_count = qp->packet_count = 0; ++ qp->unsol_ack_circ = 1; ++ } else ++ /* else estimate #packets */ ++ qp->packet_count += (dma_length >> qp->mtu_shift) + 1; ++ + break; + + default: +@@ -355,13 +370,49 @@ static inline void map_ib_wc_status(u32 cqe_status, + *wc_status = IB_WC_SUCCESS; + } + ++static inline int post_one_send(struct ehca_qp *my_qp, ++ struct ib_send_wr *cur_send_wr, ++ struct ib_send_wr **bad_send_wr, ++ int hidden) ++{ ++ struct ehca_wqe *wqe_p; ++ int ret; ++ u64 start_offset = my_qp->ipz_squeue.current_q_offset; ++ ++ /* get pointer next to free WQE */ ++ wqe_p = ipz_qeit_get_inc(&my_qp->ipz_squeue); ++ if (unlikely(!wqe_p)) { ++ /* too many posted work requests: queue overflow */ ++ if (bad_send_wr) ++ *bad_send_wr = cur_send_wr; ++ ehca_err(my_qp->ib_qp.device, "Too many posted WQEs " ++ "qp_num=%x", my_qp->ib_qp.qp_num); ++ return -ENOMEM; ++ } ++ /* write a SEND WQE into the QUEUE */ ++ ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, hidden); ++ /* ++ * if something failed, ++ * reset the free entry pointer to the start value ++ */ ++ if (unlikely(ret)) { ++ my_qp->ipz_squeue.current_q_offset = start_offset; ++ if (bad_send_wr) ++ *bad_send_wr = cur_send_wr; ++ ehca_err(my_qp->ib_qp.device, "Could not write WQE " ++ "qp_num=%x", my_qp->ib_qp.qp_num); ++ return -EINVAL; ++ } ++ ++ return 0; ++} ++ + int ehca_post_send(struct ib_qp *qp, + struct ib_send_wr *send_wr, + struct ib_send_wr **bad_send_wr) + { + struct ehca_qp *my_qp = container_of(qp, struct ehca_qp, ib_qp); + struct ib_send_wr *cur_send_wr; +- struct ehca_wqe *wqe_p; + int wqe_cnt = 0; + int ret = 0; + unsigned long flags; +@@ -369,37 +420,33 @@ int ehca_post_send(struct ib_qp *qp, + /* LOCK the QUEUE */ + spin_lock_irqsave(&my_qp->spinlock_s, flags); + ++ /* Send an empty extra RDMA read if: ++ * 1) there has been an RDMA read on this connection before ++ * 2) no RDMA read occurred for ACK_CIRC_THRESHOLD link packets ++ * 3) we can be sure that any previous extra RDMA read has been ++ * processed so we don't overflow the SQ ++ */ ++ if (unlikely(my_qp->unsol_ack_circ && ++ my_qp->packet_count > ACK_CIRC_THRESHOLD && ++ my_qp->message_count > my_qp->init_attr.cap.max_send_wr)) { ++ /* insert an empty RDMA READ to fix up the remote QP state */ ++ struct ib_send_wr circ_wr; ++ memset(&circ_wr, 0, sizeof(circ_wr)); ++ circ_wr.opcode = IB_WR_RDMA_READ; ++ post_one_send(my_qp, &circ_wr, NULL, 1); /* ignore retcode */ ++ wqe_cnt++; ++ ehca_dbg(qp->device, "posted circ wr qp_num=%x", qp->qp_num); ++ my_qp->message_count = my_qp->packet_count = 0; ++ } ++ + /* loop processes list of send reqs */ + for (cur_send_wr = send_wr; cur_send_wr != NULL; + cur_send_wr = cur_send_wr->next) { +- u64 start_offset = my_qp->ipz_squeue.current_q_offset; +- /* get pointer next to free WQE */ +- wqe_p = ipz_qeit_get_inc(&my_qp->ipz_squeue); +- if (unlikely(!wqe_p)) { +- /* too many posted work requests: queue overflow */ +- if (bad_send_wr) +- *bad_send_wr = cur_send_wr; +- if (wqe_cnt == 0) { +- ret = -ENOMEM; +- ehca_err(qp->device, "Too many posted WQEs " +- "qp_num=%x", qp->qp_num); +- } +- goto post_send_exit0; +- } +- /* write a SEND WQE into the QUEUE */ +- ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr); +- /* +- * if something failed, +- * reset the free entry pointer to the start value +- */ ++ ret = post_one_send(my_qp, cur_send_wr, bad_send_wr, 0); + if (unlikely(ret)) { +- my_qp->ipz_squeue.current_q_offset = start_offset; +- *bad_send_wr = cur_send_wr; +- if (wqe_cnt == 0) { +- ret = -EINVAL; +- ehca_err(qp->device, "Could not write WQE " +- "qp_num=%x", qp->qp_num); +- } ++ /* if one or more WQEs were successful, don't fail */ ++ if (wqe_cnt) ++ ret = 0; + goto post_send_exit0; + } + wqe_cnt++; +@@ -410,6 +457,7 @@ int ehca_post_send(struct ib_qp *qp, + post_send_exit0: + iosync(); /* serialize GAL register access */ + hipz_update_sqa(my_qp, wqe_cnt); ++ my_qp->message_count += wqe_cnt; + spin_unlock_irqrestore(&my_qp->spinlock_s, flags); + return ret; + } +-- +1.5.2 + -- 1.5.2 From hnguyen at linux.vnet.ibm.com Wed Jan 23 09:47:20 2008 From: hnguyen at linux.vnet.ibm.com (Hoang-Nam Nguyen) Date: Wed, 23 Jan 2008 18:47:20 +0100 Subject: [ofa-general] [PATCH 2/3] ofed-1.3-rc3 IB/ehca: backport patch for rhel4.5 Message-ID: <200801231847.20534.hnguyen@linux.vnet.ibm.com> IB/ehca: backport ehca's mmap-to-user-space for 2.6.9_U5 This patch is required since previous version is not patchable due to many changes in source code. That means no functional changes were made by this new version. Signed-off-by: Hoang-Nam Nguyen --- .../2.6.9_U5/backport_ehca_2_rhel45_umap.patch | 120 ++++++++++---------- 1 files changed, 58 insertions(+), 62 deletions(-) diff --git a/kernel_patches/backport/2.6.9_U5/backport_ehca_2_rhel45_umap.patch b/kernel_patches/backport/2.6.9_U5/backport_ehca_2_rhel45_umap.patch index fccef72..ff9b9e3 100644 --- a/kernel_patches/backport/2.6.9_U5/backport_ehca_2_rhel45_umap.patch +++ b/kernel_patches/backport/2.6.9_U5/backport_ehca_2_rhel45_umap.patch @@ -1,25 +1,22 @@ -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h 2007-09-26 03:41:13.000000000 -0700 -@@ -161,14 +161,13 @@ struct ehca_qp { - struct ipz_qp_handle ipz_qp_handle; - struct ehca_pfqp pf; - struct ib_qp_init_attr init_attr; -+ u64 uspace_squeue; -+ u64 uspace_rqueue; -+ u64 uspace_fwh; - struct ehca_cq *send_cq; - struct ehca_cq *recv_cq; - unsigned int sqerr_purgeflag; - struct hlist_node list_entries; +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h +--- a/drivers/infiniband/hw/ehca/ehca_classes.h 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_classes.h 2008-01-23 15:38:37.000000000 +0100 +@@ -179,10 +179,10 @@ struct ehca_qp { + /* array to cache modify_qp()'s parms for GSI/SMI qp */ + struct ehca_mod_qp_parm *mod_qp_parm; + int mod_qp_parm_idx; - /* mmap counter for resources mapped into user space */ - u32 mm_count_squeue; - u32 mm_count_rqueue; - u32 mm_count_galpa; - }; - - #define IS_SRQ(qp) (qp->ext_type == EQPT_SRQ) -@@ -189,6 +188,8 @@ struct ehca_cq { ++ /* mmap addr */ ++ u64 uspace_squeue; ++ u64 uspace_rqueue; ++ u64 uspace_fwh; + /* unsolicited ack circumvention */ + int unsol_ack_circ; + int mtu_shift; +@@ -208,6 +208,8 @@ struct ehca_cq { struct ipz_cq_handle ipz_cq_handle; struct ehca_pfcq pf; spinlock_t cb_lock; @@ -28,7 +25,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_ struct hlist_head qp_hashtab[QP_HASHTAB_LEN]; struct list_head entry; u32 nr_callbacks; /* #events assigned to cpu by scaling code */ -@@ -196,9 +197,6 @@ struct ehca_cq { +@@ -215,9 +217,6 @@ struct ehca_cq { wait_queue_head_t wait_completion; spinlock_t task_lock; u32 ownpid; @@ -38,7 +35,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_ }; enum ehca_mr_flag { -@@ -301,6 +299,20 @@ struct ehca_ucontext { +@@ -320,6 +319,20 @@ struct ehca_ucontext { struct ib_ucontext ib_ucontext; }; @@ -59,15 +56,15 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_ int ehca_init_pd_cache(void); void ehca_cleanup_pd_cache(void); int ehca_init_cq_cache(void); -@@ -326,6 +338,7 @@ extern int ehca_scaling_code; - extern int ehca_mr_largepage; +@@ -345,6 +358,7 @@ extern int ehca_scaling_code; + extern int ehca_nr_ports; struct ipzu_queue_resp { + u64 queue; /* points to first queue entry */ u32 qe_size; /* queue entry size */ u32 act_nr_of_sg; u32 queue_length; /* queue length allocated in bytes */ -@@ -338,6 +351,7 @@ struct ehca_create_cq_resp { +@@ -357,6 +371,7 @@ struct ehca_create_cq_resp { u32 cq_number; u32 token; struct ipzu_queue_resp ipz_queue; @@ -75,17 +72,17 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_ u32 fw_handle_ofs; u32 dummy; }; -@@ -353,6 +367,7 @@ struct ehca_create_qp_resp { - u32 fw_handle_ofs; +@@ -373,6 +388,7 @@ struct ehca_create_qp_resp { + u32 dummy; struct ipzu_queue_resp ipz_squeue; struct ipzu_queue_resp ipz_rqueue; + struct h_galpas galpas; }; struct ehca_alloc_cq_parms { -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c 2007-09-26 03:46:28.000000000 -0700 +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_cq.c b/drivers/infiniband/hw/ehca/ehca_cq.c +--- a/drivers/infiniband/hw/ehca/ehca_cq.c 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_cq.c 2008-01-23 15:38:37.000000000 +0100 @@ -273,6 +273,7 @@ struct ib_cq *ehca_create_cq(struct ib_d if (context) { struct ipz_queue *ipz_queue = &my_cq->ipz_queue; @@ -203,12 +200,12 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c rhel4u5_ofa_kerne ehca_err(cq->device, "Invalid caller pid=%x ownpid=%x", cur_pid, my_cq->ownpid); return -EINVAL; -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h 2007-09-24 06:02:36.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h 2007-09-26 03:25:27.000000000 -0700 -@@ -189,6 +189,14 @@ int ehca_mmap(struct ib_ucontext *contex - - void ehca_poll_eqs(unsigned long data); +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h +--- a/drivers/infiniband/hw/ehca/ehca_iverbs.h 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h 2008-01-23 15:38:37.000000000 +0100 +@@ -192,6 +192,14 @@ void ehca_poll_eqs(unsigned long data); + int ehca_calc_ipd(struct ehca_shca *shca, int port, + enum ib_rate path_rate, u32 *ipd); +int ehca_mmap_nopage(u64 foffset,u64 length,void **mapped, + struct vm_area_struct **vma); @@ -221,10 +218,10 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h rhel4u5_ofa_k #ifdef CONFIG_PPC_64K_PAGES void *ehca_alloc_fw_ctrlblock(gfp_t flags); void ehca_free_fw_ctrlblock(void *ptr); -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c 2007-09-27 03:14:39.000000000 -0700 -@@ -382,7 +382,7 @@ int ehca_init_device(struct ehca_shca *s +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c +--- a/drivers/infiniband/hw/ehca/ehca_main.c 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_main.c 2008-01-23 15:38:37.000000000 +0100 +@@ -394,7 +394,7 @@ int ehca_init_device(struct ehca_shca *s strlcpy(shca->ib_device.name, "ehca%d", IB_DEVICE_NAME_MAX); shca->ib_device.owner = THIS_MODULE; @@ -233,9 +230,9 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c rhel4u5_ofa_ker shca->ib_device.uverbs_cmd_mask = (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c 2007-09-26 10:29:16.000000000 -0700 +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c +--- a/drivers/infiniband/hw/ehca/ehca_qp.c 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_qp.c 2008-01-23 15:38:37.000000000 +0100 @@ -265,15 +265,19 @@ static inline int ibqptype2servicetype(e /* * init userspace queue info from ipz_queue data @@ -258,7 +255,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne } /* -@@ -748,6 +752,7 @@ static struct ehca_qp *internal_create_q +@@ -773,6 +777,7 @@ static struct ehca_qp *internal_create_q /* copy queues, galpa data to user space */ if (context && udata) { struct ehca_create_qp_resp resp; @@ -266,11 +263,10 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne memset(&resp, 0, sizeof(resp)); resp.qp_num = my_qp->real_qp_num; -@@ -756,23 +761,55 @@ static struct ehca_qp *internal_create_q - resp.ext_type = my_qp->ext_type; +@@ -782,22 +787,55 @@ static struct ehca_qp *internal_create_q resp.qkey = my_qp->qkey; resp.real_qp_num = my_qp->real_qp_num; -- + - if (HAS_SQ(my_qp)) - queue2resp(&resp.ipz_squeue, &my_qp->ipz_squeue); - if (HAS_RQ(my_qp)) @@ -284,7 +280,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne + if (ret) { + ehca_err(pd->device, + "Could not mmap squeue pages"); -+ goto create_qp_exit4; ++ goto create_qp_exit6; + } + } + if (HAS_RQ(my_qp)) { @@ -294,7 +290,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne + if (ret) { + ehca_err(pd->device, + "Could not mmap rqueue pages"); -+ goto create_qp_exit5; ++ goto create_qp_exit7; + } + } + /* fw_handle */ @@ -304,33 +300,33 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne + &vma); + if (ret) { + ehca_err(pd->device, "Could not mmap fw_handle"); -+ goto create_qp_exit6; ++ goto create_qp_exit8; + } + my_qp->uspace_fwh = (u64)resp.galpas.kernel.fw_handle; if (ib_copy_to_udata(udata, &resp, sizeof resp)) { ehca_err(pd->device, "Copy to udata failed"); ret = -EINVAL; -- goto create_qp_exit4; -+ goto create_qp_exit7; +- goto create_qp_exit6; ++ goto create_qp_exit9; } } return my_qp; -+create_qp_exit7: ++create_qp_exit9: + ehca_munmap(my_qp->uspace_fwh, EHCA_PAGESIZE); + -+create_qp_exit6: ++create_qp_exit8: + ehca_munmap(my_qp->uspace_rqueue, my_qp->ipz_rqueue.queue_length); + -+create_qp_exit5: ++create_qp_exit7: + ehca_munmap(my_qp->uspace_squeue, my_qp->ipz_squeue.queue_length); + - create_qp_exit4: - if (HAS_RQ(my_qp)) - ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); -@@ -1124,7 +1161,7 @@ static int internal_modify_qp(struct ib_ + create_qp_exit6: + ehca_cq_unassign_qp(my_qp->send_cq, my_qp->real_qp_num); + +@@ -1157,7 +1195,7 @@ static int internal_modify_qp(struct ib_ my_qp->qp_type == IB_QPT_SMI) && statetrans == IB_QPST_SQE2RTS) { /* mark next free wqe if kernel */ @@ -339,7 +335,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne struct ehca_wqe *wqe; /* lock send queue */ spin_lock_irqsave(&my_qp->spinlock_s, flags); -@@ -1777,18 +1814,11 @@ static int internal_destroy_qp(struct ib +@@ -1927,18 +1965,11 @@ static int internal_destroy_qp(struct ib enum ib_qp_type qp_type; unsigned long flags; @@ -363,7 +359,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne } if (my_qp->send_cq) { -@@ -1805,6 +1835,24 @@ static int internal_destroy_qp(struct ib +@@ -1955,6 +1986,24 @@ static int internal_destroy_qp(struct ib idr_remove(&ehca_qp_idr, my_qp->token); write_unlock_irqrestore(&ehca_qp_idr_lock, flags); @@ -388,9 +384,9 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne h_ret = hipz_h_destroy_qp(shca->ipz_hca_handle, my_qp); if (h_ret != H_SUCCESS) { ehca_err(dev, "hipz_h_destroy_qp() failed h_ret=%li " -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_uverbs.c rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_uverbs.c ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_uverbs.c 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_uverbs.c 2007-09-26 06:42:19.000000000 -0700 +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_uverbs.c b/drivers/infiniband/hw/ehca/ehca_uverbs.c +--- a/drivers/infiniband/hw/ehca/ehca_uverbs.c 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_uverbs.c 2008-01-23 15:38:37.000000000 +0100 @@ -68,184 +68,104 @@ int ehca_dealloc_ucontext(struct ib_ucon return 0; } -- 1.5.2 From hnguyen at linux.vnet.ibm.com Wed Jan 23 09:47:45 2008 From: hnguyen at linux.vnet.ibm.com (Hoang-Nam Nguyen) Date: Wed, 23 Jan 2008 18:47:45 +0100 Subject: [ofa-general] [PATCH 3/3] ofed-1.3-rc3 IB/ehca: backport patch for rhel4.6 Message-ID: <200801231847.45491.hnguyen@linux.vnet.ibm.com> IB/ehca: backport ehca's mmap-to-user-space for 2.6.9_U6 This patch is required since previous version is not patchable due to many changes in source code. That means no functional changes were made by this new version. Signed-off-by: Hoang-Nam Nguyen --- .../2.6.9_U6/backport_ehca_2_rhel45_umap.patch | 120 ++++++++++---------- 1 files changed, 58 insertions(+), 62 deletions(-) diff --git a/kernel_patches/backport/2.6.9_U6/backport_ehca_2_rhel45_umap.patch b/kernel_patches/backport/2.6.9_U6/backport_ehca_2_rhel45_umap.patch index fccef72..ff9b9e3 100644 --- a/kernel_patches/backport/2.6.9_U6/backport_ehca_2_rhel45_umap.patch +++ b/kernel_patches/backport/2.6.9_U6/backport_ehca_2_rhel45_umap.patch @@ -1,25 +1,22 @@ -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h 2007-09-26 03:41:13.000000000 -0700 -@@ -161,14 +161,13 @@ struct ehca_qp { - struct ipz_qp_handle ipz_qp_handle; - struct ehca_pfqp pf; - struct ib_qp_init_attr init_attr; -+ u64 uspace_squeue; -+ u64 uspace_rqueue; -+ u64 uspace_fwh; - struct ehca_cq *send_cq; - struct ehca_cq *recv_cq; - unsigned int sqerr_purgeflag; - struct hlist_node list_entries; +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h +--- a/drivers/infiniband/hw/ehca/ehca_classes.h 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_classes.h 2008-01-23 15:38:37.000000000 +0100 +@@ -179,10 +179,10 @@ struct ehca_qp { + /* array to cache modify_qp()'s parms for GSI/SMI qp */ + struct ehca_mod_qp_parm *mod_qp_parm; + int mod_qp_parm_idx; - /* mmap counter for resources mapped into user space */ - u32 mm_count_squeue; - u32 mm_count_rqueue; - u32 mm_count_galpa; - }; - - #define IS_SRQ(qp) (qp->ext_type == EQPT_SRQ) -@@ -189,6 +188,8 @@ struct ehca_cq { ++ /* mmap addr */ ++ u64 uspace_squeue; ++ u64 uspace_rqueue; ++ u64 uspace_fwh; + /* unsolicited ack circumvention */ + int unsol_ack_circ; + int mtu_shift; +@@ -208,6 +208,8 @@ struct ehca_cq { struct ipz_cq_handle ipz_cq_handle; struct ehca_pfcq pf; spinlock_t cb_lock; @@ -28,7 +25,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_ struct hlist_head qp_hashtab[QP_HASHTAB_LEN]; struct list_head entry; u32 nr_callbacks; /* #events assigned to cpu by scaling code */ -@@ -196,9 +197,6 @@ struct ehca_cq { +@@ -215,9 +217,6 @@ struct ehca_cq { wait_queue_head_t wait_completion; spinlock_t task_lock; u32 ownpid; @@ -38,7 +35,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_ }; enum ehca_mr_flag { -@@ -301,6 +299,20 @@ struct ehca_ucontext { +@@ -320,6 +319,20 @@ struct ehca_ucontext { struct ib_ucontext ib_ucontext; }; @@ -59,15 +56,15 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_ int ehca_init_pd_cache(void); void ehca_cleanup_pd_cache(void); int ehca_init_cq_cache(void); -@@ -326,6 +338,7 @@ extern int ehca_scaling_code; - extern int ehca_mr_largepage; +@@ -345,6 +358,7 @@ extern int ehca_scaling_code; + extern int ehca_nr_ports; struct ipzu_queue_resp { + u64 queue; /* points to first queue entry */ u32 qe_size; /* queue entry size */ u32 act_nr_of_sg; u32 queue_length; /* queue length allocated in bytes */ -@@ -338,6 +351,7 @@ struct ehca_create_cq_resp { +@@ -357,6 +371,7 @@ struct ehca_create_cq_resp { u32 cq_number; u32 token; struct ipzu_queue_resp ipz_queue; @@ -75,17 +72,17 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_classes.h rhel4u5_ofa_ u32 fw_handle_ofs; u32 dummy; }; -@@ -353,6 +367,7 @@ struct ehca_create_qp_resp { - u32 fw_handle_ofs; +@@ -373,6 +388,7 @@ struct ehca_create_qp_resp { + u32 dummy; struct ipzu_queue_resp ipz_squeue; struct ipzu_queue_resp ipz_rqueue; + struct h_galpas galpas; }; struct ehca_alloc_cq_parms { -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c 2007-09-26 03:46:28.000000000 -0700 +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_cq.c b/drivers/infiniband/hw/ehca/ehca_cq.c +--- a/drivers/infiniband/hw/ehca/ehca_cq.c 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_cq.c 2008-01-23 15:38:37.000000000 +0100 @@ -273,6 +273,7 @@ struct ib_cq *ehca_create_cq(struct ib_d if (context) { struct ipz_queue *ipz_queue = &my_cq->ipz_queue; @@ -203,12 +200,12 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_cq.c rhel4u5_ofa_kerne ehca_err(cq->device, "Invalid caller pid=%x ownpid=%x", cur_pid, my_cq->ownpid); return -EINVAL; -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h 2007-09-24 06:02:36.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h 2007-09-26 03:25:27.000000000 -0700 -@@ -189,6 +189,14 @@ int ehca_mmap(struct ib_ucontext *contex - - void ehca_poll_eqs(unsigned long data); +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h +--- a/drivers/infiniband/hw/ehca/ehca_iverbs.h 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h 2008-01-23 15:38:37.000000000 +0100 +@@ -192,6 +192,14 @@ void ehca_poll_eqs(unsigned long data); + int ehca_calc_ipd(struct ehca_shca *shca, int port, + enum ib_rate path_rate, u32 *ipd); +int ehca_mmap_nopage(u64 foffset,u64 length,void **mapped, + struct vm_area_struct **vma); @@ -221,10 +218,10 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_iverbs.h rhel4u5_ofa_k #ifdef CONFIG_PPC_64K_PAGES void *ehca_alloc_fw_ctrlblock(gfp_t flags); void ehca_free_fw_ctrlblock(void *ptr); -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c 2007-09-27 03:14:39.000000000 -0700 -@@ -382,7 +382,7 @@ int ehca_init_device(struct ehca_shca *s +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c +--- a/drivers/infiniband/hw/ehca/ehca_main.c 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_main.c 2008-01-23 15:38:37.000000000 +0100 +@@ -394,7 +394,7 @@ int ehca_init_device(struct ehca_shca *s strlcpy(shca->ib_device.name, "ehca%d", IB_DEVICE_NAME_MAX); shca->ib_device.owner = THIS_MODULE; @@ -233,9 +230,9 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_main.c rhel4u5_ofa_ker shca->ib_device.uverbs_cmd_mask = (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c 2007-09-26 10:29:16.000000000 -0700 +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c +--- a/drivers/infiniband/hw/ehca/ehca_qp.c 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_qp.c 2008-01-23 15:38:37.000000000 +0100 @@ -265,15 +265,19 @@ static inline int ibqptype2servicetype(e /* * init userspace queue info from ipz_queue data @@ -258,7 +255,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne } /* -@@ -748,6 +752,7 @@ static struct ehca_qp *internal_create_q +@@ -773,6 +777,7 @@ static struct ehca_qp *internal_create_q /* copy queues, galpa data to user space */ if (context && udata) { struct ehca_create_qp_resp resp; @@ -266,11 +263,10 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne memset(&resp, 0, sizeof(resp)); resp.qp_num = my_qp->real_qp_num; -@@ -756,23 +761,55 @@ static struct ehca_qp *internal_create_q - resp.ext_type = my_qp->ext_type; +@@ -782,22 +787,55 @@ static struct ehca_qp *internal_create_q resp.qkey = my_qp->qkey; resp.real_qp_num = my_qp->real_qp_num; -- + - if (HAS_SQ(my_qp)) - queue2resp(&resp.ipz_squeue, &my_qp->ipz_squeue); - if (HAS_RQ(my_qp)) @@ -284,7 +280,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne + if (ret) { + ehca_err(pd->device, + "Could not mmap squeue pages"); -+ goto create_qp_exit4; ++ goto create_qp_exit6; + } + } + if (HAS_RQ(my_qp)) { @@ -294,7 +290,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne + if (ret) { + ehca_err(pd->device, + "Could not mmap rqueue pages"); -+ goto create_qp_exit5; ++ goto create_qp_exit7; + } + } + /* fw_handle */ @@ -304,33 +300,33 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne + &vma); + if (ret) { + ehca_err(pd->device, "Could not mmap fw_handle"); -+ goto create_qp_exit6; ++ goto create_qp_exit8; + } + my_qp->uspace_fwh = (u64)resp.galpas.kernel.fw_handle; if (ib_copy_to_udata(udata, &resp, sizeof resp)) { ehca_err(pd->device, "Copy to udata failed"); ret = -EINVAL; -- goto create_qp_exit4; -+ goto create_qp_exit7; +- goto create_qp_exit6; ++ goto create_qp_exit9; } } return my_qp; -+create_qp_exit7: ++create_qp_exit9: + ehca_munmap(my_qp->uspace_fwh, EHCA_PAGESIZE); + -+create_qp_exit6: ++create_qp_exit8: + ehca_munmap(my_qp->uspace_rqueue, my_qp->ipz_rqueue.queue_length); + -+create_qp_exit5: ++create_qp_exit7: + ehca_munmap(my_qp->uspace_squeue, my_qp->ipz_squeue.queue_length); + - create_qp_exit4: - if (HAS_RQ(my_qp)) - ipz_queue_dtor(my_pd, &my_qp->ipz_rqueue); -@@ -1124,7 +1161,7 @@ static int internal_modify_qp(struct ib_ + create_qp_exit6: + ehca_cq_unassign_qp(my_qp->send_cq, my_qp->real_qp_num); + +@@ -1157,7 +1195,7 @@ static int internal_modify_qp(struct ib_ my_qp->qp_type == IB_QPT_SMI) && statetrans == IB_QPST_SQE2RTS) { /* mark next free wqe if kernel */ @@ -339,7 +335,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne struct ehca_wqe *wqe; /* lock send queue */ spin_lock_irqsave(&my_qp->spinlock_s, flags); -@@ -1777,18 +1814,11 @@ static int internal_destroy_qp(struct ib +@@ -1927,18 +1965,11 @@ static int internal_destroy_qp(struct ib enum ib_qp_type qp_type; unsigned long flags; @@ -363,7 +359,7 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne } if (my_qp->send_cq) { -@@ -1805,6 +1835,24 @@ static int internal_destroy_qp(struct ib +@@ -1955,6 +1986,24 @@ static int internal_destroy_qp(struct ib idr_remove(&ehca_qp_idr, my_qp->token); write_unlock_irqrestore(&ehca_qp_idr_lock, flags); @@ -388,9 +384,9 @@ diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_qp.c rhel4u5_ofa_kerne h_ret = hipz_h_destroy_qp(shca->ipz_hca_handle, my_qp); if (h_ret != H_SUCCESS) { ehca_err(dev, "hipz_h_destroy_qp() failed h_ret=%li " -diff -Nurp ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_uverbs.c rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_uverbs.c ---- ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_uverbs.c 2007-09-26 09:45:25.000000000 -0700 -+++ rhel4u5_ofa_kernel-1.3/drivers/infiniband/hw/ehca/ehca_uverbs.c 2007-09-26 06:42:19.000000000 -0700 +diff -Nurp a/drivers/infiniband/hw/ehca/ehca_uverbs.c b/drivers/infiniband/hw/ehca/ehca_uverbs.c +--- a/drivers/infiniband/hw/ehca/ehca_uverbs.c 2008-01-23 15:30:42.000000000 +0100 ++++ b/drivers/infiniband/hw/ehca/ehca_uverbs.c 2008-01-23 15:38:37.000000000 +0100 @@ -68,184 +68,104 @@ int ehca_dealloc_ucontext(struct ib_ucon return 0; } -- 1.5.2 From sashak at voltaire.com Wed Jan 23 10:37:12 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 23 Jan 2008 18:37:12 +0000 Subject: [ofa-general] [PATCH] infiniband-diags/saquery: code consolidation Message-ID: <20080123183712.GC11277@sashak.voltaire.com> Some code consolidation and cleanup. Signed-off-by: Sasha Khapyorsky --- infiniband-diags/src/saquery.c | 340 +++++++++++++++------------------------- 1 files changed, 130 insertions(+), 210 deletions(-) diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c index d16e604..1b9e2d3 100644 --- a/infiniband-diags/src/saquery.c +++ b/infiniband-diags/src/saquery.c @@ -189,9 +189,9 @@ print_node_record(ib_node_record_t *node_record) ); } -static void -print_path_record(ib_path_rec_t *p_pr) +static void dump_path_record(void *data) { + ib_path_rec_t *p_pr = data; printf("PathRecord dump:\n" "\t\tservice_id..............0x%016" PRIx64 "\n" "\t\tdgid....................0x%016" PRIx64 " : " @@ -257,11 +257,11 @@ sprint_gid(ib_gid_t *gid, char *str, size_t len) return (str); } -static void -print_class_port_info(ib_class_port_info_t *class_port_info) +static void dump_class_port_info(void *data) { size_t GID_STR_LEN = 256; char gid_str[GID_STR_LEN]; + ib_class_port_info_t *class_port_info = data; printf("SA ClassPortInfo:\n" "\t\tBase version.............%d\n" @@ -302,9 +302,9 @@ print_class_port_info(ib_class_port_info_t *class_port_info) ); } -static void -print_portinfo_record(ib_portinfo_record_t *p_pir) +static void dump_portinfo_record(void *data) { + ib_portinfo_record_t *p_pir = data; const ib_port_info_t * const p_pi = &p_pir->port_info; printf("PortInfoRecord dump:\n" @@ -390,11 +390,11 @@ print_multicast_member_record(ib_member_rec_t *p_mcmr) } } -static void -print_service_record(ib_service_record_t *p_sr) +static void dump_service_record(void *data) { char buf_service_key[35]; char buf_service_name[65]; + ib_service_record_t *p_sr = data; sprintf(buf_service_key, "0x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x", @@ -488,9 +488,9 @@ print_service_record(ib_service_record_t *p_sr) ); } -static void -print_inform_info_record(ib_inform_info_record_t *p_iir) +static void dump_inform_info_record(void *data) { + ib_inform_info_record_t *p_iir = data; uint32_t qpn; uint8_t resp_time_val; @@ -565,8 +565,9 @@ print_inform_info_record(ib_inform_info_record_t *p_iir) } } -static void dump_one_link_record(ib_link_record_t *lr) +static void dump_one_link_record(void *data) { + ib_link_record_t *lr = data; printf("LinkRecord dump:\n" "\t\tFromLID....................%u\n" "\t\tFromPort...................%u\n" @@ -576,8 +577,9 @@ static void dump_one_link_record(ib_link_record_t *lr) lr->to_port_num, cl_ntoh16(lr->to_lid)); } -static void dump_one_slvl_record(ib_slvl_table_record_t *slvl) +static void dump_one_slvl_record(void *data) { + ib_slvl_table_record_t *slvl = data; ib_slvl_table_t *t = &slvl->slvl_tbl; printf("SL2VLTableRecord dump:\n" "\t\tLID........................%u\n" @@ -597,8 +599,9 @@ static void dump_one_slvl_record(ib_slvl_table_record_t *slvl) ib_slvl_table_get(t, 14), ib_slvl_table_get(t, 15)); } -static void dump_one_vlarb_record(ib_vl_arb_table_record_t *vlarb) +static void dump_one_vlarb_record(void *data) { + ib_vl_arb_table_record_t *vlarb = data; ib_vl_arb_element_t *e = vlarb->vl_arb_tbl.vl_entry; int i; printf("VLArbTableRecord dump:\n" @@ -625,8 +628,9 @@ static void dump_one_vlarb_record(ib_vl_arb_table_record_t *vlarb) } } -static void dump_one_pkey_tbl_record(ib_pkey_table_record_t *pktr) +static void dump_one_pkey_tbl_record(void *data) { + ib_pkey_table_record_t *pktr = data; ib_net16_t *p = pktr->pkey_tbl.pkey_entry; int i; printf("PKeyTableRecord dump:\n" @@ -645,6 +649,15 @@ static void dump_one_pkey_tbl_record(ib_pkey_table_record_t *pktr) printf("\n"); } +static void dump_results(osmv_query_res_t *r, void (*dump_func)(void *)) +{ + int i; + for (i = 0; i < r->result_cnt; i++) { + void *data = osmv_get_query_svc_rec(r->p_result_madw, i); + dump_func(data); + } +} + static void return_mad(void) { @@ -815,121 +828,6 @@ get_issm_records(osm_bind_handle_t bind_handle, ib_net32_t capability_mask) 0); } -/* - * Get the LinkRecord(s) - */ -static ib_api_status_t get_link_records(osm_bind_handle_t bind_handle, - int from_lid, int from_port, - int to_lid, int to_port) -{ - ib_link_record_t lr; - ib_net64_t comp_mask; - - memset(&lr, 0, sizeof(lr)); - comp_mask = 0; - - if (from_lid > 0) { - lr.from_lid = cl_hton16(from_lid); - comp_mask |= IB_LR_COMPMASK_FROM_LID; - } - if (from_port >= 0) { - lr.from_port_num = from_port; - comp_mask |= IB_LR_COMPMASK_FROM_PORT; - } - if (to_lid > 0) { - lr.to_lid = cl_hton16(to_lid); - comp_mask |= IB_LR_COMPMASK_TO_LID; - } - if (to_port >= 0) { - lr.to_port_num = to_port; - comp_mask |= IB_LR_COMPMASK_TO_PORT; - } - - return get_any_records(bind_handle, IB_MAD_ATTR_LINK_RECORD, 0, - comp_mask, &lr, - ib_get_attr_offset(sizeof(ib_link_record_t)), 0); -} - -static ib_api_status_t get_slvl_records(osm_bind_handle_t bind_handle, - int lid, int in_port, int out_port) -{ - ib_slvl_table_record_t slvl; - ib_net64_t comp_mask; - - memset(&slvl, 0, sizeof(slvl)); - comp_mask = 0; - - if (lid > 0) { - slvl.lid = cl_hton16(lid); - comp_mask |= IB_SLVL_COMPMASK_LID; - } - if (in_port >= 0) { - slvl.in_port_num = in_port; - comp_mask |= IB_SLVL_COMPMASK_IN_PORT; - } - if (out_port >= 0) { - slvl.out_port_num = out_port; - comp_mask |= IB_SLVL_COMPMASK_OUT_PORT; - } - - return get_any_records(bind_handle, IB_MAD_ATTR_SLVL_RECORD, 0, - comp_mask, &slvl, - ib_get_attr_offset(sizeof(ib_slvl_table_record_t)), 0); -} - -static ib_api_status_t get_vlarb_records(osm_bind_handle_t bind_handle, - int lid, int port, int block) -{ - ib_vl_arb_table_record_t vlarb; - ib_net64_t comp_mask = 0; - - memset(&vlarb, 0, sizeof(vlarb)); - - if (lid > 0) { - vlarb.lid = cl_hton16(lid); - comp_mask |= IB_VLA_COMPMASK_LID; - } - if (port >= 0) { - vlarb.port_num = port; - comp_mask |= IB_VLA_COMPMASK_OUT_PORT; - } - if (block >= 0) { - vlarb.block_num = block; - comp_mask |= IB_VLA_COMPMASK_BLOCK; - } - - return get_any_records(bind_handle, IB_MAD_ATTR_VLARB_RECORD, 0, - comp_mask, &vlarb, - ib_get_attr_offset(sizeof(ib_vl_arb_table_record_t)), 0); -} - -static ib_api_status_t get_pkey_tbl_records(osm_bind_handle_t bind_handle, - int lid, int port, int block) -{ - ib_pkey_table_record_t pktr; - ib_net64_t comp_mask = 0; - - memset(&pktr, 0, sizeof(pktr)); - - if (lid > 0) { - pktr.lid = cl_hton16(lid); - comp_mask |= IB_PKEY_COMPMASK_LID; - } - if (port >= 0) { - pktr.port_num = port; - comp_mask |= IB_PKEY_COMPMASK_PORT; - } - if (block >= 0) { - pktr.block_num = block; - comp_mask |= IB_PKEY_COMPMASK_BLOCK; - } - - return get_any_records(bind_handle, IB_MAD_ATTR_PKEY_TBL_RECORD, 0, - comp_mask, &pktr, - ib_get_attr_offset(sizeof(pktr)), - OSM_DEFAULT_SM_KEY); -} - static ib_api_status_t print_node_records(osm_bind_handle_t bind_handle) { @@ -982,8 +880,6 @@ get_print_path_rec_lid(osm_bind_handle_t bind_handle, ib_net16_t src_lid, ib_net16_t dst_lid) { - int i = 0; - ib_path_rec_t *path_record = NULL; osmv_query_req_t req; osmv_lid_pair_t lid_pair; ib_api_status_t status; @@ -1013,10 +909,7 @@ get_print_path_rec_lid(osm_bind_handle_t bind_handle, return (result.status); } status = result.status; - for (i = 0; i < result.result_cnt; i++) { - path_record = osmv_get_query_path_rec(result.p_result_madw, i); - print_path_record(path_record); - } + dump_results(&result, dump_path_record); return_mad(); return (status); } @@ -1026,8 +919,6 @@ get_print_path_rec_gid(osm_bind_handle_t bind_handle, const ib_gid_t *src_gid, const ib_gid_t *dst_gid) { - int i = 0; - ib_path_rec_t *path_record = NULL; osmv_query_req_t req; osmv_gid_pair_t gid_pair; ib_api_status_t status; @@ -1057,10 +948,7 @@ get_print_path_rec_gid(osm_bind_handle_t bind_handle, return (result.status); } status = result.status; - for (i = 0; i < result.result_cnt; i++) { - path_record = osmv_get_query_path_rec(result.p_result_madw, i); - print_path_record(path_record); - } + dump_results(&result, dump_path_record); return_mad(); return (status); } @@ -1068,8 +956,6 @@ get_print_path_rec_gid(osm_bind_handle_t bind_handle, static ib_api_status_t get_print_class_port_info(osm_bind_handle_t bind_handle) { - int i = 0; - ib_class_port_info_t *class_port_info = NULL; osmv_query_req_t req; ib_api_status_t status; @@ -1095,10 +981,7 @@ get_print_class_port_info(osm_bind_handle_t bind_handle) return (result.status); } status = result.status; - for (i = 0; i < result.result_cnt; i++) { - class_port_info = (ib_class_port_info_t*)osmv_get_query_result(result.p_result_madw, i); - print_class_port_info(class_port_info); - } + dump_results(&result, dump_class_port_info); return_mad(); return (status); } @@ -1106,19 +989,14 @@ get_print_class_port_info(osm_bind_handle_t bind_handle) static ib_api_status_t print_path_records(osm_bind_handle_t bind_handle) { - int i = 0; - ib_path_rec_t *path_record = NULL; - ib_net16_t attr_offset = ib_get_attr_offset(sizeof(*path_record)); - ib_api_status_t status; + ib_net16_t attr_offset = ib_get_attr_offset(sizeof(ib_path_rec_t)); + ib_api_status_t status; status = get_all_records(bind_handle, IB_MAD_ATTR_PATH_RECORD, attr_offset, 0); if (status != IB_SUCCESS) return (status); - for (i = 0; i < result.result_cnt; i++) { - path_record = osmv_get_query_path_rec(result.p_result_madw, i); - print_path_record(path_record); - } + dump_results(&result, dump_path_record); return_mad(); return (status); } @@ -1126,8 +1004,6 @@ print_path_records(osm_bind_handle_t bind_handle) static ib_api_status_t print_portinfo_records(osm_bind_handle_t bind_handle) { - int i = 0; - ib_portinfo_record_t *portinfo_record = NULL; ib_api_status_t status; /* First, get IsSM records */ @@ -1136,10 +1012,7 @@ print_portinfo_records(osm_bind_handle_t bind_handle) return (status); printf("IsSM ports\n"); - for (i = 0; i < result.result_cnt; i++) { - portinfo_record = osmv_get_query_portinfo_rec(result.p_result_madw, i); - print_portinfo_record(portinfo_record); - } + dump_results(&result, dump_portinfo_record); return_mad(); /* Now, get IsSMdisabled records */ @@ -1148,10 +1021,7 @@ print_portinfo_records(osm_bind_handle_t bind_handle) return (status); printf("\nIsSMdisabled ports\n"); - for (i = 0; i < result.result_cnt; i++) { - portinfo_record = osmv_get_query_portinfo_rec(result.p_result_madw, i); - print_portinfo_record(portinfo_record); - } + dump_results(&result, dump_portinfo_record); return_mad(); return (status); @@ -1199,19 +1069,14 @@ return_mc: static ib_api_status_t print_service_records(osm_bind_handle_t bind_handle) { - int i = 0; - ib_service_record_t *service_record = NULL; - ib_net16_t attr_offset = ib_get_attr_offset(sizeof(*service_record)); - ib_api_status_t status; + ib_net16_t attr_offset = ib_get_attr_offset(sizeof(ib_service_record_t)); + ib_api_status_t status; status = get_all_records(bind_handle, IB_MAD_ATTR_SERVICE_RECORD, attr_offset, 0); if (status != IB_SUCCESS) return (status); - for (i = 0; i < result.result_cnt; i++) { - service_record = osmv_get_query_svc_rec(result.p_result_madw, i); - print_service_record(service_record); - } + dump_results(&result, dump_service_record); return_mad(); return (status); } @@ -1219,19 +1084,14 @@ print_service_records(osm_bind_handle_t bind_handle) static ib_api_status_t print_inform_info_records(osm_bind_handle_t bind_handle) { - int i = 0; - ib_inform_info_record_t *inform_info_record = NULL; - ib_net16_t attr_offset = ib_get_attr_offset(sizeof(*inform_info_record)); - ib_api_status_t status; + ib_net16_t attr_offset = ib_get_attr_offset(sizeof(ib_inform_info_record_t)); + ib_api_status_t status; status = get_all_records(bind_handle, IB_MAD_ATTR_INFORM_INFO_RECORD, attr_offset, 0); if (status != IB_SUCCESS) return (status); - for (i = 0; i < result.result_cnt; i++) { - inform_info_record = osmv_get_query_inform_info_rec(result.p_result_madw, i); - print_inform_info_record(inform_info_record); - } + dump_results(&result, dump_inform_info_record); return_mad(); return (status); } @@ -1239,8 +1099,8 @@ print_inform_info_records(osm_bind_handle_t bind_handle) static ib_api_status_t print_link_records(osm_bind_handle_t bind_handle, int argc, char *argv[]) { - int i; - ib_link_record_t *lr; + ib_link_record_t lr; + ib_net64_t comp_mask = 0; int from_lid = 0, to_lid = 0, from_port = -1, to_port = -1; ib_api_status_t status; @@ -1252,15 +1112,32 @@ print_link_records(osm_bind_handle_t bind_handle, int argc, char *argv[]) parse_lid_and_ports(bind_handle, argv[1], &to_lid, &to_port, NULL); - status = get_link_records(bind_handle, from_lid, from_port, - to_lid, to_port); + memset(&lr, 0, sizeof(lr)); + + if (from_lid > 0) { + lr.from_lid = cl_hton16(from_lid); + comp_mask |= IB_LR_COMPMASK_FROM_LID; + } + if (from_port >= 0) { + lr.from_port_num = from_port; + comp_mask |= IB_LR_COMPMASK_FROM_PORT; + } + if (to_lid > 0) { + lr.to_lid = cl_hton16(to_lid); + comp_mask |= IB_LR_COMPMASK_TO_LID; + } + if (to_port >= 0) { + lr.to_port_num = to_port; + comp_mask |= IB_LR_COMPMASK_TO_PORT; + } + + status = get_any_records(bind_handle, IB_MAD_ATTR_LINK_RECORD, 0, + comp_mask, &lr, + ib_get_attr_offset(sizeof(lr)), 0); if (status != IB_SUCCESS) return status; - for (i = 0; i < result.result_cnt; i++) { - lr = osmv_get_query_result(result.p_result_madw, i); - dump_one_link_record(lr); - } + dump_results(&result, dump_one_link_record); return_mad(); return status; } @@ -1269,8 +1146,8 @@ static int print_sl2vl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, int argc, char *argv[]) { - int i; - ib_slvl_table_record_t *slvl; + ib_slvl_table_record_t slvl; + ib_net64_t comp_mask = 0; int lid = 0, in_port = -1, out_port = -1; ib_api_status_t status; @@ -1278,14 +1155,28 @@ print_sl2vl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, parse_lid_and_ports(bind_handle, argv[0], &lid, &in_port, &out_port); - status = get_slvl_records(bind_handle, lid, in_port, out_port); + memset(&slvl, 0, sizeof(slvl)); + + if (lid > 0) { + slvl.lid = cl_hton16(lid); + comp_mask |= IB_SLVL_COMPMASK_LID; + } + if (in_port >= 0) { + slvl.in_port_num = in_port; + comp_mask |= IB_SLVL_COMPMASK_IN_PORT; + } + if (out_port >= 0) { + slvl.out_port_num = out_port; + comp_mask |= IB_SLVL_COMPMASK_OUT_PORT; + } + + status = get_any_records(bind_handle, IB_MAD_ATTR_SLVL_RECORD, 0, + comp_mask, &slvl, + ib_get_attr_offset(sizeof(slvl)), 0); if (status != IB_SUCCESS) return status; - for (i = 0; i < result.result_cnt; i++) { - slvl = osmv_get_query_result(result.p_result_madw, i); - dump_one_slvl_record(slvl); - } + dump_results(&result, dump_one_slvl_record); return_mad(); return status; } @@ -1294,8 +1185,8 @@ static int print_vlarb_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, int argc, char *argv[]) { - int i; - ib_vl_arb_table_record_t *vlarb; + ib_vl_arb_table_record_t vlarb; + ib_net64_t comp_mask = 0; int lid = 0, port = -1, block = -1; ib_api_status_t status; @@ -1303,14 +1194,28 @@ print_vlarb_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, parse_lid_and_ports(bind_handle, argv[0], &lid, &port, &block); - status = get_vlarb_records(bind_handle, lid, port, block); + memset(&vlarb, 0, sizeof(vlarb)); + + if (lid > 0) { + vlarb.lid = cl_hton16(lid); + comp_mask |= IB_VLA_COMPMASK_LID; + } + if (port >= 0) { + vlarb.port_num = port; + comp_mask |= IB_VLA_COMPMASK_OUT_PORT; + } + if (block >= 0) { + vlarb.block_num = block; + comp_mask |= IB_VLA_COMPMASK_BLOCK; + } + + status = get_any_records(bind_handle, IB_MAD_ATTR_VLARB_RECORD, 0, + comp_mask, &vlarb, + ib_get_attr_offset(sizeof(vlarb)), 0); if (status != IB_SUCCESS) return status; - for (i = 0; i < result.result_cnt; i++) { - vlarb = osmv_get_query_result(result.p_result_madw, i); - dump_one_vlarb_record(vlarb); - } + dump_results(&result, dump_one_vlarb_record); return_mad(); return status; } @@ -1319,8 +1224,8 @@ static int print_pkey_tbl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, int argc, char *argv[]) { - int i; - ib_pkey_table_record_t *pktr; + ib_pkey_table_record_t pktr; + ib_net64_t comp_mask = 0; int lid = 0, port = -1, block = -1; ib_api_status_t status; @@ -1328,14 +1233,29 @@ print_pkey_tbl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, parse_lid_and_ports(bind_handle, argv[0], &lid, &port, &block); - status = get_pkey_tbl_records(bind_handle, lid, port, block); + memset(&pktr, 0, sizeof(pktr)); + + if (lid > 0) { + pktr.lid = cl_hton16(lid); + comp_mask |= IB_PKEY_COMPMASK_LID; + } + if (port >= 0) { + pktr.port_num = port; + comp_mask |= IB_PKEY_COMPMASK_PORT; + } + if (block >= 0) { + pktr.block_num = block; + comp_mask |= IB_PKEY_COMPMASK_BLOCK; + } + + status = get_any_records(bind_handle, IB_MAD_ATTR_PKEY_TBL_RECORD, 0, + comp_mask, &pktr, + ib_get_attr_offset(sizeof(pktr)), + OSM_DEFAULT_SM_KEY); if (status != IB_SUCCESS) return status; - for (i = 0; i < result.result_cnt; i++) { - pktr = osmv_get_query_result(result.p_result_madw, i); - dump_one_pkey_tbl_record(pktr); - } + dump_results(&result, dump_one_pkey_tbl_record); return_mad(); return status; } -- 1.5.4.rc2.60.gb2e62 From dzieko at wcss.pl Wed Jan 23 11:47:28 2008 From: dzieko at wcss.pl (Pawel Dziekonski) Date: Wed, 23 Jan 2008 20:47:28 +0100 Subject: [ofa-general] Status of NFS-RDMA ? Message-ID: <20080123194728.GA10437@cefeid.wcss.wroc.pl> hi, I'm deploying a new cluster with infiniband and I would like to use NFS-RDMA over IB. I'm asking about it here because it is hardly possible to find up to date info about NFS-RDMA. Page http://nfs-rdma.sourceforge.net/Documents/README points to Tom Tucker's linux kernel git tree but provided git link is dead. My hardware will be Mellanox HBAa and Flextronics switch. I already know that NFS-RDMA client is in official kernel. What about server? Should I use OFED 1.2 or try 1.3? Should I use infiniband drivers from kernel or OFED? I'm ready for a big challenge and lots of testing. This would result in feedback to OFED/NFS-RDMA development. thanks in advance, Pawel -- Pawel Dziekonski Wroclaw Centre for Networking & Supercomputing, HPC Department Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl From jlentini at netapp.com Wed Jan 23 12:19:21 2008 From: jlentini at netapp.com (James Lentini) Date: Wed, 23 Jan 2008 15:19:21 -0500 (EST) Subject: [ofa-general] Status of NFS-RDMA ? In-Reply-To: <20080123194728.GA10437@cefeid.wcss.wroc.pl> References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> Message-ID: On Wed, 23 Jan 2008, Pawel Dziekonski wrote: > hi, > > I'm deploying a new cluster with infiniband and I would like to use > NFS-RDMA over IB. I'm asking about it here because it is hardly > possible to find up to date info about NFS-RDMA. Page > http://nfs-rdma.sourceforge.net/Documents/README points to Tom > Tucker's linux kernel git tree but provided git link is dead. The name of Tom's tree changed last Wednesday. I'll update the docs with the new address. The new URL is: git://git.linux-nfs.org/projects/tomtucker/xprt-switch-2.6.git > My hardware will be Mellanox HBAa and Flextronics switch. I already > know that NFS-RDMA client is in official kernel. What about server? > Should I use OFED 1.2 or try 1.3? Should I use infiniband drivers from > kernel or OFED? For now, you should get the NFS/RDMA server from Tom Tucker's git tree. We expect it to be merged with mainline linux in 2.6.25. > I'm ready for a big challenge and lots of testing. This would result > in feedback to OFED/NFS-RDMA development. Great! > thanks in advance, Pawel > -- > Pawel Dziekonski > Wroclaw Centre for Networking & Supercomputing, HPC Department > Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND > phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From weiny2 at llnl.gov Wed Jan 23 12:40:28 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 23 Jan 2008 12:40:28 -0800 Subject: [ofa-general] [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values Message-ID: <20080123124028.75708ab0.weiny2@llnl.gov> >From 020618d66bdcecba6f49bc7f48ae40485d657437 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Wed, 23 Jan 2008 12:39:05 -0800 Subject: [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values Signed-off-by: Ira K. Weiny --- opensm/opensm/osm_subnet.c | 13 +++++++++++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index c9b4d57..7b14e0c 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -1469,10 +1469,19 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) "leaf_head_of_queue_lifetime 0x%02x\n\n" "# Limit the maximal operational VLs\n" "max_op_vls %u\n\n" - "# Force link speed enable on switch links\n" + "# Force PortInfo:LinkSpeedEnabled on switch ports\n" "# If 0, don't modify PortInfo:LinkSpeedEnabled on switch port\n" "# Otherwise, use value for PortInfo:LinkSpeedEnabled on switch port\n" - "# Default is 15 (to set to PortInfo:LinkSpeedSupported)\n\n" + "# Values are (IB Spec 1.2, 14.2.5.6 Table 145 \"PortInfo\")\n" + "# 1: 2.5 Gbps\n" + "# 2: 5.0 Gbps\n" + "# 3: 2.5 or 5.0 Gbps\n" + "# 4: 10.0 Gbps\n" + "# 5: 2.5 or 10.0 Gbps\n" + "# 6: 5.0 or 10.0 Gbps\n" + "# 7: 2.5 or 5.0 or 10.0 Gbps\n" + "# 8-14: Reserved\n" + "# Default 15: set to PortInfo:LinkSpeedSupported\n\n" "force_link_speed %u\n\n" "# The subnet_timeout code that will be set for all the ports\n" "# The actual timeout is 4.096usec * 2^\n" -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-opensm-opensm-osm_subnet.c-add-a-comment-of-valid.patch Type: application/octet-stream Size: 1614 bytes Desc: not available URL: From kononov at dls.net Wed Jan 23 12:43:41 2008 From: kononov at dls.net (Roman Kononov) Date: Wed, 23 Jan 2008 14:43:41 -0600 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <1200988162.6925.170.camel@mtls03> References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> Message-ID: <4797A6FD.2080708@dls.net> On 2008-01-22 01:49, Eli Cohen wrote: > > I am sending two patches, one for userspace and one for kernel space > which solves this issue. > Thanks for the patches: http://lists.openfabrics.org/pipermail/general/2008-January/045259.html http://lists.openfabrics.org/pipermail/general/2008-January/045260.html They "fix" the test program I sent to the list earlier. It ran for many hours. Unfortunately, they did not fix my convoluted software. I applied the user space patch to libmthca-1.0.4 from OFED-1.2.5.4, and the kernel space patch to the 2.6.23.14 kernel. The user space patch did not want to apply one of the hunks (the one containing '- wbm();') to srq.c because the code being patched did not have the 'wbm();' line. This forced me to remove the '- wbm()' line from the patch file. Then I observed these errors, each occurred twice so far: - A send completion is out of order. It has a "future" wr_id value. - A receive completion has a "future" imm_data value. It looks exactly like if the sending side dropped a few IBV_WR_RDMA_WRITE_WITH_IMM requests. Or the sender sent them later (but my software does not known about them because it stops on the first error). Is it possible that with IBV_QPT_RC queues, the IBV_WR_RDMA_WRITE_WITH_IMM requests are completed out of order on either sending or receiving side? Roman From tziporet at dev.mellanox.co.il Wed Jan 23 13:03:33 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 23 Jan 2008 23:03:33 +0200 Subject: [ofa-general] Resizing SRQ In-Reply-To: <20080123151114.GF7336@minantech.com> References: <20080123142706.GD7336@minantech.com> <479755C3.3040205@dev.mellanox.co.il> <20080123151114.GF7336@minantech.com> Message-ID: <4797ABA5.4080903@mellanox.co.il> Gleb Natapov wrote: >> Resize SRQ is not supported by the mlx4 low level driver. >> >> > Well I guess it is not implemented for mthca too then since libmthca > function looks the same. Is there any plans to implement it? > > > We can do it for OFED 1.4 Tziporet From tziporet at dev.mellanox.co.il Wed Jan 23 13:10:49 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 23 Jan 2008 23:10:49 +0200 Subject: [ofa-general] Status of NFS-RDMA ? In-Reply-To: References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> Message-ID: <4797AD59.2000206@mellanox.co.il> James Lentini wrote: > On Wed, 23 Jan 2008, Pawel Dziekonski wrote: > > >> hi, >> >> I'm deploying a new cluster with infiniband and I would like to use >> NFS-RDMA over IB. I'm asking about it here because it is hardly >> possible to find up to date info about NFS-RDMA. Page >> http://nfs-rdma.sourceforge.net/Documents/README points to Tom >> Tucker's linux kernel git tree but provided git link is dead. >> > > The name of Tom's tree changed last Wednesday. I'll update the docs > with the new address. > > The new URL is: > > git://git.linux-nfs.org/projects/tomtucker/xprt-switch-2.6.git > > >> My hardware will be Mellanox HBAa and Flextronics switch. I already >> know that NFS-RDMA client is in official kernel. What about server? >> Should I use OFED 1.2 or try 1.3? Should I use infiniband drivers from >> kernel or OFED? >> > > Note that NFS-RDMA is not part of OFED now (will be only in 1.4) however Tom prepared backport patches that enable it to work on distros with OFED 1.2 You can look at http://www.mellanox.com/products/nfs_rdma_sdk.php for more info on this Tziporet From rdreier at cisco.com Wed Jan 23 13:14:53 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 23 Jan 2008 13:14:53 -0800 Subject: [ofa-general] Re: modify QP from SQE -> RTS (after query QP) fails without IBV_QP_CUR_STATE In-Reply-To: <47973EDC.5030504@dev.mellanox.co.il> (Dotan Barak's message of "Wed, 23 Jan 2008 15:19:24 +0200") References: <47973EDC.5030504@dev.mellanox.co.il> Message-ID: > When i tried to recover the QP and modify it's state to RTS i had to > use the flag IBV_QP_CUR_STATE > because the internal QP structure assumed that the QP state is RTS > (although query QP noticed otherwise). > > I created and tested 2 patches for both mthca and mlx4 drivers to > update the internal QP state when a > successful query QP was executed. > > Will you accept those patches? Sounds fine, although I think we need to take a little care about how we update the QP state (locking) now that there are two ways to do it. And obviiously it's hard to make a definitiev statement about patches I've never seen... From rdreier at cisco.com Wed Jan 23 13:23:34 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 23 Jan 2008 13:23:34 -0800 Subject: [ofa-general] Re: InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: <20080123055014.GA9256@infradead.org> (Christoph Hellwig's message of "Wed, 23 Jan 2008 05:50:14 +0000") References: <20080121100151.GD5333@infradead.org> <20080123055014.GA9256@infradead.org> Message-ID: > > be improved (sparse endianness annotation, > that's a blocker for sure. No new code that's not sparse clean, please. I have to disagree -- remember how strongly Linus pushed to merge hardware drivers early? the code in question will not run unless you have the hardware it drives, and that hardware is useless without the code. And *something* is better for users than nothing. Anyway I don't think the endianness annotations are that hard so it will probably show up soon. - R. From eli at dev.mellanox.co.il Wed Jan 23 13:54:52 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Wed, 23 Jan 2008 23:54:52 +0200 Subject: [ofa-general] [RFC] IPoIB UD 4K MTU support In-Reply-To: <1201068498.756.59.camel@localhost.localdomain> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> Message-ID: <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> On 1/23/08, Shirley Ma wrote: > > Hello Eli, > > On Wed, 2008-01-23 at 17:32 +0200, Eli Cohen wrote: > > can you send a path to the git tree this patch is based on? > > I used OFED-1.3 RC1 tree + SG patch from Pradeep. I can recreate the > patch per your request with some changes to address Dotan's concern. > Which git tree would you like me to build on? > > Roland's for-2.6.25 branch is fine. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eli at dev.mellanox.co.il Wed Jan 23 14:12:46 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 00:12:46 +0200 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <4797A6FD.2080708@dls.net> References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> <4797A6FD.2080708@dls.net> Message-ID: <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> Roman, if your software creates QPs whose receive queue size is not a power of two then you might experience weired problems as the patches I sent have a bug. I am sending a patch to be applied on top of the previous libmthca patch so you can try it (the same fix goes for the kernel code too). Tomorow I will send the fixed patches again. I apologize if the patch is badly formed. src/qp.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/qp.c b/src/qp.c index f3aa6c7..3c5f049 100644 --- a/src/qp.c +++ b/src/qp.c @@ -885,7 +885,7 @@ int mthca_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap, } else { for (i = 0; i < qp->rq.max; ++i) { next = get_recv_wqe(qp, i); - next->nda_op = htonl((((i + 1) & (qp->rq.max - 1)) << + next->nda_op = htonl((((i + 1) % qp->rq.max) << qp->rq.wqe_shift) | 1); } } On 1/23/08, Roman Kononov wrote: > > On 2008-01-22 01:49, Eli Cohen wrote: > > > > I am sending two patches, one for userspace and one for kernel space > > which solves this issue. > > > > Thanks for the patches: > http://lists.openfabrics.org/pipermail/general/2008-January/045259.html > http://lists.openfabrics.org/pipermail/general/2008-January/045260.html > > They "fix" the test program I sent to the list earlier. It ran for many > hours. Unfortunately, they did not fix my convoluted software. > > I applied the user space patch to libmthca-1.0.4 from OFED-1.2.5.4, and > the > kernel space patch to the 2.6.23.14 kernel. The user space patch did not > want to apply one of the hunks (the one containing '- wbm();') to srq.c > because the code being patched did not have the 'wbm();' line. This forced > me to remove the '- wbm()' line from the patch file. > > Then I observed these errors, each occurred twice so far: > - A send completion is out of order. It has a "future" wr_id value. > - A receive completion has a "future" imm_data value. > > It looks exactly like if the sending side dropped a few > IBV_WR_RDMA_WRITE_WITH_IMM requests. Or the sender sent them later (but my > software does not known about them because it stops on the first error). > > Is it possible that with IBV_QPT_RC queues, the IBV_WR_RDMA_WRITE_WITH_IMM > requests are completed out of order on either sending or receiving side? > > Roman > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mashirle at us.ibm.com Wed Jan 23 05:11:10 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 23 Jan 2008 05:11:10 -0800 Subject: [ofa-general] [PATCH] fix for an IPoIB compile error In-Reply-To: <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> Message-ID: <1201093871.9739.5.camel@localhost.localdomain> Hello, Here is a trivial patch to fix below IPoIB compile error when CM is not configured. The patch is against OFED-1.3 kernel git tree. drivers/infiniband/ulp/ipoib/ipoib_main.c: In function ‘ipoib_change_mtu’: drivers/infiniband/ulp/ipoib/ipoib_main.c:186: error: ‘struct ipoib_dev_priv’ has no member named ‘cm’ thanks Shirley Signed-off-by: Shirley Ma diff -urpN ipoib-patch1/ipoib_main.c ipoib/ipoib_main.c --- ipoib-patch1/ipoib_main.c 2008-01-23 17:51:24.000000000 -0500 +++ ipoib/ipoib_main.c 2008-01-23 17:53:41.000000000 -0500 @@ -181,6 +181,7 @@ static int ipoib_change_mtu(struct net_d { struct ipoib_dev_priv *priv = netdev_priv(dev); +#ifdef CONFIG_INFINIBAND_IPOIB_CM /* dev->mtu > 2K ==> connected mode */ if (ipoib_cm_admin_enabled(dev)) { if (new_mtu > priv->cm.max_cm_mtu) @@ -192,6 +193,7 @@ static int ipoib_change_mtu(struct net_d dev->mtu = new_mtu; return 0; } +#endif if (new_mtu > ipoib_ud_mtu(priv->max_ib_mtu)) return -EINVAL; From gmkurtzer at gmail.com Wed Jan 23 15:18:55 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Wed, 23 Jan 2008 15:18:55 -0800 Subject: [ofa-general] Opensm compatibility with rate=1 Message-ID: <571f1a060801231518i76c3383l38e3fe9cc32e3cd8@mail.gmail.com> Hello, We recently updated OFED (among other things) on one of our IB test beds that use older cards. Something broke recently with an error in dmesg like: kernel: ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 We used to fix this by defining our partition to be: Default=0x7fff,ipoib,rate=1:ALL=full; But this no longer seems to work. In the opensm source code I see the following: /* following v1 ver1.2 p901 */ #define IB_PATH_RECORD_RATE_2_5_GBS 2 #define IB_PATH_RECORD_RATE_10_GBS 3 #define IB_PATH_RECORD_RATE_30_GBS 4 #define IB_PATH_RECORD_RATE_5_GBS 5 #define IB_PATH_RECORD_RATE_20_GBS 6 #define IB_PATH_RECORD_RATE_40_GBS 7 #define IB_PATH_RECORD_RATE_60_GBS 8 #define IB_PATH_RECORD_RATE_80_GBS 9 #define IB_PATH_RECORD_RATE_120_GBS 10 #define IB_MIN_RATE IB_PATH_RECORD_RATE_2_5_GBS #define IB_MAX_RATE IB_PATH_RECORD_RATE_120_GBS Which forces the lowest possible rate to be 2 which doesn't work with our test bed. By kludging IB_MIN_RATE to be set to 1, things seem to be working but chances are supporting only rates >= 2 was done on purpose. Is there a better workaround or solution to this, or a way of continuing support for rate=1? Thank you. Greg note: I am not signed up to the list directly, so please CC me on any replies. From rdreier at cisco.com Wed Jan 23 15:27:43 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 23 Jan 2008 15:27:43 -0800 Subject: [ofa-general] Re: InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: (Roland Dreier's message of "Wed, 23 Jan 2008 13:23:34 -0800") References: <20080121100151.GD5333@infradead.org> <20080123055014.GA9256@infradead.org> Message-ID: Rather than arguing over whether we have to have sparse clean code, I decided to annotate the code myself. Here's a patch that fixes most of the sparse warnings in the nes driver. There's still some stuff that actually looks buggy, like the way hte_index stuff is handled. You initialize hte_index_mask as: hte_index_mask = ((u32)1 << ((u32temp & 0x001f)+1))-1; nesadapter->hte_index_mask = hte_index_mask; but then compute hte_index stuff with: nesqp->hte_index = cpu_to_be32( crc32c(~0, (void *)&nes_quad, sizeof(nes_quad)) ^ 0xffffffff); and then do: nesqp->hte_index &= nesadapter->hte_index_mask; which seems odd to say the least (hte_index is big-endian, hte_index_mask is cpu-endian). And also, there's code with the loc_addr/rem_addr etc that seem very confused. For example cm_info->loc_addr = htonl(cm_info->loc_addr); cm_info->rem_addr = htonl(cm_info->rem_addr); cm_info->loc_port = htons(cm_info->loc_port); cm_info->rem_port = htons(cm_info->rem_port); which is obviously impossible to annotate correctly, and I couldn't keep track of the endianness stuff elsewhere. Anyway this is what I have in case the promised cleanups don't turn up in time... Signed-off-by: Roland Dreier diff --git a/drivers/infiniband/hw/nes/nes.c b/drivers/infiniband/hw/nes/nes.c index 7a2f596..365ebaa 100644 --- a/drivers/infiniband/hw/nes/nes.c +++ b/drivers/infiniband/hw/nes/nes.c @@ -231,10 +231,10 @@ static int nes_net_event(struct notifier_block *notifier, } else { if (neigh->nud_state & NUD_VALID) { nes_manage_arp_cache(neigh->dev, neigh->ha, - ntohl(*(u32 *)neigh->primary_key), NES_ARP_ADD); + ntohl(*(__be32 *)neigh->primary_key), NES_ARP_ADD); } else { nes_manage_arp_cache(neigh->dev, neigh->ha, - ntohl(*(u32 *)neigh->primary_key), NES_ARP_DELETE); + ntohl(*(__be32 *)neigh->primary_key), NES_ARP_DELETE); } } return NOTIFY_OK; diff --git a/drivers/infiniband/hw/nes/nes.h b/drivers/infiniband/hw/nes/nes.h index 31d3cf5..d50eb49 100644 --- a/drivers/infiniband/hw/nes/nes.h +++ b/drivers/infiniband/hw/nes/nes.h @@ -203,6 +203,7 @@ extern u32 cm_packets_retrans; extern u32 cm_listens_created; extern u32 cm_listens_destroyed; extern u32 cm_backlog_drops; +extern atomic_t cm_loopbacks; extern atomic_t cm_nodes_created; extern atomic_t cm_nodes_destroyed; extern atomic_t cm_accel_dropped_pkts; diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c index d0153e2..4bb5833 100644 --- a/drivers/infiniband/hw/nes/nes_cm.c +++ b/drivers/infiniband/hw/nes/nes_cm.c @@ -187,7 +187,7 @@ static int parse_mpa(struct nes_cm_node *cm_node, u8 *buffer, u32 len) } mpa_frame = (struct ietf_mpa_frame *)buffer; - cm_node->mpa_frame_size = (u32)ntohs(mpa_frame->priv_data_len); + cm_node->mpa_frame_size = ntohs(mpa_frame->priv_data_len); if (cm_node->mpa_frame_size + sizeof(struct ietf_mpa_frame) != len) { nes_debug(NES_DBG_CM, "The received ietf buffer was not right" @@ -271,7 +271,7 @@ struct sk_buff *form_cm_frame(struct sk_buff *skb, struct nes_cm_node *cm_node, buf += sizeof(*tcph); skb->ip_summed = CHECKSUM_PARTIAL; - skb->protocol = ntohs(0x800); + skb->protocol = htons(0x800); skb->data_len = 0; skb->mac_len = ETH_HLEN; @@ -285,7 +285,7 @@ struct sk_buff *form_cm_frame(struct sk_buff *skb, struct nes_cm_node *cm_node, iph->tot_len = htons(packetsize); iph->id = htons(++cm_node->tcp_cntxt.loc_id); - iph->frag_off = ntohs(0x4000); + iph->frag_off = htons(0x4000); iph->ttl = 0x40; iph->protocol= 0x06; /* IPPROTO_TCP */ @@ -394,7 +394,7 @@ int schedule_nes_timer(struct nes_cm_node *cm_node, struct sk_buff *skb, } if (type == NES_TIMER_TYPE_SEND) { - new_send->seq_num = htonl(tcp_hdr(skb)->seq); + new_send->seq_num = ntohl(tcp_hdr(skb)->seq); atomic_inc(&new_send->skb->users); ret = nes_nic_cm_xmit(new_send->skb, cm_node->netdev); @@ -419,7 +419,7 @@ int schedule_nes_timer(struct nes_cm_node *cm_node, struct sk_buff *skb, spin_unlock_irqrestore(&cm_node->retrans_list_lock, flags); } if (type == NES_TIMER_TYPE_RECV) { - new_send->seq_num = htonl(tcp_hdr(skb)->seq); + new_send->seq_num = ntohl(tcp_hdr(skb)->seq); new_send->timetosend = jiffies; spin_lock_irqsave(&cm_node->recv_list_lock, flags); list_add_tail(&new_send->list, &cm_node->recv_list); @@ -1245,7 +1245,7 @@ static int process_options(struct nes_cm_node *cm_node, u8 *optionsloc, u32 opti if (all_options->as_mss.length != 4) { return 1; } else { - tmp = htons(all_options->as_mss.mss); + tmp = ntohs(all_options->as_mss.mss); if (tmp > 0 && tmp < cm_node->tcp_cntxt.mss) cm_node->tcp_cntxt.mss = tmp; } @@ -1369,7 +1369,7 @@ int process_packet(struct nes_cm_node *cm_node, struct sk_buff *skb, else if (tcph->syn) cm_node->tcp_cntxt.mss = NES_CM_DEFAULT_MSS; - cm_node->tcp_cntxt.snd_wnd = htons(tcph->window) << + cm_node->tcp_cntxt.snd_wnd = ntohs(tcph->window) << cm_node->tcp_cntxt.snd_wscale; if (cm_node->tcp_cntxt.snd_wnd > cm_node->tcp_cntxt.max_snd_wnd) { @@ -1621,7 +1621,7 @@ static struct nes_cm_listener *mini_cm_listen(struct nes_cm_core *cm_core, nes_debug(NES_DBG_CM, "Api - listen(): addr=0x%08X, port=0x%04x," " listener = %p, backlog = %d, cm_id = %p.\n", - ntohl(cm_info->loc_addr), ntohs(cm_info->loc_port), + htonl(cm_info->loc_addr), htons(cm_info->loc_port), listener, listener->backlog, listener->cm_id); return listener; @@ -1827,7 +1827,7 @@ int mini_cm_recv_pkt(struct nes_cm_core *cm_core, struct nes_vnic *nesvnic, tcph = (struct tcphdr *)(skb->data + sizeof(struct iphdr)); skb_reset_network_header(skb); skb_set_transport_header(skb, sizeof(*tcph)); - skb->len = htons(iph->tot_len); + skb->len = ntohs(iph->tot_len); nfo.loc_addr = ntohl(iph->daddr); nfo.loc_port = ntohs(tcph->dest); @@ -2832,7 +2832,7 @@ void cm_event_connected(struct nes_cm_event *event) nesqp->hte_index = cpu_to_be32( crc32c(~0, (void *)&nes_quad, sizeof(nes_quad)) ^ 0xffffffff); nes_debug(NES_DBG_CM, "HTE Index = 0x%08X, After CRC = 0x%08X\n", - nesqp->hte_index, nesqp->hte_index & nesadapter->hte_index_mask); + be32_to_cpu(nesqp->hte_index), be32_to_cpu(nesqp->hte_index) & nesadapter->hte_index_mask); nesqp->hte_index &= nesadapter->hte_index_mask; nesqp->nesqp_context->hte_index = cpu_to_le32(nesqp->hte_index); diff --git a/drivers/infiniband/hw/nes/nes_cm.h b/drivers/infiniband/hw/nes/nes_cm.h index 46f1dea..6109fdf 100644 --- a/drivers/infiniband/hw/nes/nes_cm.h +++ b/drivers/infiniband/hw/nes/nes_cm.h @@ -55,7 +55,7 @@ struct ietf_mpa_frame { u8 key[IETF_MPA_KEY_SIZE]; u8 flags; u8 rev; - u16 priv_data_len; + __be16 priv_data_len; u8 priv_data[0]; }; @@ -63,9 +63,9 @@ struct ietf_mpa_frame { struct nes_v4_quad { u32 rsvd0; - u32 DstIpAdrIndex; /* Only most significant 5 bits are valid */ - u32 SrcIpadr; - u16 TcpPorts[2]; /* src is low, dest is high */ + __le32 DstIpAdrIndex; /* Only most significant 5 bits are valid */ + __be32 SrcIpadr; + __be16 TcpPorts[2]; /* src is low, dest is high */ }; struct nes_cm_node; @@ -101,7 +101,7 @@ enum option_numbers { struct option_mss { u8 optionnum; u8 length; - u16 mss; + __be16 mss; }; struct option_windowscale { diff --git a/drivers/infiniband/hw/nes/nes_context.h b/drivers/infiniband/hw/nes/nes_context.h index 114553f..3c4b06f 100644 --- a/drivers/infiniband/hw/nes/nes_context.h +++ b/drivers/infiniband/hw/nes/nes_context.h @@ -34,50 +34,50 @@ #define NES_CONTEXT_H struct nes_qp_context { - u32 misc; - u32 cqs; - u32 sq_addr_low; - u32 sq_addr_high; - u32 rq_addr_low; - u32 rq_addr_high; - u32 misc2; - u16 tcpPorts[2]; - u32 ip0; - u32 ip1; - u32 ip2; - u32 ip3; - u32 mss; - u32 arp_index_vlan; - u32 tcp_state_flow_label; - u32 pd_index_wscale; - u32 keepalive; + __le32 misc; + __le32 cqs; + __le32 sq_addr_low; + __le32 sq_addr_high; + __le32 rq_addr_low; + __le32 rq_addr_high; + __le32 misc2; + __le16 tcpPorts[2]; + __le32 ip0; + __le32 ip1; + __le32 ip2; + __le32 ip3; + __le32 mss; + __le32 arp_index_vlan; + __le32 tcp_state_flow_label; + __le32 pd_index_wscale; + __le32 keepalive; u32 ts_recent; u32 ts_age; - u32 snd_nxt; - u32 snd_wnd; - u32 rcv_nxt; - u32 rcv_wnd; - u32 snd_max; - u32 snd_una; + __le32 snd_nxt; + __le32 snd_wnd; + __le32 rcv_nxt; + __le32 rcv_wnd; + __le32 snd_max; + __le32 snd_una; u32 srtt; - u32 rttvar; - u32 ssthresh; - u32 cwnd; - u32 snd_wl1; - u32 snd_wl2; - u32 max_snd_wnd; - u32 ts_val_delta; + __le32 rttvar; + __le32 ssthresh; + __le32 cwnd; + __le32 snd_wl1; + __le32 snd_wl2; + __le32 max_snd_wnd; + __le32 ts_val_delta; u32 retransmit; u32 probe_cnt; - u32 hte_index; - u32 q2_addr_low; - u32 q2_addr_high; - u32 ird_index; + __le32 hte_index; + __le32 q2_addr_low; + __le32 q2_addr_high; + __le32 ird_index; u32 Rsvd3; - u32 ird_ord_sizes; + __le32 ird_ord_sizes; u32 mrkr_offset; - u32 aeq_token_low; - u32 aeq_token_high; + __le32 aeq_token_low; + __le32 aeq_token_high; }; /* QP Context Misc Field */ diff --git a/drivers/infiniband/hw/nes/nes_hw.c b/drivers/infiniband/hw/nes/nes_hw.c index b9cebfa..73c3a6c 100644 --- a/drivers/infiniband/hw/nes/nes_hw.c +++ b/drivers/infiniband/hw/nes/nes_hw.c @@ -2199,7 +2199,7 @@ void nes_nic_ce_handler(struct nes_device *nesdev, struct nes_hw_nic_cq *cq) struct nes_hw_nic_sq_wqe *nic_sqe; struct sk_buff *skb; struct sk_buff *rx_skb; - u16 *wqe_fragment_length; + __le16 *wqe_fragment_length; unsigned long flags; u32 head; u32 cq_size; @@ -2227,7 +2227,7 @@ void nes_nic_ce_handler(struct nes_device *nesdev, struct nes_hw_nic_cq *cq) wqe_fragment_index = 1; nic_sqe = &nesnic->sq_vbase[nesnic->sq_tail]; skb = nesnic->tx_skb[nesnic->sq_tail]; - wqe_fragment_length = (u16 *)&nic_sqe->wqe_words[NES_NIC_SQ_WQE_LENGTH_0_TAG_IDX]; + wqe_fragment_length = (__le16 *)&nic_sqe->wqe_words[NES_NIC_SQ_WQE_LENGTH_0_TAG_IDX]; /* bump past the vlan tag */ wqe_fragment_length++; if (le16_to_cpu(wqe_fragment_length[wqe_fragment_index]) != 0) { diff --git a/drivers/infiniband/hw/nes/nes_hw.h b/drivers/infiniband/hw/nes/nes_hw.h index 2efb55e..b1fea65 100644 --- a/drivers/infiniband/hw/nes/nes_hw.h +++ b/drivers/infiniband/hw/nes/nes_hw.h @@ -1106,7 +1106,7 @@ struct nes_adapter { }; struct nes_pbl { - u64 *pbl_vbase; + __le64 *pbl_vbase; dma_addr_t pbl_pbase; struct page *page; unsigned long user_base; diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c index bd3f9e8..30a9696 100644 --- a/drivers/infiniband/hw/nes/nes_nic.c +++ b/drivers/infiniband/hw/nes/nes_nic.c @@ -92,42 +92,6 @@ static const u32 default_msg = NETIF_MSG_DRV | NETIF_MSG_PROBE | NETIF_MSG_LINK | NETIF_MSG_IFUP | NETIF_MSG_IFDOWN; static int debug = -1; -extern atomic_t cm_connects; -extern atomic_t cm_accepts; -extern atomic_t cm_disconnects; -extern atomic_t cm_closes; -extern atomic_t cm_connecteds; -extern atomic_t cm_connect_reqs; -extern atomic_t cm_rejects; -extern atomic_t mod_qp_timouts; -extern atomic_t qps_created; -extern atomic_t qps_destroyed; -extern atomic_t sw_qps_destroyed; -extern u32 mh_detected; -extern u32 mh_pauses_sent; -extern u32 cm_packets_sent; -extern u32 cm_packets_bounced; -extern u32 cm_packets_created; -extern u32 cm_packets_received; -extern u32 cm_packets_dropped; -extern u32 cm_packets_retrans; -extern u32 cm_listens_created; -extern u32 cm_listens_destroyed; -extern u32 cm_backlog_drops; -extern atomic_t cm_loopbacks; -extern atomic_t cm_nodes_created; -extern atomic_t cm_nodes_destroyed; -extern atomic_t cm_accel_dropped_pkts; -extern atomic_t cm_resets_recvd; -extern u32 int_mod_timer_init; -extern u32 int_mod_cq_depth_256; -extern u32 int_mod_cq_depth_128; -extern u32 int_mod_cq_depth_32; -extern u32 int_mod_cq_depth_24; -extern u32 int_mod_cq_depth_16; -extern u32 int_mod_cq_depth_4; -extern u32 int_mod_cq_depth_1; - static int nes_netdev_open(struct net_device *); static int nes_netdev_stop(struct net_device *); static int nes_netdev_start_xmit(struct sk_buff *, struct net_device *); @@ -350,21 +314,21 @@ static int nes_nic_send(struct sk_buff *skb, struct net_device *netdev) struct nes_hw_nic *nesnic = &nesvnic->nic; struct nes_hw_nic_sq_wqe *nic_sqe; struct tcphdr *tcph; - u16 *wqe_fragment_length; + __le16 *wqe_fragment_length; u32 wqe_misc; u16 wqe_fragment_index = 1; /* first fragment (0) is used by copy buffer */ u16 skb_fragment_index; dma_addr_t bus_address; nic_sqe = &nesnic->sq_vbase[nesnic->sq_head]; - wqe_fragment_length = (u16 *)&nic_sqe->wqe_words[NES_NIC_SQ_WQE_LENGTH_0_TAG_IDX]; + wqe_fragment_length = (__le16 *)&nic_sqe->wqe_words[NES_NIC_SQ_WQE_LENGTH_0_TAG_IDX]; /* setup the VLAN tag if present */ if (vlan_tx_tag_present(skb)) { nes_debug(NES_DBG_NIC_TX, "%s: VLAN packet to send... VLAN = %08X\n", netdev->name, vlan_tx_tag_get(skb)); wqe_misc = NES_NIC_SQ_WQE_TAGVALUE_ENABLE; - wqe_fragment_length[0] = vlan_tx_tag_get(skb); + wqe_fragment_length[0] = (__force __le16) vlan_tx_tag_get(skb); } else wqe_misc = 0; @@ -475,7 +439,7 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev) u32 send_rc; struct iphdr *iph; unsigned long flags; - u16 *wqe_fragment_length; + __le16 *wqe_fragment_length; u32 nr_frags; u32 original_first_length; // u64 *wqe_fragment_address; @@ -577,13 +541,13 @@ tso_sq_no_longer_full: tso_wqe_length = 0; nic_sqe = &nesnic->sq_vbase[nesnic->sq_head]; wqe_fragment_length = - (u16 *)&nic_sqe->wqe_words[NES_NIC_SQ_WQE_LENGTH_0_TAG_IDX]; + (__le16 *)&nic_sqe->wqe_words[NES_NIC_SQ_WQE_LENGTH_0_TAG_IDX]; /* setup the VLAN tag if present */ if (vlan_tx_tag_present(skb)) { nes_debug(NES_DBG_NIC_TX, "%s: VLAN packet to send... VLAN = %08X\n", netdev->name, vlan_tx_tag_get(skb) ); wqe_misc = NES_NIC_SQ_WQE_TAGVALUE_ENABLE; - wqe_fragment_length[0] = vlan_tx_tag_get(skb); + wqe_fragment_length[0] = (__force __le16) vlan_tx_tag_get(skb); } else wqe_misc = 0; diff --git a/drivers/infiniband/hw/nes/nes_utils.c b/drivers/infiniband/hw/nes/nes_utils.c index efdd629..24e2326 100644 --- a/drivers/infiniband/hw/nes/nes_utils.c +++ b/drivers/infiniband/hw/nes/nes_utils.c @@ -610,8 +610,8 @@ void nes_post_cqp_request(struct nes_device *nesdev, nes_debug(NES_DBG_CQP, "CQP request %p (opcode 0x%02X), line 1 = 0x%08X" " put on the pending queue.\n", cqp_request, - cqp_request->cqp_wqe.wqe_words[NES_CQP_WQE_OPCODE_IDX]&0x3f, - cqp_request->cqp_wqe.wqe_words[NES_CQP_WQE_ID_IDX]); + le32_to_cpu(cqp_request->cqp_wqe.wqe_words[NES_CQP_WQE_OPCODE_IDX]) & 0x3f, + le32_to_cpu(cqp_request->cqp_wqe.wqe_words[NES_CQP_WQE_ID_IDX])); list_add_tail(&cqp_request->list, &nesdev->cqp_pending_reqs); } diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c index ebe1a17..e218cac 100644 --- a/drivers/infiniband/hw/nes/nes_verbs.c +++ b/drivers/infiniband/hw/nes/nes_verbs.c @@ -298,10 +298,8 @@ static int nes_bind_mw(struct ib_qp *ibqp, struct ib_mw *ibmw, cpu_to_le32((u32)((u64temp)>>32)); wqe->wqe_words[NES_IWARP_SQ_WQE_COMP_CTX_HIGH_IDX] = cpu_to_le32((u32)(upper_32_bits((unsigned long)nesqp))); - wqe->wqe_words[NES_IWARP_SQ_WQE_COMP_CTX_LOW_IDX] = (u32)((unsigned long)nesqp); - wqe->wqe_words[NES_IWARP_SQ_WQE_COMP_CTX_LOW_IDX] |= head; - wqe->wqe_words[NES_IWARP_SQ_WQE_COMP_CTX_LOW_IDX] = - cpu_to_le32(wqe->wqe_words[NES_IWARP_SQ_WQE_COMP_CTX_LOW_IDX]); + wqe->wqe_words[NES_IWARP_SQ_WQE_COMP_CTX_LOW_IDX] = + cpu_to_le32(head | (u32)(unsigned long) nesqp); wqe_misc = NES_IWARP_SQ_OP_BIND; wqe_misc |= NES_IWARP_SQ_WQE_LOCAL_FENCE; @@ -1072,9 +1070,9 @@ static int nes_setup_virt_qp(struct nes_qp *nesqp, struct nes_pbl *nespbl, { unsigned long flags; void *mem; - u64 *pbl = NULL; - u64 *tpbl; - u64 *pblbuffer; + __le64 *pbl = NULL; + __le64 *tpbl; + __le64 *pblbuffer; struct nes_device *nesdev = nesvnic->nesdev; struct nes_adapter *nesadapter = nesdev->nesadapter; u32 pbl_entries; @@ -1091,7 +1089,7 @@ static int nes_setup_virt_qp(struct nes_qp *nesqp, struct nes_pbl *nespbl, /* the first pbl to be fro the rq_vbase... */ rq_pbl_entries = (rq_size * sizeof(struct nes_hw_qp_wqe)) >> 12; sq_pbl_entries = (sq_size * sizeof(struct nes_hw_qp_wqe)) >> 12; - nesqp->hwqp.sq_pbase = (le32_to_cpu (((u32 *)pbl)[0]) ) | ((u64)((le32_to_cpu (((u32 *)pbl)[1]))) << 32); + nesqp->hwqp.sq_pbase = (le32_to_cpu (((__le32 *)pbl)[0]) ) | ((u64)((le32_to_cpu (((__le32 *)pbl)[1]))) << 32); if (!nespbl->page) { nes_debug(NES_DBG_QP, "QP nespbl->page is NULL \n"); kfree(nespbl); @@ -1109,7 +1107,7 @@ static int nes_setup_virt_qp(struct nes_qp *nesqp, struct nes_pbl *nespbl, /* Now to get to sq.. we need to calculate how many */ /* PBL entries were used by the rq.. */ pbl += sq_pbl_entries; - nesqp->hwqp.rq_pbase = (le32_to_cpu (((u32 *)pbl)[0]) ) | ((u64)((le32_to_cpu (((u32 *)pbl)[1]))) << 32); + nesqp->hwqp.rq_pbase = (le32_to_cpu (((__le32 *)pbl)[0]) ) | ((u64)((le32_to_cpu (((__le32 *)pbl)[1]))) << 32); /* nesqp->hwqp.rq_vbase = bus_to_virt(*pbl); */ /*nesqp->hwqp.rq_vbase = phys_to_virt(*pbl); */ @@ -2405,7 +2403,7 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, u64 virt, int acc, struct ib_udata *udata) { u64 iova_start; - u64 *pbl; + __le64 *pbl; u64 region_length; dma_addr_t last_dma_addr = 0; dma_addr_t first_dma_addr = 0; @@ -2729,15 +2727,15 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, chunk_pages += (sg_dma_len(&chunk->page_list[nmap_index]) & (4096-1)) ? 1 : 0; nespbl->page = sg_page(&chunk->page_list[0]); for (page_index=0; page_indexpage_list[nmap_index])+ (page_index*4096))); - ((u32 *)pbl)[1] = cpu_to_le32(((u64) + ((__le32 *)pbl)[1] = cpu_to_le32(((u64) (sg_dma_address(&chunk->page_list[nmap_index])+ (page_index*4096)))>>32); nes_debug(NES_DBG_MR, "pbl=%p, *pbl=0x%016llx, 0x%08x%08x\n", pbl, (unsigned long long)*pbl, - le32_to_cpu(((u32 *)pbl)[1]), le32_to_cpu(((u32 *)pbl)[0])); + le32_to_cpu(((__le32 *)pbl)[1]), le32_to_cpu(((__le32 *)pbl)[0])); pbl++; } } @@ -3730,10 +3728,10 @@ static int nes_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry) /* Working on a SQ Completion*/ wq_tail = wqe_index; nesqp->hwqp.sq_tail = (wqe_index+1)&(nesqp->hwqp.sq_size - 1); - wrid = (((u64)(cpu_to_le32((u32)nesqp->hwqp.sq_vbase[wq_tail]. + wrid = (((u64)(le32_to_cpu(nesqp->hwqp.sq_vbase[wq_tail]. wqe_words[NES_IWARP_SQ_WQE_COMP_SCRATCH_HIGH_IDX]))) << 32) | - ((u64)(cpu_to_le32((u32)nesqp->hwqp.sq_vbase[wq_tail]. - wqe_words[NES_IWARP_SQ_WQE_COMP_SCRATCH_LOW_IDX]))); + le32_to_cpu(nesqp->hwqp.sq_vbase[wq_tail]. + wqe_words[NES_IWARP_SQ_WQE_COMP_SCRATCH_LOW_IDX]); entry->byte_len = le32_to_cpu(nesqp->hwqp.sq_vbase[wq_tail]. wqe_words[NES_IWARP_SQ_WQE_TOTAL_PAYLOAD_IDX]); diff --git a/drivers/infiniband/hw/nes/nes_verbs.h b/drivers/infiniband/hw/nes/nes_verbs.h index c5ee39d..d4cdc6e 100644 --- a/drivers/infiniband/hw/nes/nes_verbs.h +++ b/drivers/infiniband/hw/nes/nes_verbs.h @@ -77,8 +77,8 @@ struct nes_mr { }; struct nes_hw_pb { - u32 pa_low; - u32 pa_high; + __le32 pa_low; + __le32 pa_high; }; struct nes_vpbl { @@ -139,7 +139,7 @@ struct nes_qp { struct work_struct ae_work; enum ib_qp_state ibqp_state; u32 iwarp_state; - u32 hte_index; + __be32 hte_index; u32 last_aeq; u32 qp_mem_size; atomic_t refcount; From kononov at dls.net Wed Jan 23 15:29:07 2008 From: kononov at dls.net (Roman Kononov) Date: Wed, 23 Jan 2008 17:29:07 -0600 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> <4797A6FD.2080708@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> Message-ID: <4797CDC3.6090707@dls.net> On 2008-01-23 16:12, Eli Cohen wrote: > if your software creates QPs whose receive queue size is not a power of two > then you might experience weired problems as the patches I sent have a bug The RQ size is 64. The SQ size is 64. The CQ size is 128. > I am sending a patch to be applied on top of the previous libmthca patch so > you can try it (the same fix goes for the kernel code too). Tomorow I will > send the fixed patches again. I apologize if the patch is badly formed. Thanks for the patches. I have another "simple" program (700 lines) which fails with both tavor (4.8.200) and memfree (5.3.000) FW. Unfortunately, it takes an hour or more to fail; the failure is not obvious, because when it happens, very little is printed; and when I try to modify the code to make it more obvious the problem goes away. The failures are related to ordering. Either completions or data are out of order. If anybody is interested let me know, I will post it. Roman From rdreier at cisco.com Wed Jan 23 15:32:37 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 23 Jan 2008 15:32:37 -0800 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <4797CDC3.6090707@dls.net> (Roman Kononov's message of "Wed, 23 Jan 2008 17:29:07 -0600") References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> <4797A6FD.2080708@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> <4797CDC3.6090707@dls.net> Message-ID: > I have another "simple" program (700 lines) which fails with both > tavor (4.8.200) and memfree (5.3.000) FW. Unfortunately, it takes an > hour or more to fail; the failure is not obvious, because when it > happens, very little is printed; and when I try to modify the code to > make it more obvious the problem goes away. The failures are related > to ordering. Either completions or data are out of order. If anybody > is interested let me know, I will post it. I'd be curious to run it. It can't hurt to have the test... - R. From weiny2 at llnl.gov Wed Jan 23 16:12:13 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 23 Jan 2008 16:12:13 -0800 Subject: [ofa-general] Re: [PATCH] infiniband-diags/saquery: code consolidation In-Reply-To: <20080123183712.GC11277@sashak.voltaire.com> References: <20080123183712.GC11277@sashak.voltaire.com> Message-ID: <20080123161213.5e06ad4b.weiny2@llnl.gov> Quick tests indicate this works just fine. Thanks for cleaning this up. It was getting pretty big, Ira On Wed, 23 Jan 2008 18:37:12 +0000 Sasha Khapyorsky wrote: > > Some code consolidation and cleanup. > > Signed-off-by: Sasha Khapyorsky > --- > infiniband-diags/src/saquery.c | 340 +++++++++++++++------------------------- > 1 files changed, 130 insertions(+), 210 deletions(-) > > diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c > index d16e604..1b9e2d3 100644 > --- a/infiniband-diags/src/saquery.c > +++ b/infiniband-diags/src/saquery.c > @@ -189,9 +189,9 @@ print_node_record(ib_node_record_t *node_record) > ); > } > > -static void > -print_path_record(ib_path_rec_t *p_pr) > +static void dump_path_record(void *data) > { > + ib_path_rec_t *p_pr = data; > printf("PathRecord dump:\n" > "\t\tservice_id..............0x%016" PRIx64 "\n" > "\t\tdgid....................0x%016" PRIx64 " : " > @@ -257,11 +257,11 @@ sprint_gid(ib_gid_t *gid, char *str, size_t len) > return (str); > } > > -static void > -print_class_port_info(ib_class_port_info_t *class_port_info) > +static void dump_class_port_info(void *data) > { > size_t GID_STR_LEN = 256; > char gid_str[GID_STR_LEN]; > + ib_class_port_info_t *class_port_info = data; > > printf("SA ClassPortInfo:\n" > "\t\tBase version.............%d\n" > @@ -302,9 +302,9 @@ print_class_port_info(ib_class_port_info_t *class_port_info) > ); > } > > -static void > -print_portinfo_record(ib_portinfo_record_t *p_pir) > +static void dump_portinfo_record(void *data) > { > + ib_portinfo_record_t *p_pir = data; > const ib_port_info_t * const p_pi = &p_pir->port_info; > > printf("PortInfoRecord dump:\n" > @@ -390,11 +390,11 @@ print_multicast_member_record(ib_member_rec_t *p_mcmr) > } > } > > -static void > -print_service_record(ib_service_record_t *p_sr) > +static void dump_service_record(void *data) > { > char buf_service_key[35]; > char buf_service_name[65]; > + ib_service_record_t *p_sr = data; > > sprintf(buf_service_key, > "0x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x", > @@ -488,9 +488,9 @@ print_service_record(ib_service_record_t *p_sr) > ); > } > > -static void > -print_inform_info_record(ib_inform_info_record_t *p_iir) > +static void dump_inform_info_record(void *data) > { > + ib_inform_info_record_t *p_iir = data; > uint32_t qpn; > uint8_t resp_time_val; > > @@ -565,8 +565,9 @@ print_inform_info_record(ib_inform_info_record_t *p_iir) > } > } > > -static void dump_one_link_record(ib_link_record_t *lr) > +static void dump_one_link_record(void *data) > { > + ib_link_record_t *lr = data; > printf("LinkRecord dump:\n" > "\t\tFromLID....................%u\n" > "\t\tFromPort...................%u\n" > @@ -576,8 +577,9 @@ static void dump_one_link_record(ib_link_record_t *lr) > lr->to_port_num, cl_ntoh16(lr->to_lid)); > } > > -static void dump_one_slvl_record(ib_slvl_table_record_t *slvl) > +static void dump_one_slvl_record(void *data) > { > + ib_slvl_table_record_t *slvl = data; > ib_slvl_table_t *t = &slvl->slvl_tbl; > printf("SL2VLTableRecord dump:\n" > "\t\tLID........................%u\n" > @@ -597,8 +599,9 @@ static void dump_one_slvl_record(ib_slvl_table_record_t *slvl) > ib_slvl_table_get(t, 14), ib_slvl_table_get(t, 15)); > } > > -static void dump_one_vlarb_record(ib_vl_arb_table_record_t *vlarb) > +static void dump_one_vlarb_record(void *data) > { > + ib_vl_arb_table_record_t *vlarb = data; > ib_vl_arb_element_t *e = vlarb->vl_arb_tbl.vl_entry; > int i; > printf("VLArbTableRecord dump:\n" > @@ -625,8 +628,9 @@ static void dump_one_vlarb_record(ib_vl_arb_table_record_t *vlarb) > } > } > > -static void dump_one_pkey_tbl_record(ib_pkey_table_record_t *pktr) > +static void dump_one_pkey_tbl_record(void *data) > { > + ib_pkey_table_record_t *pktr = data; > ib_net16_t *p = pktr->pkey_tbl.pkey_entry; > int i; > printf("PKeyTableRecord dump:\n" > @@ -645,6 +649,15 @@ static void dump_one_pkey_tbl_record(ib_pkey_table_record_t *pktr) > printf("\n"); > } > > +static void dump_results(osmv_query_res_t *r, void (*dump_func)(void *)) > +{ > + int i; > + for (i = 0; i < r->result_cnt; i++) { > + void *data = osmv_get_query_svc_rec(r->p_result_madw, i); > + dump_func(data); > + } > +} > + > static void > return_mad(void) > { > @@ -815,121 +828,6 @@ get_issm_records(osm_bind_handle_t bind_handle, ib_net32_t capability_mask) > 0); > } > > -/* > - * Get the LinkRecord(s) > - */ > -static ib_api_status_t get_link_records(osm_bind_handle_t bind_handle, > - int from_lid, int from_port, > - int to_lid, int to_port) > -{ > - ib_link_record_t lr; > - ib_net64_t comp_mask; > - > - memset(&lr, 0, sizeof(lr)); > - comp_mask = 0; > - > - if (from_lid > 0) { > - lr.from_lid = cl_hton16(from_lid); > - comp_mask |= IB_LR_COMPMASK_FROM_LID; > - } > - if (from_port >= 0) { > - lr.from_port_num = from_port; > - comp_mask |= IB_LR_COMPMASK_FROM_PORT; > - } > - if (to_lid > 0) { > - lr.to_lid = cl_hton16(to_lid); > - comp_mask |= IB_LR_COMPMASK_TO_LID; > - } > - if (to_port >= 0) { > - lr.to_port_num = to_port; > - comp_mask |= IB_LR_COMPMASK_TO_PORT; > - } > - > - return get_any_records(bind_handle, IB_MAD_ATTR_LINK_RECORD, 0, > - comp_mask, &lr, > - ib_get_attr_offset(sizeof(ib_link_record_t)), 0); > -} > - > -static ib_api_status_t get_slvl_records(osm_bind_handle_t bind_handle, > - int lid, int in_port, int out_port) > -{ > - ib_slvl_table_record_t slvl; > - ib_net64_t comp_mask; > - > - memset(&slvl, 0, sizeof(slvl)); > - comp_mask = 0; > - > - if (lid > 0) { > - slvl.lid = cl_hton16(lid); > - comp_mask |= IB_SLVL_COMPMASK_LID; > - } > - if (in_port >= 0) { > - slvl.in_port_num = in_port; > - comp_mask |= IB_SLVL_COMPMASK_IN_PORT; > - } > - if (out_port >= 0) { > - slvl.out_port_num = out_port; > - comp_mask |= IB_SLVL_COMPMASK_OUT_PORT; > - } > - > - return get_any_records(bind_handle, IB_MAD_ATTR_SLVL_RECORD, 0, > - comp_mask, &slvl, > - ib_get_attr_offset(sizeof(ib_slvl_table_record_t)), 0); > -} > - > -static ib_api_status_t get_vlarb_records(osm_bind_handle_t bind_handle, > - int lid, int port, int block) > -{ > - ib_vl_arb_table_record_t vlarb; > - ib_net64_t comp_mask = 0; > - > - memset(&vlarb, 0, sizeof(vlarb)); > - > - if (lid > 0) { > - vlarb.lid = cl_hton16(lid); > - comp_mask |= IB_VLA_COMPMASK_LID; > - } > - if (port >= 0) { > - vlarb.port_num = port; > - comp_mask |= IB_VLA_COMPMASK_OUT_PORT; > - } > - if (block >= 0) { > - vlarb.block_num = block; > - comp_mask |= IB_VLA_COMPMASK_BLOCK; > - } > - > - return get_any_records(bind_handle, IB_MAD_ATTR_VLARB_RECORD, 0, > - comp_mask, &vlarb, > - ib_get_attr_offset(sizeof(ib_vl_arb_table_record_t)), 0); > -} > - > -static ib_api_status_t get_pkey_tbl_records(osm_bind_handle_t bind_handle, > - int lid, int port, int block) > -{ > - ib_pkey_table_record_t pktr; > - ib_net64_t comp_mask = 0; > - > - memset(&pktr, 0, sizeof(pktr)); > - > - if (lid > 0) { > - pktr.lid = cl_hton16(lid); > - comp_mask |= IB_PKEY_COMPMASK_LID; > - } > - if (port >= 0) { > - pktr.port_num = port; > - comp_mask |= IB_PKEY_COMPMASK_PORT; > - } > - if (block >= 0) { > - pktr.block_num = block; > - comp_mask |= IB_PKEY_COMPMASK_BLOCK; > - } > - > - return get_any_records(bind_handle, IB_MAD_ATTR_PKEY_TBL_RECORD, 0, > - comp_mask, &pktr, > - ib_get_attr_offset(sizeof(pktr)), > - OSM_DEFAULT_SM_KEY); > -} > - > static ib_api_status_t > print_node_records(osm_bind_handle_t bind_handle) > { > @@ -982,8 +880,6 @@ get_print_path_rec_lid(osm_bind_handle_t bind_handle, > ib_net16_t src_lid, > ib_net16_t dst_lid) > { > - int i = 0; > - ib_path_rec_t *path_record = NULL; > osmv_query_req_t req; > osmv_lid_pair_t lid_pair; > ib_api_status_t status; > @@ -1013,10 +909,7 @@ get_print_path_rec_lid(osm_bind_handle_t bind_handle, > return (result.status); > } > status = result.status; > - for (i = 0; i < result.result_cnt; i++) { > - path_record = osmv_get_query_path_rec(result.p_result_madw, i); > - print_path_record(path_record); > - } > + dump_results(&result, dump_path_record); > return_mad(); > return (status); > } > @@ -1026,8 +919,6 @@ get_print_path_rec_gid(osm_bind_handle_t bind_handle, > const ib_gid_t *src_gid, > const ib_gid_t *dst_gid) > { > - int i = 0; > - ib_path_rec_t *path_record = NULL; > osmv_query_req_t req; > osmv_gid_pair_t gid_pair; > ib_api_status_t status; > @@ -1057,10 +948,7 @@ get_print_path_rec_gid(osm_bind_handle_t bind_handle, > return (result.status); > } > status = result.status; > - for (i = 0; i < result.result_cnt; i++) { > - path_record = osmv_get_query_path_rec(result.p_result_madw, i); > - print_path_record(path_record); > - } > + dump_results(&result, dump_path_record); > return_mad(); > return (status); > } > @@ -1068,8 +956,6 @@ get_print_path_rec_gid(osm_bind_handle_t bind_handle, > static ib_api_status_t > get_print_class_port_info(osm_bind_handle_t bind_handle) > { > - int i = 0; > - ib_class_port_info_t *class_port_info = NULL; > osmv_query_req_t req; > ib_api_status_t status; > > @@ -1095,10 +981,7 @@ get_print_class_port_info(osm_bind_handle_t bind_handle) > return (result.status); > } > status = result.status; > - for (i = 0; i < result.result_cnt; i++) { > - class_port_info = (ib_class_port_info_t*)osmv_get_query_result(result.p_result_madw, i); > - print_class_port_info(class_port_info); > - } > + dump_results(&result, dump_class_port_info); > return_mad(); > return (status); > } > @@ -1106,19 +989,14 @@ get_print_class_port_info(osm_bind_handle_t bind_handle) > static ib_api_status_t > print_path_records(osm_bind_handle_t bind_handle) > { > - int i = 0; > - ib_path_rec_t *path_record = NULL; > - ib_net16_t attr_offset = ib_get_attr_offset(sizeof(*path_record)); > - ib_api_status_t status; > + ib_net16_t attr_offset = ib_get_attr_offset(sizeof(ib_path_rec_t)); > + ib_api_status_t status; > > status = get_all_records(bind_handle, IB_MAD_ATTR_PATH_RECORD, attr_offset, 0); > if (status != IB_SUCCESS) > return (status); > > - for (i = 0; i < result.result_cnt; i++) { > - path_record = osmv_get_query_path_rec(result.p_result_madw, i); > - print_path_record(path_record); > - } > + dump_results(&result, dump_path_record); > return_mad(); > return (status); > } > @@ -1126,8 +1004,6 @@ print_path_records(osm_bind_handle_t bind_handle) > static ib_api_status_t > print_portinfo_records(osm_bind_handle_t bind_handle) > { > - int i = 0; > - ib_portinfo_record_t *portinfo_record = NULL; > ib_api_status_t status; > > /* First, get IsSM records */ > @@ -1136,10 +1012,7 @@ print_portinfo_records(osm_bind_handle_t bind_handle) > return (status); > > printf("IsSM ports\n"); > - for (i = 0; i < result.result_cnt; i++) { > - portinfo_record = osmv_get_query_portinfo_rec(result.p_result_madw, i); > - print_portinfo_record(portinfo_record); > - } > + dump_results(&result, dump_portinfo_record); > return_mad(); > > /* Now, get IsSMdisabled records */ > @@ -1148,10 +1021,7 @@ print_portinfo_records(osm_bind_handle_t bind_handle) > return (status); > > printf("\nIsSMdisabled ports\n"); > - for (i = 0; i < result.result_cnt; i++) { > - portinfo_record = osmv_get_query_portinfo_rec(result.p_result_madw, i); > - print_portinfo_record(portinfo_record); > - } > + dump_results(&result, dump_portinfo_record); > return_mad(); > > return (status); > @@ -1199,19 +1069,14 @@ return_mc: > static ib_api_status_t > print_service_records(osm_bind_handle_t bind_handle) > { > - int i = 0; > - ib_service_record_t *service_record = NULL; > - ib_net16_t attr_offset = ib_get_attr_offset(sizeof(*service_record)); > - ib_api_status_t status; > + ib_net16_t attr_offset = ib_get_attr_offset(sizeof(ib_service_record_t)); > + ib_api_status_t status; > > status = get_all_records(bind_handle, IB_MAD_ATTR_SERVICE_RECORD, attr_offset, 0); > if (status != IB_SUCCESS) > return (status); > > - for (i = 0; i < result.result_cnt; i++) { > - service_record = osmv_get_query_svc_rec(result.p_result_madw, i); > - print_service_record(service_record); > - } > + dump_results(&result, dump_service_record); > return_mad(); > return (status); > } > @@ -1219,19 +1084,14 @@ print_service_records(osm_bind_handle_t bind_handle) > static ib_api_status_t > print_inform_info_records(osm_bind_handle_t bind_handle) > { > - int i = 0; > - ib_inform_info_record_t *inform_info_record = NULL; > - ib_net16_t attr_offset = ib_get_attr_offset(sizeof(*inform_info_record)); > - ib_api_status_t status; > + ib_net16_t attr_offset = ib_get_attr_offset(sizeof(ib_inform_info_record_t)); > + ib_api_status_t status; > > status = get_all_records(bind_handle, IB_MAD_ATTR_INFORM_INFO_RECORD, attr_offset, 0); > if (status != IB_SUCCESS) > return (status); > > - for (i = 0; i < result.result_cnt; i++) { > - inform_info_record = osmv_get_query_inform_info_rec(result.p_result_madw, i); > - print_inform_info_record(inform_info_record); > - } > + dump_results(&result, dump_inform_info_record); > return_mad(); > return (status); > } > @@ -1239,8 +1099,8 @@ print_inform_info_records(osm_bind_handle_t bind_handle) > static ib_api_status_t > print_link_records(osm_bind_handle_t bind_handle, int argc, char *argv[]) > { > - int i; > - ib_link_record_t *lr; > + ib_link_record_t lr; > + ib_net64_t comp_mask = 0; > int from_lid = 0, to_lid = 0, from_port = -1, to_port = -1; > ib_api_status_t status; > > @@ -1252,15 +1112,32 @@ print_link_records(osm_bind_handle_t bind_handle, int argc, char *argv[]) > parse_lid_and_ports(bind_handle, argv[1], > &to_lid, &to_port, NULL); > > - status = get_link_records(bind_handle, from_lid, from_port, > - to_lid, to_port); > + memset(&lr, 0, sizeof(lr)); > + > + if (from_lid > 0) { > + lr.from_lid = cl_hton16(from_lid); > + comp_mask |= IB_LR_COMPMASK_FROM_LID; > + } > + if (from_port >= 0) { > + lr.from_port_num = from_port; > + comp_mask |= IB_LR_COMPMASK_FROM_PORT; > + } > + if (to_lid > 0) { > + lr.to_lid = cl_hton16(to_lid); > + comp_mask |= IB_LR_COMPMASK_TO_LID; > + } > + if (to_port >= 0) { > + lr.to_port_num = to_port; > + comp_mask |= IB_LR_COMPMASK_TO_PORT; > + } > + > + status = get_any_records(bind_handle, IB_MAD_ATTR_LINK_RECORD, 0, > + comp_mask, &lr, > + ib_get_attr_offset(sizeof(lr)), 0); > if (status != IB_SUCCESS) > return status; > > - for (i = 0; i < result.result_cnt; i++) { > - lr = osmv_get_query_result(result.p_result_madw, i); > - dump_one_link_record(lr); > - } > + dump_results(&result, dump_one_link_record); > return_mad(); > return status; > } > @@ -1269,8 +1146,8 @@ static int > print_sl2vl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, > int argc, char *argv[]) > { > - int i; > - ib_slvl_table_record_t *slvl; > + ib_slvl_table_record_t slvl; > + ib_net64_t comp_mask = 0; > int lid = 0, in_port = -1, out_port = -1; > ib_api_status_t status; > > @@ -1278,14 +1155,28 @@ print_sl2vl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, > parse_lid_and_ports(bind_handle, argv[0], > &lid, &in_port, &out_port); > > - status = get_slvl_records(bind_handle, lid, in_port, out_port); > + memset(&slvl, 0, sizeof(slvl)); > + > + if (lid > 0) { > + slvl.lid = cl_hton16(lid); > + comp_mask |= IB_SLVL_COMPMASK_LID; > + } > + if (in_port >= 0) { > + slvl.in_port_num = in_port; > + comp_mask |= IB_SLVL_COMPMASK_IN_PORT; > + } > + if (out_port >= 0) { > + slvl.out_port_num = out_port; > + comp_mask |= IB_SLVL_COMPMASK_OUT_PORT; > + } > + > + status = get_any_records(bind_handle, IB_MAD_ATTR_SLVL_RECORD, 0, > + comp_mask, &slvl, > + ib_get_attr_offset(sizeof(slvl)), 0); > if (status != IB_SUCCESS) > return status; > > - for (i = 0; i < result.result_cnt; i++) { > - slvl = osmv_get_query_result(result.p_result_madw, i); > - dump_one_slvl_record(slvl); > - } > + dump_results(&result, dump_one_slvl_record); > return_mad(); > return status; > } > @@ -1294,8 +1185,8 @@ static int > print_vlarb_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, > int argc, char *argv[]) > { > - int i; > - ib_vl_arb_table_record_t *vlarb; > + ib_vl_arb_table_record_t vlarb; > + ib_net64_t comp_mask = 0; > int lid = 0, port = -1, block = -1; > ib_api_status_t status; > > @@ -1303,14 +1194,28 @@ print_vlarb_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, > parse_lid_and_ports(bind_handle, argv[0], > &lid, &port, &block); > > - status = get_vlarb_records(bind_handle, lid, port, block); > + memset(&vlarb, 0, sizeof(vlarb)); > + > + if (lid > 0) { > + vlarb.lid = cl_hton16(lid); > + comp_mask |= IB_VLA_COMPMASK_LID; > + } > + if (port >= 0) { > + vlarb.port_num = port; > + comp_mask |= IB_VLA_COMPMASK_OUT_PORT; > + } > + if (block >= 0) { > + vlarb.block_num = block; > + comp_mask |= IB_VLA_COMPMASK_BLOCK; > + } > + > + status = get_any_records(bind_handle, IB_MAD_ATTR_VLARB_RECORD, 0, > + comp_mask, &vlarb, > + ib_get_attr_offset(sizeof(vlarb)), 0); > if (status != IB_SUCCESS) > return status; > > - for (i = 0; i < result.result_cnt; i++) { > - vlarb = osmv_get_query_result(result.p_result_madw, i); > - dump_one_vlarb_record(vlarb); > - } > + dump_results(&result, dump_one_vlarb_record); > return_mad(); > return status; > } > @@ -1319,8 +1224,8 @@ static int > print_pkey_tbl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, > int argc, char *argv[]) > { > - int i; > - ib_pkey_table_record_t *pktr; > + ib_pkey_table_record_t pktr; > + ib_net64_t comp_mask = 0; > int lid = 0, port = -1, block = -1; > ib_api_status_t status; > > @@ -1328,14 +1233,29 @@ print_pkey_tbl_records(const struct query_cmd *q, osm_bind_handle_t bind_handle, > parse_lid_and_ports(bind_handle, argv[0], > &lid, &port, &block); > > - status = get_pkey_tbl_records(bind_handle, lid, port, block); > + memset(&pktr, 0, sizeof(pktr)); > + > + if (lid > 0) { > + pktr.lid = cl_hton16(lid); > + comp_mask |= IB_PKEY_COMPMASK_LID; > + } > + if (port >= 0) { > + pktr.port_num = port; > + comp_mask |= IB_PKEY_COMPMASK_PORT; > + } > + if (block >= 0) { > + pktr.block_num = block; > + comp_mask |= IB_PKEY_COMPMASK_BLOCK; > + } > + > + status = get_any_records(bind_handle, IB_MAD_ATTR_PKEY_TBL_RECORD, 0, > + comp_mask, &pktr, > + ib_get_attr_offset(sizeof(pktr)), > + OSM_DEFAULT_SM_KEY); > if (status != IB_SUCCESS) > return status; > > - for (i = 0; i < result.result_cnt; i++) { > - pktr = osmv_get_query_result(result.p_result_madw, i); > - dump_one_pkey_tbl_record(pktr); > - } > + dump_results(&result, dump_one_pkey_tbl_record); > return_mad(); > return status; > } > -- > 1.5.4.rc2.60.gb2e62 From kononov at dls.net Wed Jan 23 16:31:03 2008 From: kononov at dls.net (Roman Kononov) Date: Wed, 23 Jan 2008 18:31:03 -0600 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> <4797A6FD.2080708@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> <4797CDC3.6090707@dls.net> Message-ID: <4797DC47.6060104@dls.net> On 2008-01-23 17:32, Roland Dreier wrote: > I'd be curious to run it. It can't hurt to have the test... This is similar to my previous program. The difference is that this one makes many (up to 10, in test_create()) sets of SQ+RQ+CQ in struct conn_t, which share a single Completion Channel in struct ctx_t. Every conn_t has a ring of receive buffers, a ring of send buffers, send sequence number, receive sequence number. Every time a buffer is sent, just before ibv_post_send() call, the send sequence number is placed into the buffer, imm_data and wr_id. Upon Send Completion, wr_id and the sent buffer must contain the expected send sequence number. Every time a buffer is received, just before ibv_post_recv() call, the receive sequence number is placed into wr_id. Upon Receive Completion, wr_id, the received buffer and imm_data must contain the expected receive sequence number. These 2 "musts" are sometimes violated. In my setup assertion fails in lines 303 (receiver) and 287 (sender). The program has 2 threads. The first one reads the completion channel, validates the Send and Receive Completions, issues ibv_post_recv() and ibv_post_send(). The second one can only issue ibv_post_send(). The program makes 2 QP. Increasing the number of QP seemly does not increase the probability of failure. The program prints . In my setup, it seems that if I run two pairs of the program, the failure occurs sooner. Roman From kononov at dls.net Wed Jan 23 16:36:23 2008 From: kononov at dls.net (Roman Kononov) Date: Wed, 23 Jan 2008 18:36:23 -0600 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <4797DC47.6060104@dls.net> References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> <4797A6FD.2080708@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> <4797CDC3.6090707@dls.net> <4797DC47.6060104@dls.net> Message-ID: <4797DD87.8020808@dls.net> On 2008-01-23 18:31, Roman Kononov wrote: > On 2008-01-23 17:32, Roland Dreier wrote: >> I'd be curious to run it. It can't hurt to have the test... > > This is similar to my previous program. The difference is that this one > makes many (up to 10, in test_create()) sets of SQ+RQ+CQ in struct > conn_t, which share a single Completion Channel in struct ctx_t. Every > conn_t has a ring of receive buffers, a ring of send buffers, send > sequence number, receive sequence number. Every time a buffer is sent, > just before ibv_post_send() call, the send sequence number is placed > into the buffer, imm_data and wr_id. Upon Send Completion, wr_id and the > sent buffer must contain the expected send sequence number. Every time a > buffer is received, just before ibv_post_recv() call, the receive > sequence number is placed into wr_id. Upon Receive Completion, wr_id, > the received buffer and imm_data must contain the expected receive > sequence number. These 2 "musts" are sometimes violated. In my setup > assertion fails in lines 303 (receiver) and 287 (sender). > > The program has 2 threads. The first one reads the completion channel, > validates the Send and Receive Completions, issues ibv_post_recv() and > ibv_post_send(). The second one can only issue ibv_post_send(). > > The program makes 2 QP. Increasing the number of QP seemly does not > increase the probability of failure. > > The program prints . > > In my setup, it seems that if I run two pairs of the program, the > failure occurs sooner. > > Roman > Sorry, I forgot the program... -------------- next part -------------- A non-text attachment was scrubbed... Name: kink.c Type: text/x-csrc Size: 20747 bytes Desc: not available URL: From arlin.r.davis at intel.com Wed Jan 23 16:36:54 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Wed, 23 Jan 2008 16:36:54 -0800 Subject: [ofa-general] [PATCH] uDAPL v2: dapltest segfault on RHEL5 using inet_ntoa Message-ID: <000001c85e21$37fe8d20$ff0da8c0@amr.corp.intel.com> dapltest does not include definitions for inet_ntoa. At load time the symbol was resloved but with default definition of int, instead of char*, it caused a segfault. Add correct include files in dapl_mdep_user.h. Signed-off by: Arlin Davis diff --git a/test/dapltest/mdep/linux/dapl_mdep_user.h b/test/dapltest/mdep/linux/dapl_mdep_user.h index 153c8c1..52199d1 100755 --- a/test/dapltest/mdep/linux/dapl_mdep_user.h +++ b/test/dapltest/mdep/linux/dapl_mdep_user.h @@ -43,6 +43,11 @@ #include #include +/* inet_ntoa */ +#include +#include +#include + /* Default Device Name */ #define DT_MdepDeviceName "ofa-v2-ib0" From dwdavidbeaversm at davidbeavers.com Wed Jan 23 16:51:48 2008 From: dwdavidbeaversm at davidbeavers.com (Bianca Dougherty) Date: Thu, 24 Jan 2008 01:51:48 +0100 Subject: [ofa-general] New Year season is a great time to improve your health Message-ID: <223530016.72710988527256@davidbeavers.com> If you take special New Year offer from CanadianPharmacy, you'll save up to 20% on you products. Only now. Don't waste time, this offer is valid till the end of the season only. CanadianPharmacy is your choice when you're looking for the place to buy products in a safe and confidential way. Full range of 100% generic products which are available to order online. Prompt delivery, personal approach, excellent service.12 free bonus pills will be added to any order over $300.Order products with pleasure and make significant savings. http://geocities.com/parker_mullins/Thank You for Your time and for your attention Yours faithfully, Bianca Dougherty -------------- next part -------------- An HTML attachment was scrubbed... URL: From weiny2 at llnl.gov Wed Jan 23 19:12:35 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Wed, 23 Jan 2008 19:12:35 -0800 Subject: [ofa-general] Is the opensm/scripts/sldd.sh script still needed? Message-ID: <20080123191235.179886dd.weiny2@llnl.gov> I was going through the start up script provided in the opensm package and found a reference to the sldd.sh script. Is this still needed? It still exists in opensm/scripts/sldd.sh but is _not_ in the spec.in file? The opensm/scripts/redhat-opensm.init will try and run this if the "HONORE_GUID2LID_FLAG" is set. So should we add it to the spec file or remove the usage in the start up script? Thanks, Ira From kliteyn at mellanox.co.il Wed Jan 23 19:22:30 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 24 Jan 2008 05:22:30 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-24:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-23 OpenSM git rev = Sun_Jan_20_19:54:33_2008 [f559875dfdbacb17221042c77aa4c8b9554ff75b] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=397 Fail=3 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 9 FatTree RhinoDDR.topo 8 FatTree merge-root-4-ary-3-tree.topo Failures: 2 FatTree merge-root-4-ary-3-tree.topo 1 FatTree RhinoDDR.topo From rdreier at cisco.com Wed Jan 23 19:22:44 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 23 Jan 2008 19:22:44 -0800 Subject: [ofa-general] Re: [PATCH 2/3] IB/srp: allow user to control host queue length In-Reply-To: <1198102136.5649.34.camel@lap75545.ornl.gov> (David Dillow's message of "Wed, 19 Dec 2007 17:08:56 -0500") References: <1198102136.5649.34.camel@lap75545.ornl.gov> Message-ID: So thinking about this some more... does it make more sense to leave our can_queue alone, use scsi_adjust_queue_depth() to limit the queue to the initial request limit, and then add a .change_queue_depth method so that the user can adjust the queue depth via the standard scsi sysfs entry? - R. From rdreier at cisco.com Wed Jan 23 20:17:50 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 23 Jan 2008 20:17:50 -0800 Subject: [ofa-general] updated Ubuntu package archive, notes on Debian and Fedora Message-ID: I've recently updated my Ubuntu PPA ("personal package archive") to include both libmlx4 and librdmacm and have builds for both gutsy and hardy. To use the packages, just add the following to your /etc/apt/sources.list for hardy: deb http://ppa.launchpad.net/roland.dreier/ubuntu hardy main deb-src http://ppa.launchpad.net/roland.dreier/ubuntu hardy main or the following for gutsy: deb http://ppa.launchpad.net/roland.dreier/ubuntu gutsy main deb-src http://ppa.launchpad.net/roland.dreier/ubuntu gutsy main as far as other distributions go... libmlx4 packages have already made it into the Debian archive, including the testing/lenny distribution. I have started the process to get librdmacm packages into Debian (the ITP bug is http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=462348), so packages should appear in the not-too-distant future. I will create Fedora packages for librdmacm soon; my libmlx4 packages for Fedora are still waiting for review (https://bugzilla.redhat.com/show_bug.cgi?id=409511). - R. From rdreier at cisco.com Wed Jan 23 20:20:43 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 23 Jan 2008 20:20:43 -0800 Subject: [ofa-general] Re: why does ibv_device_attr.*_guid are in network order? In-Reply-To: <4797052D.4070502@dev.mellanox.co.il> (Dotan Barak's message of "Wed, 23 Jan 2008 11:13:17 +0200") References: <4797052D.4070502@dev.mellanox.co.il> Message-ID: > I noticed that the following attributes in ibv_device_attr are in > network order: > node_guid > sys_image_guid > > What is the reason for this? It's hard for me to see any reason to byte-swap them. GUIDs are really byte-strings and it seems sensible to keep them into network order, since you're not likely to do any arithmetic with them. Another point in favor of network byte order is that GIDs are IPv6 addresses, and so they should be stored in network byte order by the verbs just as IP addresses are kept in network byte order everywhere else. Given that GIDs are in network byte order, I think it would be very confusing and error-prone if GUIDs were in host byte order. From dillowda at ornl.gov Wed Jan 23 20:24:07 2008 From: dillowda at ornl.gov (David Dillow) Date: Wed, 23 Jan 2008 23:24:07 -0500 Subject: [ofa-general] Re: [PATCH 2/3] IB/srp: allow user to control host queue length In-Reply-To: References: <1198102136.5649.34.camel@lap75545.ornl.gov> Message-ID: <1201148647.13982.27.camel@obelisk.thedillows.org> On Wed, 2008-01-23 at 19:22 -0800, Roland Dreier wrote: > So thinking about this some more... does it make more sense to leave > our can_queue alone, use scsi_adjust_queue_depth() to limit the queue > to the initial request limit, I don't think we can use scsi_adjust_queue_depth() -- that's used per scsi_device (ie, LUN), and is not the host's overall queue depth. When not using tagged queuing, it looks to be equivalent to cmds_per_lun. > and then add a .change_queue_depth > method so that the user can adjust the queue depth via the standard > scsi sysfs entry? I like the idea of being able to tune this on the fly via sysfs, but I'm not sure what to call it, since the mid-layer already registers a "can_queue" file which is read-only. Other than that, it looks possible to just update can_queue at will, as long as we make sure it is >= 1 and we can accept a few extra commands in flight until the queue drains below the new setting. Ideas? There's another reason to have some ability to change can_queue, either dynamically or at add-target time -- there are some devices out there that report a slightly inflated initial credit limit, which lets us drive into unspecified territory. I expect there to be a fix released soon, but not everyone will be able to upgrade and will need to limited the queue length manually to stay out of trouble. -- Dave Dillow National Center for Computational Science Oak Ridge National Laboratory (865) 241-6602 office From sean.hefty at intel.com Wed Jan 23 21:32:28 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 23 Jan 2008 21:32:28 -0800 Subject: [ofa-general] updated Ubuntu package archive, notes on Debian and Fedora In-Reply-To: References: Message-ID: <000001c85e4a$82d7a2e0$63e0180a@amr.corp.intel.com> thanks for adding the librdmacm packages From dotanb at dev.mellanox.co.il Wed Jan 23 22:10:20 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Thu, 24 Jan 2008 08:10:20 +0200 Subject: [ofa-general] Re: why does ibv_device_attr.*_guid are in network order? In-Reply-To: References: <4797052D.4070502@dev.mellanox.co.il> Message-ID: <47982BCC.8010404@dev.mellanox.co.il> Roland Dreier wrote: > > I noticed that the following attributes in ibv_device_attr are in > > network order: > > node_guid > > sys_image_guid > > > > What is the reason for this? > > It's hard for me to see any reason to byte-swap them. GUIDs are > really byte-strings and it seems sensible to keep them into network > order, since you're not likely to do any arithmetic with them. > Another point in favor of network byte order is that GIDs are IPv6 > addresses, and so they should be stored in network byte order by the > verbs just as IP addresses are kept in network byte order everywhere > else. Given that GIDs are in network byte order, I think it would be > very confusing and error-prone if GUIDs were in host byte order. > > I wanted to clarify this issue, thanks. I will add to the man page that those attributes are in network order to prevent any confusion in the future ... (i tried to use them and noticed that they are not in host order in the first place ...) thanks again Dotan From mashirle at us.ibm.com Wed Jan 23 13:00:11 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 23 Jan 2008 13:00:11 -0800 Subject: [ofa-general] Re: [PATCH] fix for an IPoIB compile error In-Reply-To: <1201093871.9739.5.camel@localhost.localdomain> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201093871.9739.5.camel@localhost.localdomain> Message-ID: <1201122011.9739.8.camel@localhost.localdomain> Recreated the patch against 2.6.25 branch git tree. Thanks Shirley diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index a082466..d116854 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -181,6 +181,7 @@ static int ipoib_change_mtu(struct net_device *dev, int new_mtu) { struct ipoib_dev_priv *priv = netdev_priv(dev); +#ifdef CONFIG_INFINIBAND_IPOIB_CM /* dev->mtu > 2K ==> connected mode */ if (ipoib_cm_admin_enabled(dev)) { if (new_mtu > ipoib_cm_max_mtu(dev)) @@ -193,6 +194,7 @@ static int ipoib_change_mtu(struct net_device *dev, int new_mtu) dev->mtu = new_mtu; return 0; } +#endif if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) return -EINVAL; From mashirle at us.ibm.com Wed Jan 23 13:09:56 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 23 Jan 2008 13:09:56 -0800 Subject: [ofa-general] [RFC] IPoIB UD 4K MTU support In-Reply-To: <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> Message-ID: <1201122596.9739.10.camel@localhost.localdomain> Hello Eli, Here is the patch against Roland's for-2.6.25 git tree. Please let me know if any problem. Thanks for reviewing this patch. Shirley Signed-off-by Shirley Ma diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index fe250c6..af11e2c 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -56,9 +56,6 @@ /* constants */ enum { - IPOIB_PACKET_SIZE = 2048, - IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, - IPOIB_ENCAP_LEN = 4, IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header to 16 */ @@ -319,6 +316,7 @@ struct ipoib_dev_priv { struct dentry *mcg_dentry; struct dentry *path_dentry; #endif + unsigned int max_ib_mtu; }; struct ipoib_ah { @@ -424,6 +422,13 @@ int ipoib_mcast_stop_thread(struct net_device *dev, int flush); void ipoib_mcast_dev_down(struct net_device *dev); void ipoib_mcast_dev_flush(struct net_device *dev); +/* padding packet to fit one page size for 4K IB mtu */ +static inline int ipoib_ud_mtu(unsigned int ib_mtu) +{ + return (ib_mtu < 4096) ? (ib_mtu - IPOIB_ENCAP_LEN) : + (ib_mtu - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4); +} + #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG struct ipoib_mcast_iter *ipoib_mcast_iter_init(struct net_device *dev); int ipoib_mcast_iter_next(struct ipoib_mcast_iter *iter); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 52bc2bd..d888a47 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -87,6 +87,15 @@ void ipoib_free_ah(struct kref *kref) spin_unlock_irqrestore(&priv->lock, flags); } +static int ipoib_ud_buf_size(unsigned int max_ib_mtu) +{ + if (max_ib_mtu < 4096) + return (max_ib_mtu + IB_GRH_BYTES); + else + /* padding packet to one page for 4K mtu */ + return (max_ib_mtu - 4); +} + static int ipoib_ib_post_receive(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -96,7 +105,7 @@ static int ipoib_ib_post_receive(struct net_device *dev, int id) int ret; list.addr = priv->rx_ring[id].mapping; - list.length = IPOIB_BUF_SIZE; + list.length = ipoib_ud_buf_size(priv->max_ib_mtu); list.lkey = priv->mr->lkey; param.next = NULL; @@ -108,7 +117,7 @@ static int ipoib_ib_post_receive(struct net_device *dev, int id) if (unlikely(ret)) { ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); ib_dma_unmap_single(priv->ca, priv->rx_ring[id].mapping, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); dev_kfree_skb_any(priv->rx_ring[id].skb); priv->rx_ring[id].skb = NULL; } @@ -122,7 +131,7 @@ static int ipoib_alloc_rx_skb(struct net_device *dev, int id) struct sk_buff *skb; u64 addr; - skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); + skb = dev_alloc_skb(ipoib_ud_buf_size(priv->max_ib_mtu) + 4); if (!skb) return -ENOMEM; @@ -133,7 +142,7 @@ static int ipoib_alloc_rx_skb(struct net_device *dev, int id) */ skb_reserve(skb, 4); - addr = ib_dma_map_single(priv->ca, skb->data, IPOIB_BUF_SIZE, + addr = ib_dma_map_single(priv->ca, skb->data, ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { dev_kfree_skb_any(skb); @@ -190,7 +199,7 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); ib_dma_unmap_single(priv->ca, addr, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); dev_kfree_skb_any(skb); priv->rx_ring[wr_id].skb = NULL; return; @@ -215,7 +224,7 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", wc->byte_len, wc->slid); - ib_dma_unmap_single(priv->ca, addr, IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ib_dma_unmap_single(priv->ca, addr, ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); skb_put(skb, wc->byte_len); skb_pull(skb, IB_GRH_BYTES); @@ -632,7 +641,7 @@ int ipoib_ib_dev_stop(struct net_device *dev, int flush) continue; ib_dma_unmap_single(priv->ca, rx_req->mapping, - IPOIB_BUF_SIZE, + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); dev_kfree_skb_any(rx_req->skb); rx_req->skb = NULL; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index a082466..8a994f3 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -194,7 +194,7 @@ static int ipoib_change_mtu(struct net_device *dev, int new_mtu) return 0; } - if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) + if (new_mtu > ipoib_ud_mtu(priv->max_ib_mtu)) return -EINVAL; priv->admin_mtu = new_mtu; @@ -969,7 +969,7 @@ static void ipoib_setup(struct net_device *dev) dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; /* MTU will be reset when mcast join happens */ - dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; + dev->mtu = ipoib_ud_mtu(priv->max_ib_mtu); priv->mcast_mtu = priv->admin_mtu = dev->mtu; memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); @@ -1103,6 +1103,7 @@ static struct net_device *ipoib_add_port(const char *format, struct ib_device *hca, u8 port) { struct ipoib_dev_priv *priv; + struct ib_port_attr attr; int result = -ENOMEM; priv = ipoib_intf_alloc(format); @@ -1111,6 +1112,13 @@ static struct net_device *ipoib_add_port(const char *format, SET_NETDEV_DEV(priv->dev, hca->dma_device); + if (!ib_query_port(hca, port, &attr)) + priv->max_ib_mtu = ib_mtu_enum_to_int(attr.max_mtu); + else { + printk(KERN_WARNING "%s: ib_query_port %d failed\n", + hca->name, port); + goto device_init_failed; + } result = ib_query_pkey(hca, port, 0, &priv->pkey); if (result) { printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 2628339..0661e87 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -567,9 +567,7 @@ void ipoib_mcast_join_task(struct work_struct *work) return; } - priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - - IPOIB_ENCAP_LEN; - + priv->mcast_mtu = ipoib_ud_mtu(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); From bart.vanassche at gmail.com Wed Jan 23 23:13:20 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Thu, 24 Jan 2008 08:13:20 +0100 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: References: Message-ID: On Jan 18, 2008 1:11 AM, Roland Dreier wrote: > Anyway, here are all the pending things that I'm aware of. As usual, > if something isn't already in my tree and isn't listed below, I > probably missed it or dropped it by mistake. Please remind me again > in that case. Are there any plans to merge the SDP (Sockets Direct Protocol) implementation ? Bart. From mashirle at us.ibm.com Wed Jan 23 13:41:56 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 23 Jan 2008 13:41:56 -0800 Subject: [ofa-general] Re: [PATCH] fix for an IPoIB compile error In-Reply-To: <1201122011.9739.8.camel@localhost.localdomain> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201093871.9739.5.camel@localhost.localdomain> <1201122011.9739.8.camel@localhost.localdomain> Message-ID: <1201124516.9739.13.camel@localhost.localdomain> Hello Roland, I just noticed it's being fixed in for-2.6.25 but not OFED-1.3 git tree. Can someone integrate the fix from for-2.6.25 to OFED-1.3? Thanks Shirley From HNGUYEN at de.ibm.com Thu Jan 24 00:26:10 2008 From: HNGUYEN at de.ibm.com (Hoang-Nam Nguyen) Date: Thu, 24 Jan 2008 09:26:10 +0100 Subject: [ofa-general] [PATCH] fix for an IPoIB compile error In-Reply-To: <1201093871.9739.5.camel@localhost.localdomain> Message-ID: > Here is a trivial patch to fix below IPoIB compile error when CM is not > configured. The patch is against OFED-1.3 kernel git tree. > > drivers/infiniband/ulp/ipoib/ipoib_main.c: In function > ‘ipoib_change_mtu’: > drivers/infiniband/ulp/ipoib/ipoib_main.c:186: error: ‘struct > ipoib_dev_priv’ has no member named ‘cm’ Look at this "[PATCH] IB/ipoib: Fix undefined symbol (priv->cm) if ipoib_cm disabled" http://lkml.org/lkml/2008/1/16/287 Nam From elocution at thenewme.com Thu Jan 24 01:35:17 2008 From: elocution at thenewme.com (Pete Whitt) Date: Thu, 24 Jan 2008 12:35:17 +0300 Subject: [ofa-general] CockEnormousGarrett Message-ID: <01c85e85$93007080$06dbcf59@elocution> MargoWhackingDick http://www.slopitues.com From vlad at lists.openfabrics.org Thu Jan 24 03:09:57 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 24 Jan 2008 03:09:57 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080124-0200 daily build status Message-ID: <20080124110957.0666AE6004E@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.12 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.21.1 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.17 Passed on x86_64 with linux-2.6.17 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.22 Passed on powerpc with linux-2.6.12 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.21.1 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.14 Passed on ppc64 with linux-2.6.12 Passed on x86_64 with linux-2.6.18-53.el5 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.15 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.15 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ia64 with linux-2.6.16 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.19 Passed on ppc64 with linux-2.6.19 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.22 Passed on ia64 with linux-2.6.16.21-0.8-default Failed: From jackm at dev.mellanox.co.il Thu Jan 24 04:29:34 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Thu, 24 Jan 2008 14:29:34 +0200 Subject: [ofa-general] [PATCH v2] libmlx4: avoid memcpy in blueflame post_sends In-Reply-To: References: <200801091223.14155.jackm@dev.mellanox.co.il> Message-ID: <200801241429.35292.jackm@dev.mellanox.co.il> Do not use memcpy when copying to the BlueFlame buffer. memcpy implementations may use move-string-buffer (byte-wise copy) assembler instructions, which do not guarantee copy order into the blueflame buffer. NOTE: This is Roland's version of the fix. Signed-off-by: Jack Morgenstein --- Roland, I notice that you have not yet had a chance to apply this patch (your version, except that I changed sizeof (long) to sizeof (unsigned long) ). I'm just posting it "officially". - Jack diff --git a/src/qp.c b/src/qp.c index bced740..4322513 100644 --- a/src/qp.c +++ b/src/qp.c @@ -168,6 +168,20 @@ static void set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ibv_sge *sg) dseg->byte_count = htonl(sg->length); } +/* + * Avoid using memcpy() to copy to BlueFlame page, since memcpy() + * implementations may use move-string-buffer assembler instructions, + * which do not guarantee order of copying. + */ +static void mlx4_bf_copy(unsigned long *dst, unsigned long *src, unsigned bytecnt) +{ + while (bytecnt > 0) { + *dst++ = *src++; + *dst++ = *src++; + bytecnt -= 2 * sizeof (unsigned long); + } +} + int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { @@ -391,7 +405,8 @@ out: pthread_spin_lock(&ctx->bf_lock); - memcpy(ctx->bf_page + ctx->bf_offset, ctrl, align(size * 16, 64)); + mlx4_bf_copy(ctx->bf_page + ctx->bf_offset, (unsigned long *) ctrl, + align(size * 16, 64)); wc_wmb(); ctx->bf_offset ^= ctx->bf_buf_size; From wabbler at broadlinkllc.com Thu Jan 24 05:07:39 2008 From: wabbler at broadlinkllc.com (Janet Peck) Date: Thu, 24 Jan 2008 13:07:39 +0000 Subject: [ofa-general] Jan 24 16:00:00 MSK 2008 save 1984 on a11 Ado6e/Mlcrosoft Message-ID: <000901c85e89$2c7d5180$0100007f@yvkot> goto soft2008new . com in Internet browser adobe acrobat 3d - 59 avid xpress pro 5.7 - 119 mcafee desktop firewall 8.0.493 - 39 mindjet mindmanager pro 7.0 - 39 autodesk building systems 2006 - 129 adobe photoshop cs3 extended - 89 adobe contribute cs3 - 39 adobe creative suite 3 master collection for win - 299 masterwriter 1.0 - 49 intuit quicken home and business 2008 - 39 microsoft onenote pro 2003 - 29 ms xp professional with sp2 - 49 avid newscutter xp 6.7.2 - 69 avid xpress pro 5.7 - 119 creative suite premium 2 - 149 From nosedived at stonemfg.com Thu Jan 24 05:07:39 2008 From: nosedived at stonemfg.com (Jerald Morgan) Date: Thu, 24 Jan 2008 13:07:39 +0000 Subject: [ofa-general] Jan 24 16:00:00 MSK 2008 save 1984 on a11 Ado6e/Mlcrosoft Message-ID: <000901c85e89$2c7d5180$0100007f@yvkot> goto soft2008new . com in Internet browser adobe acrobat 3d - 59 avid xpress pro 5.7 - 119 mcafee desktop firewall 8.0.493 - 39 mindjet mindmanager pro 7.0 - 39 autodesk building systems 2006 - 129 adobe photoshop cs3 extended - 89 adobe contribute cs3 - 39 adobe creative suite 3 master collection for win - 299 masterwriter 1.0 - 49 intuit quicken home and business 2008 - 39 microsoft onenote pro 2003 - 29 ms xp professional with sp2 - 49 avid newscutter xp 6.7.2 - 69 avid xpress pro 5.7 - 119 creative suite premium 2 - 149 From whinnierl40 at nhp2000.de Thu Jan 24 05:42:13 2008 From: whinnierl40 at nhp2000.de (Alvin Battle) Date: Thu, 24 Jan 2008 14:42:13 +0100 Subject: [ofa-general] Guys Need This Message-ID: <919533389.98659807735689@nhp2000.de> Amanda told me my sm rbn al ekl l p hy en fwk is didn't matter to her, but since I've put on 2 in qh ch aiv es, she has or ncf gas ju ms everytime we have s lw e ij x! http://home.graffiti.net/diekevers/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dotanb at dev.mellanox.co.il Thu Jan 24 05:43:09 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Thu, 24 Jan 2008 15:43:09 +0200 Subject: [ofa-general] Re: why does ibv_device_attr.*_guid are in network order? In-Reply-To: References: <4797052D.4070502@dev.mellanox.co.il> Message-ID: <479895ED.2040606@dev.mellanox.co.il> Roland Dreier wrote: > > I noticed that the following attributes in ibv_device_attr are in > > network order: > > node_guid > > sys_image_guid > > > > What is the reason for this? > > It's hard for me to see any reason to byte-swap them. GUIDs are > really byte-strings and it seems sensible to keep them into network > order, since you're not likely to do any arithmetic with them. > Another point in favor of network byte order is that GIDs are IPv6 > addresses, and so they should be stored in network byte order by the > verbs just as IP addresses are kept in network byte order everywhere > else. Given that GIDs are in network byte order, I think it would be > very confusing and error-prone if GUIDs were in host byte order. > > I don't know if this mail thread is the place for it, but why ibv_query_pkey return the pkey value in network order? thanks Dotan From gstreiff at NetEffect.com Thu Jan 24 05:54:36 2008 From: gstreiff at NetEffect.com (Glenn Streiff) Date: Thu, 24 Jan 2008 07:54:36 -0600 Subject: [ofa-general] RE: InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: Message-ID: <5E701717F2B2ED4EA60F87C8AA57B7CC0794FEB4@venom2> > From: Roland Dreier [mailto:rdreier at cisco.com] > To: Christoph Hellwig; Glenn Streiff > > Rather than arguing over whether we have to have sparse clean code, I > decided to annotate the code myself. Here's a patch that fixes most > of the sparse warnings in the nes driver. There's still some stuff > that actually looks buggy, like the way hte_index stuff is handled. > You initialize hte_index_mask as: > > hte_index_mask = ((u32)1 << ((u32temp & 0x001f)+1))-1; > nesadapter->hte_index_mask = hte_index_mask; > > but then compute hte_index stuff with: > > nesqp->hte_index = cpu_to_be32( > crc32c(~0, (void *)&nes_quad, > sizeof(nes_quad)) ^ 0xffffffff); > > and then do: > > nesqp->hte_index &= nesadapter->hte_index_mask; > > which seems odd to say the least (hte_index is big-endian, > hte_index_mask is cpu-endian). > > And also, there's code with the loc_addr/rem_addr etc that seem very > confused. For example > > cm_info->loc_addr = htonl(cm_info->loc_addr); > cm_info->rem_addr = htonl(cm_info->rem_addr); > cm_info->loc_port = htons(cm_info->loc_port); > cm_info->rem_port = htons(cm_info->rem_port); > > which is obviously impossible to annotate correctly, and I couldn't > keep track of the endianness stuff elsewhere. Thanks for the additional review and patch. I take your point. The part is little endian and the driver is functional for little and big endian platforms. There may have been some expedience with the declarations there. I think it can be improved. Let me take it up with the person who wrote that code. Also, I want everyone to understand that my skill set is weighted more towards build/install/config. And I guess I'll be patch wrangling as well. So I'll rely on input from my developers for issues that drill down or I'll have them post directly. I respect the work you guys do. For now, let me get some qa cycles with your patch across x86_64 and power (and probably a couple others). Regards, Glenn > > Anyway this is what I have in case the promised cleanups don't turn up > in time... > > Signed-off-by: Roland Dreier > > From eli at dev.mellanox.co.il Thu Jan 24 06:18:09 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 16:18:09 +0200 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <4797DD87.8020808@dls.net> References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> <4797A6FD.2080708@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> <4797CDC3.6090707@dls.net> <4797DC47.6060104@dls.net> <4797DD87.8020808@dls.net> Message-ID: <1201184289.6755.7.camel@mtls03> I am re-sending the patches again, this time with the bug of being unable to handle QPs whose receive size is not a power of two fixed. Roman, can you tell whether from your perspective there has been an improvement in stability with the latest patches? From eli at mellanox.co.il Thu Jan 24 06:23:48 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 16:23:48 +0200 Subject: [ofa-general] [PATCH] ib/limthca: Remove an always true condition Message-ID: <1201184628.6755.9.camel@mtls03> Remove an always true condition srq->first_free can never be negative. Signed-off-by: Eli Cohen --- src/srq.c | 6 +----- 1 files changed, 1 insertions(+), 5 deletions(-) diff --git a/src/srq.c b/src/srq.c index f9fc006..72b7a0e 100644 --- a/src/srq.c +++ b/src/srq.c @@ -66,11 +66,7 @@ void mthca_free_srq_wqe(struct mthca_srq *srq, int ind) { pthread_spin_lock(&srq->lock); - if (srq->first_free >= 0) - *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; - else - srq->first_free = ind; - + *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; *wqe_to_link(get_wqe(srq, ind)) = -1; srq->last_free = ind; -- 1.5.3.8 From eli at mellanox.co.il Thu Jan 24 06:27:11 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 16:27:11 +0200 Subject: [ofa-general] [PATCH] IB/libmthca: Pre link receive WQEs in Tavor mode Message-ID: <1201184831.6755.14.camel@mtls03> Pre link receive WQEs in Tavor mode Tavor mode requires that each WQE in a posted list of receive WQEs will have a valid NDA field. This requirement holds true for regular QPs as well as for SRQs. This patch prelinks the receive queue in a regular QP and keeps the free list in SRQ always properly linked. Signed-off-by: Eli Cohen Reviewed-by: Jack Morgenstein --- src/qp.c | 14 ++++++++------ src/srq.c | 24 +++++++++++++++--------- 2 files changed, 23 insertions(+), 15 deletions(-) diff --git a/src/qp.c b/src/qp.c index 841e316..3c5f049 100644 --- a/src/qp.c +++ b/src/qp.c @@ -360,7 +360,6 @@ int mthca_tavor_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, prev_wqe = qp->rq.last; qp->rq.last = wqe; - ((struct mthca_next_seg *) wqe)->nda_op = 0; ((struct mthca_next_seg *) wqe)->ee_nds = htonl(MTHCA_NEXT_DBD); ((struct mthca_next_seg *) wqe)->flags = @@ -388,9 +387,6 @@ int mthca_tavor_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, qp->wrid[ind + qp->sq.max] = wr->wr_id; - ((struct mthca_next_seg *) prev_wqe)->nda_op = - htonl((ind << qp->rq.wqe_shift) | 1); - wmb(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = htonl(MTHCA_NEXT_DBD | size); @@ -786,6 +782,8 @@ int mthca_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap, { int size; int max_sq_sge; + struct mthca_next_seg *next; + int i; qp->rq.max_gs = cap->max_recv_sge; qp->sq.max_gs = cap->max_send_sge; @@ -860,9 +858,7 @@ int mthca_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap, memset(qp->buf.buf, 0, qp->buf_size); if (mthca_is_memfree(pd->context)) { - struct mthca_next_seg *next; struct mthca_data_seg *scatter; - int i; uint32_t sz; sz = htonl((sizeof (struct mthca_next_seg) + @@ -886,6 +882,12 @@ int mthca_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap, qp->sq.wqe_shift) + qp->send_wqe_offset); } + } else { + for (i = 0; i < qp->rq.max; ++i) { + next = get_recv_wqe(qp, i); + next->nda_op = htonl((((i + 1) % qp->rq.max) << + qp->rq.wqe_shift) | 1); + } } qp->sq.last = get_send_wqe(qp, qp->sq.max - 1); diff --git a/src/srq.c b/src/srq.c index 72b7a0e..1d326b8 100644 --- a/src/srq.c +++ b/src/srq.c @@ -64,9 +64,13 @@ static inline int *wqe_to_link(void *wqe) void mthca_free_srq_wqe(struct mthca_srq *srq, int ind) { + struct mthca_next_seg *last_free; + pthread_spin_lock(&srq->lock); - *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; + last_free = get_wqe(srq, srq->last_free); + *wqe_to_link(last_free) = ind; + last_free->nda_op = htonl((ind << srq->wqe_shift) | 1); *wqe_to_link(get_wqe(srq, ind)) = -1; srq->last_free = ind; @@ -113,7 +117,6 @@ int mthca_tavor_post_srq_recv(struct ibv_srq *ibsrq, prev_wqe = srq->last; srq->last = wqe; - ((struct mthca_next_seg *) wqe)->nda_op = 0; ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ @@ -142,9 +145,6 @@ int mthca_tavor_post_srq_recv(struct ibv_srq *ibsrq, ((struct mthca_data_seg *) wqe)->addr = 0; } - ((struct mthca_next_seg *) prev_wqe)->nda_op = - htonl((ind << srq->wqe_shift) | 1); - wmb(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = htonl(MTHCA_NEXT_DBD); @@ -218,8 +218,6 @@ int mthca_arbel_post_srq_recv(struct ibv_srq *ibsrq, break; } - ((struct mthca_next_seg *) wqe)->nda_op = - htonl((next_ind << srq->wqe_shift) | 1); ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ @@ -302,9 +300,17 @@ int mthca_alloc_srq_buf(struct ibv_pd *pd, struct ibv_srq_attr *attr, */ for (i = 0; i < srq->max; ++i) { - wqe = get_wqe(srq, i); + struct mthca_next_seg *next; - *wqe_to_link(wqe) = i < srq->max - 1 ? i + 1 : -1; + next = wqe = get_wqe(srq, i); + + if (i < srq->max - 1) { + *wqe_to_link(wqe) = i + 1; + next->nda_op = htonl(((i + 1) << srq->wqe_shift) | 1); + } else { + *wqe_to_link(wqe) = -1; + next->nda_op = 0; + } for (scatter = wqe + sizeof (struct mthca_next_seg); (void *) scatter < wqe + (1 << srq->wqe_shift); -- 1.5.3.8 From eli at mellanox.co.il Thu Jan 24 06:33:40 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 16:33:40 +0200 Subject: [ofa-general] [PATCH] IB/ib_mthca: Remove an always true condition Message-ID: <1201185220.6755.17.camel@mtls03> Remove an always true condition srq->first_free can never be negative. Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mthca/mthca_srq.c | 6 +----- 1 files changed, 1 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/hw/mthca/mthca_srq.c b/drivers/infiniband/hw/mthca/mthca_srq.c index 553d681..782b478 100644 --- a/drivers/infiniband/hw/mthca/mthca_srq.c +++ b/drivers/infiniband/hw/mthca/mthca_srq.c @@ -475,11 +475,7 @@ void mthca_free_srq_wqe(struct mthca_srq *srq, u32 wqe_addr) spin_lock(&srq->lock); - if (likely(srq->first_free >= 0)) - *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; - else - srq->first_free = ind; - + *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; *wqe_to_link(get_wqe(srq, ind)) = -1; srq->last_free = ind; -- 1.5.3.8 From eli at mellanox.co.il Thu Jan 24 06:35:31 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 16:35:31 +0200 Subject: [ofa-general] PATCH]: IB/ib_mthca: Pre link receive WQEs in Tavor mode Message-ID: <1201185331.6755.20.camel@mtls03> Pre link receive WQEs in Tavor mode Tavor mode requires that each WQE in a posted list of receive WQEs will have a valid NDA field. This requirement holds true for regular QPs as well as for SRQs. This patch prelinks the receive queue in a regular QP and keeps the free list in SRQ always properly linked. Signed-off-by: Eli Cohen Reviewed-by: Jack Morgenstein --- drivers/infiniband/hw/mthca/mthca_qp.c | 13 ++++++++----- drivers/infiniband/hw/mthca/mthca_srq.c | 23 ++++++++++++++--------- 2 files changed, 22 insertions(+), 14 deletions(-) diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c index 86aa732..8433897 100644 --- a/drivers/infiniband/hw/mthca/mthca_qp.c +++ b/drivers/infiniband/hw/mthca/mthca_qp.c @@ -1175,6 +1175,7 @@ static int mthca_alloc_qp_common(struct mthca_dev *dev, { int ret; int i; + struct mthca_next_seg *next; qp->refcount = 1; init_waitqueue_head(&qp->wait); @@ -1217,7 +1218,6 @@ static int mthca_alloc_qp_common(struct mthca_dev *dev, } if (mthca_is_memfree(dev)) { - struct mthca_next_seg *next; struct mthca_data_seg *scatter; int size = (sizeof (struct mthca_next_seg) + qp->rq.max_gs * sizeof (struct mthca_data_seg)) / 16; @@ -1240,6 +1240,13 @@ static int mthca_alloc_qp_common(struct mthca_dev *dev, qp->sq.wqe_shift) + qp->send_wqe_offset); } + } else { + for (i = 0; i < qp->rq.max; ++i) { + next = get_recv_wqe(qp, i); + next->nda_op = htonl((((i + 1) % qp->rq.max) << + qp->rq.wqe_shift) | 1); + } + } qp->sq.last = get_send_wqe(qp, qp->sq.max - 1); @@ -1863,7 +1870,6 @@ int mthca_tavor_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, prev_wqe = qp->rq.last; qp->rq.last = wqe; - ((struct mthca_next_seg *) wqe)->nda_op = 0; ((struct mthca_next_seg *) wqe)->ee_nds = cpu_to_be32(MTHCA_NEXT_DBD); ((struct mthca_next_seg *) wqe)->flags = 0; @@ -1885,9 +1891,6 @@ int mthca_tavor_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, qp->wrid[ind] = wr->wr_id; - ((struct mthca_next_seg *) prev_wqe)->nda_op = - cpu_to_be32((ind << qp->rq.wqe_shift) | 1); - wmb(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = cpu_to_be32(MTHCA_NEXT_DBD | size); diff --git a/drivers/infiniband/hw/mthca/mthca_srq.c b/drivers/infiniband/hw/mthca/mthca_srq.c index 782b478..af8483c 100644 --- a/drivers/infiniband/hw/mthca/mthca_srq.c +++ b/drivers/infiniband/hw/mthca/mthca_srq.c @@ -175,9 +175,17 @@ static int mthca_alloc_srq_buf(struct mthca_dev *dev, struct mthca_pd *pd, * scatter list L_Keys to the sentry value of 0x100. */ for (i = 0; i < srq->max; ++i) { - wqe = get_wqe(srq, i); + struct mthca_next_seg *next; - *wqe_to_link(wqe) = i < srq->max - 1 ? i + 1 : -1; + next = wqe = get_wqe(srq, i); + + if (i < srq->max - 1) { + *wqe_to_link(wqe) = i + 1; + next->nda_op = htonl(((i + 1) << srq->wqe_shift) | 1); + } else { + *wqe_to_link(wqe) = -1; + next->nda_op = 0; + } for (scatter = wqe + sizeof (struct mthca_next_seg); (void *) scatter < wqe + (1 << srq->wqe_shift); @@ -470,12 +478,15 @@ out: void mthca_free_srq_wqe(struct mthca_srq *srq, u32 wqe_addr) { int ind; + struct mthca_next_seg *last_free; ind = wqe_addr >> srq->wqe_shift; spin_lock(&srq->lock); - *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; + last_free = get_wqe(srq, srq->last_free); + *wqe_to_link(last_free) = ind; + last_free->nda_op = htonl((ind << srq->wqe_shift) | 1); *wqe_to_link(get_wqe(srq, ind)) = -1; srq->last_free = ind; @@ -524,7 +535,6 @@ int mthca_tavor_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr, prev_wqe = srq->last; srq->last = wqe; - ((struct mthca_next_seg *) wqe)->nda_op = 0; ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ @@ -545,9 +555,6 @@ int mthca_tavor_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr, if (i < srq->max_gs) mthca_set_data_seg_inval(wqe); - ((struct mthca_next_seg *) prev_wqe)->nda_op = - cpu_to_be32((ind << srq->wqe_shift) | 1); - wmb(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = cpu_to_be32(MTHCA_NEXT_DBD); @@ -629,8 +636,6 @@ int mthca_arbel_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr, break; } - ((struct mthca_next_seg *) wqe)->nda_op = - cpu_to_be32((next_ind << srq->wqe_shift) | 1); ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ -- 1.5.3.8 From dotanb at dev.mellanox.co.il Thu Jan 24 06:53:40 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Thu, 24 Jan 2008 16:53:40 +0200 Subject: [ofa-general] Re: modify QP from SQE -> RTS (after query QP) fails without IBV_QP_CUR_STATE In-Reply-To: References: <47973EDC.5030504@dev.mellanox.co.il> Message-ID: <4798A674.2070509@dev.mellanox.co.il> Roland Dreier wrote: > > When i tried to recover the QP and modify it's state to RTS i had to > > use the flag IBV_QP_CUR_STATE > > because the internal QP structure assumed that the QP state is RTS > > (although query QP noticed otherwise). > > > > I created and tested 2 patches for both mthca and mlx4 drivers to > > update the internal QP state when a > > successful query QP was executed. > > > > Will you accept those patches? > > Sounds fine, although I think we need to take a little care about how > we update the QP state (locking) now that there are two ways to do > it. And obviiously it's hard to make a definitiev statement about > patches I've never seen... > > I just wanted to know how do you fill about his patch (and if it is a useful code change). I didn't find time to do it today, i guess I'll send you the patches on Sunday. thanks Dotan From weikuan.yu at gmail.com Thu Jan 24 07:08:37 2008 From: weikuan.yu at gmail.com (Weikuan Yu) Date: Thu, 24 Jan 2008 10:08:37 -0500 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> Message-ID: <4798A9F5.7030109@gmail.com> Hi, Scott, I have been running SDP tests across two woodcrest nodes with 4x DDR cards using OFED-1.2.5.4. The card/firmware info is below. CA 'mthca0' CA type: MT25208 Number of ports: 2 Firmware version: 5.1.400 Hardware version: a0 Node GUID: 0x0002c90200228e0c System image GUID: 0x0002c90200228e0f I could not get a bandwidth more than 5Gbps like you have shown here. Wonder if I need to upgrade to the latest software or firmware? Any suggestions? Thanks, --Weikuan TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.225.77 (192.168 .225.77) port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 131072 131072 131072 10.00 4918.95 21.29 24.99 1.418 1.665 Scott Weitzenkamp (sweitzen) wrote: > Jim, > > I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU > (single core each CPU) Xeon system. I do not see any performance > improvement (either throughput or CPU utilization) using netperf when I > set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384. Can you elaborate on > your HCA type, and performance improvement you see? > > Here's an example netperf command line when using a Cheetah DDR HCA and > 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware too): > > [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 -v2 -4 -H > 192.168.1.201 -l 30 -t TCP_STREAM -c -C -- -m 65536 > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.201 > (192.168.1.201) port 0 AF_INET : histogram : demo > > Recv Send Send Utilization Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 87380 16384 65536 30.01 7267.70 55.06 61.27 1.241 > 1.381 > > Alignment Offset Bytes Bytes Sends Bytes > Recvs > Local Remote Local Remote Xfered Per Per > Send Recv Send Recv Send (avg) Recv (avg) > 8 8 0 0 2.726e+10 65536.00 415942 48106.01 > 566648 > From sashak at voltaire.com Thu Jan 24 07:48:45 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 24 Jan 2008 15:48:45 +0000 Subject: [ofa-general] Re: [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values In-Reply-To: <20080123124028.75708ab0.weiny2@llnl.gov> References: <20080123124028.75708ab0.weiny2@llnl.gov> Message-ID: <20080124154845.GD11277@sashak.voltaire.com> On 12:40 Wed 23 Jan , Ira Weiny wrote: > From 020618d66bdcecba6f49bc7f48ae40485d657437 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Wed, 23 Jan 2008 12:39:05 -0800 > Subject: [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values > > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From jim at mellanox.com Thu Jan 24 07:56:39 2008 From: jim at mellanox.com (Jim Mott) Date: Thu, 24 Jan 2008 07:56:39 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: <4798A9F5.7030109@gmail.com> References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> Message-ID: Hi, 64K is borderline for seeing bzcopy effect. Using an AMD 6000+ (3 Ghz dual core) in Asus M2A-VM motherboard with ConnectX running 2.3 firmware and OFED 1.3-rc3 stack running on 2.6.23.8 kernel.org kernel, I ran the test for 128K: 5546 sdp_zcopy_thresh=0 (off) 8709 sdp_zcopy_thresh=65536 For these tests, I just have LD_PRELOAD set in my environment. ======================= I see that TCP_MAXSEG is not being handled by libsdp and will look into it. [root at dirk ~]# modprobe ib_sdp [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c -C -- -m 128K TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 193.168.10.198 (193.168.10.198) port 0 AF_INET netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 131072 30.01 5545.69 51.47 14.43 1.521 1.706 Alignment Offset Bytes Bytes Sends Bytes Recvs Local Remote Local Remote Xfered Per Per Send Recv Send Recv Send (avg) Recv (avg) 8 8 0 0 2.08e+10 131072.00 158690 33135.60 627718 Maximum Segment Size (bytes) -1 [root at dirk ~]# echo 65536 >/sys/module/ib_sdp/parameters/sdp_zcopy_thresh [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c -C -- -m 128K TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 193.168.10.198 (193.168.10.198) port 0 AF_INET netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 131072 30.01 8708.58 50.63 14.55 0.953 1.095 Alignment Offset Bytes Bytes Sends Bytes Recvs Local Remote Local Remote Xfered Per Per Send Recv Send Recv Send (avg) Recv (avg) 8 8 0 0 3.267e+10 131072.00 249228 26348.30 1239807 Maximum Segment Size (bytes) -1 Thanks, JIm Jim Mott Mellanox Technologies Ltd. mail: jim at mellanox.com Phone: 512-294-5481 -----Original Message----- From: Weikuan Yu [mailto:weikuan.yu at gmail.com] Sent: Thursday, January 24, 2008 9:09 AM To: Scott Weitzenkamp (sweitzen) Cc: Jim Mott; ewg at lists.openfabrics.org; general at lists.openfabrics.org Subject: Re: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh Hi, Scott, I have been running SDP tests across two woodcrest nodes with 4x DDR cards using OFED-1.2.5.4. The card/firmware info is below. CA 'mthca0' CA type: MT25208 Number of ports: 2 Firmware version: 5.1.400 Hardware version: a0 Node GUID: 0x0002c90200228e0c System image GUID: 0x0002c90200228e0f I could not get a bandwidth more than 5Gbps like you have shown here. Wonder if I need to upgrade to the latest software or firmware? Any suggestions? Thanks, --Weikuan TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.225.77 (192.168 .225.77) port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 131072 131072 131072 10.00 4918.95 21.29 24.99 1.418 1.665 Scott Weitzenkamp (sweitzen) wrote: > Jim, > > I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU > (single core each CPU) Xeon system. I do not see any performance > improvement (either throughput or CPU utilization) using netperf when I > set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384. Can you elaborate on > your HCA type, and performance improvement you see? > > Here's an example netperf command line when using a Cheetah DDR HCA and > 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware too): > > [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 -v2 -4 -H > 192.168.1.201 -l 30 -t TCP_STREAM -c -C -- -m 65536 > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.201 > (192.168.1.201) port 0 AF_INET : histogram : demo > > Recv Send Send Utilization Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 87380 16384 65536 30.01 7267.70 55.06 61.27 1.241 > 1.381 > > Alignment Offset Bytes Bytes Sends Bytes > Recvs > Local Remote Local Remote Xfered Per Per > Send Recv Send Recv Send (avg) Recv (avg) > 8 8 0 0 2.726e+10 65536.00 415942 48106.01 > 566648 > From kononov at dls.net Thu Jan 24 08:01:07 2008 From: kononov at dls.net (Roman Kononov) Date: Thu, 24 Jan 2008 10:01:07 -0600 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <1201184289.6755.7.camel@mtls03> References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> <4797A6FD.2080708@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> <4797CDC3.6090707@dls.net> <4797DC47.6060104@dls.net> <4797DD87.8020808@dls.net> <1201184289.6755.7.camel@mtls03> Message-ID: <4798B643.9060508@dls.net> On 2008-01-24 08:18 Eli Cohen said the following: > I am re-sending the patches again, this time with the bug of being > unable to handle QPs whose receive size is not a power of two fixed. .../libmthca/src>egrep '&.*qp->.q\.m.*-' *.c qp.c: ind = qp->sq.head & (qp->sq.max - 1); qp.c: ind = qp->rq.head & (qp->rq.max - 1); qp.c: next->nda_op = htonl(((i + 1) & (qp->rq.max - 1)) << qp.c: next->nda_op = htonl((((i + 1) & (qp->sq.max - 1)) << Should these change as well? In mthca_create_qp() there are assignments: qp->sq.max = align_queue_size(pd->context, attr->cap.max_send_wr, 0); qp->rq.max = align_queue_size(pd->context, attr->cap.max_recv_wr, 0); And later: qp->sq.max = attr->cap.max_send_wr; qp->rq.max = attr->cap.max_recv_wr; This is suspicious. > > Roman, > can you tell whether from your perspective there has been an improvement > in stability with the latest patches? Since I use 'nice' queue sizes, the very last patches made no difference. Roman From rdreier at cisco.com Thu Jan 24 08:09:18 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Jan 2008 08:09:18 -0800 Subject: [ofa-general] InfiniBand/RDMA merge plans for 2.6.25 In-Reply-To: (Bart Van Assche's message of "Thu, 24 Jan 2008 08:13:20 +0100") References: Message-ID: > > Anyway, here are all the pending things that I'm aware of. As usual, > > if something isn't already in my tree and isn't listed below, I > > probably missed it or dropped it by mistake. Please remind me again > > in that case. > > Are there any plans to merge the SDP (Sockets Direct Protocol) implementation ? No. To my knowledge, it has never been proposed for merging or been posted for review. From ssufficool at rov.sbcounty.gov Thu Jan 24 08:36:21 2008 From: ssufficool at rov.sbcounty.gov (Sufficool, Stanley) Date: Thu, 24 Jan 2008 08:36:21 -0800 Subject: [ofa-general] RE: [Scst-devel] [Stgt-devel] Performance of SCST versus STGT In-Reply-To: Message-ID: Can I surmise from all this benchmarking of the ib_srpt module that this is now stable? Last I checked the git version still had a data corruption bug. I saw some patches submitted to fix this, but the OFED git repository (http://www.openfabrics.org/git/~vu/srpt.git) shows no recent commits. With all these impressive numbers, I am eager to see SRPT/SCST performance in the WinOF srp initiator environment using database loads. Props to Vu and Vlad for this outstanding storage transport! On Jan 24, 2008 8:06 AM, Robin Humble wrote: > On Tue, Jan 22, 2008 at 01:32:08PM +0100, Bart Van Assche wrote: > > > >....................................................................... ...................... > >. . STGT read SCST read . STGT read SCST read . > >. . performance performance . performance performance . > >. . (0.5K, MB/s) (0.5K, MB/s) . (1 MB >MB/s) (1 MB, MB/s) . > >....................................................................... ...................... > >. Ethernet (1 Gb/s network) . 77 78 . 77 89 . > >. IPoIB (8 Gb/s network) . 163 185 . 201 239 . > >. iSER (8 Gb/s network) . 250 N/A . 360 N/A . > >. SRP (8 Gb/s network) . N/A 421 . N/A 683 . > >..................................................................... > >....................... > Results with /dev/ram0 configured as backing store on the target (buffered I/O): Read Write Read Write performance performance performance performance (0.5K, MB/s) (0.5K, MB/s) (1 MB, MB/s) (1 MB, MB/s) STGT + iSER 250 48 349 781 SCST + SRP 411 66 659 746 Results with /dev/ram0 configured as backing store on the target (direct I/O): Read Write Read Write performance performance performance performance (0.5K, MB/s) (0.5K, MB/s) (1 MB, MB/s) (1 MB, MB/s) STGT + iSER 7.9 9.8 589 647 SCST + SRP 12.3 9.7 811 794 Bart. From fenkes at de.ibm.com Thu Jan 24 08:59:08 2008 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Thu, 24 Jan 2008 17:59:08 +0100 Subject: [ofa-general] [PATCH] IB/ehca: Prevent sending UD packets to QP0 Message-ID: <200801241759.09065.fenkes@de.ibm.com> IB spec doesn't allow packets to QP0 sent on any other VL than VL15. Hardware doesn't filter those packets on the send side, so we need to do this in the driver and firmware. As eHCA doesn't support QP0, we can just filter out all traffic going to QP0, regardless of SL or VL. Signed-off-by: Joachim Fenkes --- drivers/infiniband/hw/ehca/ehca_reqs.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index 3aacc8c..2ce8cff 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -209,6 +209,10 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, ehca_gen_err("wr.ud.ah is NULL. qp=%p", qp); return -EINVAL; } + if (unlikely(send_wr->wr.ud.remote_qpn == 0)) { + ehca_gen_err("dest QP# is 0. qp=%x", qp->real_qp_num); + return -EINVAL; + } my_av = container_of(send_wr->wr.ud.ah, struct ehca_av, ib_ah); wqe_p->u.ud_av.ud_av = my_av->av; -- 1.5.2 From eli at mellanox.co.il Thu Jan 24 08:57:07 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 18:57:07 +0200 Subject: [ofa-general] [PATCH 3/16 v2] IB/ib_core: Add checksum support to ib core Message-ID: <1201193827.6755.62.camel@mtls03> Add checksum support to ib core Signed-off-by: Eli Cohen --- changes: update documentation to exclude IPv6 include/rdma/ib_verbs.h | 13 +++++++++++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 11f3960..e35cc29 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -95,7 +95,14 @@ enum ib_device_cap_flags { IB_DEVICE_N_NOTIFY_CQ = (1<<14), IB_DEVICE_ZERO_STAG = (1<<15), IB_DEVICE_SEND_W_INV = (1<<16), - IB_DEVICE_MEM_WINDOW = (1<<17) + IB_DEVICE_MEM_WINDOW = (1<<17), + /* + * devices which publish this capability must support insertion of UDP + * and TCP checksum on outgoing packets and can verify the validity of + * checksum for incoming packets. Setting this flag implies the driver + * may set NETIF_F_IP_CSUM. + */ + IB_DEVICE_IP_CSUM = (1<<18), }; enum ib_atomic_cap { @@ -431,6 +438,7 @@ struct ib_wc { u8 sl; u8 dlid_path_bits; u8 port_num; /* valid only for DR SMPs on switches */ + int csum_ok; }; enum ib_cq_notify_flags { @@ -615,7 +623,8 @@ enum ib_send_flags { IB_SEND_FENCE = 1, IB_SEND_SIGNALED = (1<<1), IB_SEND_SOLICITED = (1<<2), - IB_SEND_INLINE = (1<<3) + IB_SEND_INLINE = (1<<3), + IB_SEND_IP_CSUM = (1<<4) }; struct ib_sge { -- 1.5.3.8 From eli at mellanox.co.il Thu Jan 24 08:57:11 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 18:57:11 +0200 Subject: [ofa-general] [PATCH 4/16 v2] IB/ipoib: Add checksum offload support for ipoib Message-ID: <1201193831.6755.63.camel@mtls03> Add checksum offload support for ipoib Signed-off-by: Eli Cohen --- changes: remove ipv6 support add call to skb_reset_network_header() drivers/infiniband/ulp/ipoib/ipoib.h | 1 + drivers/infiniband/ulp/ipoib/ipoib_cm.c | 9 +++++++++ drivers/infiniband/ulp/ipoib/ipoib_ib.c | 20 ++++++++++++++++++++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 +++++++++++++++ 4 files changed, 45 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 6729c14..f0876dc 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -86,6 +86,7 @@ enum { IPOIB_MCAST_STARTED = 8, IPOIB_FLAG_ADMIN_CM = 9, IPOIB_FLAG_UMCAST = 10, + IPOIB_FLAG_CSUM = 11, IPOIB_MAX_BACKOFF_SECONDS = 16, diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 8485fde..1c1e446 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1234,6 +1234,11 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, set_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); ipoib_warn(priv, "enabling connected mode " "will cause multicast packet drops\n"); + + dev->features &= ~NETIF_F_IP_CSUM; + + priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM; + ipoib_flush_paths(dev); return count; } @@ -1242,6 +1247,10 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, clear_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); dev->mtu = min(priv->mcast_mtu, dev->mtu); ipoib_flush_paths(dev); + + if (priv->ca->flags & IB_DEVICE_IP_CSUM) + dev->features |= NETIF_F_IP_CSUM; + return count; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 680c27f..875eba0 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -37,6 +37,7 @@ #include #include +#include #include @@ -231,6 +232,19 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) skb->dev = dev; /* XXX get correct PACKET_ type here */ skb->pkt_type = PACKET_HOST; + + /* check rx csum */ + if (test_bit(IPOIB_FLAG_CSUM, &priv->flags) && likely(wc->csum_ok)) { + /* + * Note: this is a specific requirement for Mellanox + * HW but since it is the only HW currently supporting + * checksum offload I put it here + */ + skb_reset_network_header(skb); + if (ip_hdr(skb)->ihl == 5) + skb->ip_summed = CHECKSUM_UNNECESSARY; + } + netif_receive_skb(skb); repost: @@ -394,6 +408,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, return; } + if (dev->flags & NETIF_F_IP_CSUM && + skb->ip_summed == CHECKSUM_PARTIAL) + priv->tx_wr.send_flags |= IB_SEND_IP_CSUM; + else + priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM; + if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), address->ah, qpn, tx_req->mapping, skb_headlen(skb), diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index b40e0f7..c0e6c01 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1106,6 +1106,20 @@ int ipoib_add_pkey_attr(struct net_device *dev) return device_create_file(&dev->dev, &dev_attr_pkey); } +static void set_csum(struct net_device *dev, struct ib_device *hca) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (test_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags)) + return; + + if (!(hca->flags & IB_DEVICE_IP_CSUM)) + return; + + dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM; + set_bit(IPOIB_FLAG_CSUM, &priv->flags); +} + static struct net_device *ipoib_add_port(const char *format, struct ib_device *hca, u8 port) { @@ -1144,6 +1158,7 @@ static struct net_device *ipoib_add_port(const char *format, } else memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + set_csum(priv->dev, hca); result = ipoib_dev_init(priv->dev, hca, port); if (result < 0) { -- 1.5.3.8 From eli at mellanox.co.il Thu Jan 24 08:57:19 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 18:57:19 +0200 Subject: [ofa-general] [PATCH 5/16 v2] IB/mlx4_ib: Add checksum offload support to mlx4 Message-ID: <1201193839.6755.64.camel@mtls03> Add checksum offload support to mlx4 Signed-off-by: Eli Cohen Signed-off-by: Ali Ayub --- changes: remove support for ipv6 modify csum_ok condition to provide a reliable indication drivers/infiniband/hw/mlx4/cq.c | 9 +++++++++ drivers/infiniband/hw/mlx4/main.c | 5 +++++ drivers/infiniband/hw/mlx4/qp.c | 3 +++ drivers/net/mlx4/fw.c | 3 +++ include/linux/mlx4/cq.h | 4 ++-- include/linux/mlx4/qp.h | 2 ++ 6 files changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 9d32c49..308fe2a 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -314,6 +314,11 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, int is_send; int is_error; u16 wqe_ctr; + __be32 status; + +#define CSUM_MASK_BITS cpu_to_be32(0x13c00000) +#define CSUM_VAL_BITS cpu_to_be32(0x10400000) +#define CSUM_MASK2_BITS cpu_to_be32(0x0c000000) cqe = next_cqe_sw(cq); if (!cqe) @@ -431,6 +436,10 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, wc->wc_flags |= be32_to_cpu(cqe->g_mlpath_rqpn) & 0x80000000 ? IB_WC_GRH : 0; wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) & 0x7f; + status = cqe->ipoib_status; + wc->csum_ok = (status & CSUM_MASK_BITS) == CSUM_VAL_BITS && + (status & CSUM_MASK2_BITS) && + cqe->checksum == 0xffff; } return 0; diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index d8287d9..8ce94a1 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -99,6 +99,8 @@ static int mlx4_ib_query_device(struct ib_device *ibdev, props->device_cap_flags |= IB_DEVICE_AUTO_PATH_MIG; if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_UD_AV_PORT) props->device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) + props->device_cap_flags |= IB_DEVICE_IP_CSUM; props->vendor_id = be32_to_cpup((__be32 *) (out_mad->data + 36)) & 0xffffff; @@ -612,6 +614,9 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) ibdev->ib_dev.unmap_fmr = mlx4_ib_unmap_fmr; ibdev->ib_dev.dealloc_fmr = mlx4_ib_fmr_dealloc; + if (ibdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) + ibdev->ib_dev.flags |= IB_DEVICE_IP_CSUM; + if (init_node_data(ibdev)) goto err_map; diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 8cba9c5..ca7cd04 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -1307,6 +1307,9 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE) : 0) | (wr->send_flags & IB_SEND_SOLICITED ? cpu_to_be32(MLX4_WQE_CTRL_SOLICITED) : 0) | + ((wr->send_flags & IB_SEND_IP_CSUM) ? + cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM | + MLX4_WQE_CTRL_TCP_UDP_CSUM) : 0) | qp->sq_signal_bits; if (wr->opcode == IB_WR_SEND_WITH_IMM || diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index 5064873..d6c2851 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -736,6 +736,9 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param) MLX4_PUT(inbox, (u8) (PAGE_SHIFT - 12), INIT_HCA_UAR_PAGE_SZ_OFFSET); MLX4_PUT(inbox, param->log_uar_sz, INIT_HCA_LOG_UAR_SZ_OFFSET); + if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) + *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 3); + err = mlx4_cmd(dev, mailbox->dma, 0, 0, MLX4_CMD_INIT_HCA, 10000); if (err) diff --git a/include/linux/mlx4/cq.h b/include/linux/mlx4/cq.h index 0181e0a..5fdc859 100644 --- a/include/linux/mlx4/cq.h +++ b/include/linux/mlx4/cq.h @@ -45,11 +45,11 @@ struct mlx4_cqe { u8 sl; u8 reserved1; __be16 rlid; - u32 reserved2; + __be32 ipoib_status; __be32 byte_cnt; __be16 wqe_index; __be16 checksum; - u8 reserved3[3]; + u8 reserved2[3]; u8 owner_sr_opcode; }; diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h index 3968b94..b4eb921 100644 --- a/include/linux/mlx4/qp.h +++ b/include/linux/mlx4/qp.h @@ -158,6 +158,8 @@ enum { MLX4_WQE_CTRL_FENCE = 1 << 6, MLX4_WQE_CTRL_CQ_UPDATE = 3 << 2, MLX4_WQE_CTRL_SOLICITED = 1 << 1, + MLX4_WQE_CTRL_IP_CSUM = 1 << 4, + MLX4_WQE_CTRL_TCP_UDP_CSUM = 1 << 5, }; struct mlx4_wqe_ctrl_seg { -- 1.5.3.8 From eli at mellanox.co.il Thu Jan 24 08:57:25 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 18:57:25 +0200 Subject: [ofa-general] [PATCH 9/16 v2] IB/ipoib: Add LSO support to ipoib Message-ID: <1201193845.6755.65.camel@mtls03> Add LSO support to ipoib Signed-off-by: Eli Cohen --- changes: modified to catch up with changes in previous patches drivers/infiniband/ulp/ipoib/ipoib.h | 54 ++++++++++++------- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 7 ++- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 80 +++++++++++++++++++++------ drivers/infiniband/ulp/ipoib/ipoib_main.c | 8 +++- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 3 +- 5 files changed, 111 insertions(+), 41 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index f0876dc..e15884c 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -151,31 +151,40 @@ static inline int ipoib_dma_map_tx(struct ib_device *ca, { struct sk_buff *skb = tx_req->skb; u64 *mapping = tx_req->mapping; - int frags; int i; - - mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) - return -EIO; - - frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < frags; ++i) { + int nfrags; + int off; + + if (skb_headlen(skb)) { + mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) + return -EIO; + off = 1; + } else + off = 0; + + nfrags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < nfrags; ++i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - mapping[i + 1] = ib_dma_map_page(ca, frag->page, - frag->page_offset, frag->size, - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(ca, mapping[i + 1]))) + mapping[i + off] = ib_dma_map_page(ca, frag->page, frag->page_offset, + frag->size, DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[i + off]))) goto partial_error; } return 0; partial_error: - ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + if (skb_headlen(skb)) { + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + off = 0; + } else + off = 1; for (; i > 0; --i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1]; - ib_dma_unmap_page(ca, mapping[i], frag->size, DMA_TO_DEVICE); + ib_dma_unmap_page(ca, mapping[i - off], frag->size, + DMA_TO_DEVICE); } return -EIO; } @@ -185,15 +194,20 @@ static inline void ipoib_dma_unmap_tx(struct ib_device *ca, { struct sk_buff *skb = tx_req->skb; u64 *mapping = tx_req->mapping; - int frags; int i; + int nfrags; + int off; - ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + if (skb_headlen(skb)) { + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + off = 1; + } else + off = 0; - frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < frags; ++i) { + nfrags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < nfrags; ++i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - ib_dma_unmap_page(ca, mapping[i + 1], frag->size, + ib_dma_unmap_page(ca, mapping[i + off], frag->size, DMA_TO_DEVICE); } } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 1c1e446..5d2b38a 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1235,7 +1235,7 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, ipoib_warn(priv, "enabling connected mode " "will cause multicast packet drops\n"); - dev->features &= ~NETIF_F_IP_CSUM; + dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_TSO); priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM; @@ -1251,6 +1251,11 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, if (priv->ca->flags & IB_DEVICE_IP_CSUM) dev->features |= NETIF_F_IP_CSUM; + + if (priv->dev->features & NETIF_F_SG && + priv->ca->flags & IB_DEVICE_TCP_TSO) + priv->dev->features |= NETIF_F_TSO; + return count; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 875eba0..92a7162 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -38,6 +38,7 @@ #include #include #include +#include #include @@ -354,24 +355,40 @@ void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) static inline int post_send(struct ipoib_dev_priv *priv, unsigned int wr_id, struct ib_ah *address, u32 qpn, - u64 *mapping, int headlen, - skb_frag_t *frags, - int nr_frags) + struct ipoib_tx_buf *tx_req, + void *head, int hlen) { struct ib_send_wr *bad_wr; - int i; + int i, off; + struct sk_buff *skb = tx_req->skb; + skb_frag_t *frags = skb_shinfo(skb)->frags; + int nr_frags = skb_shinfo(skb)->nr_frags; + u64 *mapping = tx_req->mapping; + + if (skb_headlen(skb)) { + priv->tx_sge[0].addr = mapping[0]; + priv->tx_sge[0].length = skb_headlen(skb); + off = 1; + } else + off = 0; - priv->tx_sge[0].addr = mapping[0]; - priv->tx_sge[0].length = headlen; for (i = 0; i < nr_frags; ++i) { - priv->tx_sge[i + 1].addr = mapping[i + 1]; - priv->tx_sge[i + 1].length = frags[i].size; + priv->tx_sge[i + off].addr = mapping[i + off]; + priv->tx_sge[i + off].length = frags[i].size; } - priv->tx_wr.num_sge = nr_frags + 1; + priv->tx_wr.num_sge = nr_frags + off; priv->tx_wr.wr_id = wr_id; priv->tx_wr.wr.ud.remote_qpn = qpn; priv->tx_wr.wr.ud.ah = address; + if (head) { + priv->tx_wr.wr.ud.mss = skb_shinfo(skb)->gso_size; + priv->tx_wr.wr.ud.header = head; + priv->tx_wr.wr.ud.hlen = hlen; + priv->tx_wr.opcode = IB_WR_LSO; + } else + priv->tx_wr.opcode = IB_WR_SEND; + return ib_post_send(priv->qp, &priv->tx_wr, &bad_wr); } @@ -380,14 +397,36 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ipoib_tx_buf *tx_req; + int hlen; + void *phead; + + if (!skb_is_gso(skb)) { + if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { + ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", + skb->len, priv->mcast_mtu + IPOIB_ENCAP_LEN); + ++dev->stats.tx_dropped; + ++dev->stats.tx_errors; + ipoib_cm_skb_too_long(dev, skb, priv->mcast_mtu); + return; + } + phead = 0; + hlen = 0; + } else { + /* + * LSO header is limited to max 60 bytes + */ + if (unlikely((ip_hdr(skb)->ihl + tcp_hdr(skb)->doff) > 15)) { + ipoib_warn(priv, "ip(%d) and tcp(%d) headers too long, dropping skb\n", + ip_hdr(skb)->ihl << 2, tcp_hdr(skb)->doff << 2); + goto drop; + } - if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { - ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", - skb->len, priv->mcast_mtu + IPOIB_ENCAP_LEN); - ++dev->stats.tx_dropped; - ++dev->stats.tx_errors; - ipoib_cm_skb_too_long(dev, skb, priv->mcast_mtu); - return; + hlen = ((ip_hdr(skb)->ihl + tcp_hdr(skb)->doff) << 2) + IPOIB_ENCAP_LEN; + phead = skb->data; + if (unlikely(!skb_pull(skb, hlen))) { + ipoib_warn(priv, "linear data too small\n"); + goto drop; + } } ipoib_dbg_data(priv, "sending packet, length=%d address=%p qpn=0x%06x\n", @@ -416,8 +455,7 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), address->ah, qpn, - tx_req->mapping, skb_headlen(skb), - skb_shinfo(skb)->frags, skb_shinfo(skb)->nr_frags))) { + tx_req, phead, hlen))) { ipoib_warn(priv, "post_send failed\n"); ++dev->stats.tx_errors; ipoib_dma_unmap_tx(priv->ca, tx_req); @@ -433,6 +471,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, netif_stop_queue(dev); } } + return; + +drop: + ++dev->stats.tx_errors; + dev_kfree_skb_any(skb); + return; } static void __ipoib_reap_ah(struct net_device *dev) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index c0e6c01..27beb4a 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -713,7 +713,9 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev) goto out; } - ipoib_send(dev, skb, neigh->ah, IPOIB_QPN(skb->dst->neighbour->ha)); + ipoib_send(dev, skb, neigh->ah, + IPOIB_QPN(skb->dst->neighbour->ha)); + goto out; } @@ -1177,6 +1179,10 @@ static struct net_device *ipoib_add_port(const char *format, goto event_failed; } + if (priv->dev->features & NETIF_F_SG && priv->ca->flags & IB_DEVICE_TCP_TSO) + priv->dev->features |= NETIF_F_TSO; + + result = register_netdev(priv->dev); if (result) { printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index a6f5f65..f2289c6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -153,7 +153,8 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .max_recv_sge = 1 }, .sq_sig_type = IB_SIGNAL_ALL_WR, - .qp_type = IB_QPT_UD + .qp_type = IB_QPT_UD, + .create_flags = QP_CREATE_LSO, }; int i, ret, size; -- 1.5.3.8 From eli at mellanox.co.il Thu Jan 24 08:57:35 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 18:57:35 +0200 Subject: [ofa-general] [PATCH 13/16 v2] IB/ipoib: Add support for modify CQ Message-ID: <1201193855.6755.66.camel@mtls03> Add support for modify CQ Add support for modifying CQ parameters for controlling event generation moderation. Signed-off-by: Eli Cohen --- changes: Fix spelling mistakes Fix function documentation drivers/infiniband/core/verbs.c | 7 +++++++ include/rdma/ib_verbs.h | 11 +++++++++++ 2 files changed, 18 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index 86ed8af..84709ed 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -628,6 +628,13 @@ struct ib_cq *ib_create_cq(struct ib_device *device, } EXPORT_SYMBOL(ib_create_cq); +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period) +{ + return cq->device->modify_cq ? + cq->device->modify_cq(cq, cq_count, cq_period) : -ENOSYS; +} +EXPORT_SYMBOL(ib_modify_cq); + int ib_destroy_cq(struct ib_cq *cq) { if (atomic_read(&cq->usecnt)) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 6ef1729..a8f94a9 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -984,6 +984,8 @@ struct ib_device { int comp_vector, struct ib_ucontext *context, struct ib_udata *udata); + int (*modify_cq)(struct ib_cq *cq, u16 cq_count, + u16 cq_period); int (*destroy_cq)(struct ib_cq *cq); int (*resize_cq)(struct ib_cq *cq, int cqe, struct ib_udata *udata); @@ -1389,6 +1391,15 @@ struct ib_cq *ib_create_cq(struct ib_device *device, int ib_resize_cq(struct ib_cq *cq, int cqe); /** + * ib_modify_cq - Modifies moderation params of the CQ + * @cq: The CQ to modify. + * @cq_count: number of CQEs that will trigger an event + * @cq_period: max period of time in usec before triggering an event + * + */ +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); + +/** * ib_destroy_cq - Destroys the specified CQ. * @cq: The CQ to destroy. */ -- 1.5.3.8 From eli at mellanox.co.il Thu Jan 24 08:57:43 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 18:57:43 +0200 Subject: [ofa-general] [PATCH 14/16 v2] IB/ipoib: Support modifying IPOIB CQ moderation params Message-ID: <1201193863.6755.67.camel@mtls03> Support modifying IPOIB CQ moderation params This can be used to tune at run time the paramters controlling the event (interrupt) generation rate and thus reduce the overhead incurred by handling interrupts resulting in better throughput. Signed-off-by: Eli Cohen --- changes: Bug fix in reporting tx coalescing drivers/infiniband/ulp/ipoib/ipoib.h | 6 ++++ drivers/infiniband/ulp/ipoib/ipoib_etool.c | 42 ++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 6783936..b22b0c7 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -303,6 +303,11 @@ struct ipoib_cm_dev_priv { struct ib_recv_wr rx_wr; }; +struct ipoib_ethtool_st { + u16 coalesce_usecs; + u16 max_coalesced_frames; +}; + /* * Device private locking: tx_lock protects members used in TX fast * path (and we use LLTX so upper layers don't do extra locking). @@ -380,6 +385,7 @@ struct ipoib_dev_priv { struct dentry *mcg_dentry; struct dentry *path_dentry; #endif + struct ipoib_ethtool_st etool; }; struct ipoib_ah { diff --git a/drivers/infiniband/ulp/ipoib/ipoib_etool.c b/drivers/infiniband/ulp/ipoib/ipoib_etool.c index 913aea0..958229a 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_etool.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_etool.c @@ -44,9 +44,51 @@ static void ipoib_get_drvinfo(struct net_device *netdev, strncpy(drvinfo->driver, "ipoib", sizeof(drvinfo->driver) - 1); } +static int ipoib_get_coalesce(struct net_device *dev, + struct ethtool_coalesce *coal) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + coal->rx_coalesce_usecs = priv->etool.coalesce_usecs; + coal->tx_coalesce_usecs = priv->etool.coalesce_usecs; + coal->rx_max_coalesced_frames = priv->etool.max_coalesced_frames; + coal->tx_max_coalesced_frames = priv->etool.max_coalesced_frames; + + return 0; +} + +static int ipoib_set_coalesce(struct net_device *dev, + struct ethtool_coalesce *coal) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + if (coal->rx_coalesce_usecs > 0xffff || + coal->tx_coalesce_usecs > 0xffff || + coal->rx_max_coalesced_frames > 0xffff || + coal->tx_max_coalesced_frames > 0xffff) + return -EINVAL; + + ret = ib_modify_cq(priv->cq, coal->rx_max_coalesced_frames, + coal->rx_coalesce_usecs); + if (ret) { + ipoib_dbg(priv, "failed modifying CQ\n"); + return ret; + } + + coal->tx_coalesce_usecs = coal->rx_coalesce_usecs; + priv->etool.coalesce_usecs = coal->rx_coalesce_usecs; + coal->tx_max_coalesced_frames = coal->rx_max_coalesced_frames; + priv->etool.max_coalesced_frames = coal->rx_max_coalesced_frames; + + return 0; +} + static const struct ethtool_ops ipoib_ethtool_ops = { .get_drvinfo = ipoib_get_drvinfo, .get_tso = ethtool_op_get_tso, + .get_coalesce = ipoib_get_coalesce, + .set_coalesce = ipoib_set_coalesce, }; void ipoib_set_ethtool_ops(struct net_device *dev) -- 1.5.3.8 From vuhuong at mellanox.com Thu Jan 24 09:06:24 2008 From: vuhuong at mellanox.com (Vu Pham) Date: Thu, 24 Jan 2008 09:06:24 -0800 Subject: [ofa-general] Re: [Scst-devel] [Stgt-devel] Performance of SCST versus STGT In-Reply-To: References: Message-ID: <4798C590.8090605@mellanox.com> Sufficool, Stanley wrote: > Can I surmise from all this benchmarking of the ib_srpt module that this > is now stable? Last I checked the git version still had a data > corruption bug. > > I saw some patches submitted to fix this, but the OFED git repository > (http://www.openfabrics.org/git/~vu/srpt.git) shows no recent commits. > > I'll check it in with some other patches before the weekend You should use ofed-1.3 (srpt is in there). http://www.openfabrics.org/git/~vu/srpt.git is for older ofed version (ie. 1.2.5 and 1.2) > With all these impressive numbers, I am eager to see SRPT/SCST > performance in the WinOF srp initiator environment using database loads. > > Props to Vu and Vlad for this outstanding storage transport! > > Kudo to Vlad - great job on scst, iscsi_scst...! > > > On Jan 24, 2008 8:06 AM, Robin Humble > wrote: > >> On Tue, Jan 22, 2008 at 01:32:08PM +0100, Bart Van Assche wrote: >> >> ....................................................................... >> > ...................... > >>> . . STGT read SCST read . STGT >>> > read SCST read . > >>> . . performance performance . >>> > performance performance . > >>> . . (0.5K, MB/s) (0.5K, MB/s) . (1 MB >>> >> MB/s) (1 MB, MB/s) . >> >> ....................................................................... >> > ...................... > >>> . Ethernet (1 Gb/s network) . 77 78 . >>> > 77 89 . > >>> . IPoIB (8 Gb/s network) . 163 185 . >>> > 201 239 . > >>> . iSER (8 Gb/s network) . 250 N/A . >>> > 360 N/A . > >>> . SRP (8 Gb/s network) . N/A 421 . >>> > N/A 683 . > >>> ..................................................................... >>> ....................... >>> > > Results with /dev/ram0 configured as backing store on the target > (buffered I/O): > Read Write Read Write > performance performance performance > performance > (0.5K, MB/s) (0.5K, MB/s) (1 MB, MB/s) (1 MB, > MB/s) > STGT + iSER 250 48 349 781 > SCST + SRP 411 66 659 746 > > Results with /dev/ram0 configured as backing store on the target (direct > I/O): > Read Write Read Write > performance performance performance > performance > (0.5K, MB/s) (0.5K, MB/s) (1 MB, MB/s) (1 MB, > MB/s) > STGT + iSER 7.9 9.8 589 647 > SCST + SRP 12.3 9.7 811 794 > > Bart. > From sweitzen at cisco.com Thu Jan 24 09:16:45 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Thu, 24 Jan 2008 09:16:45 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> Message-ID: I've tested on RHEL4 and RHEL5, and see no sdp_zcopy_thresh improvement for any message size, as measured with netperf, for any Arbel or ConnectX HCA. Scott > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Thursday, January 24, 2008 7:57 AM > To: Weikuan Yu; Scott Weitzenkamp (sweitzen) > Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > Hi, > 64K is borderline for seeing bzcopy effect. Using an AMD > 6000+ (3 Ghz > dual core) in Asus M2A-VM motherboard with ConnectX running > 2.3 firmware > and OFED 1.3-rc3 stack running on 2.6.23.8 kernel.org kernel, > I ran the > test for 128K: > 5546 sdp_zcopy_thresh=0 (off) > 8709 sdp_zcopy_thresh=65536 > > For these tests, I just have LD_PRELOAD set in my environment. > > ======================= > > I see that TCP_MAXSEG is not being handled by libsdp and will > look into > it. > > > [root at dirk ~]# modprobe ib_sdp > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c > -C -- -m 128K > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 193.168.10.198 > (193.168.10.198) port 0 AF_INET > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > Recv Send Send Utilization Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 87380 16384 131072 30.01 5545.69 51.47 14.43 1.521 > 1.706 > > Alignment Offset Bytes Bytes Sends Bytes > Recvs > Local Remote Local Remote Xfered Per Per > Send Recv Send Recv Send (avg) Recv (avg) > 8 8 0 0 2.08e+10 131072.00 158690 33135.60 > 627718 > > Maximum > Segment > Size (bytes) > -1 > [root at dirk ~]# echo 65536 > >/sys/module/ib_sdp/parameters/sdp_zcopy_thresh > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c > -C -- -m 128K > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 193.168.10.198 > (193.168.10.198) port 0 AF_INET > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > Recv Send Send Utilization Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 87380 16384 131072 30.01 8708.58 50.63 14.55 0.953 > 1.095 > > Alignment Offset Bytes Bytes Sends Bytes > Recvs > Local Remote Local Remote Xfered Per Per > Send Recv Send Recv Send (avg) Recv (avg) > 8 8 0 0 3.267e+10 131072.00 249228 26348.30 > 1239807 > > Maximum > Segment > Size (bytes) > -1 > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Weikuan Yu [mailto:weikuan.yu at gmail.com] > Sent: Thursday, January 24, 2008 9:09 AM > To: Scott Weitzenkamp (sweitzen) > Cc: Jim Mott; ewg at lists.openfabrics.org; general at lists.openfabrics.org > Subject: Re: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > Hi, Scott, > > I have been running SDP tests across two woodcrest nodes with 4x DDR > cards using OFED-1.2.5.4. The card/firmware info is below. > > CA 'mthca0' > CA type: MT25208 > Number of ports: 2 > Firmware version: 5.1.400 > Hardware version: a0 > Node GUID: 0x0002c90200228e0c > System image GUID: 0x0002c90200228e0f > > I could not get a bandwidth more than 5Gbps like you have shown here. > Wonder if I need to upgrade to the latest software or firmware? Any > suggestions? > > Thanks, > --Weikuan > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 192.168.225.77 > (192.168 > .225.77) port 0 AF_INET > Recv Send Send Utilization > Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 131072 131072 131072 10.00 4918.95 21.29 24.99 1.418 > 1.665 > > > Scott Weitzenkamp (sweitzen) wrote: > > Jim, > > > > I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU > > (single core each CPU) Xeon system. I do not see any performance > > improvement (either throughput or CPU utilization) using > netperf when > I > > set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384. Can you elaborate > on > > your HCA type, and performance improvement you see? > > > > Here's an example netperf command line when using a Cheetah DDR HCA > and > > 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware > too): > > > > [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 -v2 -4 -H > > 192.168.1.201 -l 30 -t TCP_STREAM -c -C -- -m 65536 > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 192.168.1.201 > > (192.168.1.201) port 0 AF_INET : histogram : demo > > > > Recv Send Send Utilization > Service > > Demand > > Socket Socket Message Elapsed Send Recv Send > > Recv > > Size Size Size Time Throughput local remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > > us/KB > > > > 87380 16384 65536 30.01 7267.70 55.06 61.27 1.241 > > 1.381 > > > > Alignment Offset Bytes Bytes Sends Bytes > > Recvs > > Local Remote Local Remote Xfered Per Per > > Send Recv Send Recv Send (avg) > Recv (avg) > > 8 8 0 0 2.726e+10 65536.00 415942 > 48106.01 > > 566648 > > > From eli at dev.mellanox.co.il Thu Jan 24 09:18:40 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 19:18:40 +0200 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <4798B643.9060508@dls.net> References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> <4797A6FD.2080708@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> <4797CDC3.6090707@dls.net> <4797DC47.6060104@dls.net> <4797DD87.8020808@dls.net> <1201184289.6755.7.camel@mtls03> <4798B643.9060508@dls.net> Message-ID: <1201195120.6755.75.camel@mtls03> On Thu, 2008-01-24 at 10:01 -0600, Roman Kononov wrote: > On 2008-01-24 08:18 Eli Cohen said the following: > > I am re-sending the patches again, this time with the bug of being > > unable to handle QPs whose receive size is not a power of two fixed. > > .../libmthca/src>egrep '&.*qp->.q\.m.*-' *.c > qp.c: ind = qp->sq.head & (qp->sq.max - 1); > qp.c: ind = qp->rq.head & (qp->rq.max - 1); > qp.c: next->nda_op = htonl(((i + 1) & (qp->rq.max - 1)) << > qp.c: next->nda_op = htonl((((i + 1) & (qp->sq.max - 1)) << > > Should these change as well? No, these are for memfree where the driver assures queue sizes are a power of 2. > > In mthca_create_qp() there are assignments: > qp->sq.max = align_queue_size(pd->context, attr->cap.max_send_wr, 0); > qp->rq.max = align_queue_size(pd->context, attr->cap.max_recv_wr, 0); > And later: > qp->sq.max = attr->cap.max_send_wr; > qp->rq.max = attr->cap.max_recv_wr; > This is suspicious. I agree this does not look too nice. But going through the code I could not see how the kernel could return a different value than what it got from userspace. Roland what do you think? > > > > > Roman, > > can you tell whether from your perspective there has been an improvement > > in stability with the latest patches? > > Since I use 'nice' queue sizes, the very last patches made no difference. > I meant to ask whether there was improvement between that time before the patches were posted to now. From sashak at voltaire.com Thu Jan 24 09:38:47 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 24 Jan 2008 17:38:47 +0000 Subject: [ofa-general] [PATCH] opensm/scripts: fix sldd.sh paths In-Reply-To: <20080123191235.179886dd.weiny2@llnl.gov> References: <20080123191235.179886dd.weiny2@llnl.gov> Message-ID: <20080124173847.GE11277@sashak.voltaire.com> Fix sldd.sh path in startup script and path to config file. Signed-off-by: Sasha Khapyorsky --- opensm/scripts/opensmd | 2 +- opensm/scripts/sldd.sh | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/opensm/scripts/opensmd b/opensm/scripts/opensmd index e5c734e..549e206 100755 --- a/opensm/scripts/opensmd +++ b/opensm/scripts/opensmd @@ -43,7 +43,7 @@ prog=/usr/bin/opensm bin=${prog##*/} # Handover daemon for updating guid2lid cache file -sldd_prog=/usr/bin/sldd.sh +sldd_prog=/usr/sbin/sldd.sh sldd_bin=${sldd_prog##*/} sldd_pid_file=/var/run/sldd.pid diff --git a/opensm/scripts/sldd.sh b/opensm/scripts/sldd.sh index cfb8953..21f6126 100755 --- a/opensm/scripts/sldd.sh +++ b/opensm/scripts/sldd.sh @@ -41,7 +41,7 @@ # config: /etc/opensm.conf [ -f /etc/sysconfig/opensm.conf ] && CONFIG=/etc/sysconfig/opensm.conf -[ -f /etc/ofa/opensm.conf ] && CONFIG=/etc/sysconfig/opensm.conf +[ -f /etc/ofa/opensm.conf ] && CONFIG=/etc/ofa/opensm.conf SLDD_DEBUG=${SLDD_DEBUG:-0} -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Thu Jan 24 09:40:36 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 24 Jan 2008 17:40:36 +0000 Subject: [ofa-general] Re: Is the opensm/scripts/sldd.sh script still needed? In-Reply-To: <20080123191235.179886dd.weiny2@llnl.gov> References: <20080123191235.179886dd.weiny2@llnl.gov> Message-ID: <20080124174036.GF11277@sashak.voltaire.com> Hi Ira, On 19:12 Wed 23 Jan , Ira Weiny wrote: > I was going through the start up script provided in the opensm package and > found a reference to the sldd.sh script. Is this still needed? It still > exists in opensm/scripts/sldd.sh but is _not_ in the spec.in file? > > The opensm/scripts/redhat-opensm.init will try and run this if the > "HONORE_GUID2LID_FLAG" is set. So should we add it to the spec file or remove > the usage in the start up script? I think we need to install this script if it could be used. Will send the patch. Sasha From sashak at voltaire.com Thu Jan 24 09:41:27 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 24 Jan 2008 17:41:27 +0000 Subject: [ofa-general] [PATCH] opensm/opensm.spec: install scripts/sldd.sh In-Reply-To: <20080124174036.GF11277@sashak.voltaire.com> References: <20080123191235.179886dd.weiny2@llnl.gov> <20080124174036.GF11277@sashak.voltaire.com> Message-ID: <20080124174127.GG11277@sashak.voltaire.com> It is used in startup script when HONORE_GUID2LID is enabled. Signed-off-by: Sasha Khapyorsky --- opensm/opensm.spec.in | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/opensm/opensm.spec.in b/opensm/opensm.spec.in index cde005f..72c8eba 100644 --- a/opensm/opensm.spec.in +++ b/opensm/opensm.spec.in @@ -98,6 +98,7 @@ mkdir -p $etc/{init.d,ofa,logrotate.d} install -m 755 scripts/${REDHAT}opensm.init $etc/init.d/opensm install -m 644 scripts/opensm.conf $etc/ofa/opensm.conf install -m 644 scripts/opensm.logrotate $etc/logrotate.d/opensm +install -m 755 scripts/sldd.sh $RPM_BUILD_ROOT%{_sbindir}/sldd.sh %clean rm -rf $RPM_BUILD_ROOT @@ -126,6 +127,7 @@ fi %{_mandir}/man8/* %doc AUTHORS COPYING README %{_sysconfdir}/init.d/opensm +%{_sbindir}/sldd.sh %config(noreplace) %{_sysconfdir}/ofa/opensm.conf %config(noreplace) %{_sysconfdir}/logrotate.d/opensm %dir /var/cache/opensm -- 1.5.4.rc2.60.gb2e62 From swise at opengridcomputing.com Thu Jan 24 09:30:52 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 24 Jan 2008 11:30:52 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: <47977262.1060906@hpc.ufl.edu> References: <4783A5B0.6040603@hpc.ufl.edu><4783B3F5.20600@opengridcomputing.com><4783BDD5.7000702@hpc.ufl.edu><4783C326.3070306@opengridcomputing.com><478634A5.3080204@hpc.ufl.edu><47863794.9080709@opengridcomputing.com><47865A4A.4070603@hpc.ufl.edu><47865E5B.4030607@opengridcomputing.com><4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> Message-ID: <4798CB4C.7070706@opengridcomputing.com> Are these recv buffers user memory or kernel memory? I just submitted a fix for a bug in build_phys_page_list(). Perhaps you're hitting this? It would hit it if these are buffers allocated by the sdp kernel module and registered via ib_reg_phys_mr(). Alsoalso: If sdp is using ib_get_dma_mr() to access all of memory, then it won't work with the chelsio driver, which has a 4GB limit on MRs. So cxgb3 creates dma_mrs that map only address 0..4GB-1. This just doesn't work at all if there is an iommu mapping bus addresses above 4GB. Steve. Craig Prescott wrote: > > Hi Felix; > > Here are the last 4 WRs: > > ... > Entering iwch_post_receive > iwch_post_receive: Dumping built work request before ring_doorbell: > iwch_post_receive: WQE ffff810241d59e00: 17c001008000000d > iwch_post_receive: WQE ffff810241d59e08: 0000000000000000 > iwch_post_receive: WQE ffff810241d59e10: 0000000000000001 > iwch_post_receive: WQE ffff810241d59e18: 000002ff00000810 > iwch_post_receive: WQE ffff810241d59e20: 000000044eac3000 > iwch_post_receive: WQE ffff810241d59e28: 0000000000000000 > iwch_post_receive: WQE ffff810241d59e30: 0000000000000000 > iwch_post_receive: WQE ffff810241d59e38: 0000000000000000 > iwch_post_receive: WQE ffff810241d59e40: 0000000000000000 > iwch_post_receive: WQE ffff810241d59e48: 0000000000000000 > iwch_post_receive: WQE ffff810241d59e50: 0000000000000000 > iwch_post_receive: WQE ffff810241d59e58: 0000000000000000 > iwch_post_receive: WQE ffff810241d59e60: 0000000000000000 > iwch_post_receive: returning 0 > Entering iwch_post_receive > iwch_post_receive: Dumping built work request before ring_doorbell: > iwch_post_receive: WQE ffff810241d59e80: 17c001008000000d > iwch_post_receive: WQE ffff810241d59e88: 0000000000000000 > iwch_post_receive: WQE ffff810241d59e90: 0000000000000001 > iwch_post_receive: WQE ffff810241d59e98: 000002ff00000810 > iwch_post_receive: WQE ffff810241d59ea0: 000000044eac4000 > iwch_post_receive: WQE ffff810241d59ea8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59eb0: 0000000000000000 > iwch_post_receive: WQE ffff810241d59eb8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59ec0: 0000000000000000 > iwch_post_receive: WQE ffff810241d59ec8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59ed0: 0000000000000000 > iwch_post_receive: WQE ffff810241d59ed8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59ee0: 0000000000000000 > iwch_post_receive: returning 0 > Entering iwch_post_receive > iwch_post_receive: Dumping built work request before ring_doorbell: > iwch_post_receive: WQE ffff810241d59f00: 17c001008000000d > iwch_post_receive: WQE ffff810241d59f08: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f10: 0000000000000001 > iwch_post_receive: WQE ffff810241d59f18: 000002ff00000810 > iwch_post_receive: WQE ffff810241d59f20: 000000044eac5000 > iwch_post_receive: WQE ffff810241d59f28: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f30: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f38: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f40: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f48: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f50: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f58: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f60: 0000000000000000 > iwch_post_receive: returning 0 > Entering iwch_post_receive > iwch_post_receive: Dumping built work request before ring_doorbell: > iwch_post_receive: WQE ffff810241d59f80: 17c001008000000d > iwch_post_receive: WQE ffff810241d59f88: 0000000000000000 > iwch_post_receive: WQE ffff810241d59f90: 0000000000000001 > iwch_post_receive: WQE ffff810241d59f98: 000002ff00000810 > iwch_post_receive: WQE ffff810241d59fa0: 000000044eac6000 > iwch_post_receive: WQE ffff810241d59fa8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fb0: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fb8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fc0: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fc8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fd0: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fd8: 0000000000000000 > iwch_post_receive: WQE ffff810241d59fe0: 0000000000000000 > iwch_post_receive: returning 0 > > Thanks, > Craig > > > Felix Marti wrote: >> Hi Craig, >> >> Can you please dump not only the last, but the last 4 WRs? >> >> Thanks, >> felix >> >>> -----Original Message----- >>> From: general-bounces at lists.openfabrics.org [mailto:general- >>> bounces at lists.openfabrics.org] On Behalf Of Craig Prescott >>> Sent: Wednesday, January 23, 2008 8:05 AM >>> To: Steve Wise >>> Cc: general at lists.openfabrics.org >>> Subject: Re: [ofa-general] SDP and iWARP >>> >>> Steve Wise wrote: >>>> Craig Prescott wrote: >>>>> Steve Wise wrote: >>>>>> Craig Prescott wrote: >>>>>>> Steve Wise wrote: >>>>>>>> Craig Prescott wrote: >>>>>>>>> The above call also emits a couple of messages >>>>>>>>> into the listener's syslog now : >>>>>>>>> >>>>>>>>> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid >>>>>>>>> 0x20 opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>>>>> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 >>> opcode >>>>>>>>> 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>>>>> >>>>>>>> This is an async event generated due to a failure processing a >> SQ >>>>>>>> WR, I think. opcodes and status codes for iw_cxgb3 are in >>> cxio_wr.h. >>>>>>>> type 1 means it was an egress (SQ) failure >>>>>>>> status 0x6 is a base/bounds violation, >>>>>>>> but 14 seems incorrect. That's not a valid T3 opcode. ???? >>>>>>>> >>>>>>> Ok, thanks! I guess I'm not sure what to make of that yet, >>> though. >>>>>> See where in iwch_accept_cr() the failure is happening. It >> doesn't >>>>>> look like send_mpa_reply() is being called. >>>>>> >>>>> The ECONNRESET is coming from here in iwch_accept_cr(): >>>>> >>>>> ... >>>>> /* wait for wr_ack */ >>>>> wait_event(ep->com.waitq, ep->com.rpl_done); >>>>> err = ep->com.rpl_err; >>>>> ... >>>>> >>>>> Is that what you thought was happening? >>>> I don't know exactly what is going on! But the code above means >> that >>>> the firmware never successfully sent the last streaming message (the >>>> mpa-start reply) and never transitioned the connection into rdma >>> mode. >>>> And the async error might indicate that some WR was posted prior to >>>> doing the rdma_accept() and that WR had problems. >>> Ok. I'm sorry for such a slow response. >>> >>>> a few questions: >>>> >>>> What firmware are you running? ethtool -i will tell you. >>> [root at tebow1 ~]# ethtool -i eth4 >>> driver: cxgb3 >>> version: 1.0-ko >>> firmware-version: T 5.0.0 TP 1.1.0 >>> bus-info: 0000:86:00.0 >>> >>>> What ofed version exactly? >>> OFED 1.3 daily from a few weeks back now: OFED-1.3-20080107-0942 >>> >>>> Does sdp post a SQ or RQ WR prior to doing the rdma_accept()? Can >>> you >>>> dump that work request? Maybe in iwch_post_send and iwch_post_recv, >>>> dump the work request after it is built and before the code rings >> the >>>> doorbell. You can dump it as 8B flits, and be sure an put the flits >>> in >>>> host byte order. See cxio_dump_wqe() in cxio_dbg.c... >>> The following is the last work request seen before rdma_accept(): >>> >>> iwch_post_receive: Dumping built work request before ring_doorbell: >>> iwch_post_receive: WQE ffff810241d59f80: 17c001008000000d >>> iwch_post_receive: WQE ffff810241d59f88: 0000000000000000 >>> iwch_post_receive: WQE ffff810241d59f90: 0000000000000001 >>> iwch_post_receive: WQE ffff810241d59f98: 000002ff00000810 >>> iwch_post_receive: WQE ffff810241d59fa0: 000000044eac6000 >>> iwch_post_receive: WQE ffff810241d59fa8: 0000000000000000 >>> iwch_post_receive: WQE ffff810241d59fb0: 0000000000000000 >>> iwch_post_receive: WQE ffff810241d59fb8: 0000000000000000 >>> iwch_post_receive: WQE ffff810241d59fc0: 0000000000000000 >>> iwch_post_receive: WQE ffff810241d59fc8: 0000000000000000 >>> iwch_post_receive: WQE ffff810241d59fd0: 0000000000000000 >>> iwch_post_receive: WQE ffff810241d59fd8: 0000000000000000 >>> iwch_post_receive: WQE ffff810241d59fe0: 0000000000000000 >>> iwch_post_receive: returning 0 >>> >>> This comes from sdp_init_qp(), via sdp_connect_handler(). >>> There are a total of 64 work requests (all from >>> iwch_post_receive()) generated while the netserver is >>> trying to handle the RDMA_CM_EVENT_CONNECT_REQUEST. >>> >>> Can you help me decode the above work request? >>> >>> Thanks, >>> Craig >>> >>> >>> >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib- >>> general From kononov at dls.net Thu Jan 24 09:39:51 2008 From: kononov at dls.net (Roman Kononov) Date: Thu, 24 Jan 2008 11:39:51 -0600 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <1201195120.6755.75.camel@mtls03> References: <475607AA.301@dls.net> <478B8172.2010104@mellanox.co.il> <1200988162.6925.170.camel@mtls03> <4797A6FD.2080708@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> <4797CDC3.6090707@dls.net> <4797DC47.6060104@dls.net> <4797DD87.8020808@dls.net> <1201184289.6755.7.camel@mtls03> <4798B643.9060508@dls.net> <1201195120.6755.75.camel@mtls03> Message-ID: <4798CD67.8040503@dls.net> On 2008-01-24 11:18 Eli Cohen said the following: >>> Roman, >>> can you tell whether from your perspective there has been an improvement >>> in stability with the latest patches? >> Since I use 'nice' queue sizes, the very last patches made no difference. >> > I meant to ask whether there was improvement between that time before > the patches were posted to now. > Difficult to say. Before the patches, I got a few errors per day. After the patches, I get different errors with same frequency. Question: is there an easy way to inject transmission errors into the fabric? Roman From rdreier at cisco.com Thu Jan 24 09:45:43 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Jan 2008 09:45:43 -0800 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4798CB4C.7070706@opengridcomputing.com> (Steve Wise's message of "Thu, 24 Jan 2008 11:30:52 -0600") References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> Message-ID: Sorry to come into this thread so late, but does it make sense to try the current SDP code over iWARP? As I understand things, the RDMA consortium has its own spec for SDP on iWARP, which may not precisely correspond to the IBA SDP annex. So probably the SDP code would need updating to work over iWARP. (And don't all the iWARP vendors have TCP offload socket stuff for their adapters anyway, which is a simpler solution to the same problem that SDP solves?) - R. From jim at mellanox.com Thu Jan 24 09:46:42 2008 From: jim at mellanox.com (Jim Mott) Date: Thu, 24 Jan 2008 09:46:42 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> Message-ID: I am really puzzled. The majority of my testing has been between Rhat4U4 and Rhat5. Using netperf command lines of the form: netperf -C -c -P 0 -t TCP_RR -H 193.168.10.143 -l 60 ---r 64 netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 1000000 and a process of: - set sdp_zcopy_thresh=0, run bandwidth test - set sdp_zcopy_thresh=size, run bandwidth test I repeatedly get results that look like this: size SDP Bzcopy 65536 7375.00 7515.98 131072 7465.70 8105.58 1000000 6541.87 9948.76 These numbers are from high end (2-socket, quad-core) machines. When you use smaller machines, like the AMD dual-core shown below, the differences between SDP with and without bzcopy are more striking. The process to start the netserver is: export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib export LD_PRELOAD=libsdp.so export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf netserver The process to start the netperf is similar: export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib export LD_PRELOAD=libsdp.so export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 1000000 You and unload and reload ib_sdp between tests, but I just echo 0 and echo size into sdp_zcopy_thresh on the sending side. Note that it is in a different place on Rhat4u4 and Rhat5. My libsdp.conf is the default that ships with OFED. Stripping the comments (grep -v), it is just: log min-level 9 destination file libsdp.log use both server * *:* use both client * *:* Note that if you build locally: cd /tmp/openib_gen2/xxxx/ofa_1_3_dev_kernel make install the libsdp.conf file seems to get lost. You must restore it by hand. I have a shell script that automates this testing for a wide range of message sizes: 64 128 512 1024 2048 4096 8192 16000 32768 65536 131072 1000000 on multiple transports: IP both "echo datagram > /sys/class/net/ib0/mode" IP-CM both "echo connected > /sys/class/net/ib0/mode" SDP both Bzcopy TCP_STREAM Where both is TCP_RR and TCP_STREAM testing. The variance in SDP bandwidth results can be 10%-15% between runs. The difference between Bzcopy and non-Bzcopy is always very visible for 128K and up tests though. Could some other people please try to run some of these tests? If only help me know if I am crazy? Thanks, JIm Jim Mott Mellanox Technologies Ltd. mail: jim at mellanox.com Phone: 512-294-5481 -----Original Message----- From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] Sent: Thursday, January 24, 2008 11:17 AM To: Jim Mott; Weikuan Yu Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh I've tested on RHEL4 and RHEL5, and see no sdp_zcopy_thresh improvement for any message size, as measured with netperf, for any Arbel or ConnectX HCA. Scott > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Thursday, January 24, 2008 7:57 AM > To: Weikuan Yu; Scott Weitzenkamp (sweitzen) > Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > Hi, > 64K is borderline for seeing bzcopy effect. Using an AMD > 6000+ (3 Ghz > dual core) in Asus M2A-VM motherboard with ConnectX running > 2.3 firmware > and OFED 1.3-rc3 stack running on 2.6.23.8 kernel.org kernel, > I ran the > test for 128K: > 5546 sdp_zcopy_thresh=0 (off) > 8709 sdp_zcopy_thresh=65536 > > For these tests, I just have LD_PRELOAD set in my environment. > > ======================= > > I see that TCP_MAXSEG is not being handled by libsdp and will > look into > it. > > > [root at dirk ~]# modprobe ib_sdp > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c > -C -- -m 128K > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 193.168.10.198 > (193.168.10.198) port 0 AF_INET > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > Recv Send Send Utilization Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 87380 16384 131072 30.01 5545.69 51.47 14.43 1.521 > 1.706 > > Alignment Offset Bytes Bytes Sends Bytes > Recvs > Local Remote Local Remote Xfered Per Per > Send Recv Send Recv Send (avg) Recv (avg) > 8 8 0 0 2.08e+10 131072.00 158690 33135.60 > 627718 > > Maximum > Segment > Size (bytes) > -1 > [root at dirk ~]# echo 65536 > >/sys/module/ib_sdp/parameters/sdp_zcopy_thresh > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c > -C -- -m 128K > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 193.168.10.198 > (193.168.10.198) port 0 AF_INET > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > Recv Send Send Utilization Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 87380 16384 131072 30.01 8708.58 50.63 14.55 0.953 > 1.095 > > Alignment Offset Bytes Bytes Sends Bytes > Recvs > Local Remote Local Remote Xfered Per Per > Send Recv Send Recv Send (avg) Recv (avg) > 8 8 0 0 3.267e+10 131072.00 249228 26348.30 > 1239807 > > Maximum > Segment > Size (bytes) > -1 > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Weikuan Yu [mailto:weikuan.yu at gmail.com] > Sent: Thursday, January 24, 2008 9:09 AM > To: Scott Weitzenkamp (sweitzen) > Cc: Jim Mott; ewg at lists.openfabrics.org; general at lists.openfabrics.org > Subject: Re: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > Hi, Scott, > > I have been running SDP tests across two woodcrest nodes with 4x DDR > cards using OFED-1.2.5.4. The card/firmware info is below. > > CA 'mthca0' > CA type: MT25208 > Number of ports: 2 > Firmware version: 5.1.400 > Hardware version: a0 > Node GUID: 0x0002c90200228e0c > System image GUID: 0x0002c90200228e0f > > I could not get a bandwidth more than 5Gbps like you have shown here. > Wonder if I need to upgrade to the latest software or firmware? Any > suggestions? > > Thanks, > --Weikuan > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 192.168.225.77 > (192.168 > .225.77) port 0 AF_INET > Recv Send Send Utilization > Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 131072 131072 131072 10.00 4918.95 21.29 24.99 1.418 > 1.665 > > > Scott Weitzenkamp (sweitzen) wrote: > > Jim, > > > > I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU > > (single core each CPU) Xeon system. I do not see any performance > > improvement (either throughput or CPU utilization) using > netperf when > I > > set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384. Can you elaborate > on > > your HCA type, and performance improvement you see? > > > > Here's an example netperf command line when using a Cheetah DDR HCA > and > > 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware > too): > > > > [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 -v2 -4 -H > > 192.168.1.201 -l 30 -t TCP_STREAM -c -C -- -m 65536 > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 192.168.1.201 > > (192.168.1.201) port 0 AF_INET : histogram : demo > > > > Recv Send Send Utilization > Service > > Demand > > Socket Socket Message Elapsed Send Recv Send > > Recv > > Size Size Size Time Throughput local remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > > us/KB > > > > 87380 16384 65536 30.01 7267.70 55.06 61.27 1.241 > > 1.381 > > > > Alignment Offset Bytes Bytes Sends Bytes > > Recvs > > Local Remote Local Remote Xfered Per Per > > Send Recv Send Recv Send (avg) > Recv (avg) > > 8 8 0 0 2.726e+10 65536.00 415942 > 48106.01 > > 566648 > > > From swise at opengridcomputing.com Thu Jan 24 09:54:26 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 24 Jan 2008 11:54:26 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> Message-ID: <4798D0D2.5070103@opengridcomputing.com> Roland Dreier wrote: > Sorry to come into this thread so late, but does it make sense to try > the current SDP code over iWARP? As I understand things, the RDMA > consortium has its own spec for SDP on iWARP, which may not precisely > correspond to the IBA SDP annex. So probably the SDP code would need > updating to work over iWARP. > I didn't think they were that different, but I don't know for sure. However, unless the IB-SDP uses atomics or some other IB-specific work request, it just might work. > (And don't all the iWARP vendors have TCP offload socket stuff for > their adapters anyway, which is a simpler solution to the same problem > that SDP solves?) > Dunno about other vendors, but Chelsio's TOE code is available on their web site (service.chelsio.com). It will be more efficient than SDP over iWARP (no SDP/iWARP headers needed). I assume Craig is doing some research to perhaps determine exactly how SDP performs over iwarp vs TOE or standard TCP. Steve. From emax1005 at 012.net.il Thu Jan 24 07:00:27 2008 From: emax1005 at 012.net.il (=?windows-1255?Q?=E4=EE=F8=EB=E6_=EC=E4=EB=F9=F8=FA_=EE=F0=E4=EC=E9=ED?=) Date: Thu, 24 Jan 2008 17:00:27 +0200 Subject: [ofa-general] =?windows-1255?b?7uQg4+X09yDs6iDg+iDk8vH3Pw==?= Message-ID: <1a08d5627e8479aef60540f4001c2d43@012.net.il> An HTML attachment was scrubbed... URL: From sashak at voltaire.com Thu Jan 24 10:15:30 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 24 Jan 2008 18:15:30 +0000 Subject: [ofa-general] Opensm compatibility with rate=1 In-Reply-To: <571f1a060801231518i76c3383l38e3fe9cc32e3cd8@mail.gmail.com> References: <571f1a060801231518i76c3383l38e3fe9cc32e3cd8@mail.gmail.com> Message-ID: <20080124181530.GH11277@sashak.voltaire.com> Hi Greg, On 15:18 Wed 23 Jan , Greg Kurtzer wrote: > > We recently updated OFED (among other things) on one of our IB test > beds that use older cards. Something broke recently with an error in > dmesg like: > > kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 > > We used to fix this by defining our partition to be: > > Default=0x7fff,ipoib,rate=1:ALL=full; > > But this no longer seems to work. > > In the opensm source code I see the following: > > > /* following v1 ver1.2 p901 */ > #define IB_PATH_RECORD_RATE_2_5_GBS 2 > #define IB_PATH_RECORD_RATE_10_GBS 3 > #define IB_PATH_RECORD_RATE_30_GBS 4 > #define IB_PATH_RECORD_RATE_5_GBS 5 > #define IB_PATH_RECORD_RATE_20_GBS 6 > #define IB_PATH_RECORD_RATE_40_GBS 7 > #define IB_PATH_RECORD_RATE_60_GBS 8 > #define IB_PATH_RECORD_RATE_80_GBS 9 > #define IB_PATH_RECORD_RATE_120_GBS 10 > > #define IB_MIN_RATE IB_PATH_RECORD_RATE_2_5_GBS > #define IB_MAX_RATE IB_PATH_RECORD_RATE_120_GBS > > Which forces the lowest possible rate to be 2 which doesn't work with > our test bed. By kludging IB_MIN_RATE to be set to 1, things seem to > be working but chances are supporting only rates >= 2 was done on > purpose. Is there a better workaround or solution to this, or a way of > continuing support for rate=1? What is the purpose of rate=1 in your setup? According to IBA spec the value '1' for rate is "reserved". Sasha From mashirle at us.ibm.com Thu Jan 24 00:14:51 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 24 Jan 2008 00:14:51 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> Message-ID: <1201162492.9739.21.camel@localhost.localdomain> Hello Eli, Below is the updated patch against Roland's for-2.6.25 tree. This patch allows IPoIB UD MTU up to 4K when HCA is capable. To simple this patch, the IPoIB MTU size is limited to 4K - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4 bytes (padding to align IP header) so we can limit skb buffer allocation to one page. The node IPoIB link MTU size is the minimum value of admin configurable MTU through ifconfig and IPoIB default broadcast group MTU size. When Subnet Manager enables default broadcast group during start up, this subnet IPoIB link MTU will be the value of default broadcast group MTU size. For any node IB MTU smaller than this value, the node can't join this IPoIB subnet. For any node IB MTU is greater than this value, the node will join this IPoIB subnet and set this value as its link MTU. If Subnet Manager disables default broadcast group during start up, the first bring up node in this subnet will create the default IPoIB broadcast group based on the negotiation with the Subnet Manager. Signed-off-by: Shirley Ma --- drivers/infiniband/ulp/ipoib/ipoib.h | 11 ++++++++--- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 21 ++++++++++++++------- drivers/infiniband/ulp/ipoib/ipoib_main.c | 19 ++++++++++++++----- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 4 +--- 4 files changed, 37 insertions(+), 18 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index fe250c6..af11e2c 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -56,9 +56,6 @@ /* constants */ enum { - IPOIB_PACKET_SIZE = 2048, - IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, - IPOIB_ENCAP_LEN = 4, IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header to 16 */ @@ -319,6 +316,7 @@ struct ipoib_dev_priv { struct dentry *mcg_dentry; struct dentry *path_dentry; #endif + unsigned int max_ib_mtu; }; struct ipoib_ah { @@ -424,6 +422,13 @@ int ipoib_mcast_stop_thread(struct net_device *dev, int flush); void ipoib_mcast_dev_down(struct net_device *dev); void ipoib_mcast_dev_flush(struct net_device *dev); +/* padding packet to fit one page size for 4K IB mtu */ +static inline int ipoib_ud_mtu(unsigned int ib_mtu) +{ + return (ib_mtu < 4096) ? (ib_mtu - IPOIB_ENCAP_LEN) : + (ib_mtu - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4); +} + #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG struct ipoib_mcast_iter *ipoib_mcast_iter_init(struct net_device *dev); int ipoib_mcast_iter_next(struct ipoib_mcast_iter *iter); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 52bc2bd..662ec8e 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -87,6 +87,13 @@ void ipoib_free_ah(struct kref *kref) spin_unlock_irqrestore(&priv->lock, flags); } +/* padding packet to fit one page size for 4K IB mtu */ +static int ipoib_ud_buf_size(unsigned int max_ib_mtu) +{ + return (max_ib_mtu < 4096) ? (max_ib_mtu + IB_GRH_BYTES) : + (max_ib_mtu - 4); +} + static int ipoib_ib_post_receive(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -96,7 +103,7 @@ static int ipoib_ib_post_receive(struct net_device *dev, int id) int ret; list.addr = priv->rx_ring[id].mapping; - list.length = IPOIB_BUF_SIZE; + list.length = ipoib_ud_buf_size(priv->max_ib_mtu); list.lkey = priv->mr->lkey; param.next = NULL; @@ -108,7 +115,7 @@ static int ipoib_ib_post_receive(struct net_device *dev, int id) if (unlikely(ret)) { ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); ib_dma_unmap_single(priv->ca, priv->rx_ring[id].mapping, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); dev_kfree_skb_any(priv->rx_ring[id].skb); priv->rx_ring[id].skb = NULL; } @@ -122,7 +129,7 @@ static int ipoib_alloc_rx_skb(struct net_device *dev, int id) struct sk_buff *skb; u64 addr; - skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); + skb = dev_alloc_skb(ipoib_ud_buf_size(priv->max_ib_mtu) + 4); if (!skb) return -ENOMEM; @@ -133,7 +140,7 @@ static int ipoib_alloc_rx_skb(struct net_device *dev, int id) */ skb_reserve(skb, 4); - addr = ib_dma_map_single(priv->ca, skb->data, IPOIB_BUF_SIZE, + addr = ib_dma_map_single(priv->ca, skb->data, ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { dev_kfree_skb_any(skb); @@ -190,7 +197,7 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); ib_dma_unmap_single(priv->ca, addr, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); dev_kfree_skb_any(skb); priv->rx_ring[wr_id].skb = NULL; return; @@ -215,7 +222,7 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", wc->byte_len, wc->slid); - ib_dma_unmap_single(priv->ca, addr, IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ib_dma_unmap_single(priv->ca, addr, ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); skb_put(skb, wc->byte_len); skb_pull(skb, IB_GRH_BYTES); @@ -632,7 +639,7 @@ int ipoib_ib_dev_stop(struct net_device *dev, int flush) continue; ib_dma_unmap_single(priv->ca, rx_req->mapping, - IPOIB_BUF_SIZE, + ipoib_ud_buf_size(priv->max_ib_mtu), DMA_FROM_DEVICE); dev_kfree_skb_any(rx_req->skb); rx_req->skb = NULL; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index a082466..b7192ca 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -194,7 +194,7 @@ static int ipoib_change_mtu(struct net_device *dev, int new_mtu) return 0; } - if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) + if (new_mtu > ipoib_ud_mtu(priv->max_ib_mtu)) return -EINVAL; priv->admin_mtu = new_mtu; @@ -968,10 +968,6 @@ static void ipoib_setup(struct net_device *dev) dev->tx_queue_len = ipoib_sendq_size * 2; dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; - /* MTU will be reset when mcast join happens */ - dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; - priv->mcast_mtu = priv->admin_mtu = dev->mtu; - memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); netif_carrier_off(dev); @@ -1103,6 +1099,7 @@ static struct net_device *ipoib_add_port(const char *format, struct ib_device *hca, u8 port) { struct ipoib_dev_priv *priv; + struct ib_port_attr attr; int result = -ENOMEM; priv = ipoib_intf_alloc(format); @@ -1111,6 +1108,18 @@ static struct net_device *ipoib_add_port(const char *format, SET_NETDEV_DEV(priv->dev, hca->dma_device); + if (!ib_query_port(hca, port, &attr)) + priv->max_ib_mtu = ib_mtu_enum_to_int(attr.max_mtu); + else { + printk(KERN_WARNING "%s: ib_query_port %d failed\n", + hca->name, port); + goto device_init_failed; + } + + /* MTU will be reset when mcast join happens */ + priv->dev->mtu = ipoib_ud_mtu(priv->max_ib_mtu); + priv->mcast_mtu = priv->admin_mtu = priv->dev->mtu; + result = ib_query_pkey(hca, port, 0, &priv->pkey); if (result) { printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 2628339..0661e87 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -567,9 +567,7 @@ void ipoib_mcast_join_task(struct work_struct *work) return; } - priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - - IPOIB_ENCAP_LEN; - + priv->mcast_mtu = ipoib_ud_mtu(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); From jimmott at austin.rr.com Thu Jan 24 10:29:27 2008 From: jimmott at austin.rr.com (Jim Mott) Date: Thu, 24 Jan 2008 12:29:27 -0600 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> Message-ID: <000001c85eb7$0dc54f80$294fee80$@rr.com> Attached is the output of a run of the shell script I described. The source machine was the big 2-socket dual core, what the target (running netserver) was the AMD 6000+ system. I know it is a little dense, but the format of the data is pretty self explanatory. In this run you see that for the 64 byte latency test: SDP 59628.17 TPS with 41.4 uS/transaction (Local 22.604 + Remote 18.796) IP 17703.55 TPS with 50.6 uS/transaction (Local 15.680 + Remote 34.380) SDP provided about 3.37x total transactions/second with 18% less CPU/transaction. On the Bandwidth side for 1,000,000 byte messages you get: IP-CM 3802.80 Mb/sec with 2.4 uS/KB (local 1.297 + remote 1.103) SDP 6841.07 Mb/sec with 2.6 uS/KB (local 1.346 + remote 1.217) Bzcopy 10573.41 Mb/sec with 1.5 uS/KB (local 0.733 + remote 0.796) BZcopy provided about 2.8x the bandwidth of IPoIB-CM with 38% less CPU/Kb. Bzcopy provided about 1.5x the bandwidth of SDP with 42% less CPU/Kb. These are the sorts of numbers I keep getting. # date ==> Thu Jan 24 11:38:37 CST 2008 # Tests run for 60 seconds # Source IP is 193.168.10.198 remote IP is 193.168.10.125 # Local OS is 2.6.9-42.ELsmp x86_64 # sdp_zcopy_thresh is 0 ofa_1_3_dev-20080121-0855 User: libibverbs: git://git.openfabrics.org/ofed_1_3/libibverbs.git ofed_1_3 commit 018c44a44ff0344dfe7cf5f6598f81d81769164e libmthca: git://git.openfabrics.org/ofed_1_3/libmthca.git ofed_1_3 commit ec00c5b0887888bb62515961205a1e6e61bfea5f libmlx4: git://git.openfabrics.org/ofed_1_3/libmlx4.git ofed_1_3 commit e3b9b75bdb024cf4af9ecac96b5aa14488ea5c72 libehca: git://git.openfabrics.org/ofed_1_3/libehca.git ofed_1_3 commit f159085910b42d7118e536b0ad40b8fc2b8e5c27 libipathverbs: git://git.openfabrics.org/~ralphc/libipathverbs/.git master commit d47f13b02acab6129e719155f4f90d743229685a libcxgb3: git://git.openfabrics.org/ofed_1_3/libcxgb3.git ofed_1_3 commit 10893a7a45e6913483d90023bb90bf9cb2420384 libnes: git://git.openfabrics.org/ofed_1_3/libnes.git ofed_1_3 commit 27ebf415cb65918237c7c21fd8b28bc1dbf4fca6 libibcm: git://git.openfabrics.org/~shefty/libibcm.git master commit a45e43483ac29a26c1803f217ca21a07534494c4 librdmacm: git://git.openfabrics.org/ofed_1_3/librdmacm.git ofed_1_3 commit afe87c16f40fe4a3622f231672737950c0ebf9fa dapl: git://git.openfabrics.org/~ardavis/dapl.git master commit 6dcf1763c153c27c29ba76bac35be4f6935ddd96 libsdp: git://git.openfabrics.org/ofed_1_3/libsdp.git ofed_1_3 commit 47801f8f1e2168c34690b93edaccadc2ece936ef sdpnetstat: git://git.openfabrics.org/ofed_1_3/sdpnetstat.git ofed_1_3 commit 3341620a7259c4f7bdd4180864b98e260c3dc223 srptools: git://git.openfabrics.org/~ishai/srptools.git master commit 79ce808b9e181559c08495e1698c58bd49155ae4 perftest: git://git.openfabrics.org/~tziporet/perftest.git master commit 07343734cb4ae15ea5b6aaabcd3ad57e2a36806b qlvnictools: git://git.openfabrics.org/ofed_1_3/qlvnictools.git ofed_1_3 commit 41a148393a602810df80109e71086970a91c1d8d tvflash: git://git.openfabrics.org/~rdreier/tvflash.git master commit 39a63301f0344b6b3d45bc4b16d76be81f4377c0 mstflint: git://git.openfabrics.org/~orenk/mstflint.git master commit 3c711303e6474186920a24aadcc262f6fa6c9177 qperf: git://git.openfabrics.org/ofed_1_3/qperf.git ofed_1_3 commit 317ca959ec2bd978e6c51fd304ac546fdf5397d8 management: git://git.openfabrics.org/ofed_1_3/management.git ofed_1_3 commit 88b853b0f2a463a8335f7451c809d5ae5d1e14ee ibutils: git://git.openfabrics.org/~orenk/ibutils.git master commit 0225143c82416d02d6f00cf93bb0f38915557a12 imgen: git://git.openfabrics.org/~mst/imgen.git master commit a309109bebcc1ae94720c6bb8be5b0b974b93324 ofed_scripts: git://git.openfabrics.org/ofed_1_3/ofascripts.git ofed_1_3 commit efcd4c7ab3d3a4b0b8d75165d101825cb73093ff Kernel: Git: git://git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel commit dbdcdc7b6c699f8634c58f18e83c1d2caa75f4c2 CA 'mlx4_0' CA type: MT25418 Number of ports: 2 Firmware version: 2.3.914 Hardware version: 0 Node GUID: 0x0002c90300002078 System image GUID: 0x0002c9030000207b Port 1: State: Active Physical state: LinkUp Rate: 20 Base lid: 1 LMC: 0 SM lid: 8 Capability mask: 0x02510868 Port GUID: 0x0002c90300002079 Port 2: State: Down Physical state: Polling Rate: 10 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510868 Port GUID: 0x0002c9030000207a # Format of latency record # LAT, mode, sdp, size, transaction/sec, local CPU, rmt CPU, local uS/tr, rmt uS/tr # Format of bandwith record # BW, mode, sdp, size, bandwith (Mb/sec), local CPU, rmt CPU, local uS/KB, rmt uS/KB ----- Start ----- LAT, IPoIB, No SDP, 64, 17703.55, 3.47, 30.43, 15.680, 34.380 LAT, IPoIB-CM, No SDP, 64, 16147.11, 3.48, 29.73, 17.220, 36.826 LAT, IPoIB-CM, SDP, 64, 59628.17, 16.85, 56.04, 22.604, 18.796 BW, IPoIB, No SDP, 64, 731.88, 15.49, 29.44, 13.873, 6.592 BW, IPoIB-CM, No SDP, 64, 832.68, 13.91, 21.11, 10.950, 4.155 BW, IPoIB-CM, SDP, 64, 149.59, 17.22, 54.04, 75.452, 59.186 BW, IPoIB-CM, SDP-Bzcopy(64), 64, 75.73, 16.62, 56.30, 143.795, 121.818 LAT, IPoIB, No SDP, 128, 17531.94, 2.60, 28.50, 11.855, 32.514 LAT, IPoIB-CM, No SDP, 128, 16148.53, 2.42, 29.23, 11.994, 36.205 LAT, IPoIB-CM, SDP, 128, 59596.38, 15.87, 55.70, 21.305, 18.694 BW, IPoIB, No SDP, 128, 1500.18, 13.61, 35.66, 5.944, 3.895 BW, IPoIB-CM, No SDP, 128, 1843.86, 13.44, 28.70, 4.776, 2.551 BW, IPoIB-CM, SDP, 128, 312.79, 17.96, 52.81, 37.625, 27.664 BW, IPoIB-CM, SDP-Bzcopy(128), 128, 137.33, 17.18, 55.45, 82.003, 66.151 LAT, IPoIB, No SDP, 512, 15950.67, 3.56, 28.34, 17.840, 35.533 LAT, IPoIB-CM, No SDP, 512, 15248.76, 3.53, 30.09, 18.534, 39.466 LAT, IPoIB-CM, SDP, 512, 58716.25, 15.46, 55.70, 21.061, 18.974 BW, IPoIB, No SDP, 512, 2649.21, 13.25, 40.64, 3.277, 2.513 BW, IPoIB-CM, No SDP, 512, 3672.23, 12.55, 28.03, 2.239, 1.251 BW, IPoIB-CM, SDP, 512, 1606.05, 17.83, 51.98, 7.277, 5.303 BW, IPoIB-CM, SDP-Bzcopy(512), 512, 531.12, 17.24, 55.45, 21.276, 17.104 LAT, IPoIB, No SDP, 1024, 15460.70, 3.41, 27.34, 17.668, 35.366 LAT, IPoIB-CM, No SDP, 1024, 14610.25, 3.52, 29.12, 19.247, 39.856 LAT, IPoIB-CM, SDP, 1024, 52837.54, 14.88, 55.30, 22.526, 20.933 BW, IPoIB, No SDP, 1024, 3202.60, 12.85, 44.53, 2.630, 2.278 BW, IPoIB-CM, No SDP, 1024, 4323.63, 12.02, 30.21, 1.822, 1.145 BW, IPoIB-CM, SDP, 1024, 2670.95, 17.53, 47.30, 4.302, 2.901 BW, IPoIB-CM, SDP-Bzcopy(1024), 1024, 957.09, 16.81, 54.80, 11.511, 9.380 LAT, IPoIB, No SDP, 2048, 13176.44, 4.21, 31.66, 25.547, 48.053 LAT, IPoIB-CM, No SDP, 2048, 13654.42, 3.10, 27.27, 18.165, 39.936 LAT, IPoIB-CM, SDP, 2048, 41861.31, 15.71, 54.07, 30.031, 25.831 BW, IPoIB, No SDP, 2048, 3375.68, 13.23, 44.40, 2.568, 2.155 BW, IPoIB-CM, No SDP, 2048, 4790.28, 10.69, 32.74, 1.463, 1.120 BW, IPoIB-CM, SDP, 2048, 3066.77, 18.10, 44.72, 3.868, 2.389 BW, IPoIB-CM, SDP-Bzcopy(2048), 2048, 1456.18, 14.98, 54.94, 6.744, 6.181 LAT, IPoIB, No SDP, 4096, 11710.43, 4.33, 29.50, 29.592, 50.377 LAT, IPoIB-CM, No SDP, 4096, 11723.73, 3.79, 28.05, 25.878, 47.855 LAT, IPoIB-CM, SDP, 4096, 31206.42, 14.57, 54.01, 37.355, 34.616 BW, IPoIB, No SDP, 4096, 3564.14, 12.81, 45.62, 2.356, 2.097 BW, IPoIB-CM, No SDP, 4096, 4711.42, 9.81, 33.34, 1.365, 1.159 BW, IPoIB-CM, SDP, 4096, 4036.12, 17.60, 48.64, 2.857, 1.974 BW, IPoIB-CM, SDP-Bzcopy(4096), 4096, 2716.29, 14.92, 54.36, 3.600, 3.279 LAT, IPoIB, No SDP, 8192, 9327.63, 5.38, 29.40, 46.101, 63.036 LAT, IPoIB-CM, No SDP, 8192, 9585.17, 3.50, 26.79, 29.188, 55.891 LAT, IPoIB-CM, SDP, 8192, 21154.69, 14.24, 54.07, 53.849, 51.120 BW, IPoIB, No SDP, 8192, 3936.99, 12.67, 47.47, 2.109, 1.975 BW, IPoIB-CM, No SDP, 8192, 4716.53, 9.58, 33.50, 1.330, 1.164 BW, IPoIB-CM, SDP, 8192, 5688.79, 15.64, 51.24, 1.802, 1.476 BW, IPoIB-CM, SDP-Bzcopy(8192), 8192, 4582.61, 14.39, 53.10, 2.058, 1.898 LAT, IPoIB, No SDP, 16000, 7486.21, 7.13, 37.44, 76.157, 100.034 LAT, IPoIB-CM, No SDP, 16000, 7328.59, 2.69, 26.01, 29.416, 70.975 LAT, IPoIB-CM, SDP, 16000, 13124.20, 14.93, 53.32, 91.027, 81.259 BW, IPoIB, No SDP, 16000, 3621.30, 13.00, 46.12, 2.353, 2.087 BW, IPoIB-CM, No SDP, 16000, 5281.13, 9.55, 35.28, 1.186, 1.095 BW, IPoIB-CM, SDP, 16000, 6851.09, 15.01, 52.01, 1.436, 1.244 BW, IPoIB-CM, SDP-Bzcopy(16000), 16000, 6633.81, 14.23, 52.49, 1.406, 1.296 LAT, IPoIB, No SDP, 32768, 4328.65, 7.04, 33.37, 130.028, 154.172 LAT, IPoIB-CM, No SDP, 32768, 4540.68, 3.63, 25.96, 63.984, 114.345 LAT, IPoIB-CM, SDP, 32768, 8449.69, 13.97, 53.27, 132.230, 126.086 BW, IPoIB, No SDP, 32768, 3932.05, 12.57, 47.40, 2.096, 1.975 BW, IPoIB-CM, No SDP, 32768, 4790.61, 9.34, 33.79, 1.278, 1.156 BW, IPoIB-CM, SDP, 32768, 7700.25, 15.04, 51.26, 1.280, 1.091 BW, IPoIB-CM, SDP-Bzcopy(32768), 32768, 7552.72, 13.40, 52.51, 1.163, 1.139 LAT, IPoIB, No SDP, 65536, 2821.14, 8.01, 34.68, 227.079, 245.879 LAT, IPoIB-CM, No SDP, 65536, 2494.38, 3.52, 30.69, 112.742, 246.042 LAT, IPoIB-CM, SDP, 65536, 4972.83, 15.00, 52.92, 241.327, 212.849 BW, IPoIB, No SDP, 65536, 3759.02, 12.83, 46.25, 2.236, 2.016 BW, IPoIB-CM, No SDP, 65536, 4824.50, 8.37, 34.40, 1.138, 1.168 BW, IPoIB-CM, SDP, 65536, 8388.52, 13.00, 51.51, 1.015, 1.006 BW, IPoIB-CM, SDP-Bzcopy(65536), 65536, 9070.82, 13.10, 52.44, 0.946, 0.947 LAT, IPoIB, No SDP, 131072, 1941.87, 10.31, 38.14, 424.560, 392.832 LAT, IPoIB-CM, No SDP, 131072, 1559.19, 5.05, 38.05, 259.204, 488.052 LAT, IPoIB-CM, SDP, 131072, 2897.77, 12.64, 49.54, 349.040, 341.949 BW, IPoIB, No SDP, 131072, 3761.74, 12.54, 46.65, 2.185, 2.032 BW, IPoIB-CM, No SDP, 131072, 3749.70, 7.69, 27.40, 1.345, 1.197 BW, IPoIB-CM, SDP, 131072, 6462.24, 14.25, 51.22, 1.445, 1.299 BW, IPoIB-CM, SDP-Bzcopy(131072), 131072, 9807.54, 13.94, 52.07, 0.931, 0.870 LAT, IPoIB, No SDP, 1000000, 203.55, 11.29, 42.55, 4437.527, 4180.463 LAT, IPoIB-CM, No SDP, 1000000, 164.28, 4.53, 35.14, 2205.818, 4278.296 LAT, IPoIB-CM, SDP, 1000000, 470.33, 13.48, 49.18, 2292.254, 2091.181 BW, IPoIB, No SDP, 1000000, 3795.27, 12.94, 46.85, 2.234, 2.022 BW, IPoIB-CM, No SDP, 1000000, 3802.80, 7.52, 25.61, 1.297, 1.103 BW, IPoIB-CM, SDP, 1000000, 6841.07, 14.05, 50.82, 1.346, 1.217 BW, IPoIB-CM, SDP-Bzcopy(1000000), 1000000, 10573.41, 11.82, 51.34, 0.733, 0.796 -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Jim Mott Sent: Thursday, January 24, 2008 11:47 AM To: Scott Weitzenkamp (sweitzen); Weikuan Yu Cc: general at lists.openfabrics.org Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh I am really puzzled. The majority of my testing has been between Rhat4U4 and Rhat5. Using netperf command lines of the form: netperf -C -c -P 0 -t TCP_RR -H 193.168.10.143 -l 60 ---r 64 netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 1000000 and a process of: - set sdp_zcopy_thresh=0, run bandwidth test - set sdp_zcopy_thresh=size, run bandwidth test I repeatedly get results that look like this: size SDP Bzcopy 65536 7375.00 7515.98 131072 7465.70 8105.58 1000000 6541.87 9948.76 These numbers are from high end (2-socket, quad-core) machines. When you use smaller machines, like the AMD dual-core shown below, the differences between SDP with and without bzcopy are more striking. The process to start the netserver is: export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib export LD_PRELOAD=libsdp.so export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf netserver The process to start the netperf is similar: export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib export LD_PRELOAD=libsdp.so export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 ---r 1000000 You and unload and reload ib_sdp between tests, but I just echo 0 and echo size into sdp_zcopy_thresh on the sending side. Note that it is in a different place on Rhat4u4 and Rhat5. My libsdp.conf is the default that ships with OFED. Stripping the comments (grep -v), it is just: log min-level 9 destination file libsdp.log use both server * *:* use both client * *:* Note that if you build locally: cd /tmp/openib_gen2/xxxx/ofa_1_3_dev_kernel make install the libsdp.conf file seems to get lost. You must restore it by hand. I have a shell script that automates this testing for a wide range of message sizes: 64 128 512 1024 2048 4096 8192 16000 32768 65536 131072 1000000 on multiple transports: IP both "echo datagram > /sys/class/net/ib0/mode" IP-CM both "echo connected > /sys/class/net/ib0/mode" SDP both Bzcopy TCP_STREAM Where both is TCP_RR and TCP_STREAM testing. The variance in SDP bandwidth results can be 10%-15% between runs. The difference between Bzcopy and non-Bzcopy is always very visible for 128K and up tests though. Could some other people please try to run some of these tests? If only help me know if I am crazy? Thanks, JIm Jim Mott Mellanox Technologies Ltd. mail: jim at mellanox.com Phone: 512-294-5481 -----Original Message----- From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] Sent: Thursday, January 24, 2008 11:17 AM To: Jim Mott; Weikuan Yu Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh I've tested on RHEL4 and RHEL5, and see no sdp_zcopy_thresh improvement for any message size, as measured with netperf, for any Arbel or ConnectX HCA. Scott > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Thursday, January 24, 2008 7:57 AM > To: Weikuan Yu; Scott Weitzenkamp (sweitzen) > Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > Hi, > 64K is borderline for seeing bzcopy effect. Using an AMD > 6000+ (3 Ghz > dual core) in Asus M2A-VM motherboard with ConnectX running > 2.3 firmware > and OFED 1.3-rc3 stack running on 2.6.23.8 kernel.org kernel, > I ran the > test for 128K: > 5546 sdp_zcopy_thresh=0 (off) > 8709 sdp_zcopy_thresh=65536 > > For these tests, I just have LD_PRELOAD set in my environment. > > ======================= > > I see that TCP_MAXSEG is not being handled by libsdp and will > look into > it. > > > [root at dirk ~]# modprobe ib_sdp > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c > -C -- -m 128K > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 193.168.10.198 > (193.168.10.198) port 0 AF_INET > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > Recv Send Send Utilization Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 87380 16384 131072 30.01 5545.69 51.47 14.43 1.521 > 1.706 > > Alignment Offset Bytes Bytes Sends Bytes > Recvs > Local Remote Local Remote Xfered Per Per > Send Recv Send Recv Send (avg) Recv (avg) > 8 8 0 0 2.08e+10 131072.00 158690 33135.60 > 627718 > > Maximum > Segment > Size (bytes) > -1 > [root at dirk ~]# echo 65536 > >/sys/module/ib_sdp/parameters/sdp_zcopy_thresh > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t TCP_STREAM -c > -C -- -m 128K > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 193.168.10.198 > (193.168.10.198) port 0 AF_INET > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > Recv Send Send Utilization Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 87380 16384 131072 30.01 8708.58 50.63 14.55 0.953 > 1.095 > > Alignment Offset Bytes Bytes Sends Bytes > Recvs > Local Remote Local Remote Xfered Per Per > Send Recv Send Recv Send (avg) Recv (avg) > 8 8 0 0 3.267e+10 131072.00 249228 26348.30 > 1239807 > > Maximum > Segment > Size (bytes) > -1 > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Weikuan Yu [mailto:weikuan.yu at gmail.com] > Sent: Thursday, January 24, 2008 9:09 AM > To: Scott Weitzenkamp (sweitzen) > Cc: Jim Mott; ewg at lists.openfabrics.org; general at lists.openfabrics.org > Subject: Re: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > Hi, Scott, > > I have been running SDP tests across two woodcrest nodes with 4x DDR > cards using OFED-1.2.5.4. The card/firmware info is below. > > CA 'mthca0' > CA type: MT25208 > Number of ports: 2 > Firmware version: 5.1.400 > Hardware version: a0 > Node GUID: 0x0002c90200228e0c > System image GUID: 0x0002c90200228e0f > > I could not get a bandwidth more than 5Gbps like you have shown here. > Wonder if I need to upgrade to the latest software or firmware? Any > suggestions? > > Thanks, > --Weikuan > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 192.168.225.77 > (192.168 > .225.77) port 0 AF_INET > Recv Send Send Utilization > Service > Demand > Socket Socket Message Elapsed Send Recv Send > Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 131072 131072 131072 10.00 4918.95 21.29 24.99 1.418 > 1.665 > > > Scott Weitzenkamp (sweitzen) wrote: > > Jim, > > > > I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU > > (single core each CPU) Xeon system. I do not see any performance > > improvement (either throughput or CPU utilization) using > netperf when > I > > set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384. Can you elaborate > on > > your HCA type, and performance improvement you see? > > > > Here's an example netperf command line when using a Cheetah DDR HCA > and > > 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware > too): > > > > [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 -v2 -4 -H > > 192.168.1.201 -l 30 -t TCP_STREAM -c -C -- -m 65536 > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 192.168.1.201 > > (192.168.1.201) port 0 AF_INET : histogram : demo > > > > Recv Send Send Utilization > Service > > Demand > > Socket Socket Message Elapsed Send Recv Send > > Recv > > Size Size Size Time Throughput local remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > > us/KB > > > > 87380 16384 65536 30.01 7267.70 55.06 61.27 1.241 > > 1.381 > > > > Alignment Offset Bytes Bytes Sends Bytes > > Recvs > > Local Remote Local Remote Xfered Per Per > > Send Recv Send Recv Send (avg) > Recv (avg) > > 8 8 0 0 2.726e+10 65536.00 415942 > 48106.01 > > 566648 > > > _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From eli at dev.mellanox.co.il Thu Jan 24 10:30:48 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Thu, 24 Jan 2008 20:30:48 +0200 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <4798CD67.8040503@dls.net> References: <475607AA.301@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> <4797CDC3.6090707@dls.net> <4797DC47.6060104@dls.net> <4797DD87.8020808@dls.net> <1201184289.6755.7.camel@mtls03> <4798B643.9060508@dls.net> <1201195120.6755.75.camel@mtls03> <4798CD67.8040503@dls.net> Message-ID: <4e6a6b3c0801241030n182ef2fbu5485be1a60a3074d@mail.gmail.com> On 1/24/08, Roman Kononov wrote: > > On 2008-01-24 11:18 Eli Cohen said the following: > >>> Roman, > >>> can you tell whether from your perspective there has been an > improvement > >>> in stability with the latest patches? > >> Since I use 'nice' queue sizes, the very last patches made no > difference. > >> > > I meant to ask whether there was improvement between that time before > > the patches were posted to now. > > > > Difficult to say. Before the patches, I got a few errors per day. After > the patches, I get different errors with same frequency. > > Question: is there an easy way to inject transmission errors into the > fabric? I am not sure what exactly you mean by this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kononov at dls.net Thu Jan 24 10:55:39 2008 From: kononov at dls.net (Roman Kononov) Date: Thu, 24 Jan 2008 12:55:39 -0600 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <4e6a6b3c0801241030n182ef2fbu5485be1a60a3074d@mail.gmail.com> References: <475607AA.301@dls.net> <4e6a6b3c0801231412t7970ce83u9a2bb9aa11a1b2cd@mail.gmail.com> <4797CDC3.6090707@dls.net> <4797DC47.6060104@dls.net> <4797DD87.8020808@dls.net> <1201184289.6755.7.camel@mtls03> <4798B643.9060508@dls.net> <1201195120.6755.75.camel@mtls03> <4798CD67.8040503@dls.net> <4e6a6b3c0801241030n182ef2fbu5485be1a60a3074d@mail.gmail.com> Message-ID: <4798DF2B.8060306@dls.net> On 2008-01-24 12:30 Eli Cohen said the following: > On 1/24/08, *Roman Kononov* > > wrote: > > Question: is there an easy way to inject transmission errors into the > fabric? > > I am not sure what exactly you mean by this. I wanted to artificially corrupt I/B packets or their CRC, to cause transmission errors. Under such conditions the packets would be re-sent, intensifying error correction activity, possibly resulting in more frequent occurrence of software or firmware bugs. Roman From weiny2 at llnl.gov Thu Jan 24 10:59:19 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 24 Jan 2008 10:59:19 -0800 Subject: [ofa-general] [PATCH] opensm/scripts/redhat-opensm.init: fix starting opensm when using daemon mode Message-ID: <20080124105919.5b491029.weiny2@llnl.gov> When daemon mode was specified this script was reporting "failure" on start when it actually worked. This fixes it by just waiting for a valid pid of opensm. Ira >From 8f9b550c055b0fabf4c5fd6652060fae5dd7216b Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Thu, 24 Jan 2008 10:51:23 -0800 Subject: [PATCH] opensm/scripts/redhat-opensm.init: fix starting opensm when using daemon mode Signed-off-by: Ira K. Weiny --- opensm/scripts/redhat-opensm.init | 14 ++++++++++++-- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/opensm/scripts/redhat-opensm.init b/opensm/scripts/redhat-opensm.init index 56dbb7c..3e00403 100755 --- a/opensm/scripts/redhat-opensm.init +++ b/opensm/scripts/redhat-opensm.init @@ -238,9 +238,19 @@ start() echo -n "Starting IB Subnet Manager" echo $PORT_FLAG | $prog $START_FLAGS > /dev/null 2>&1 & - OSM_PID=$! + cnt=0; alive=0 + while [ $cnt -lt 6 -a $alive -ne 1 ]; do + echo -n "."; + sleep 1 + alive=0 + OSM_PID=`pidof $prog` + if [ "$OSM_PID" != "" ]; then + alive=1 + fi + let cnt++; + done + echo $OSM_PID > $PID_FILE - sleep 1 checkpid $OSM_PID RC=$? [ $RC -eq 0 ] && echo_success || echo_failure -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-opensm-scripts-redhat-opensm.init-fix-starting-open.patch Type: application/octet-stream Size: 1177 bytes Desc: not available URL: From dwparisgroupm at parisgroup.net Thu Jan 24 11:25:59 2008 From: dwparisgroupm at parisgroup.net (Velda Craig) Date: Thu, 24 Jan 2008 15:25:59 -0400 Subject: [ofa-general] Order drugs online in Canada Message-ID: <117906310.51077664098559@parisgroup.net> Dear valued member.Why spend more when there's an opportunity to get a product of the same quality at a lower price? Check out the New Offers at CanadianPharmacy. However, health is health, we have to care about it no matter what. Luckily, we have Canada. Pharmaceutical plants located there produce drugs at a much lower cost price thus making it a lot more paying for us to buy their products. Besides, Canadian drugs are not by any means worse than the ones made in the USA.12 free bonus pills will be added to any order over $300.Visit our online storeto find out about our great prices. http://geocities.com/bernard.armando/Thank You for Your time and for your attention Yours faithfully, Velda Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhagen at iol.unh.edu Thu Jan 24 11:31:44 2008 From: mhagen at iol.unh.edu (Mikkel Hagen) Date: Thu, 24 Jan 2008 14:31:44 -0500 Subject: [ofa-general] Upcoming OFA-IWG Interop Event In-Reply-To: <46E5841B.9020507@iol.unh.edu> References: <46E5841B.9020507@iol.unh.edu> Message-ID: <4798E7A0.1050202@iol.unh.edu> The University of New Hampshire InterOperability Lab and Open Fabrics Alliance Interoperability Working Group would like to extend an invitation to all members to attend the upcoming Interoperability Event hosted at UNH-IOL facility. We will be performing the interoperability test plan developed within the OFA-IWG and granting logos to all qualified participants shortly after the event. All required information can be found at the following link regarding logistics, registration, test plan, etc: http://www.iol.unh.edu/services/testing/ofa/events/index.php Please download the Quick Start Guide (QSG) for all information and then feel free to forward any further questions to myself (mhagen at iol.unh.edu) or interop-wg at list.openfabrics.org. Thanks! Mikkel Hagen Project Assistant - Fibre Channel/SAS/SATA Consortiums Research and Development Engineer - iWARP Consortium FC/SAS/SATA:1-603-862-0701 iWARP:1-603-862-5083 Fax:1-603-862-4181 UNH-IOL 121 Technology Drive, Suite 2 Durham, NH 03824 From transter at gmail.com Thu Jan 24 11:55:28 2008 From: transter at gmail.com (lbt) Date: Thu, 24 Jan 2008 11:55:28 -0800 Subject: [ofa-general] QP types supported with iWarp? Message-ID: Hello, I noticed that the Chelsio and NetEffect iWarp drivers in OFED 1.3 only seem to support RC QP's (i.e. IB_QPT_RC type). Is there any plan to support other QP types? Thanks! Lan -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Thu Jan 24 11:56:01 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 24 Jan 2008 13:56:01 -0600 Subject: [ofa-general] QP types supported with iWarp? In-Reply-To: References: Message-ID: <4798ED51.8070206@opengridcomputing.com> lbt wrote: > Hello, > > I noticed that the Chelsio and NetEffect iWarp drivers in OFED 1.3 only > seem to support RC QP's (i.e. IB_QPT_RC type). Is there any plan to > support other QP types? > > Thanks! > Lan > Currently the iWARP protocols and RDMA Consortium verbs only device point to point / RC... What type are you interested in? From swise at opengridcomputing.com Thu Jan 24 12:03:07 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 24 Jan 2008 14:03:07 -0600 Subject: [ofa-general] QP types supported with iWarp? In-Reply-To: <4798ED51.8070206@opengridcomputing.com> References: <4798ED51.8070206@opengridcomputing.com> Message-ID: <4798EEFB.7020807@opengridcomputing.com> Steve Wise wrote: > lbt wrote: >> Hello, >> >> I noticed that the Chelsio and NetEffect iWarp drivers in OFED 1.3 >> only seem to support RC QP's (i.e. IB_QPT_RC type). Is there any plan >> to support other QP types? >> >> Thanks! >> Lan >> > > Currently the iWARP protocols and RDMA Consortium verbs only device ^^^^^^^^^^^ that's "only define" > point to point / RC... > > What type are you interested in? > > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general From xma at us.ibm.com Thu Jan 24 12:06:30 2008 From: xma at us.ibm.com (Shirley Ma) Date: Thu, 24 Jan 2008 12:06:30 -0800 Subject: [ofa-general] [PATCH 2/16] ib/ipoib: Add s/g support for IPOIB In-Reply-To: <1200501454.13546.71.camel@mtls03> Message-ID: Hello Eli, Can you make IPoIB-CM rx S/G functions more generic so it can be reused here for IPoIB UD tx S/G? Thanks Shirley Eli Cohen Roland Dreier Sent by: cc general-b openfabrics ounces at li Subject sts.openf [ofa-general] [PATCH 2/16] ib/ipoib: Add abrics.or s/g support for IPOIB g 01/16/08 08:37 AM Add s/g support for IPOIB This patch acts as a preperation for using checksum offload for IB devices capable of inserting/verifying checksum in IP packets. The patch does not actaully turn on NETIF_F_SG but rather defers the role to the patches adding checksum offload capabilities. Support is added only for datagram mode since Mellanox HW does not support checksum offload on connected QPs. Signed-off-by: Michael S. Tsirkin Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib.h | 56 +++++++++++++++++++++++++++- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 10 ++-- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 41 ++++++++++---------- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 10 ++-- 4 files changed, 85 insertions(+), 32 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index eb7edab..6729c14 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -142,9 +142,61 @@ struct ipoib_rx_buf { struct ipoib_tx_buf { struct sk_buff *skb; - u64 mapping; + u64 mapping[MAX_SKB_FRAGS + 1]; }; +static inline int ipoib_dma_map_tx(struct ib_device *ca, + struct ipoib_tx_buf *tx_req) +{ + struct sk_buff *skb = tx_req->skb; + u64 *mapping = tx_req->mapping; + int frags; + int i; + + mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) + return -EIO; + + frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < frags; ++i) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + mapping[i + 1] = ib_dma_map_page(ca, frag->page, + frag->page_offset, frag->size, + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[i + 1]))) + goto partial_error; + } + return 0; + +partial_error: + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + + for (; i > 0; --i) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1]; + ib_dma_unmap_page(ca, mapping[i], frag->size, DMA_TO_DEVICE); + } + return -EIO; +} + +static inline void ipoib_dma_unmap_tx(struct ib_device *ca, + struct ipoib_tx_buf *tx_req) +{ + struct sk_buff *skb = tx_req->skb; + u64 *mapping = tx_req->mapping; + int frags; + int i; + + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + + frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < frags; ++i) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + ib_dma_unmap_page(ca, mapping[i + 1], frag->size, + DMA_TO_DEVICE); + } +} + struct ib_cm_id; struct ipoib_cm_data { @@ -290,7 +342,7 @@ struct ipoib_dev_priv { struct ipoib_tx_buf *tx_ring; unsigned tx_head; unsigned tx_tail; - struct ib_sge tx_sge; + struct ib_sge tx_sge[MAX_SKB_FRAGS + 1]; struct ib_send_wr tx_wr; unsigned tx_outstanding; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 059cf92..8485fde 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -495,8 +495,8 @@ static inline int post_send(struct ipoib_dev_priv *priv, { struct ib_send_wr *bad_wr; - priv->tx_sge.addr = addr; - priv->tx_sge.length = len; + priv->tx_sge[0].addr = addr; + priv->tx_sge[0].length = len; priv->tx_wr.wr_id = wr_id | IPOIB_OP_CM; @@ -537,7 +537,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_ return; } - tx_req->mapping = addr; + tx_req->mapping[0] = addr; if (unlikely(post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1), addr, skb->len))) { @@ -576,7 +576,7 @@ void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ib_wc *wc) tx_req = &tx->tx_ring[wr_id]; - ib_dma_unmap_single(priv->ca, tx_req->mapping, tx_req->skb->len, DMA_TO_DEVICE); + ib_dma_unmap_single(priv->ca, tx_req->mapping[0], tx_req->skb->len, DMA_TO_DEVICE); /* FIXME: is this right? Shouldn't we only increment on success? */ ++dev->stats.tx_packets; @@ -954,7 +954,7 @@ timeout: while ((int) p->tx_tail - (int) p->tx_head < 0) { tx_req = &p->tx_ring[p->tx_tail & (ipoib_sendq_size - 1)]; - ib_dma_unmap_single(priv->ca, tx_req->mapping, tx_req->skb->len, + ib_dma_unmap_single(priv->ca, tx_req->mapping[0], tx_req->skb->len, DMA_TO_DEVICE); dev_kfree_skb_any(tx_req->skb); ++p->tx_tail; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 5063dd5..680c27f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -257,8 +257,7 @@ static void ipoib_ib_handle_tx_wc(struct net_device *dev, struct ib_wc *wc) tx_req = &priv->tx_ring[wr_id]; - ib_dma_unmap_single(priv->ca, tx_req->mapping, - tx_req->skb->len, DMA_TO_DEVICE); + ipoib_dma_unmap_tx(priv->ca, tx_req); ++dev->stats.tx_packets; dev->stats.tx_bytes += tx_req->skb->len; @@ -341,16 +340,23 @@ void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) static inline int post_send(struct ipoib_dev_priv *priv, unsigned int wr_id, struct ib_ah *address, u32 qpn, - u64 addr, int len) + u64 *mapping, int headlen, + skb_frag_t *frags, + int nr_frags) { struct ib_send_wr *bad_wr; + int i; - priv->tx_sge.addr = addr; - priv->tx_sge.length = len; - - priv->tx_wr.wr_id = wr_id; - priv->tx_wr.wr.ud.remote_qpn = qpn; - priv->tx_wr.wr.ud.ah = address; + priv->tx_sge[0].addr = mapping[0]; + priv->tx_sge[0].length = headlen; + for (i = 0; i < nr_frags; ++i) { + priv->tx_sge[i + 1].addr = mapping[i + 1]; + priv->tx_sge[i + 1].length = frags[i].size; + } + priv->tx_wr.num_sge = nr_frags + 1; + priv->tx_wr.wr_id = wr_id; + priv->tx_wr.wr.ud.remote_qpn = qpn; + priv->tx_wr.wr.ud.ah = address; return ib_post_send(priv->qp, &priv->tx_wr, &bad_wr); } @@ -360,7 +366,6 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ipoib_tx_buf *tx_req; - u64 addr; if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", @@ -383,20 +388,19 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, */ tx_req = &priv->tx_ring[priv->tx_head & (ipoib_sendq_size - 1)]; tx_req->skb = skb; - addr = ib_dma_map_single(priv->ca, skb->data, skb->len, - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { + if (unlikely(ipoib_dma_map_tx(priv->ca, tx_req))) { ++dev->stats.tx_errors; dev_kfree_skb_any(skb); return; } - tx_req->mapping = addr; if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), - address->ah, qpn, addr, skb->len))) { + address->ah, qpn, + tx_req->mapping, skb_headlen(skb), + skb_shinfo(skb)->frags, skb_shinfo(skb)->nr_frags))) { ipoib_warn(priv, "post_send failed\n"); ++dev->stats.tx_errors; - ib_dma_unmap_single(priv->ca, addr, skb->len, DMA_TO_DEVICE); + ipoib_dma_unmap_tx(priv->ca, tx_req); dev_kfree_skb_any(skb); } else { dev->trans_start = jiffies; @@ -615,10 +619,7 @@ int ipoib_ib_dev_stop(struct net_device *dev, int flush) while ((int) priv->tx_tail - (int) priv->tx_head < 0) { tx_req = &priv->tx_ring[priv->tx_tail & (ipoib_sendq_size - 1)]; - ib_dma_unmap_single(priv->ca, - tx_req->mapping, - tx_req->skb->len, - DMA_TO_DEVICE); + ipoib_dma_unmap_tx(priv->ca, tx_req); dev_kfree_skb_any(tx_req->skb); ++priv->tx_tail; --priv->tx_outstanding; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 3c6e45d..a6f5f65 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -149,14 +149,14 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .cap = { .max_send_wr = ipoib_sendq_size, .max_recv_wr = ipoib_recvq_size, - .max_send_sge = 1, + .max_send_sge = dev->features & NETIF_F_SG ? MAX_SKB_FRAGS + 1 : 1, .max_recv_sge = 1 }, .sq_sig_type = IB_SIGNAL_ALL_WR, .qp_type = IB_QPT_UD }; - int ret, size; + int i, ret, size; priv->pd = ib_alloc_pd(priv->ca); if (IS_ERR(priv->pd)) { @@ -197,11 +197,11 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) priv->dev->dev_addr[2] = (priv->qp->qp_num >> 8) & 0xff; priv->dev->dev_addr[3] = (priv->qp->qp_num ) & 0xff; - priv->tx_sge.lkey = priv->mr->lkey; + for (i = 0; i < MAX_SKB_FRAGS + 1; ++i) + priv->tx_sge[i].lkey = priv->mr->lkey; priv->tx_wr.opcode = IB_WR_SEND; - priv->tx_wr.sg_list = &priv->tx_sge; - priv->tx_wr.num_sge = 1; + priv->tx_wr.sg_list = priv->tx_sge; priv->tx_wr.send_flags = IB_SEND_SIGNALED; return 0; -- 1.5.3.8 _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: pic25807.gif Type: image/gif Size: 1255 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: From meier3 at llnl.gov Thu Jan 24 12:28:46 2008 From: meier3 at llnl.gov (Timothy A. Meier) Date: Thu, 24 Jan 2008 12:28:46 -0800 Subject: [ofa-general] [PATCH] opensm: osm_subnet.c log and print the path of the cached option file Message-ID: <4798F4FE.5010101@llnl.gov> Sasha, A trivial patch. During development, we (at LLNL) sometimes use different options/configurations. This provides a way to know which one is active. From d60408b4dc1cb0c917e2eb33d6a3f62ac6bb9b5c Mon Sep 17 00:00:00 2001 From: Tim Meier Date: Thu, 24 Jan 2008 11:51:08 -0800 Subject: [PATCH] opensm: osm_subnet.c log and print the path of the cached option file Logged (syslog and print) the path to the option file that is cached and parsed. This is helpful when something other than the default path is used. Signed-off-by: Tim Meier --- opensm/opensm/osm_subnet.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 7b14e0c..bae087d 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -1175,6 +1175,10 @@ ib_api_status_t osm_subn_parse_conf_file(IN osm_subn_opt_t * const p_opts) file_name, strerror(errno)); return IB_ERROR; } + sprintf(line, " Reading Cached Option File: %s\n", file_name); + printf(line); + cl_log_event("OpenSM", CL_LOG_INFO, line, NULL, 0); + while (fgets(line, 1023, opts_file) != NULL) { /* get the first token */ -- 1.5.1 -- Timothy A. Meier Computer Scientist ICCD/High Performance Computing 925.422.3341 meier3 at llnl.gov -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 0001-opensm-osm_subnet.c-log-and-print-the-path-of-the.patch URL: From worleys at gmail.com Thu Jan 24 12:39:03 2008 From: worleys at gmail.com (Chris Worley) Date: Thu, 24 Jan 2008 13:39:03 -0700 Subject: [ofa-general] How to inspect routing? Message-ID: I'm working w/ a subnet manager that gets confused after a few hours/day. Is there any way to dump routing information from a node, and verify that the routes that will be used by a node don't lead to the proper destination? From swise at opengridcomputing.com Thu Jan 24 12:43:05 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 24 Jan 2008 14:43:05 -0600 Subject: [ofa-general] [GIT PULL ofed-1.3] - Tag the ofed cxgb3 driver version. Message-ID: <4798F859.60003@opengridcomputing.com> Vlad, Please pull the following patch from: git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel This patch must have gotten lost from 1.2.5 -> 1.3. Thanks, Steve. ----- Tag -ofed for cxgb3 driver version. This keeps kernel.org vs ofed driver versions unique. Signed-off-by: Steve Wise --- .../fixes/cxgb3_00300_add_ofed_version_tag.patch | 13 +++++++++++++ 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/kernel_patches/fixes/cxgb3_00300_add_ofed_version_tag.patch b/kernel_patches/fixes/cxgb3_00300_add_ofed_version_tag.patch new file mode 100644 index 0000000..ffee40a --- /dev/null +++ b/kernel_patches/fixes/cxgb3_00300_add_ofed_version_tag.patch @@ -0,0 +1,13 @@ +diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h +index ef1c633..ef2405a 100644 +--- a/drivers/net/cxgb3/version.h ++++ b/drivers/net/cxgb3/version.h +@@ -35,7 +35,7 @@ + #define DRV_DESC "Chelsio T3 Network Driver" + #define DRV_NAME "cxgb3" + /* Driver version */ +-#define DRV_VERSION "1.0-ko" ++#define DRV_VERSION "1.0-ofed" + + /* Firmware version */ + #define FW_VERSION_MAJOR 4 From prescott at hpc.ufl.edu Thu Jan 24 12:56:30 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Thu, 24 Jan 2008 15:56:30 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4798CB4C.7070706@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu><4783B3F5.20600@opengridcomputing.com><4783BDD5.7000702@hpc.ufl.edu><4783C326.3070306@opengridcomputing.com><478634A5.3080204@hpc.ufl.edu><47863794.9080709@opengridcomputing.com><47865A4A.4070603@hpc.ufl.edu><47865E5B.4030607@opengridcomputing.com><4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> Message-ID: <4798FB7E.8050200@hpc.ufl.edu> Hi Steve; The SDP socket gets an associated mr when sdp_init_qp() calls ib_get_dma_mr(). It looks to me like this drills down into the provider layer, which will ultimately end up calling build_phys_page_list() from iwch_register_phys_mem(). Unfortunately, when I try to look at the ib_mr_attrs via ib_query_mr(), the call fails. When sdp_post_recv() calls ib_post_recv(), it looks to me like a DMA mapping has been set up between the SDP private receive buffers and card. The receive buffers are kmalloc'd in sdp_init_qp(). I hope I have this right. But it sounds like it is possible I am hitting both issues you describe. I guess one way to check is to drop my test nodes down to 4GB or less, right? They currently have 16GB. Thanks again, Craig Steve Wise wrote: > Are these recv buffers user memory or kernel memory? I just submitted a > fix for a bug in build_phys_page_list(). Perhaps you're hitting this? > It would hit it if these are buffers allocated by the sdp kernel module > and registered via ib_reg_phys_mr(). > > Alsoalso: If sdp is using ib_get_dma_mr() to access all of memory, then > it won't work with the chelsio driver, which has a 4GB limit on MRs. So > cxgb3 creates dma_mrs that map only address 0..4GB-1. This just > doesn't work at all if there is an iommu mapping bus addresses above 4GB. > > Steve. > > > > Craig Prescott wrote: >> >> Hi Felix; >> >> Here are the last 4 WRs: >> >> ... >> Entering iwch_post_receive >> iwch_post_receive: Dumping built work request before ring_doorbell: >> iwch_post_receive: WQE ffff810241d59e00: 17c001008000000d >> iwch_post_receive: WQE ffff810241d59e08: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59e10: 0000000000000001 >> iwch_post_receive: WQE ffff810241d59e18: 000002ff00000810 >> iwch_post_receive: WQE ffff810241d59e20: 000000044eac3000 >> iwch_post_receive: WQE ffff810241d59e28: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59e30: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59e38: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59e40: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59e48: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59e50: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59e58: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59e60: 0000000000000000 >> iwch_post_receive: returning 0 >> Entering iwch_post_receive >> iwch_post_receive: Dumping built work request before ring_doorbell: >> iwch_post_receive: WQE ffff810241d59e80: 17c001008000000d >> iwch_post_receive: WQE ffff810241d59e88: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59e90: 0000000000000001 >> iwch_post_receive: WQE ffff810241d59e98: 000002ff00000810 >> iwch_post_receive: WQE ffff810241d59ea0: 000000044eac4000 >> iwch_post_receive: WQE ffff810241d59ea8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59eb0: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59eb8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59ec0: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59ec8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59ed0: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59ed8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59ee0: 0000000000000000 >> iwch_post_receive: returning 0 >> Entering iwch_post_receive >> iwch_post_receive: Dumping built work request before ring_doorbell: >> iwch_post_receive: WQE ffff810241d59f00: 17c001008000000d >> iwch_post_receive: WQE ffff810241d59f08: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f10: 0000000000000001 >> iwch_post_receive: WQE ffff810241d59f18: 000002ff00000810 >> iwch_post_receive: WQE ffff810241d59f20: 000000044eac5000 >> iwch_post_receive: WQE ffff810241d59f28: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f30: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f38: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f40: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f48: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f50: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f58: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f60: 0000000000000000 >> iwch_post_receive: returning 0 >> Entering iwch_post_receive >> iwch_post_receive: Dumping built work request before ring_doorbell: >> iwch_post_receive: WQE ffff810241d59f80: 17c001008000000d >> iwch_post_receive: WQE ffff810241d59f88: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59f90: 0000000000000001 >> iwch_post_receive: WQE ffff810241d59f98: 000002ff00000810 >> iwch_post_receive: WQE ffff810241d59fa0: 000000044eac6000 >> iwch_post_receive: WQE ffff810241d59fa8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fb0: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fb8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fc0: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fc8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fd0: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fd8: 0000000000000000 >> iwch_post_receive: WQE ffff810241d59fe0: 0000000000000000 >> iwch_post_receive: returning 0 >> >> Thanks, >> Craig >> >> >> Felix Marti wrote: >>> Hi Craig, >>> >>> Can you please dump not only the last, but the last 4 WRs? >>> >>> Thanks, >>> felix >>> >>>> -----Original Message----- >>>> From: general-bounces at lists.openfabrics.org [mailto:general- >>>> bounces at lists.openfabrics.org] On Behalf Of Craig Prescott >>>> Sent: Wednesday, January 23, 2008 8:05 AM >>>> To: Steve Wise >>>> Cc: general at lists.openfabrics.org >>>> Subject: Re: [ofa-general] SDP and iWARP >>>> >>>> Steve Wise wrote: >>>>> Craig Prescott wrote: >>>>>> Steve Wise wrote: >>>>>>> Craig Prescott wrote: >>>>>>>> Steve Wise wrote: >>>>>>>>> Craig Prescott wrote: >>>>>>>>>> The above call also emits a couple of messages >>>>>>>>>> into the listener's syslog now : >>>>>>>>>> >>>>>>>>>> Jan 9 21:53:54 tebow2 kernel: iwch_ev_dispatch - CQE Err qpid >>>>>>>>>> 0x20 opcode 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>>>>>> Jan 9 21:53:54 tebow2 kernel: post_qp_event - AE qpid 0x20 >>>> opcode >>>>>>>>>> 14 status 0x6 type 1 wrid.hi 0x0 wrid.lo 0x80000000 >>>>>>>>>> >>>>>>>>> This is an async event generated due to a failure processing a >>> SQ >>>>>>>>> WR, I think. opcodes and status codes for iw_cxgb3 are in >>>> cxio_wr.h. >>>>>>>>> type 1 means it was an egress (SQ) failure >>>>>>>>> status 0x6 is a base/bounds violation, >>>>>>>>> but 14 seems incorrect. That's not a valid T3 opcode. ???? >>>>>>>>> >>>>>>>> Ok, thanks! I guess I'm not sure what to make of that yet, >>>> though. >>>>>>> See where in iwch_accept_cr() the failure is happening. It >>> doesn't >>>>>>> look like send_mpa_reply() is being called. >>>>>>> >>>>>> The ECONNRESET is coming from here in iwch_accept_cr(): >>>>>> >>>>>> ... >>>>>> /* wait for wr_ack */ >>>>>> wait_event(ep->com.waitq, ep->com.rpl_done); >>>>>> err = ep->com.rpl_err; >>>>>> ... >>>>>> >>>>>> Is that what you thought was happening? >>>>> I don't know exactly what is going on! But the code above means >>> that >>>>> the firmware never successfully sent the last streaming message (the >>>>> mpa-start reply) and never transitioned the connection into rdma >>>> mode. >>>>> And the async error might indicate that some WR was posted prior to >>>>> doing the rdma_accept() and that WR had problems. >>>> Ok. I'm sorry for such a slow response. >>>> >>>>> a few questions: >>>>> >>>>> What firmware are you running? ethtool -i will tell you. >>>> [root at tebow1 ~]# ethtool -i eth4 >>>> driver: cxgb3 >>>> version: 1.0-ko >>>> firmware-version: T 5.0.0 TP 1.1.0 >>>> bus-info: 0000:86:00.0 >>>> >>>>> What ofed version exactly? >>>> OFED 1.3 daily from a few weeks back now: OFED-1.3-20080107-0942 >>>> >>>>> Does sdp post a SQ or RQ WR prior to doing the rdma_accept()? Can >>>> you >>>>> dump that work request? Maybe in iwch_post_send and iwch_post_recv, >>>>> dump the work request after it is built and before the code rings >>> the >>>>> doorbell. You can dump it as 8B flits, and be sure an put the flits >>>> in >>>>> host byte order. See cxio_dump_wqe() in cxio_dbg.c... >>>> The following is the last work request seen before rdma_accept(): >>>> >>>> iwch_post_receive: Dumping built work request before ring_doorbell: >>>> iwch_post_receive: WQE ffff810241d59f80: 17c001008000000d >>>> iwch_post_receive: WQE ffff810241d59f88: 0000000000000000 >>>> iwch_post_receive: WQE ffff810241d59f90: 0000000000000001 >>>> iwch_post_receive: WQE ffff810241d59f98: 000002ff00000810 >>>> iwch_post_receive: WQE ffff810241d59fa0: 000000044eac6000 >>>> iwch_post_receive: WQE ffff810241d59fa8: 0000000000000000 >>>> iwch_post_receive: WQE ffff810241d59fb0: 0000000000000000 >>>> iwch_post_receive: WQE ffff810241d59fb8: 0000000000000000 >>>> iwch_post_receive: WQE ffff810241d59fc0: 0000000000000000 >>>> iwch_post_receive: WQE ffff810241d59fc8: 0000000000000000 >>>> iwch_post_receive: WQE ffff810241d59fd0: 0000000000000000 >>>> iwch_post_receive: WQE ffff810241d59fd8: 0000000000000000 >>>> iwch_post_receive: WQE ffff810241d59fe0: 0000000000000000 >>>> iwch_post_receive: returning 0 >>>> >>>> This comes from sdp_init_qp(), via sdp_connect_handler(). >>>> There are a total of 64 work requests (all from >>>> iwch_post_receive()) generated while the netserver is >>>> trying to handle the RDMA_CM_EVENT_CONNECT_REQUEST. >>>> >>>> Can you help me decode the above work request? >>>> >>>> Thanks, >>>> Craig >>>> >>>> >>>> >>>> _______________________________________________ >>>> general mailing list >>>> general at lists.openfabrics.org >>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>>> >>>> To unsubscribe, please visit >>> http://openib.org/mailman/listinfo/openib- >>>> general > From rdreier at cisco.com Thu Jan 24 13:00:37 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Jan 2008 13:00:37 -0800 Subject: [ofa-general] Re: [PATCH] ib/limthca: Remove an always true condition In-Reply-To: <1201184628.6755.9.camel@mtls03> (Eli Cohen's message of "Thu, 24 Jan 2008 16:23:48 +0200") References: <1201184628.6755.9.camel@mtls03> Message-ID: Thanks for splitting this up... it makes review much easier, and indeed it seems either there is a bug in the existing code, or this patch is wrong: > - if (srq->first_free >= 0) > - *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; > - else > - srq->first_free = ind; > - > + *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; why is first_free always >= 0? I don't see anything that guarantees that, and in fact mthca_tavor_post_srq_recv and mthca_arbel_post_srq_recv both have code: ind = srq->first_free; if (ind < 0) { so if first_free is always non-negative, we could also delete these checks from the fast path. I do see the SRQ create code adds a spare entry: srq->max = align_queue_size(pd->context, attr->attr.max_wr, 1); but it seems we don't prevent the consumer from using this entry. However I can't recreate the reasoning why we need the spare entry... The most straightforward fix is to change the check for the SRQ being full in the post_srq_recv functions so that it keeps the spare entry, but I'd like to understand the code again first ;) - R. From swise at opengridcomputing.com Thu Jan 24 13:00:15 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 24 Jan 2008 15:00:15 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4798FB7E.8050200@hpc.ufl.edu> References: <4783A5B0.6040603@hpc.ufl.edu><4783B3F5.20600@opengridcomputing.com><4783BDD5.7000702@hpc.ufl.edu><4783C326.3070306@opengridcomputing.com><478634A5.3080204@hpc.ufl.edu><47863794.9080709@opengridcomputing.com><47865A4A.4070603@hpc.ufl.edu><47865E5B.4030607@opengridcomputing.com><4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> <4798FB7E.8050200@hpc.ufl.edu> Message-ID: <4798FC5F.3060807@opengridcomputing.com> Craig Prescott wrote: > > Hi Steve; > > The SDP socket gets an associated mr when sdp_init_qp() calls > ib_get_dma_mr(). It looks to me like this drills down into > the provider layer, which will ultimately end up calling > build_phys_page_list() from iwch_register_phys_mem(). > > Unfortunately, when I try to look at the ib_mr_attrs via > ib_query_mr(), the call fails. > > When sdp_post_recv() calls ib_post_recv(), it looks to me > like a DMA mapping has been set up between the SDP private > receive buffers and card. The receive buffers are kmalloc'd > in sdp_init_qp(). > > I hope I have this right. But it sounds like it is possible > I am hitting both issues you describe. > > I guess one way to check is to drop my test nodes down to 4GB > or less, right? They currently have 16GB. > Drop them down to 1 or 2GB and try it. 4GB still requires the iommu to remap things above 4GB. Sorry about that. I forgot about the 4GB limitation and get_dma_mr(). I guess the chelsio driver should really just fail the get_dma_mr() call since it doesn't properly support it. There is one other experiment you could try. You could try using lkey 0 for any sgl used in a send or receive work request. This maps to the zero stag in iwarp lingo. But I haven't tested that yet :) Steve. From weiny2 at llnl.gov Thu Jan 24 13:26:37 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 24 Jan 2008 13:26:37 -0800 Subject: [ofa-general] How to inspect routing? In-Reply-To: References: Message-ID: <20080124132637.5033cfe6.weiny2@llnl.gov> On Thu, 24 Jan 2008 13:39:03 -0700 "Chris Worley" wrote: > I'm working w/ a subnet manager that gets confused after a few hours/day. > > Is there any way to dump routing information from a node, and verify > that the routes that will be used by a node don't lead to the proper > destination? At the lowest level there are: dump_lfts.sh dump_mfts.sh Which dump the forwarding tables from all the switches in the fabric. Also, to check for a route from node A to B you can use ibtracert. ibutils might have something which displays this in a gui, however, I am not familiar with those tools. Hope this helps, Ira From rdreier at cisco.com Thu Jan 24 13:29:07 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Jan 2008 13:29:07 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: <1201162492.9739.21.camel@localhost.localdomain> (Shirley Ma's message of "Thu, 24 Jan 2008 00:14:51 -0800") References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> Message-ID: OK, this is some half-baked thinking based on reading the patch. I don't know the right answer here -- I am hoping to spark discussion that makes the correct thing clear: > +static inline int ipoib_ud_mtu(unsigned int ib_mtu) > +{ > + return (ib_mtu < 4096) ? (ib_mtu - IPOIB_ENCAP_LEN) : > + (ib_mtu - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4); > +} reading this, my first reaction is that the magic 4096 constant should have a name. And in fact the most obvious name for it is PAGE_SIZE. However, this means that (assuming everyone can handle an IB MTU of 4096), systems with PAGE_SIZE > 4096 would come up with a different IPoIB MTU than systems with PAGE_SIZE == 4096. And I'm not sure whether that would cause problems or not. (eg TCP should be OK) But then in general, if we use the approach here (which is very appealing because it's so simple), Linux will potentially have an MTU different from other OSes that might choose a different way to handle an IB MTU of 4096. So does that mean that we should use a more complicated approach to get the max possible MTU of 4096 - 4? (As a side note, that magic constant of 4 above, which comes from: /* * IB will leave a 40 byte gap for a GRH and IPoIB adds a 4 byte * header. So we need 4 more bytes to get to 48 and align the * IP header to a multiple of 16. */ skb_reserve(skb, 4); probably wants a name too, in the interest of maintainability) - R. From hrosenstock at xsigo.com Thu Jan 24 13:33:42 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 24 Jan 2008 13:33:42 -0800 Subject: [ofa-general] [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values In-Reply-To: <20080123124028.75708ab0.weiny2@llnl.gov> References: <20080123124028.75708ab0.weiny2@llnl.gov> Message-ID: <1201210422.25913.94.camel@hrosenstock-ws.xsigo.com> On Wed, 2008-01-23 at 12:40 -0800, Ira Weiny wrote: > >From 020618d66bdcecba6f49bc7f48ae40485d657437 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Wed, 23 Jan 2008 12:39:05 -0800 > Subject: [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values > > > Signed-off-by: Ira K. Weiny > --- > opensm/opensm/osm_subnet.c | 13 +++++++++++-- > 1 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c > index c9b4d57..7b14e0c 100644 > --- a/opensm/opensm/osm_subnet.c > +++ b/opensm/opensm/osm_subnet.c > @@ -1469,10 +1469,19 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) > "leaf_head_of_queue_lifetime 0x%02x\n\n" > "# Limit the maximal operational VLs\n" > "max_op_vls %u\n\n" > - "# Force link speed enable on switch links\n" > + "# Force PortInfo:LinkSpeedEnabled on switch ports\n" > "# If 0, don't modify PortInfo:LinkSpeedEnabled on switch port\n" > "# Otherwise, use value for PortInfo:LinkSpeedEnabled on switch port\n" > - "# Default is 15 (to set to PortInfo:LinkSpeedSupported)\n\n" > + "# Values are (IB Spec 1.2, 14.2.5.6 Table 145 \"PortInfo\")\n" > + "# 1: 2.5 Gbps\n" > + "# 2: 5.0 Gbps\n" > + "# 3: 2.5 or 5.0 Gbps\n" > + "# 4: 10.0 Gbps\n" > + "# 5: 2.5 or 10.0 Gbps\n" > + "# 6: 5.0 or 10.0 Gbps\n" > + "# 7: 2.5 or 5.0 or 10.0 Gbps\n" Should the values be per IBA 1.2.1 r.t. 1.2 ? > + "# 8-14: Reserved\n" > + "# Default 15: set to PortInfo:LinkSpeedSupported\n\n" > "force_link_speed %u\n\n" > "# The subnet_timeout code that will be set for all the ports\n" > "# The actual timeout is 4.096usec * 2^\n" > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From hrosenstock at xsigo.com Thu Jan 24 13:34:48 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 24 Jan 2008 13:34:48 -0800 Subject: [ofa-general] [PATCH] IB/ehca: Prevent sending UD packets to QP0 In-Reply-To: <200801241759.09065.fenkes@de.ibm.com> References: <200801241759.09065.fenkes@de.ibm.com> Message-ID: <1201210488.25913.97.camel@hrosenstock-ws.xsigo.com> On Thu, 2008-01-24 at 17:59 +0100, Joachim Fenkes wrote: > IB spec doesn't allow packets to QP0 sent on any other VL than VL15. > Hardware doesn't filter those packets on the send side, so we need to do > this in the driver and firmware. > > As eHCA doesn't support QP0, we can just filter out all traffic going to > QP0, regardless of SL or VL. Is this a hardware or software limitation ? If it is software, is there any plan to enable QP0 support ? -- Hal > Signed-off-by: Joachim Fenkes > --- > drivers/infiniband/hw/ehca/ehca_reqs.c | 4 ++++ > 1 files changed, 4 insertions(+), 0 deletions(-) > > diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c > index 3aacc8c..2ce8cff 100644 > --- a/drivers/infiniband/hw/ehca/ehca_reqs.c > +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c > @@ -209,6 +209,10 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, > ehca_gen_err("wr.ud.ah is NULL. qp=%p", qp); > return -EINVAL; > } > + if (unlikely(send_wr->wr.ud.remote_qpn == 0)) { > + ehca_gen_err("dest QP# is 0. qp=%x", qp->real_qp_num); > + return -EINVAL; > + } > my_av = container_of(send_wr->wr.ud.ah, struct ehca_av, ib_ah); > wqe_p->u.ud_av.ud_av = my_av->av; > From weiny2 at llnl.gov Thu Jan 24 13:40:53 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 24 Jan 2008 13:40:53 -0800 Subject: [ofa-general] [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values In-Reply-To: <1201210422.25913.94.camel@hrosenstock-ws.xsigo.com> References: <20080123124028.75708ab0.weiny2@llnl.gov> <1201210422.25913.94.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080124134053.428a5492.weiny2@llnl.gov> On Thu, 24 Jan 2008 13:33:42 -0800 Hal Rosenstock wrote: > On Wed, 2008-01-23 at 12:40 -0800, Ira Weiny wrote: > > >From 020618d66bdcecba6f49bc7f48ae40485d657437 Mon Sep 17 00:00:00 2001 > > From: Ira K. Weiny > > Date: Wed, 23 Jan 2008 12:39:05 -0800 > > Subject: [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values > > > > > > Signed-off-by: Ira K. Weiny > > --- > > opensm/opensm/osm_subnet.c | 13 +++++++++++-- > > 1 files changed, 11 insertions(+), 2 deletions(-) > > > > diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c > > index c9b4d57..7b14e0c 100644 > > --- a/opensm/opensm/osm_subnet.c > > +++ b/opensm/opensm/osm_subnet.c > > @@ -1469,10 +1469,19 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) > > "leaf_head_of_queue_lifetime 0x%02x\n\n" > > "# Limit the maximal operational VLs\n" > > "max_op_vls %u\n\n" > > - "# Force link speed enable on switch links\n" > > + "# Force PortInfo:LinkSpeedEnabled on switch ports\n" > > "# If 0, don't modify PortInfo:LinkSpeedEnabled on switch port\n" > > "# Otherwise, use value for PortInfo:LinkSpeedEnabled on switch port\n" > > - "# Default is 15 (to set to PortInfo:LinkSpeedSupported)\n\n" > > + "# Values are (IB Spec 1.2, 14.2.5.6 Table 145 \"PortInfo\")\n" > > + "# 1: 2.5 Gbps\n" > > + "# 2: 5.0 Gbps\n" > > + "# 3: 2.5 or 5.0 Gbps\n" > > + "# 4: 10.0 Gbps\n" > > + "# 5: 2.5 or 10.0 Gbps\n" > > + "# 6: 5.0 or 10.0 Gbps\n" > > + "# 7: 2.5 or 5.0 or 10.0 Gbps\n" > > Should the values be per IBA 1.2.1 r.t. 1.2 ? I'm sorry did I miss the release of 1.2.1? I got those out of the 1.2 Release PDF. Are they different in 1.2.1? Ira > > > + "# 8-14: Reserved\n" > > + "# Default 15: set to PortInfo:LinkSpeedSupported\n\n" > > "force_link_speed %u\n\n" > > "# The subnet_timeout code that will be set for all the ports\n" > > "# The actual timeout is 4.096usec * 2^\n" > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From worleys at gmail.com Thu Jan 24 13:41:30 2008 From: worleys at gmail.com (Chris Worley) Date: Thu, 24 Jan 2008 14:41:30 -0700 Subject: [ofa-general] How to inspect routing? In-Reply-To: <1201210401.25913.92.camel@hrosenstock-ws.xsigo.com> References: <1201210401.25913.92.camel@hrosenstock-ws.xsigo.com> Message-ID: On Jan 24, 2008 2:33 PM, Hal Rosenstock wrote: > On Thu, 2008-01-24 at 13:39 -0700, Chris Worley wrote: > > I'm working w/ a subnet manager that gets confused after a few hours/day. > > Is it OpenSM or something other SM ? Another switch-specific SM that claims to be compatible w/ OFED on the nodes. Nodes "drop-out" after a few hours or days: they can no longer talk to some other nodes. This behavior doesn't happen w/ OpenSM, but we need to test the switches SM. Note the intentional lack of brand names. Thanks for the info. I'll use it on the next attempt to use the alternate SM. Chris From hrosenstock at xsigo.com Thu Jan 24 13:41:53 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 24 Jan 2008 13:41:53 -0800 Subject: [ofa-general] [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values In-Reply-To: <20080124134053.428a5492.weiny2@llnl.gov> References: <20080123124028.75708ab0.weiny2@llnl.gov> <1201210422.25913.94.camel@hrosenstock-ws.xsigo.com> <20080124134053.428a5492.weiny2@llnl.gov> Message-ID: <1201210913.25913.105.camel@hrosenstock-ws.xsigo.com> On Thu, 2008-01-24 at 13:40 -0800, Ira Weiny wrote: > On Thu, 24 Jan 2008 13:33:42 -0800 > Hal Rosenstock wrote: > > > On Wed, 2008-01-23 at 12:40 -0800, Ira Weiny wrote: > > > >From 020618d66bdcecba6f49bc7f48ae40485d657437 Mon Sep 17 00:00:00 2001 > > > From: Ira K. Weiny > > > Date: Wed, 23 Jan 2008 12:39:05 -0800 > > > Subject: [PATCH] opensm/opensm/osm_subnet.c: add a comment of valid "force_link_speed" values > > > > > > > > > Signed-off-by: Ira K. Weiny > > > --- > > > opensm/opensm/osm_subnet.c | 13 +++++++++++-- > > > 1 files changed, 11 insertions(+), 2 deletions(-) > > > > > > diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c > > > index c9b4d57..7b14e0c 100644 > > > --- a/opensm/opensm/osm_subnet.c > > > +++ b/opensm/opensm/osm_subnet.c > > > @@ -1469,10 +1469,19 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) > > > "leaf_head_of_queue_lifetime 0x%02x\n\n" > > > "# Limit the maximal operational VLs\n" > > > "max_op_vls %u\n\n" > > > - "# Force link speed enable on switch links\n" > > > + "# Force PortInfo:LinkSpeedEnabled on switch ports\n" > > > "# If 0, don't modify PortInfo:LinkSpeedEnabled on switch port\n" > > > "# Otherwise, use value for PortInfo:LinkSpeedEnabled on switch port\n" > > > - "# Default is 15 (to set to PortInfo:LinkSpeedSupported)\n\n" > > > + "# Values are (IB Spec 1.2, 14.2.5.6 Table 145 \"PortInfo\")\n" > > > + "# 1: 2.5 Gbps\n" > > > + "# 2: 5.0 Gbps\n" > > > + "# 3: 2.5 or 5.0 Gbps\n" > > > + "# 4: 10.0 Gbps\n" > > > + "# 5: 2.5 or 10.0 Gbps\n" > > > + "# 6: 5.0 or 10.0 Gbps\n" > > > + "# 7: 2.5 or 5.0 or 10.0 Gbps\n" > > > > Should the values be per IBA 1.2.1 r.t. 1.2 ? > > I'm sorry did I miss the release of 1.2.1? Look on the IBTA site. vol 1's been released. > I got those out of the 1.2 Release > PDF. Are they different in 1.2.1? Yes. > Ira > > > > > > + "# 8-14: Reserved\n" > > > + "# Default 15: set to PortInfo:LinkSpeedSupported\n\n" > > > "force_link_speed %u\n\n" > > > "# The subnet_timeout code that will be set for all the ports\n" > > > "# The actual timeout is 4.096usec * 2^\n" > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From transter at gmail.com Thu Jan 24 13:42:04 2008 From: transter at gmail.com (lbt) Date: Thu, 24 Jan 2008 13:42:04 -0800 Subject: [ofa-general] QP types supported with iWarp? In-Reply-To: <4798EEFB.7020807@opengridcomputing.com> References: <4798ED51.8070206@opengridcomputing.com> <4798EEFB.7020807@opengridcomputing.com> Message-ID: I was interested in UD QP's so that I could set up multicast groups. Let me take a closer look at the RDMA Consortium; kind of new to this stuff, so thanks for the pointer Steve! Lan On 1/24/08, Steve Wise wrote: > > Steve Wise wrote: > > lbt wrote: > >> Hello, > >> > >> I noticed that the Chelsio and NetEffect iWarp drivers in OFED 1.3 > >> only seem to support RC QP's (i.e. IB_QPT_RC type). Is there any plan > >> to support other QP types? > >> > >> Thanks! > >> Lan > >> > > > > Currently the iWARP protocols and RDMA Consortium verbs only device > ^^^^^^^^^^^ > that's "only define" > > point to point / RC... > > > > What type are you interested in? > > > > > > > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Thu Jan 24 13:42:37 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 24 Jan 2008 15:42:37 -0600 Subject: [ofa-general] help on /sys/class structure and rdma Message-ID: <4799064D.4060705@opengridcomputing.com> Hey Roland, I just found a bug in a drop I did for libcxgb3. I changed the code to read the adapter fw version sysfs file to attempt to check for incompatible fw/lib cases. I was reading this file in cxgb3_driver_init(): /sys/class/infiniband_verbs/uverbs0/device/infiniband:cxgb3_0/fw_ver I'm using ibv_read_sysfs_file() and the uverbs_sys_path passed in, but building the string to dive down into the device/infiniband:blah directory. However this infiniband:cxgb3_0 link apparently is not present on a rhel4u5 distro. I assumed this stuff would always be consistent for ofed installations. Is that not true? What is the preferred way for a lib to get to the device's sysfs files? Thanks! Steve. From hrosenstock at xsigo.com Thu Jan 24 13:33:21 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 24 Jan 2008 13:33:21 -0800 Subject: [ofa-general] How to inspect routing? In-Reply-To: References: Message-ID: <1201210401.25913.92.camel@hrosenstock-ws.xsigo.com> On Thu, 2008-01-24 at 13:39 -0700, Chris Worley wrote: > I'm working w/ a subnet manager that gets confused after a few hours/day. Is it OpenSM or something other SM ? > Is there any way to dump routing information from a node, and verify > that the routes that will be used by a node don't lead to the proper > destination? I think there are several ways to go about this: management has dump_lfts.sh (and dump_mfts.sh) to dump the forwarding tables in the switches in the subnet. (They can also be loaded into ibsim). There are also tools like ibtracert which will walk a path from source to dest. ibutils has ibdiagnet which I think analyzes this (I'm basing this on some output I've seen). There's also ibdiagpath. -- Hal > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sweitzen at cisco.com Thu Jan 24 13:58:20 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Thu, 24 Jan 2008 13:58:20 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> Message-ID: Jim, Like I've said before, I don't see any change in throughput with SDP zcopy, plus the throughput bounces around. When you run netperf with -D, do you see variations in throughput? Here's example from a dual socket Xeon 5355 quad core RHEL5 x86_64 system with ConnectX 2.3.0 firmware, the interim results throughput bounces around between 4-7 Gbps. [releng at svbu-qaclus-98 ~]$ cat /sys/module/ib_sdp/parameters/sdp_zcopy_thresh 16384 [releng at svbu-qaclus-98 ~]$ LD_PRELOAD=libsdp.so netperf241 -C -c -P 0 -t TCP_STREAM -H 192.168.1.127 -D -l 60 -- -m 1000000 Interim result: 6676.50 10^6bits/s over 1.00 seconds Interim result: 6674.47 10^6bits/s over 1.00 seconds Interim result: 6687.89 10^6bits/s over 1.00 seconds Interim result: 7075.40 10^6bits/s over 1.00 seconds Interim result: 7065.08 10^6bits/s over 1.00 seconds Interim result: 7074.69 10^6bits/s over 1.00 seconds Interim result: 6667.10 10^6bits/s over 1.06 seconds Interim result: 4492.29 10^6bits/s over 1.48 seconds Interim result: 4503.65 10^6bits/s over 1.00 seconds Interim result: 4481.25 10^6bits/s over 1.01 seconds Interim result: 4495.91 10^6bits/s over 1.00 seconds Interim result: 4521.51 10^6bits/s over 1.00 seconds Interim result: 4466.58 10^6bits/s over 1.01 seconds Interim result: 4482.09 10^6bits/s over 1.00 seconds Interim result: 4480.21 10^6bits/s over 1.00 seconds Interim result: 4490.07 10^6bits/s over 1.00 seconds Interim result: 4479.47 10^6bits/s over 1.00 seconds Interim result: 4480.30 10^6bits/s over 1.00 seconds Interim result: 4489.14 10^6bits/s over 1.00 seconds Interim result: 4484.38 10^6bits/s over 1.00 seconds Interim result: 4473.64 10^6bits/s over 1.00 seconds Interim result: 4479.71 10^6bits/s over 1.00 seconds Interim result: 4486.54 10^6bits/s over 1.00 seconds Interim result: 4456.65 10^6bits/s over 1.01 seconds Interim result: 4483.70 10^6bits/s over 1.00 seconds Interim result: 4486.41 10^6bits/s over 1.00 seconds Interim result: 4489.58 10^6bits/s over 1.00 seconds Interim result: 4478.15 10^6bits/s over 1.00 seconds Interim result: 4476.67 10^6bits/s over 1.00 seconds Interim result: 4496.49 10^6bits/s over 1.00 seconds Interim result: 4489.26 10^6bits/s over 1.00 seconds Interim result: 4479.86 10^6bits/s over 1.00 seconds Interim result: 4500.97 10^6bits/s over 1.00 seconds Interim result: 4473.96 10^6bits/s over 1.00 seconds Interim result: 7346.56 10^6bits/s over 1.00 seconds Interim result: 7524.94 10^6bits/s over 1.00 seconds Interim result: 7540.16 10^6bits/s over 1.00 seconds Interim result: 7553.53 10^6bits/s over 1.00 seconds Interim result: 7552.08 10^6bits/s over 1.00 seconds Interim result: 7550.08 10^6bits/s over 1.00 seconds Interim result: 7554.35 10^6bits/s over 1.00 seconds Interim result: 7550.85 10^6bits/s over 1.00 seconds Interim result: 7557.27 10^6bits/s over 1.00 seconds Interim result: 7568.28 10^6bits/s over 1.00 seconds Interim result: 7497.24 10^6bits/s over 1.01 seconds Interim result: 7436.44 10^6bits/s over 1.01 seconds Interim result: 6098.26 10^6bits/s over 1.22 seconds Interim result: 5644.82 10^6bits/s over 1.08 seconds Interim result: 5639.07 10^6bits/s over 1.00 seconds Interim result: 5636.32 10^6bits/s over 1.00 seconds Interim result: 5640.45 10^6bits/s over 1.00 seconds Interim result: 6319.06 10^6bits/s over 1.00 seconds Interim result: 7324.10 10^6bits/s over 1.00 seconds Interim result: 7323.53 10^6bits/s over 1.00 seconds Interim result: 7333.88 10^6bits/s over 1.00 seconds Interim result: 7172.70 10^6bits/s over 1.02 seconds Interim result: 4488.97 10^6bits/s over 1.60 seconds Interim result: 4492.37 10^6bits/s over 1.00 seconds 87380 16384 1000000 60.00 5701.15 17.41 16.26 2.001 1.870 Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Thursday, January 24, 2008 9:47 AM > To: Scott Weitzenkamp (sweitzen); Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > I am really puzzled. The majority of my testing has been between > Rhat4U4 and Rhat5. Using netperf command lines of the form: > netperf -C -c -P 0 -t TCP_RR -H 193.168.10.143 -l 60 ---r 64 > netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 > ---r 1000000 > and a process of: > - set sdp_zcopy_thresh=0, run bandwidth test > - set sdp_zcopy_thresh=size, run bandwidth test > I repeatedly get results that look like this: > size SDP Bzcopy > 65536 7375.00 7515.98 > 131072 7465.70 8105.58 > 1000000 6541.87 9948.76 > > These numbers are from high end (2-socket, quad-core) machines. When > you > use smaller machines, like the AMD dual-core shown below, the > differences > between SDP with and without bzcopy are more striking. > > The process to start the netserver is: > export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib > export LD_PRELOAD=libsdp.so > export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf > netserver > > The process to start the netperf is similar: > export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib > export LD_PRELOAD=libsdp.so > export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf > netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 > ---r 1000000 > > You and unload and reload ib_sdp between tests, but I just echo 0 and > echo size into sdp_zcopy_thresh on the sending side. Note that it is > in a different place on Rhat4u4 and Rhat5. > > My libsdp.conf is the default that ships with OFED. Stripping the > comments (grep -v), it is just: > log min-level 9 destination file libsdp.log > use both server * *:* > use both client * *:* > Note that if you build locally: > cd /tmp/openib_gen2/xxxx/ofa_1_3_dev_kernel > make install > the libsdp.conf file seems to get lost. You must restore it by > hand. > > I have a shell script that automates this testing for a > wide range of message sizes: > 64 128 512 1024 2048 4096 8192 16000 32768 65536 131072 1000000 > on multiple transports: > IP both "echo datagram > /sys/class/net/ib0/mode" > IP-CM both "echo connected > /sys/class/net/ib0/mode" > SDP both > Bzcopy TCP_STREAM > Where both is TCP_RR and TCP_STREAM testing. > > The variance in SDP bandwidth results can be 10%-15% between > runs. The > difference between Bzcopy and non-Bzcopy is always very > visible for 128K > and up tests though. > > Could some other people please try to run some of these > tests? If only > help me know if I am crazy? > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > Sent: Thursday, January 24, 2008 11:17 AM > To: Jim Mott; Weikuan Yu > Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > I've tested on RHEL4 and RHEL5, and see no sdp_zcopy_thresh > improvement > for any message size, as measured with netperf, for any Arbel or > ConnectX HCA. > > Scott > > > > -----Original Message----- > > From: Jim Mott [mailto:jim at mellanox.com] > > Sent: Thursday, January 24, 2008 7:57 AM > > To: Weikuan Yu; Scott Weitzenkamp (sweitzen) > > Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org > > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > > performance changes inOFED 1.3 beta, and I get Oops when > > enabling sdp_zcopy_thresh > > > > Hi, > > 64K is borderline for seeing bzcopy effect. Using an AMD > > 6000+ (3 Ghz > > dual core) in Asus M2A-VM motherboard with ConnectX running > > 2.3 firmware > > and OFED 1.3-rc3 stack running on 2.6.23.8 kernel.org kernel, > > I ran the > > test for 128K: > > 5546 sdp_zcopy_thresh=0 (off) > > 8709 sdp_zcopy_thresh=65536 > > > > For these tests, I just have LD_PRELOAD set in my environment. > > > > ======================= > > > > I see that TCP_MAXSEG is not being handled by libsdp and will > > look into > > it. > > > > > > [root at dirk ~]# modprobe ib_sdp > > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t > TCP_STREAM -c > > -C -- -m 128K > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > > 193.168.10.198 > > (193.168.10.198) port 0 AF_INET > > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > > Recv Send Send Utilization > Service > > Demand > > Socket Socket Message Elapsed Send Recv Send > > Recv > > Size Size Size Time Throughput local remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > > us/KB > > > > 87380 16384 131072 30.01 5545.69 51.47 14.43 1.521 > > 1.706 > > > > Alignment Offset Bytes Bytes Sends Bytes > > Recvs > > Local Remote Local Remote Xfered Per Per > > Send Recv Send Recv Send (avg) > Recv (avg) > > 8 8 0 0 2.08e+10 131072.00 158690 > 33135.60 > > 627718 > > > > Maximum > > Segment > > Size (bytes) > > -1 > > [root at dirk ~]# echo 65536 > > >/sys/module/ib_sdp/parameters/sdp_zcopy_thresh > > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t > TCP_STREAM -c > > -C -- -m 128K > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > > 193.168.10.198 > > (193.168.10.198) port 0 AF_INET > > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > > Recv Send Send Utilization > Service > > Demand > > Socket Socket Message Elapsed Send Recv Send > > Recv > > Size Size Size Time Throughput local remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > > us/KB > > > > 87380 16384 131072 30.01 8708.58 50.63 14.55 0.953 > > 1.095 > > > > Alignment Offset Bytes Bytes Sends Bytes > > Recvs > > Local Remote Local Remote Xfered Per Per > > Send Recv Send Recv Send (avg) > Recv (avg) > > 8 8 0 0 3.267e+10 131072.00 249228 > 26348.30 > > 1239807 > > > > Maximum > > Segment > > Size (bytes) > > -1 > > > > Thanks, > > JIm > > > > Jim Mott > > Mellanox Technologies Ltd. > > mail: jim at mellanox.com > > Phone: 512-294-5481 > > > > > > -----Original Message----- > > From: Weikuan Yu [mailto:weikuan.yu at gmail.com] > > Sent: Thursday, January 24, 2008 9:09 AM > > To: Scott Weitzenkamp (sweitzen) > > Cc: Jim Mott; ewg at lists.openfabrics.org; > general at lists.openfabrics.org > > Subject: Re: [ofa-general] RE: [ewg] Not seeing any SDP performance > > changes inOFED 1.3 beta, and I get Oops when enabling > sdp_zcopy_thresh > > > > Hi, Scott, > > > > I have been running SDP tests across two woodcrest nodes > with 4x DDR > > cards using OFED-1.2.5.4. The card/firmware info is below. > > > > CA 'mthca0' > > CA type: MT25208 > > Number of ports: 2 > > Firmware version: 5.1.400 > > Hardware version: a0 > > Node GUID: 0x0002c90200228e0c > > System image GUID: 0x0002c90200228e0f > > > > I could not get a bandwidth more than 5Gbps like you have > shown here. > > Wonder if I need to upgrade to the latest software or firmware? Any > > suggestions? > > > > Thanks, > > --Weikuan > > > > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > > 192.168.225.77 > > (192.168 > > .225.77) port 0 AF_INET > > Recv Send Send Utilization > > Service > > Demand > > Socket Socket Message Elapsed Send Recv Send > > Recv > > Size Size Size Time Throughput local > remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > > us/KB > > > > 131072 131072 131072 10.00 4918.95 21.29 24.99 > 1.418 > > 1.665 > > > > > > Scott Weitzenkamp (sweitzen) wrote: > > > Jim, > > > > > > I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU > > > (single core each CPU) Xeon system. I do not see any performance > > > improvement (either throughput or CPU utilization) using > > netperf when > > I > > > set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384. Can > you elaborate > > on > > > your HCA type, and performance improvement you see? > > > > > > Here's an example netperf command line when using a > Cheetah DDR HCA > > and > > > 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware > > too): > > > > > > [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 > -v2 -4 -H > > > 192.168.1.201 -l 30 -t TCP_STREAM -c -C -- -m 65536 > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > > 192.168.1.201 > > > (192.168.1.201) port 0 AF_INET : histogram : demo > > > > > > Recv Send Send Utilization > > Service > > > Demand > > > Socket Socket Message Elapsed Send > Recv Send > > > Recv > > > Size Size Size Time Throughput local > remote local > > > remote > > > bytes bytes bytes secs. 10^6bits/s % S % S > us/KB > > > us/KB > > > > > > 87380 16384 65536 30.01 7267.70 55.06 > 61.27 1.241 > > > 1.381 > > > > > > Alignment Offset Bytes Bytes Sends Bytes > > > Recvs > > > Local Remote Local Remote Xfered Per Per > > > Send Recv Send Recv Send (avg) > > Recv (avg) > > > 8 8 0 0 2.726e+10 65536.00 415942 > > 48106.01 > > > 566648 > > > > > > From prescott at hpc.ufl.edu Thu Jan 24 14:01:03 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Thu, 24 Jan 2008 17:01:03 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4798FC5F.3060807@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu><4783B3F5.20600@opengridcomputing.com><4783BDD5.7000702@hpc.ufl.edu><4783C326.3070306@opengridcomputing.com><478634A5.3080204@hpc.ufl.edu><47863794.9080709@opengridcomputing.com><47865A4A.4070603@hpc.ufl.edu><47865E5B.4030607@opengridcomputing.com><4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> <4798FB7E.8050200@hpc.ufl.edu> <4798FC5F.3060807@opengridcomputing.com> Message-ID: <47990A9F.2070707@hpc.ufl.edu> Steve Wise wrote: > Craig Prescott wrote: >> >> Hi Steve; >> >> The SDP socket gets an associated mr when sdp_init_qp() calls >> ib_get_dma_mr(). It looks to me like this drills down into >> the provider layer, which will ultimately end up calling >> build_phys_page_list() from iwch_register_phys_mem(). >> >> Unfortunately, when I try to look at the ib_mr_attrs via >> ib_query_mr(), the call fails. >> >> When sdp_post_recv() calls ib_post_recv(), it looks to me >> like a DMA mapping has been set up between the SDP private >> receive buffers and card. The receive buffers are kmalloc'd >> in sdp_init_qp(). >> >> I hope I have this right. But it sounds like it is possible >> I am hitting both issues you describe. >> >> I guess one way to check is to drop my test nodes down to 4GB >> or less, right? They currently have 16GB. >> > > Drop them down to 1 or 2GB and try it. 4GB still requires the iommu to > remap things above 4GB. Awesome ;-) rdma_accept() now returns zero and no more complaints about opcodes and such, and the server now gets to RDMA_CM_EVENT_ESTABLISHED. Thanks! Of course, the client panic'd at this point. Little by little... > Sorry about that. I forgot about the 4GB limitation and get_dma_mr(). I > guess the chelsio driver should really just fail the get_dma_mr() call > since it doesn't properly support it. > > There is one other experiment you could try. You could try using lkey 0 > for any sgl used in a send or receive work request. This maps to the > zero stag in iwarp lingo. But I haven't tested that yet :) I'll try to do this tomorrow. Thanks again! Craig From weiny2 at llnl.gov Thu Jan 24 14:01:29 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 24 Jan 2008 14:01:29 -0800 Subject: [ofa-general] [PATCH] opensm/opensm/osm_subnet.c: update valid force_link_speed values to v1.2.1 of the spec Message-ID: <20080124140129.14498e34.weiny2@llnl.gov> >From c4aa70c9ec5642d214fa0da06fe5301454e99760 Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Thu, 24 Jan 2008 13:56:22 -0800 Subject: [PATCH] opensm/opensm/osm_subnet.c: update valid force_link_speed values to v1.2.1 of the spec Signed-off-by: Ira K. Weiny --- opensm/opensm/osm_subnet.c | 7 ++----- 1 files changed, 2 insertions(+), 5 deletions(-) diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 7b14e0c..cd96aed 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -1472,15 +1472,12 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) "# Force PortInfo:LinkSpeedEnabled on switch ports\n" "# If 0, don't modify PortInfo:LinkSpeedEnabled on switch port\n" "# Otherwise, use value for PortInfo:LinkSpeedEnabled on switch port\n" - "# Values are (IB Spec 1.2, 14.2.5.6 Table 145 \"PortInfo\")\n" + "# Values are (IB Spec 1.2.1, 14.2.5.6 Table 146 \"PortInfo\")\n" "# 1: 2.5 Gbps\n" - "# 2: 5.0 Gbps\n" "# 3: 2.5 or 5.0 Gbps\n" - "# 4: 10.0 Gbps\n" "# 5: 2.5 or 10.0 Gbps\n" - "# 6: 5.0 or 10.0 Gbps\n" "# 7: 2.5 or 5.0 or 10.0 Gbps\n" - "# 8-14: Reserved\n" + "# 2,4,6,8-14 Reserved\n" "# Default 15: set to PortInfo:LinkSpeedSupported\n\n" "force_link_speed %u\n\n" "# The subnet_timeout code that will be set for all the ports\n" -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-opensm-opensm-osm_subnet.c-update-valid-force_link_.patch Type: application/octet-stream Size: 1431 bytes Desc: not available URL: From swise at opengridcomputing.com Thu Jan 24 14:30:16 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 24 Jan 2008 16:30:16 -0600 Subject: [ofa-general] [PATCH 2.6.25] RDMA/cxgb3: Fix the T3A workaround checks. Message-ID: <20080124223016.772.18293.stgit@dell3.ogc.int> RDMA/cxgb3: Fix the T3A workaround checks. Correctly work around T3A issues by checking "hwtype != T3A" instead of "hwtype == T3B". Needed for new hw types. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/cxio_hal.c | 2 +- drivers/infiniband/hw/cxgb3/iwch_cm.c | 4 ++-- drivers/infiniband/hw/cxgb3/iwch_provider.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c index eec6a30..e220b44 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_hal.c +++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c @@ -179,7 +179,7 @@ int cxio_create_cq(struct cxio_rdev *rdev_p, struct t3_cq *cq) setup.size = 1UL << cq->size_log2; setup.credits = 65535; setup.credit_thres = 1; - if (rdev_p->t3cdev_p->type == T3B) + if (rdev_p->t3cdev_p->type != T3A) setup.ovfl_mode = 0; else setup.ovfl_mode = 1; diff --git a/drivers/infiniband/hw/cxgb3/iwch_cm.c b/drivers/infiniband/hw/cxgb3/iwch_cm.c index 20ba372..f8cb0fe 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_cm.c +++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c @@ -1118,7 +1118,7 @@ static int act_open_rpl(struct t3cdev *tdev, struct sk_buff *skb, void *ctx) status2errno(rpl->status)); connect_reply_upcall(ep, status2errno(rpl->status)); state_set(&ep->com, DEAD); - if (ep->com.tdev->type == T3B && act_open_has_tid(rpl->status)) + if (ep->com.tdev->type != T3A && act_open_has_tid(rpl->status)) release_tid(ep->com.tdev, GET_TID(rpl), NULL); cxgb3_free_atid(ep->com.tdev, ep->atid); dst_release(ep->dst); @@ -1249,7 +1249,7 @@ static void reject_cr(struct t3cdev *tdev, u32 hwtid, __be32 peer_ip, skb_trim(skb, sizeof(struct cpl_tid_release)); skb_get(skb); - if (tdev->type == T3B) + if (tdev->type != T3A) release_tid(tdev, hwtid, skb); else { struct cpl_pass_accept_rpl *rpl; diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index 69b1204..df1838f 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c @@ -646,7 +646,7 @@ static struct ib_mr *iwch_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, if (err) goto err; - if (udata && t3b_device(rhp)) { + if (udata && !t3a_device(rhp)) { uresp.pbl_addr = (mhp->attr.pbl_addr - rhp->rdev.rnic_info.pbl_base) >> 3; PDBG("%s user resp pbl_addr 0x%x\n", __FUNCTION__, From administeredijp3 at amodel.de Thu Jan 24 14:58:02 2008 From: administeredijp3 at amodel.de (Thelma Nieves) Date: Thu, 24 Jan 2008 23:58:02 +0100 Subject: [ofa-general] Take part in a sexual marathon with our qualified help Message-ID: <550786780.95398679322340@amodel.de> Don't you look upon your diminutive willy as worth worrying about? We don't want to disappoint you, but regardless of your skills as a lover, the majority of girls privily yearn for huge ph hd al aa lus! But don't worry, with Me xzu ga ys Di pyj k you have every prospect to become all women's idol! Enlarge the le nq ng nj th of your stick today and say goodbye to se rfe xu fyk al incompetence! -------------- next part -------------- An HTML attachment was scrubbed... URL: From meier3 at llnl.gov Thu Jan 24 14:59:18 2008 From: meier3 at llnl.gov (Timothy A. Meier) Date: Thu, 24 Jan 2008 14:59:18 -0800 Subject: [ofa-general] [PATCH] opensm: diags add DR path support to some utils Message-ID: <47991846.7050700@llnl.gov> Sasha, At LLNL, we find these -D options useful (some diagnostic messages give us only direct paths). From 3c681566514f0c948cfb5002f7536af9ca563e33 Mon Sep 17 00:00:00 2001 From: Tim Meier Date: Thu, 24 Jan 2008 14:49:02 -0800 Subject: [PATCH] opensm: diags add DR path support to some utils Added direct route support to iblinkinfo.pl and ibqueryerrors.pl. Signed-off-by: Tim Meier --- infiniband-diags/scripts/IBswcountlimits.pm | 51 +++++++++++++++++++++++++++ infiniband-diags/scripts/iblinkinfo.pl | 12 +++++- infiniband-diags/scripts/ibqueryerrors.pl | 12 +++++- 3 files changed, 71 insertions(+), 4 deletions(-) diff --git a/infiniband-diags/scripts/IBswcountlimits.pm b/infiniband-diags/scripts/IBswcountlimits.pm index 6985750..1ada8a8 100755 --- a/infiniband-diags/scripts/IBswcountlimits.pm +++ b/infiniband-diags/scripts/IBswcountlimits.pm @@ -373,3 +373,54 @@ sub get_num_ports return ($num_ports); } +# ========================================================================= +# convert_dr_to_guid(direct_route) +# +sub convert_dr_to_guid +{ + my $guid = undef; + + my $data = `smpquery nodeinfo -D $_[0]`; + my @lines = split("\n", $data); + foreach my $line (@lines) { + if ($line =~ /^PortGuid:\.+(.*)/) { $guid = $1; } + } + $guid; +} + +# ========================================================================= +# get_node_type(guid_or_direct_route) +# +sub get_node_type +{ + my $type = undef; + my $query_arg = "smpquery nodeinfo "; + if($_[0] =~ /x/) + { + # assume arg is a guid if contains an x + $query_arg .= "-G " . $_[0]; + } + else + { + # assume arg is a direct path + $query_arg .= "-D " . $_[0]; + } + + my $data = `$query_arg`; + my @lines = split("\n", $data); + foreach my $line (@lines) + { + if ($line =~ /^NodeType:\.+(.*)/) { $type = $1; } + } + $type; +} + +# ========================================================================= +# is_switch(guid_or_direct_route) +# +sub is_switch +{ + my $node_type = &get_node_type($_[0]); + ($node_type =~ /Switch/); +} + diff --git a/infiniband-diags/scripts/iblinkinfo.pl b/infiniband-diags/scripts/iblinkinfo.pl index 6d02eac..f21c31c 100755 --- a/infiniband-diags/scripts/iblinkinfo.pl +++ b/infiniband-diags/scripts/iblinkinfo.pl @@ -43,10 +43,11 @@ use IBswcountlimits; sub usage_and_exit { my $prog = $_[0]; - print "Usage: $prog [-Rhclp -S -C -P ]\n"; + print "Usage: $prog [-Rhclp -S -D -C -P ]\n"; print " Report link speed and connection for each port of each switch which is active\n"; print " -h This help message\n"; print " -R Recalculate ibnetdiscover information (Default is to reuse ibnetdiscover output)\n"; + print " -D output only the switch specified by direct route path\n"; print " -S output only the switch specified by guid\n"; print " -d print only down links\n"; print " -l (line mode) print all information for each link on each line\n"; @@ -60,6 +61,7 @@ sub usage_and_exit my $argv0 = `basename $0`; my $regenerate_map = undef; my $single_switch = undef; +my $direct_route = undef; my $line_mode = undef; my $print_add_switch = undef; my $print_extended_cap = undef; @@ -68,8 +70,9 @@ my $ca_name = ""; my $ca_port = ""; chomp $argv0; -if (!getopts("hcpldRS:C:P:")) { usage_and_exit $argv0; } +if (!getopts("hcpldRS:D:C:P:")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } +if (defined $Getopt::Std::opt_D) { $direct_route = $Getopt::Std::opt_D; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_S) { $single_switch = $Getopt::Std::opt_S; } if (defined $Getopt::Std::opt_d) { $only_down_links = $Getopt::Std::opt_d; } @@ -84,6 +87,11 @@ my $extra_smpquery_params = get_ca_name_port_param_string($ca_name, $ca_port); sub main { get_link_ends($regenerate_map, $ca_name, $ca_port); + if ($direct_route) + { + # convert DR to guid, then use original single_switch option + $single_switch = $IBswcountlimits::convert_dr_to_guid{$direct_route}; + } foreach my $switch (sort (keys (%IBswcountlimits::link_ends))) { if ($single_switch && $switch ne $single_switch) { diff --git a/infiniband-diags/scripts/ibqueryerrors.pl b/infiniband-diags/scripts/ibqueryerrors.pl index bdb458d..ca899c7 100755 --- a/infiniband-diags/scripts/ibqueryerrors.pl +++ b/infiniband-diags/scripts/ibqueryerrors.pl @@ -136,13 +136,14 @@ sub get_switches sub usage_and_exit { my $prog = $_[0]; - print "Usage: $prog [-a -c -r -R -s -S -d -C -P ]\n"; + print "Usage: $prog [-a -c -r -R -s -S -D -d -C -P ]\n"; print " Report counters on all switches in subnet\n"; print " -a Report an action to take\n"; print " -c suppress some of the common counters\n"; print " -r report port configuration information\n"; print " -R Recalculate ibnetdiscover information\n"; print " -s suppress errors listed\n"; + print " -D output only the switch specified by direct route path\n"; print " -S query only \n"; print " -d include the data counters in the output\n"; print " -C use selected Channel Adaptor name for queries\n"; @@ -153,11 +154,12 @@ sub usage_and_exit my $argv0 = `basename $0`; my $regenerate_map = undef; my $single_switch = undef; +my $direct_route = undef; my $ca_name = ""; my $ca_port = ""; chomp $argv0; -if (!getopts("has:crRS:dC:P:")) { usage_and_exit $argv0; } +if (!getopts("has:crRS:D:dC:P:")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_a) { $print_action = "yes"; } if (defined $Getopt::Std::opt_s) { @IBswcountlimits::suppress_errors = split (",", $Getopt::Std::opt_s); } @@ -167,6 +169,7 @@ if (defined $Getopt::Std::opt_c) } if (defined $Getopt::Std::opt_r) { $report_port_info = $Getopt::Std::opt_r; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } +if (defined $Getopt::Std::opt_D) { $direct_route = $Getopt::Std::opt_D; } if (defined $Getopt::Std::opt_S) { $single_switch = $Getopt::Std::opt_S; } if (defined $Getopt::Std::opt_d) { $include_data_counters = $Getopt::Std::opt_d; } if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } @@ -183,6 +186,11 @@ sub main } get_link_ends($regenerate_map, $ca_name, $ca_port); get_switches; + if ($direct_route) + { + # convert DR to guid, then use original single_switch option + $single_switch = $IBswcountlimits::convert_dr_to_guid{$direct_route}; + } foreach my $sw_addr (keys %switches) { if ($single_switch && $sw_addr ne "$single_switch") { next; } -- 1.5.1 -- Timothy A. Meier Computer Scientist ICCD/High Performance Computing 925.422.3341 meier3 at llnl.gov -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 0002-opensm-diags-add-DR-path-support-to-some-utils.patch URL: From pradeeps at linux.vnet.ibm.com Thu Jan 24 15:06:39 2008 From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana) Date: Thu, 24 Jan 2008 15:06:39 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> Message-ID: <479919FF.4070602@linux.vnet.ibm.com> Roland Dreier wrote: > OK, this is some half-baked thinking based on reading the patch. I > don't know the right answer here -- I am hoping to spark discussion > that makes the correct thing clear: > > > +static inline int ipoib_ud_mtu(unsigned int ib_mtu) > > +{ > > + return (ib_mtu < 4096) ? (ib_mtu - IPOIB_ENCAP_LEN) : > > + (ib_mtu - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4); > > +} > > reading this, my first reaction is that the magic 4096 constant should > have a name. And in fact the most obvious name for it is PAGE_SIZE. > However, this means that (assuming everyone can handle an IB MTU of > 4096), systems with PAGE_SIZE > 4096 would come up with a different > IPoIB MTU than systems with PAGE_SIZE == 4096. And I'm not sure > whether that would cause problems or not. (eg TCP should be OK) > > But then in general, if we use the approach here (which is very > appealing because it's so simple), Linux will potentially have an MTU > different from other OSes that might choose a different way to handle > an IB MTU of 4096. So does that mean that we should use a more > complicated approach to get the max possible MTU of 4096 - 4? I am not sure I understand -are you concerned that other OSes might use 4096 as the payload size of a packet? SM will dictate the max size of the packet to be used and so packets with a payload size of 4096 will get split into two and interoperability should not be an issue. Pradeep From Ashish.Batwara at lsi.com Thu Jan 24 15:17:40 2008 From: Ashish.Batwara at lsi.com (Batwara, Ashish) Date: Thu, 24 Jan 2008 16:17:40 -0700 Subject: [ofa-general] Direct connect and SM Message-ID: <01B9E81EECACE94DBBD0A556E768FB8A01FCFBBC@NAMAIL2.ad.lsil.com> Hello, I have a small confusion on the below scenario: 1. I have two hosts and each is having single port HCA 2. Port from each HCA is directly connected (No switch) with two ports on the same target 3. SM is running on each host. 4. I see that both the ports on target gets assigned the same LID from different SMs running on different hosts. Is it a valid configuration? Should target handle this case having two ports with same LID? Best Regards ================= Ashish Batwara, PMP | Firmware Architect | Mobile: +1 316 253 9784 | email: ashish.batwara at lsi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From meier3 at llnl.gov Thu Jan 24 15:27:41 2008 From: meier3 at llnl.gov (Timothy A. Meier) Date: Thu, 24 Jan 2008 15:27:41 -0800 Subject: [ofa-general] [PATCH] opensm: update man pages for diags DR support Message-ID: <47991EED.9000100@llnl.gov> Sasha, Sorry, I should have included this in my previous patch. Small man page changes. From 920349140562c3ab44c48dd6775ba3e0beca63c4 Mon Sep 17 00:00:00 2001 From: Tim Meier Date: Thu, 24 Jan 2008 15:20:31 -0800 Subject: [PATCH] opensm: update man pages for diags DR support Added man -D information for ibquerryerrors and iblinkinfo Signed-off-by: Tim Meier --- infiniband-diags/man/iblinkinfo.8 | 7 +++++-- infiniband-diags/man/ibqueryerrors.8 | 7 +++++-- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/infiniband-diags/man/iblinkinfo.8 b/infiniband-diags/man/iblinkinfo.8 index 943ef8f..fe01af3 100644 --- a/infiniband-diags/man/iblinkinfo.8 +++ b/infiniband-diags/man/iblinkinfo.8 @@ -1,11 +1,11 @@ -.TH IBLINKINFO 8 "May 22, 2007" "OpenIB" "OpenIB Diagnostics" +.TH IBLINKINFO 8 "Jan 24, 2008" "OpenIB" "OpenIB Diagnostics" .SH NAME iblinkinfo.pl \- report link info for all links in the fabric .SH SYNOPSIS .B iblinkinfo.pl - [-Rhcdl -C -P -v -S ] + [-Rhcdl -C -P -v -S -D ] .SH DESCRIPTION .PP @@ -25,6 +25,9 @@ fabric has changed. \fB\-S \fR Output only the switch specified by .TP +\fB\-D \fR +Output only the switch specified by the direct route path. +.TP \fB\-l\fR Print all information for each link on one line. Default is to print a header with the switch information and then a list for each port (useful for grep\'ing output). diff --git a/infiniband-diags/man/ibqueryerrors.8 b/infiniband-diags/man/ibqueryerrors.8 index 5de484d..8cde440 100644 --- a/infiniband-diags/man/ibqueryerrors.8 +++ b/infiniband-diags/man/ibqueryerrors.8 @@ -1,11 +1,11 @@ -.TH IBQUERYERRORS 8 "May 22, 2007" "OpenIB" "OpenIB Diagnostics" +.TH IBQUERYERRORS 8 "Jan 24, 2008" "OpenIB" "OpenIB Diagnostics" .SH NAME ibqueryerrors.pl \- query and report non-zero IB port counters .SH SYNOPSIS .B ibqueryerrors.pl -[-a -c -r -R -C -P -s -S -d] +[-a -c -r -R -C -P -s -S -D -d] .SH DESCRIPTION .PP @@ -45,6 +45,9 @@ Suppress the errors listed in the comma separated list provided. \fB\-S \fR Report results only for the switch specified. .TP +\fB\-D \fR +Report results only for the switch specified by the direct route path. +.TP \fB\-d\fR Include the optional transmit and receive data counters. .TP -- 1.5.1 -- Timothy A. Meier Computer Scientist ICCD/High Performance Computing 925.422.3341 meier3 at llnl.gov -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 0003-opensm-update-man-pages-for-diags-DR-support.patch URL: From mashirle at us.ibm.com Thu Jan 24 05:34:15 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 24 Jan 2008 05:34:15 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> Message-ID: <1201181656.9739.42.camel@localhost.localdomain> On Thu, 2008-01-24 at 13:29 -0800, Roland Dreier wrote: > OK, this is some half-baked thinking based on reading the patch. I > don't know the right answer here -- I am hoping to spark discussion > that makes the correct thing clear: > > > +static inline int ipoib_ud_mtu(unsigned int ib_mtu) > > +{ > > + return (ib_mtu < 4096) ? (ib_mtu - IPOIB_ENCAP_LEN) : > > + (ib_mtu - IB_GRH_BYTES - IPOIB_ENCAP_LEN - 4); > > +} > > reading this, my first reaction is that the magic 4096 constant should > have a name. And in fact the most obvious name for it is PAGE_SIZE. > However, this means that (assuming everyone can handle an IB MTU of > 4096), systems with PAGE_SIZE > 4096 would come up with a different > IPoIB MTU than systems with PAGE_SIZE == 4096. And I'm not sure > whether that would cause problems or not. (eg TCP should be OK) We could use ib_mtu_enum_to_int(IB_MTU_4096) here. TCP would be OK since it negotiates mss value, but not UDP if we have one node PAGE_SIZE bigger than 4096. > But then in general, if we use the approach here (which is very > appealing because it's so simple), Linux will potentially have an MTU > different from other OSes that might choose a different way to handle > an IB MTU of 4096. So does that mean that we should use a more > complicated approach to get the max possible MTU of 4096 - 4? Actually I thought about this when I came to this simple implementation. If we use 4096-48, a patch in IPoIB to generate ICMP error could help this issue by sending the 4096-48 mtu back so the source knows how big its packets could be. Do you this this is a good idea? thanks Shirley From hrosenstock at xsigo.com Thu Jan 24 15:47:39 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Thu, 24 Jan 2008 15:47:39 -0800 Subject: [ofa-general] Direct connect and SM In-Reply-To: <01B9E81EECACE94DBBD0A556E768FB8A01FCFBBC@NAMAIL2.ad.lsil.com> References: <01B9E81EECACE94DBBD0A556E768FB8A01FCFBBC@NAMAIL2.ad.lsil.com> Message-ID: <1201218459.25913.148.camel@hrosenstock-ws.xsigo.com> On Thu, 2008-01-24 at 16:17 -0700, Batwara, Ashish wrote: > Hello, > > I have a small confusion on the below scenario: > > 1. I have two hosts and each is having single port HCA > 2. Port from each HCA is directly connected (No switch) with two > ports on the same target > 3. SM is running on each host. > 4. I see that both the ports on target gets assigned the same LID > from different SMs running on different hosts. > > > > Is it a valid configuration? Should target handle this case having two > ports with same LID? Yes, they are technically two different IB subnets. -- Hal > Best Regards > > ================= > > Ashish Batwara, PMP | Firmware Architect | Mobile: +1 316 253 9784 | > > > email: ashish.batwara at lsi.com > > > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From rdreier at cisco.com Thu Jan 24 15:48:09 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Jan 2008 15:48:09 -0800 Subject: [ofa-general] Direct connect and SM In-Reply-To: <01B9E81EECACE94DBBD0A556E768FB8A01FCFBBC@NAMAIL2.ad.lsil.com> (Ashish Batwara's message of "Thu, 24 Jan 2008 16:17:40 -0700") References: <01B9E81EECACE94DBBD0A556E768FB8A01FCFBBC@NAMAIL2.ad.lsil.com> Message-ID: > Is it a valid configuration? Should target handle this case having two > ports with same LID? Yes, it is valid. Each port is in a different subnet and hence the subnet-local addresses (LIDs) may end up being the same. - R. From rdreier at cisco.com Thu Jan 24 15:49:39 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Jan 2008 15:49:39 -0800 Subject: [ofa-general] Re: [PATCH 2.6.25] RDMA/cxgb3: Fix the T3A workaround checks. In-Reply-To: <20080124223016.772.18293.stgit@dell3.ogc.int> (Steve Wise's message of "Thu, 24 Jan 2008 16:30:16 -0600") References: <20080124223016.772.18293.stgit@dell3.ogc.int> Message-ID: thanks, applied. From rdreier at cisco.com Thu Jan 24 15:50:08 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Jan 2008 15:50:08 -0800 Subject: [ofa-general] Re: [PATCH v2] libmlx4: avoid memcpy in blueflame post_sends In-Reply-To: <200801241429.35292.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Thu, 24 Jan 2008 14:29:34 +0200") References: <200801091223.14155.jackm@dev.mellanox.co.il> <200801241429.35292.jackm@dev.mellanox.co.il> Message-ID: sorry, I had this in my tree but forgot to push it out. done now. From rdreier at cisco.com Thu Jan 24 16:03:57 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Jan 2008 16:03:57 -0800 Subject: [ofa-general] help on /sys/class structure and rdma In-Reply-To: <4799064D.4060705@opengridcomputing.com> (Steve Wise's message of "Thu, 24 Jan 2008 15:42:37 -0600") References: <4799064D.4060705@opengridcomputing.com> Message-ID: > I just found a bug in a drop I did for libcxgb3. I changed the code > to read the adapter fw version sysfs file to attempt to check for > incompatible fw/lib cases. I was reading this file in > cxgb3_driver_init(): > > /sys/class/infiniband_verbs/uverbs0/device/infiniband:cxgb3_0/fw_ver I don't have any file like that in my system either. Not sure where it might be coming from for you. However, I do have /sys/class/infiniband_verbs/uverbs1/ibdev (which should be present everywhere), and that has "cxgb3_0" in it, which you could use to get to /sys/class/infiniband/cxgb3_0/. libibverbs does set up ibv_device.ibdev_path with that path, but it happens too late for your library initialization function unfortunately. So you can either wait until someone tries to create a context and fail/warn then about firmware version, or you can duplicate the libibverbs code to come up with that ibdev_path. - R. From rdreier at cisco.com Thu Jan 24 16:05:34 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 24 Jan 2008 16:05:34 -0800 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4798D0D2.5070103@opengridcomputing.com> (Steve Wise's message of "Thu, 24 Jan 2008 11:54:26 -0600") References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> <4798D0D2.5070103@opengridcomputing.com> Message-ID: > I didn't think they were that different, but I don't know for > sure. However, unless the IB-SDP uses atomics or some other > IB-specific work request, it just might work. It might work, but are the wire formats the same? ie you might end up with IB SDP running on iWARP, rather than something that could talk to a real implentation of RDMA consortium SDP. I'd be especially worried about connection establishment, but there could be differences just about anywhere else too. - R. From krause at cup.hp.com Thu Jan 24 15:57:57 2008 From: krause at cup.hp.com (Michael Krause) Date: Thu, 24 Jan 2008 15:57:57 -0800 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4798D0D2.5070103@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> <4798D0D2.5070103@opengridcomputing.com> Message-ID: <6.2.0.14.2.20080124155545.06b312d8@esmail.cup.hp.com> An HTML attachment was scrubbed... URL: From gmkurtzer at gmail.com Thu Jan 24 16:18:39 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Thu, 24 Jan 2008 16:18:39 -0800 Subject: [ofa-general] Opensm compatibility with rate=1 In-Reply-To: <20080124181530.GH11277@sashak.voltaire.com> References: <571f1a060801231518i76c3383l38e3fe9cc32e3cd8@mail.gmail.com> <20080124181530.GH11277@sashak.voltaire.com> Message-ID: <571f1a060801241618t3545d7d7k533256da3425294c@mail.gmail.com> On Jan 24, 2008 10:15 AM, Sasha Khapyorsky wrote: > Hi Greg, > > > On 15:18 Wed 23 Jan , Greg Kurtzer wrote: > > > > We recently updated OFED (among other things) on one of our IB test > > beds that use older cards. Something broke recently with an error in > > dmesg like: > > > > kernel: ib0: multicast join failed for > > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 > > > > We used to fix this by defining our partition to be: > > > > Default=0x7fff,ipoib,rate=1:ALL=full; > > > > But this no longer seems to work. > > > > In the opensm source code I see the following: > > > > > > /* following v1 ver1.2 p901 */ > > #define IB_PATH_RECORD_RATE_2_5_GBS 2 > > #define IB_PATH_RECORD_RATE_10_GBS 3 > > #define IB_PATH_RECORD_RATE_30_GBS 4 > > #define IB_PATH_RECORD_RATE_5_GBS 5 > > #define IB_PATH_RECORD_RATE_20_GBS 6 > > #define IB_PATH_RECORD_RATE_40_GBS 7 > > #define IB_PATH_RECORD_RATE_60_GBS 8 > > #define IB_PATH_RECORD_RATE_80_GBS 9 > > #define IB_PATH_RECORD_RATE_120_GBS 10 > > > > #define IB_MIN_RATE IB_PATH_RECORD_RATE_2_5_GBS > > #define IB_MAX_RATE IB_PATH_RECORD_RATE_120_GBS > > > > Which forces the lowest possible rate to be 2 which doesn't work with > > our test bed. By kludging IB_MIN_RATE to be set to 1, things seem to > > be working but chances are supporting only rates >= 2 was done on > > purpose. Is there a better workaround or solution to this, or a way of > > continuing support for rate=1? > > What is the purpose of rate=1 in your setup? According to IBA spec the > value '1' for rate is "reserved". Hrmm, that is interesting. The command "ibstatus" shows that the rate is 2.5 Gb/sec (1X) yet "ibstat" shows the rate being 2. The opensm -V logs shows that it was setting the rate to 2 as well (which didn't work). Even by me forcing the IB_MIN_RATE to 1 still yields the same output, but opensm just works now with no more error messages. I can't tell you anything more then that except that earlier versions of opensm worked with a rate of 1. Thanks! Greg -- Greg Kurtzer http://www.runlevelzero.net/ From kliteyn at mellanox.co.il Thu Jan 24 17:38:19 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 25 Jan 2008 03:38:19 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-25:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-24 OpenSM git rev = Sun_Jan_20_20:18:24_2008 [9b093e04dedb54c78d74d0567e85b3a59f88badd] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=398 Fail=2 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 8 LidMgr IS3-128.topo Failures: 2 LidMgr IS3-128.topo From swise at opengridcomputing.com Thu Jan 24 18:43:26 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 24 Jan 2008 20:43:26 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: <6.2.0.14.2.20080124155545.06b312d8@esmail.cup.hp.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> <4798D0D2.5070103@opengridcomputing.com> <6.2.0.14.2.20080124155545.06b312d8@esmail.cup.hp.com> Message-ID: <47994CCE.2000104@opengridcomputing.com> Perhaps we can eventually make the ofed sdp module support both protocols based on the device transport type... Michael Krause wrote: > > The IETF defines SDP over iWARP. To a very large extent, the main data > path operations are the same. There are differences in the port mapper > and hello exchanges. For the most part, the IETF specification learned > from the problems implementing SDP / IB as well as dealt with the > constraints imposed by the IETF MPA specification. Done right, the > differences should be transparent to applications with only minor > changes required within the underlying library / infrastructure. > > Mike > > At 09:54 AM 1/24/2008, Steve Wise wrote: >> Roland Dreier wrote: >>> Sorry to come into this thread so late, but does it make sense to try >>> the current SDP code over iWARP? As I understand things, the RDMA >>> consortium has its own spec for SDP on iWARP, which may not precisely >>> correspond to the IBA SDP annex. So probably the SDP code would need >>> updating to work over iWARP. >> >> I didn't think they were that different, but I don't know for sure. >> However, unless the IB-SDP uses atomics or some other IB-specific work >> request, it just might work. >> >>> (And don't all the iWARP vendors have TCP offload socket stuff for >>> their adapters anyway, which is a simpler solution to the same problem >>> that SDP solves?) >> >> Dunno about other vendors, but Chelsio's TOE code is available on >> their web site (service.chelsio.com). It will be more efficient than >> SDP over iWARP (no SDP/iWARP headers needed). >> >> I assume Craig is doing some research to perhaps determine exactly how >> SDP performs over iwarp vs TOE or standard TCP. >> >> >> Steve. >> >> >> >> >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general > > ------------------------------------------------------------------------ > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From weiny2 at llnl.gov Thu Jan 24 19:02:27 2008 From: weiny2 at llnl.gov (Ira Weiny) Date: Thu, 24 Jan 2008 19:02:27 -0800 Subject: [ofa-general] [PATCH] infiniband-diags/scripts/iblinkinfo.pl: Fix switch to switch output with new ibnetdiscover output. Message-ID: <20080124190227.22a3af36.weiny2@llnl.gov> I am not sure when the "port guid output" format change to ibnetdiscover happened, but I just realized that switch to switch links were not being parsed correctly by IBswcountlimits.pm. This caused issues with some of the perl diags. I have fixed this. Furthermore, Erez mentioned that he would like port guid output so I threw that option in as well. (Since I have to parse that info out of ibnetdiscover now.) One might wonder why I did not catch this before? If you mainly test on a 1 switch system you don't get to see what switch to switch links look like in diag output... :-( I have learned my lesson, sorry. Sasha, this needs to be in 1.3 as well. Sorry, :-( Ira >From 89887aff392b0f40acf4035c19e06359271dfe6a Mon Sep 17 00:00:00 2001 From: Ira K. Weiny Date: Thu, 24 Jan 2008 18:50:38 -0800 Subject: [PATCH] infiniband-diags/scripts/iblinkinfo.pl: Fix switch to switch output with new ibnetdiscover output. ibnetdiscover now prints port guids. This format change cause the regex to break when parsing switch to switch links. Fix this by parsing for the remote port guid. And while we are at it add an option to print the port guids parsed. Signed-off-by: Ira K. Weiny --- infiniband-diags/scripts/IBswcountlimits.pm | 41 ++++++++++++++++---------- infiniband-diags/scripts/iblinkinfo.pl | 9 ++++- 2 files changed, 32 insertions(+), 18 deletions(-) diff --git a/infiniband-diags/scripts/IBswcountlimits.pm b/infiniband-diags/scripts/IBswcountlimits.pm index 6985750..c698ed1 100755 --- a/infiniband-diags/scripts/IBswcountlimits.pm +++ b/infiniband-diags/scripts/IBswcountlimits.pm @@ -288,62 +288,71 @@ sub get_link_ends if ( $in_switch eq "yes" ) { my $rec = undef; - if ($line =~ /^\[(\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\(.+\)\s+#.*\"(.*)\"\.* lid (\d+).*/) + if ($line =~ /^\[(\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\](\(.+\))?\s+#.*\"(.*)\"\.* lid (\d+).*/) { $loc_port = $1; my $rem_guid = $2; my $rem_port = $3; - my $rem_desc = $4; - my $rem_lid = $5; + my $rem_port_guid = $4; + my $rem_desc = $5; + my $rem_lid = $6; $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => "", loc_desc => $desc, loc_sw_lid => $loc_sw_lid, rem_guid => "0x$rem_guid", rem_lid => $rem_lid, - rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc }; + rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc, + rem_port_guid => $rem_port_guid }; } - if ($line =~ /^\[(\d+)\]\[ext (\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\(.+\)\s+#.*\"(.*)\"\.* lid (\d+).*/) + if ($line =~ /^\[(\d+)\]\[ext (\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\](\(.+\))?\s+#.*\"(.*)\"\.* lid (\d+).*/) { $loc_port = $1; my $loc_ext_port = $2; my $rem_guid = $3; my $rem_port = $4; - my $rem_desc = $5; - my $rem_lid = $6; + my $rem_port_guid = $5; + my $rem_desc = $6; + my $rem_lid = $7; $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => $loc_ext_port, loc_desc => $desc, loc_sw_lid => $loc_sw_lid, rem_guid => "0x$rem_guid", rem_lid => $rem_lid, - rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc }; + rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc, + rem_port_guid => $rem_port_guid }; } - if ($line =~ /^\[(\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\[ext (\d+)\]\(.+\)\s+#.*\"(.*)\"\.* lid (\d+).*/) + if ($line =~ /^\[(\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\[ext (\d+)\](\(.+\))?\s+#.*\"(.*)\"\.* lid (\d+).*/) { $loc_port = $1; my $rem_guid = $2; my $rem_port = $3; my $rem_ext_port = $4; - my $rem_desc = $5; - my $rem_lid = $6; + my $rem_port_guid = $5; + my $rem_desc = $6; + my $rem_lid = $7; $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => "", loc_desc => $desc, loc_sw_lid => $loc_sw_lid, rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => $rem_ext_port, - rem_desc => $rem_desc }; + rem_desc => $rem_desc, + rem_port_guid => $rem_port_guid }; } - if ($line =~ /^\[(\d+)\]\[ext (\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\[ext (\d+)\]\(.+\)\s+#.*\"(.*)\"\.* lid (\d+).*/) + if ($line =~ /^\[(\d+)\]\[ext (\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\[ext (\d+)\](\(.+\))?\s+#.*\"(.*)\"\.* lid (\d+).*/) { $loc_port = $1; my $loc_ext_port = $2; my $rem_guid = $3; my $rem_port = $4; my $rem_ext_port = $5; - my $rem_desc = $6; - my $rem_lid = $7; + my $rem_port_guid = $6; + my $rem_desc = $7; + my $rem_lid = $8; $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => $loc_ext_port, loc_desc => $desc, loc_sw_lid => $loc_sw_lid, rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => $rem_ext_port, - rem_desc => $rem_desc }; + rem_desc => $rem_desc, + rem_port_guid => $rem_port_guid }; } if ($rec) { + $rec->{rem_port_guid} =~ s/\((.*)\)/$1/; $IBswcountlimits::link_ends{"0x$guid"}{$loc_port} = $rec; } } diff --git a/infiniband-diags/scripts/iblinkinfo.pl b/infiniband-diags/scripts/iblinkinfo.pl index 6d02eac..764e92d 100755 --- a/infiniband-diags/scripts/iblinkinfo.pl +++ b/infiniband-diags/scripts/iblinkinfo.pl @@ -54,6 +54,7 @@ sub usage_and_exit print " -c print port capabilities (enabled/supported values)\n"; print " -C use selected Channel Adaptor name for queries\n"; print " -P use selected channel adaptor port for queries\n"; + print " -g print port guids instead of node guids\n"; exit 0; } @@ -66,9 +67,10 @@ my $print_extended_cap = undef; my $only_down_links = undef; my $ca_name = ""; my $ca_port = ""; +my $print_port_guids = undef; chomp $argv0; -if (!getopts("hcpldRS:C:P:")) { usage_and_exit $argv0; } +if (!getopts("hcpldRS:C:P:g")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_S) { $single_switch = $Getopt::Std::opt_S; } @@ -78,6 +80,7 @@ if (defined $Getopt::Std::opt_p) { $print_add_switch = $Getopt::Std::opt_p; } if (defined $Getopt::Std::opt_c) { $print_extended_cap = $Getopt::Std::opt_c; } if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } if (defined $Getopt::Std::opt_P) { $ca_port = $Getopt::Std::opt_P; } +if (defined $Getopt::Std::opt_g) { $print_port_guids = $Getopt::Std::opt_g; } my $extra_smpquery_params = get_ca_name_port_param_string($ca_name, $ca_port); @@ -145,7 +148,6 @@ sub main if ($line =~ /^VLStallCount:\.+(.*)/) { $vl_stall = $1; } if ($line =~ /^PhysLinkState:\.+(.*)/) { $phy_link_state = $1; } } - my $rem_guid = $hr->{rem_guid}; my $rem_port = $hr->{rem_port}; my $rem_lid = $hr->{rem_lid}; my $rem_speed_sup = ""; @@ -216,6 +218,9 @@ sub main { my $line_begin = sprintf ("%18s \"%30s\"%s", $switch, $hr->{loc_desc}, $pkt_life_prompt); my $ext_guid = sprintf ("%18s", $hr->{rem_guid}); + if ($print_port_guids && $hr->{rem_port_guid} ne "") { + $ext_guid = sprintf ("0x%016s", $hr->{rem_port_guid}); + } push (@output_lines, sprintf ("%s %6s %4s[%2s] ==%s%s==> %18s %6s %4s[%2s] \"%s\" ( %s %s)\n", $line_begin, $hr->{loc_sw_lid}, $port, $hr->{loc_ext_port}, -- 1.5.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-infiniband-diags-scripts-iblinkinfo.pl-Fix-switch-t.patch Type: application/octet-stream Size: 7196 bytes Desc: not available URL: From rajouri.jammu at gmail.com Thu Jan 24 20:07:46 2008 From: rajouri.jammu at gmail.com (Rajouri Jammu) Date: Thu, 24 Jan 2008 20:07:46 -0800 Subject: [ofa-general] Zero byte rdma read causes REM_OP_ERROR Message-ID: <3307cdf90801242007y3ace39ccrb72d5f35c3a937e4@mail.gmail.com> When I try doing a zero byte rdma read I get a 10 (IB_WC_REM_ACCESS_ERR ) error completion. Is that expected? Non-zero byte reads complete successfully. I'm using OFED-1.2.5.4. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bart.vanassche at gmail.com Thu Jan 24 23:51:32 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Fri, 25 Jan 2008 08:51:32 +0100 Subject: [ofa-general] Feature request: ability to enable SDP system-wide In-Reply-To: <000601c85dc7$95b48550$c11d8ff0$@rr.com> References: <000601c85dc7$95b48550$c11d8ff0$@rr.com> Message-ID: On Jan 23, 2008 2:55 PM, Jim Mott wrote: > There are no current plans to do this. Can this please be reconsidered ? The current approach (letting libsdp.so open two sockets every time socket(AF_INET...) is called) needs a complex implementation and is error prone. My opinion is that instead of letting libsdp.so create two sockets to enable IPoIB + SDP support, the kernel should be modified such that both IPoIB + SDP context information can be associated with a single socket descriptor. Once this functionality is implemented there will no longer be a need for a separate address family for SDP (AF_INET_SDP). I noticed there is strong opposition on LKML against introduction of AF_INET_SDP (http://lkml.org/lkml/2006/3/6/70). An example of an issue I encountered with libsdp.so: the STGT iSCSI target daemon does not start properly when preloading libsdp.so. Without libsdp.so tgtd runs fine: # /usr/sbin/tgtd -f (no error messages) With libsdp.so preloaded an error occurs: # LD_PRELOAD=/usr/lib/libsdp.so /usr/sbin/tgtd -f (null): iscsi_tcp_init(202) unable to bind server socket, Address already in use Bart. From tarun_arora at bionik.tv Fri Jan 25 01:03:43 2008 From: tarun_arora at bionik.tv (tarun_arora at bionik.tv) Date: Fri, 25 Jan 2008 04:03:43 -0500 Subject: [ofa-general] Last Night Message-ID: <001401c85f31$2f6b0c40$651c768a@syjab> You're in my Soul http://79.84.54.99/ From RAISCH at de.ibm.com Fri Jan 25 00:44:36 2008 From: RAISCH at de.ibm.com (Christoph Raisch) Date: Fri, 25 Jan 2008 09:44:36 +0100 Subject: [ofa-general] [PATCH] IB/ehca: Prevent sending UD packets to QP0 In-Reply-To: <1201210488.25913.97.camel@hrosenstock-ws.xsigo.com> References: <200801241759.09065.fenkes@de.ibm.com> <1201210488.25913.97.camel@hrosenstock-ws.xsigo.com> Message-ID: > Is this a hardware or software limitation ? > > -- Hal > >From the behavior of a POWER system you can see that QP0 is supported, otherwise it would be a little bit difficult to get LIDs assigned... Building QP0 support for partitions is not straight forward with virtualized adapters. And as you already know this is currently not available to the partitions. Gruss / Regards Christoph Raisch From dwhpicm at hpic.com Fri Jan 25 01:59:46 2008 From: dwhpicm at hpic.com (Aleen Mcgrath) Date: Fri, 25 Jan 2008 10:59:46 +0100 Subject: [ofa-general] Was hat diese Software? Hohe Qualitä t und niedrigen Preis Message-ID: <01c85f41$6578fd00$c20f0f4f@dwhpicm> Man kann die Software momentan bekommen. Wie? Bezahlen und auslasten! Das sind die Programmen auf allen europaischen Sprachen, die fur Windows und Macintosh vorherbestimmt sind. Fur die echte und vollige Produkte der Software bezahlt man nur wenig Geld.Wie das Programm aufzustellen? Dabei hilft die professionelle Konsultation des Anwenderdienstes. Garantierte schnelle Antwort, die Ruckzahlung ist moglich. Sie kaufen die Software, sie funktionieren ausgezeichnet http://geocities.com/robt.phillips/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Fri Jan 25 02:59:03 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 25 Jan 2008 10:59:03 +0000 Subject: [ofa-general] Opensm compatibility with rate=1 In-Reply-To: <571f1a060801241618t3545d7d7k533256da3425294c@mail.gmail.com> References: <571f1a060801231518i76c3383l38e3fe9cc32e3cd8@mail.gmail.com> <20080124181530.GH11277@sashak.voltaire.com> <571f1a060801241618t3545d7d7k533256da3425294c@mail.gmail.com> Message-ID: <20080125105903.GA13079@sashak.voltaire.com> On 16:18 Thu 24 Jan , Greg Kurtzer wrote: > > Hrmm, that is interesting. The command "ibstatus" shows that the rate > is 2.5 Gb/sec (1X) yet "ibstat" shows the rate being 2. The opensm -V > logs shows that it was setting the rate to 2 as well (which didn't > work). Why does rate=2 not work? What are errors? Sasha From sashak at voltaire.com Fri Jan 25 03:06:39 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 25 Jan 2008 11:06:39 +0000 Subject: [ofa-general] Re: [PATCH] opensm/scripts/redhat-opensm.init: fix starting opensm when using daemon mode In-Reply-To: <20080124105919.5b491029.weiny2@llnl.gov> References: <20080124105919.5b491029.weiny2@llnl.gov> Message-ID: <20080125110639.GB13079@sashak.voltaire.com> On 10:59 Thu 24 Jan , Ira Weiny wrote: > When daemon mode was specified this script was reporting "failure" on start > when it actually worked. This fixes it by just waiting for a valid pid of > opensm. > > Ira > > > From 8f9b550c055b0fabf4c5fd6652060fae5dd7216b Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Thu, 24 Jan 2008 10:51:23 -0800 > Subject: [PATCH] opensm/scripts/redhat-opensm.init: fix starting opensm when using daemon mode > > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From linabovenetmet at abovenet.de Thu Jan 24 04:42:48 2008 From: linabovenetmet at abovenet.de (Ernie Richmond) Date: Fri, 24 Jan 2008 20:42:48 +0800 Subject: [ofa-general] Even the celebrities use it Message-ID: <342177668.36354806610701@abovenet.de> Researchers at the University of Pittsburgh says the deadly H il 5N hz 1has an influence on your se ymg xu fch al life!According to the latest research, H5 sc N1 destroys your imm csg une system anddisrupt the er ou ec qfu ti bq on process! If you do nothing, you'll be powerless in bad.That is why you could visit our site and look at the latest development of ourspecialists in this sphere. In order to be double sure that you are protecteduse Vi jf a ic gr jc a S gl of ce t Ta yx bl cak e ihp t regularly and your health will be perfect!P.S. H5 ci N1 have nothing to do with V pl ia zc gr qza a S af of cx t Ta wqp bl ey et:)But V tj ia xto gr ann a S wx of rw t Ta rbj bl nj et is really the perfect m wpn ed mn ica wr ti ud on for your hea cm lth & er cm ec bfy t pf ion! Sleep Happy!Best regards,You can learn more about m up edi vwp cat fnr ions on our site.Ernie Richmond -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Fri Jan 25 03:13:38 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 25 Jan 2008 11:13:38 +0000 Subject: [ofa-general] Re: [PATCH] opensm: osm_subnet.c log and print the path of the cached option file In-Reply-To: <4798F4FE.5010101@llnl.gov> References: <4798F4FE.5010101@llnl.gov> Message-ID: <20080125111338.GC13079@sashak.voltaire.com> On 12:28 Thu 24 Jan , Timothy A. Meier wrote: > Sasha, > > A trivial patch. During development, we (at LLNL) sometimes use different > options/configurations. This provides a way to know which one is active. > > > From d60408b4dc1cb0c917e2eb33d6a3f62ac6bb9b5c Mon Sep 17 00:00:00 2001 > From: Tim Meier > Date: Thu, 24 Jan 2008 11:51:08 -0800 > Subject: [PATCH] opensm: osm_subnet.c log and print the path of the cached > option file > > Logged (syslog and print) the path to the option file that is cached and > parsed. > This is helpful when something other than the default path is used. > > Signed-off-by: Tim Meier Applied. Thanks. Sasha From dwdelonghim at delonghi.it Fri Jan 25 03:11:36 2008 From: dwdelonghim at delonghi.it (Johan Kane) Date: Fri, 25 Jan 2008 05:11:36 -0600 Subject: [ofa-general] Love.Which great Love grant. and. Message-ID: <01c85f10$c2a4e100$158f4c45@dwdelonghim> Horse I said I will saytrueor. From vlad at lists.openfabrics.org Fri Jan 25 03:16:40 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Fri, 25 Jan 2008 03:16:40 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080125-0200 daily build status Message-ID: <20080125111640.60E6AE60813@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.22 Passed on ia64 with linux-2.6.18 Passed on ppc64 with linux-2.6.18 Passed on x86_64 with linux-2.6.18 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.15 Passed on powerpc with linux-2.6.14 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.18-53.el5 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.23 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.14 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on powerpc with linux-2.6.15 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.12 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.18-8.el5 Failed: From eli at dev.mellanox.co.il Fri Jan 25 03:43:53 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Fri, 25 Jan 2008 13:43:53 +0200 Subject: [ofa-general] Re: [PATCH] ib/limthca: Remove an always true condition In-Reply-To: References: <1201184628.6755.9.camel@mtls03> Message-ID: <4e6a6b3c0801250343y2b551817k4fb62bff9445542e@mail.gmail.com> On 1/24/08, Roland Dreier wrote: > Thanks for splitting this up... it makes review much easier, and > indeed it seems either there is a bug in the existing code, or this > patch is wrong: > > > - if (srq->first_free >= 0) > > - *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; > > - else > > - srq->first_free = ind; > > - > > + *wqe_to_link(get_wqe(srq, srq->last_free)) = ind; > > why is first_free always >= 0? I don't see anything that guarantees > that, The following two subsequent ifs gurantees that. if (ind < 0) { err = -1; *bad_wr = wr; break; } wqe = get_wqe(srq, ind); next_ind = *wqe_to_link(wqe); if (next_ind < 0) { err = -1; *bad_wr = wr; break; } > and in fact mthca_tavor_post_srq_recv and mthca_arbel_post_srq_recv > both have code: > > ind = srq->first_free; > > if (ind < 0) { > > so if first_free is always non-negative, we could also delete these > checks from the fast path. I do see the SRQ create code adds a spare entry: > > srq->max = align_queue_size(pd->context, attr->attr.max_wr, 1); > > but it seems we don't prevent the consumer from using this entry. The two ifs I mentioned above guarantee that the consumer cannot post to the last entry, and they also enable to assert the requirement that all posted WQEs have a valid NDA field. > However I can't recreate the reasoning why we need the spare entry... > > The most straightforward fix is to change the check for the SRQ being > full in the post_srq_recv functions so that it keeps the spare entry, > but I'd like to understand the code again first ;) > > - R. > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From eli at dev.mellanox.co.il Fri Jan 25 03:48:25 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Fri, 25 Jan 2008 13:48:25 +0200 Subject: [ofa-general] Bogus Receive Completions In-Reply-To: <4798DF2B.8060306@dls.net> References: <475607AA.301@dls.net> <4797DC47.6060104@dls.net> <4797DD87.8020808@dls.net> <1201184289.6755.7.camel@mtls03> <4798B643.9060508@dls.net> <1201195120.6755.75.camel@mtls03> <4798CD67.8040503@dls.net> <4e6a6b3c0801241030n182ef2fbu5485be1a60a3074d@mail.gmail.com> <4798DF2B.8060306@dls.net> Message-ID: <4e6a6b3c0801250348r17cbf0bay3c315a0a02f14ab9@mail.gmail.com> On 1/24/08, Roman Kononov wrote: > On 2008-01-24 12:30 Eli Cohen said the following: > > On 1/24/08, *Roman Kononov* > > > wrote: > > > > Question: is there an easy way to inject transmission errors into the > > fabric? > > > > I am not sure what exactly you mean by this. > > I wanted to artificially corrupt I/B packets or their CRC, to cause > transmission errors. Under such conditions the packets would be re-sent, > intensifying error correction activity, possibly resulting in more > frequent occurrence of software or firmware bugs. > There is no simple way to do that. From dotanb at dev.mellanox.co.il Fri Jan 25 04:45:43 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Fri, 25 Jan 2008 14:45:43 +0200 Subject: [ofa-general] Zero byte rdma read causes REM_OP_ERROR In-Reply-To: <3307cdf90801242007y3ace39ccrb72d5f35c3a937e4@mail.gmail.com> References: <3307cdf90801242007y3ace39ccrb72d5f35c3a937e4@mail.gmail.com> Message-ID: <4799D9F7.4030607@dev.mellanox.co.il> Rajouri Jammu wrote: > When I try doing a zero byte rdma read I get a 10 > (IB_WC_REM_ACCESS_ERR ) error completion. > > Is that expected? > > Non-zero byte reads complete successfully. > > I'm using OFED-1.2.5.4. > Did you enabled RDMA Read in the qp_access_flags (in modify QP RESET->INIT)? Dotan From sashak at voltaire.com Fri Jan 25 05:06:59 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Fri, 25 Jan 2008 13:06:59 +0000 Subject: [ofa-general] Re: [PATCH] opensm/opensm/osm_subnet.c: update valid force_link_speed values to v1.2.1 of the spec In-Reply-To: <20080124140129.14498e34.weiny2@llnl.gov> References: <20080124140129.14498e34.weiny2@llnl.gov> Message-ID: <20080125130659.GH13079@sashak.voltaire.com> On 14:01 Thu 24 Jan , Ira Weiny wrote: > From c4aa70c9ec5642d214fa0da06fe5301454e99760 Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Thu, 24 Jan 2008 13:56:22 -0800 > Subject: [PATCH] opensm/opensm/osm_subnet.c: update valid force_link_speed values to v1.2.1 of > the spec > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From gmkurtzer at gmail.com Fri Jan 25 06:17:28 2008 From: gmkurtzer at gmail.com (Greg Kurtzer) Date: Fri, 25 Jan 2008 06:17:28 -0800 Subject: [ofa-general] Opensm compatibility with rate=1 In-Reply-To: <20080125105903.GA13079@sashak.voltaire.com> References: <571f1a060801231518i76c3383l38e3fe9cc32e3cd8@mail.gmail.com> <20080124181530.GH11277@sashak.voltaire.com> <571f1a060801241618t3545d7d7k533256da3425294c@mail.gmail.com> <20080125105903.GA13079@sashak.voltaire.com> Message-ID: <571f1a060801250617vde79132w38eed9de07e46ef@mail.gmail.com> We tried rate=2, but it didn't fix the errors we were getting: kernel: ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 Adding rate=1 and my kludge suppressed the error. I can't give specifics as to why any other rates did not work. Thanks! On Jan 25, 2008 2:59 AM, Sasha Khapyorsky wrote: > On 16:18 Thu 24 Jan , Greg Kurtzer wrote: > > > > Hrmm, that is interesting. The command "ibstatus" shows that the rate > > is 2.5 Gb/sec (1X) yet "ibstat" shows the rate being 2. The opensm -V > > logs shows that it was setting the rate to 2 as well (which didn't > > work). > > Why does rate=2 not work? What are errors? > > Sasha > -- Greg Kurtzer http://www.runlevelzero.net/ From jimmott at austin.rr.com Fri Jan 25 06:23:32 2008 From: jimmott at austin.rr.com (Jim Mott) Date: Fri, 25 Jan 2008 08:23:32 -0600 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> Message-ID: <005301c85f5d$e03e36b0$a0baa410$@rr.com> I had to rebuild netperf (netperf-2.4.4.tar.gz) to enable demo mode, so I do not know how these results related to what I have seen before due to differences in the test program. Running from the AMD machine against a 2-socket/quad-core Intel running SLES10-SP1-RealTime (sorry; I am testing other things...) with sdp_zcopy_thresh=64K and a fresh default netperf-2.4.4 build gives me: [root at dirk unit_tests]# netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.125 -D -l 30 -- -r 1000000 Interim result: 2945.95 10^6bits/s over 1.00 seconds Interim result: 6440.04 10^6bits/s over 1.00 seconds Interim result: 7974.69 10^6bits/s over 1.00 seconds Interim result: 8007.55 10^6bits/s over 1.00 seconds Interim result: 7807.40 10^6bits/s over 1.03 seconds Interim result: 7947.53 10^6bits/s over 1.01 seconds Interim result: 7857.82 10^6bits/s over 1.01 seconds Interim result: 7551.87 10^6bits/s over 1.04 seconds Interim result: 6939.62 10^6bits/s over 1.09 seconds Interim result: 7270.93 10^6bits/s over 1.00 seconds Interim result: 7978.83 10^6bits/s over 1.00 seconds Interim result: 7948.33 10^6bits/s over 1.00 seconds Interim result: 7964.85 10^6bits/s over 1.00 seconds Interim result: 7784.14 10^6bits/s over 1.02 seconds Interim result: 8126.98 10^6bits/s over 1.01 seconds Interim result: 7999.08 10^6bits/s over 1.02 seconds Interim result: 7535.65 10^6bits/s over 1.06 seconds Interim result: 7934.04 10^6bits/s over 1.00 seconds Interim result: 7846.96 10^6bits/s over 1.01 seconds Interim result: 7813.65 10^6bits/s over 1.00 seconds Interim result: 7768.61 10^6bits/s over 1.01 seconds Interim result: 7905.22 10^6bits/s over 1.00 seconds Interim result: 8038.71 10^6bits/s over 1.00 seconds Interim result: 7233.60 10^6bits/s over 1.11 seconds Interim result: 7884.16 10^6bits/s over 1.00 seconds Interim result: 7711.27 10^6bits/s over 1.02 seconds Interim result: 7708.41 10^6bits/s over 1.00 seconds Interim result: 7150.47 10^6bits/s over 1.08 seconds Interim result: 7736.27 10^6bits/s over 1.00 seconds 8388608 16384 16384 30.00 7545.98 52.81 12.84 1.147 1.115 After turning off bzcopy: [root at dirk unit_tests]# echo 0 > /sys/module/ib_sdp/parameters/sdp_zcopy_thresh [root at dirk unit_tests]# netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.125 -D -l 30 -- -r 1000000 Interim result: 2198.71 10^6bits/s over 1.39 seconds Interim result: 3198.83 10^6bits/s over 1.00 seconds Interim result: 3547.62 10^6bits/s over 1.00 seconds Interim result: 7633.09 10^6bits/s over 1.00 seconds Interim result: 7647.69 10^6bits/s over 1.00 seconds Interim result: 7932.82 10^6bits/s over 1.00 seconds Interim result: 8027.73 10^6bits/s over 1.00 seconds Interim result: 7642.15 10^6bits/s over 1.05 seconds Interim result: 7304.84 10^6bits/s over 1.05 seconds Interim result: 7800.30 10^6bits/s over 1.02 seconds Interim result: 7808.03 10^6bits/s over 1.00 seconds Interim result: 7875.02 10^6bits/s over 1.00 seconds Interim result: 8031.41 10^6bits/s over 1.00 seconds Interim result: 7963.28 10^6bits/s over 1.01 seconds Interim result: 7842.06 10^6bits/s over 1.02 seconds Interim result: 7783.37 10^6bits/s over 1.01 seconds Interim result: 7994.32 10^6bits/s over 1.00 seconds Interim result: 7281.62 10^6bits/s over 1.10 seconds Interim result: 7478.55 10^6bits/s over 1.01 seconds Interim result: 7993.52 10^6bits/s over 1.00 seconds Interim result: 7914.57 10^6bits/s over 1.01 seconds Interim result: 7809.30 10^6bits/s over 1.01 seconds Interim result: 7996.93 10^6bits/s over 1.00 seconds Interim result: 7867.03 10^6bits/s over 1.02 seconds Interim result: 7880.61 10^6bits/s over 1.01 seconds Interim result: 8095.24 10^6bits/s over 1.00 seconds Interim result: 7942.04 10^6bits/s over 1.02 seconds Interim result: 7818.13 10^6bits/s over 1.02 seconds Interim result: 7301.62 10^6bits/s over 1.07 seconds 8388608 16384 16384 30.01 7228.62 52.21 13.03 1.183 1.181 So I see your results (sort of). I have been using the netperf that ships with the OS (Rhat4u4 and Rhat5 mostly) or is built with default options. Maybe that is the difference. Target in this case is SLES10 SP1 Real Time. I notice far less variance in my results than in what you have shown. We might have a Linux kernel scheduling issue that accounts for the variance you see (and I have seen). That would be wonderful news because the variance has been driving me crazy. ======= There is something very different in the netperf 2.4.4 I built to enable demo mode and the ones I have been using. Even after I rebuilt it without demo mode, using the 2.4.4 netserver as a target resulted in much lower numbers than I have seen before. There was also a problem with IPoIB tests reporting no bandwith (0.01 Mb/sec). Possibly due to MTU problems that were being ignored by earlier versions. Using the rebuilt (non-demo mode) 2.4.4 netperf code against a Rhat5 native netserver more or less replicates my previous results for everything except bzcopy bandwidth. The bzcopy bandwidth reported is really bad compared to everything else. Also ifconfig on the ib devices shows errors and dropped. I will look into the test code next chance I get. For now, it appears to me that the differences we are seeing in performance relate to netperf differences. I suspect that later versions of netperf are trying to optimize around transport geometries (device reported MTUs, transport reported buffer sizes, etc.) and SDP is probably not putting it's best buffer forward. If you could possible rerun some of your tests with the OS provided netperf (fresh OS install, fresh OFED install, simple command line test), I would be interested to know if it shows different behavior from what you have been reporting. -----Original Message----- From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Scott Weitzenkamp (sweitzen) Sent: Thursday, January 24, 2008 3:58 PM To: Jim Mott; Weikuan Yu Cc: general at lists.openfabrics.org Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh Jim, Like I've said before, I don't see any change in throughput with SDP zcopy, plus the throughput bounces around. When you run netperf with -D, do you see variations in throughput? Here's example from a dual socket Xeon 5355 quad core RHEL5 x86_64 system with ConnectX 2.3.0 firmware, the interim results throughput bounces around between 4-7 Gbps. [releng at svbu-qaclus-98 ~]$ cat /sys/module/ib_sdp/parameters/sdp_zcopy_thresh 16384 [releng at svbu-qaclus-98 ~]$ LD_PRELOAD=libsdp.so netperf241 -C -c -P 0 -t TCP_STREAM -H 192.168.1.127 -D -l 60 -- -m 1000000 Interim result: 6676.50 10^6bits/s over 1.00 seconds Interim result: 6674.47 10^6bits/s over 1.00 seconds Interim result: 6687.89 10^6bits/s over 1.00 seconds Interim result: 7075.40 10^6bits/s over 1.00 seconds Interim result: 7065.08 10^6bits/s over 1.00 seconds Interim result: 7074.69 10^6bits/s over 1.00 seconds Interim result: 6667.10 10^6bits/s over 1.06 seconds Interim result: 4492.29 10^6bits/s over 1.48 seconds Interim result: 4503.65 10^6bits/s over 1.00 seconds Interim result: 4481.25 10^6bits/s over 1.01 seconds Interim result: 4495.91 10^6bits/s over 1.00 seconds Interim result: 4521.51 10^6bits/s over 1.00 seconds Interim result: 4466.58 10^6bits/s over 1.01 seconds Interim result: 4482.09 10^6bits/s over 1.00 seconds Interim result: 4480.21 10^6bits/s over 1.00 seconds Interim result: 4490.07 10^6bits/s over 1.00 seconds Interim result: 4479.47 10^6bits/s over 1.00 seconds Interim result: 4480.30 10^6bits/s over 1.00 seconds Interim result: 4489.14 10^6bits/s over 1.00 seconds Interim result: 4484.38 10^6bits/s over 1.00 seconds Interim result: 4473.64 10^6bits/s over 1.00 seconds Interim result: 4479.71 10^6bits/s over 1.00 seconds Interim result: 4486.54 10^6bits/s over 1.00 seconds Interim result: 4456.65 10^6bits/s over 1.01 seconds Interim result: 4483.70 10^6bits/s over 1.00 seconds Interim result: 4486.41 10^6bits/s over 1.00 seconds Interim result: 4489.58 10^6bits/s over 1.00 seconds Interim result: 4478.15 10^6bits/s over 1.00 seconds Interim result: 4476.67 10^6bits/s over 1.00 seconds Interim result: 4496.49 10^6bits/s over 1.00 seconds Interim result: 4489.26 10^6bits/s over 1.00 seconds Interim result: 4479.86 10^6bits/s over 1.00 seconds Interim result: 4500.97 10^6bits/s over 1.00 seconds Interim result: 4473.96 10^6bits/s over 1.00 seconds Interim result: 7346.56 10^6bits/s over 1.00 seconds Interim result: 7524.94 10^6bits/s over 1.00 seconds Interim result: 7540.16 10^6bits/s over 1.00 seconds Interim result: 7553.53 10^6bits/s over 1.00 seconds Interim result: 7552.08 10^6bits/s over 1.00 seconds Interim result: 7550.08 10^6bits/s over 1.00 seconds Interim result: 7554.35 10^6bits/s over 1.00 seconds Interim result: 7550.85 10^6bits/s over 1.00 seconds Interim result: 7557.27 10^6bits/s over 1.00 seconds Interim result: 7568.28 10^6bits/s over 1.00 seconds Interim result: 7497.24 10^6bits/s over 1.01 seconds Interim result: 7436.44 10^6bits/s over 1.01 seconds Interim result: 6098.26 10^6bits/s over 1.22 seconds Interim result: 5644.82 10^6bits/s over 1.08 seconds Interim result: 5639.07 10^6bits/s over 1.00 seconds Interim result: 5636.32 10^6bits/s over 1.00 seconds Interim result: 5640.45 10^6bits/s over 1.00 seconds Interim result: 6319.06 10^6bits/s over 1.00 seconds Interim result: 7324.10 10^6bits/s over 1.00 seconds Interim result: 7323.53 10^6bits/s over 1.00 seconds Interim result: 7333.88 10^6bits/s over 1.00 seconds Interim result: 7172.70 10^6bits/s over 1.02 seconds Interim result: 4488.97 10^6bits/s over 1.60 seconds Interim result: 4492.37 10^6bits/s over 1.00 seconds 87380 16384 1000000 60.00 5701.15 17.41 16.26 2.001 1.870 Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Thursday, January 24, 2008 9:47 AM > To: Scott Weitzenkamp (sweitzen); Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > I am really puzzled. The majority of my testing has been between > Rhat4U4 and Rhat5. Using netperf command lines of the form: > netperf -C -c -P 0 -t TCP_RR -H 193.168.10.143 -l 60 ---r 64 > netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 > ---r 1000000 > and a process of: > - set sdp_zcopy_thresh=0, run bandwidth test > - set sdp_zcopy_thresh=size, run bandwidth test > I repeatedly get results that look like this: > size SDP Bzcopy > 65536 7375.00 7515.98 > 131072 7465.70 8105.58 > 1000000 6541.87 9948.76 > > These numbers are from high end (2-socket, quad-core) machines. When > you > use smaller machines, like the AMD dual-core shown below, the > differences > between SDP with and without bzcopy are more striking. > > The process to start the netserver is: > export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib > export LD_PRELOAD=libsdp.so > export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf > netserver > > The process to start the netperf is similar: > export LD_LIBRARY_PATH=/usr/local/ofed/lib64:/usr/local/ofed/lib > export LD_PRELOAD=libsdp.so > export LIBSDP_CONFIG_FILE=/etc/infiniband/libsdp.conf > netperf -C -c -P 0 -t TCP_STREAM -H 193.168.10.143 -l 60 > ---r 1000000 > > You and unload and reload ib_sdp between tests, but I just echo 0 and > echo size into sdp_zcopy_thresh on the sending side. Note that it is > in a different place on Rhat4u4 and Rhat5. > > My libsdp.conf is the default that ships with OFED. Stripping the > comments (grep -v), it is just: > log min-level 9 destination file libsdp.log > use both server * *:* > use both client * *:* > Note that if you build locally: > cd /tmp/openib_gen2/xxxx/ofa_1_3_dev_kernel > make install > the libsdp.conf file seems to get lost. You must restore it by > hand. > > I have a shell script that automates this testing for a > wide range of message sizes: > 64 128 512 1024 2048 4096 8192 16000 32768 65536 131072 1000000 > on multiple transports: > IP both "echo datagram > /sys/class/net/ib0/mode" > IP-CM both "echo connected > /sys/class/net/ib0/mode" > SDP both > Bzcopy TCP_STREAM > Where both is TCP_RR and TCP_STREAM testing. > > The variance in SDP bandwidth results can be 10%-15% between > runs. The > difference between Bzcopy and non-Bzcopy is always very > visible for 128K > and up tests though. > > Could some other people please try to run some of these > tests? If only > help me know if I am crazy? > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > Sent: Thursday, January 24, 2008 11:17 AM > To: Jim Mott; Weikuan Yu > Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > I've tested on RHEL4 and RHEL5, and see no sdp_zcopy_thresh > improvement > for any message size, as measured with netperf, for any Arbel or > ConnectX HCA. > > Scott > > > > -----Original Message----- > > From: Jim Mott [mailto:jim at mellanox.com] > > Sent: Thursday, January 24, 2008 7:57 AM > > To: Weikuan Yu; Scott Weitzenkamp (sweitzen) > > Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org > > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > > performance changes inOFED 1.3 beta, and I get Oops when > > enabling sdp_zcopy_thresh > > > > Hi, > > 64K is borderline for seeing bzcopy effect. Using an AMD > > 6000+ (3 Ghz > > dual core) in Asus M2A-VM motherboard with ConnectX running > > 2.3 firmware > > and OFED 1.3-rc3 stack running on 2.6.23.8 kernel.org kernel, > > I ran the > > test for 128K: > > 5546 sdp_zcopy_thresh=0 (off) > > 8709 sdp_zcopy_thresh=65536 > > > > For these tests, I just have LD_PRELOAD set in my environment. > > > > ======================= > > > > I see that TCP_MAXSEG is not being handled by libsdp and will > > look into > > it. > > > > > > [root at dirk ~]# modprobe ib_sdp > > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t > TCP_STREAM -c > > -C -- -m 128K > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > > 193.168.10.198 > > (193.168.10.198) port 0 AF_INET > > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > > Recv Send Send Utilization > Service > > Demand > > Socket Socket Message Elapsed Send Recv Send > > Recv > > Size Size Size Time Throughput local remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > > us/KB > > > > 87380 16384 131072 30.01 5545.69 51.47 14.43 1.521 > > 1.706 > > > > Alignment Offset Bytes Bytes Sends Bytes > > Recvs > > Local Remote Local Remote Xfered Per Per > > Send Recv Send Recv Send (avg) > Recv (avg) > > 8 8 0 0 2.08e+10 131072.00 158690 > 33135.60 > > 627718 > > > > Maximum > > Segment > > Size (bytes) > > -1 > > [root at dirk ~]# echo 65536 > > >/sys/module/ib_sdp/parameters/sdp_zcopy_thresh > > [root at dirk ~]# netperf -v2 -4 -H 193.168.10.198 -l 30 -t > TCP_STREAM -c > > -C -- -m 128K > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > > 193.168.10.198 > > (193.168.10.198) port 0 AF_INET > > netperf: get_tcp_info: getsockopt TCP_MAXSEG: errno 92 > > Recv Send Send Utilization > Service > > Demand > > Socket Socket Message Elapsed Send Recv Send > > Recv > > Size Size Size Time Throughput local remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > > us/KB > > > > 87380 16384 131072 30.01 8708.58 50.63 14.55 0.953 > > 1.095 > > > > Alignment Offset Bytes Bytes Sends Bytes > > Recvs > > Local Remote Local Remote Xfered Per Per > > Send Recv Send Recv Send (avg) > Recv (avg) > > 8 8 0 0 3.267e+10 131072.00 249228 > 26348.30 > > 1239807 > > > > Maximum > > Segment > > Size (bytes) > > -1 > > > > Thanks, > > JIm > > > > Jim Mott > > Mellanox Technologies Ltd. > > mail: jim at mellanox.com > > Phone: 512-294-5481 > > > > > > -----Original Message----- > > From: Weikuan Yu [mailto:weikuan.yu at gmail.com] > > Sent: Thursday, January 24, 2008 9:09 AM > > To: Scott Weitzenkamp (sweitzen) > > Cc: Jim Mott; ewg at lists.openfabrics.org; > general at lists.openfabrics.org > > Subject: Re: [ofa-general] RE: [ewg] Not seeing any SDP performance > > changes inOFED 1.3 beta, and I get Oops when enabling > sdp_zcopy_thresh > > > > Hi, Scott, > > > > I have been running SDP tests across two woodcrest nodes > with 4x DDR > > cards using OFED-1.2.5.4. The card/firmware info is below. > > > > CA 'mthca0' > > CA type: MT25208 > > Number of ports: 2 > > Firmware version: 5.1.400 > > Hardware version: a0 > > Node GUID: 0x0002c90200228e0c > > System image GUID: 0x0002c90200228e0f > > > > I could not get a bandwidth more than 5Gbps like you have > shown here. > > Wonder if I need to upgrade to the latest software or firmware? Any > > suggestions? > > > > Thanks, > > --Weikuan > > > > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > > 192.168.225.77 > > (192.168 > > .225.77) port 0 AF_INET > > Recv Send Send Utilization > > Service > > Demand > > Socket Socket Message Elapsed Send Recv Send > > Recv > > Size Size Size Time Throughput local > remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > > us/KB > > > > 131072 131072 131072 10.00 4918.95 21.29 24.99 > 1.418 > > 1.665 > > > > > > Scott Weitzenkamp (sweitzen) wrote: > > > Jim, > > > > > > I am trying OFED-1.3-20071231-0600 and RHEL4 x86_64 on a dual CPU > > > (single core each CPU) Xeon system. I do not see any performance > > > improvement (either throughput or CPU utilization) using > > netperf when > > I > > > set /sys/module/ib_sdp/sdp_zcopy_thresh to 16384. Can > you elaborate > > on > > > your HCA type, and performance improvement you see? > > > > > > Here's an example netperf command line when using a > Cheetah DDR HCA > > and > > > 1.2.917 firmware (I have also tried ConnectX and 2.3.000 firmware > > too): > > > > > > [releng at svbu-qa1850-2 ~]$ LD_PRELOAD=libsdp.so netperf241 > -v2 -4 -H > > > 192.168.1.201 -l 30 -t TCP_STREAM -c -C -- -m 65536 > > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > > 192.168.1.201 > > > (192.168.1.201) port 0 AF_INET : histogram : demo > > > > > > Recv Send Send Utilization > > Service > > > Demand > > > Socket Socket Message Elapsed Send > Recv Send > > > Recv > > > Size Size Size Time Throughput local > remote local > > > remote > > > bytes bytes bytes secs. 10^6bits/s % S % S > us/KB > > > us/KB > > > > > > 87380 16384 65536 30.01 7267.70 55.06 > 61.27 1.241 > > > 1.381 > > > > > > Alignment Offset Bytes Bytes Sends Bytes > > > Recvs > > > Local Remote Local Remote Xfered Per Per > > > Send Recv Send Recv Send (avg) > > Recv (avg) > > > 8 8 0 0 2.726e+10 65536.00 415942 > > 48106.01 > > > 566648 > > > > > > _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sweitzen at cisco.com Fri Jan 25 08:35:42 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Fri, 25 Jan 2008 08:35:42 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: <005301c85f5d$e03e36b0$a0baa410$@rr.com> References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> <005301c85f5d$e03e36b0$a0baa410$@rr.com> Message-ID: > So I see your results (sort of). I have been using the > netperf that ships with the OS (Rhat4u4 and Rhat5 mostly) or > is built with > default options. Maybe that is the difference. Jim, AFAIK Red Hat does not ship netperf with RHEL. Scott From vuhuong at mellanox.com Fri Jan 25 09:00:21 2008 From: vuhuong at mellanox.com (Vu Pham) Date: Fri, 25 Jan 2008 09:00:21 -0800 Subject: [ofa-general] Re: [PATCH] drivers/infiniband/ulp/srpt: Fix target data corruption In-Reply-To: <4787991F.mailCSZ16MSG5@systemfabricworks.com> References: <4787991F.mailCSZ16MSG5@systemfabricworks.com> Message-ID: <479A15A5.3000809@mellanox.com> davem at systemfabricworks.com wrote: > Change the local buffer allocator to use a spin-lock protected linked > list instead of an array of atomic_t used/free variables. The atomic_t > code was open to a multi-thread race between test and set. This has > been observed with the result that the same data buffer was used for > more than one SCSI operation, either writing the wrong data to the disk > or sending the wrong data to the initiator. > > Signed-off-by: Robert Pearson > Signed-off-by: David A. McMillen > Applied. Thanks From vuhuong at mellanox.com Fri Jan 25 09:00:36 2008 From: vuhuong at mellanox.com (Vu Pham) Date: Fri, 25 Jan 2008 09:00:36 -0800 Subject: [ofa-general] Re: [PATCH] drivers/infiniband/ulp/srpt: Fix target data corruption In-Reply-To: <4787991F.mailCSZ16MSG5@systemfabricworks.com> References: <4787991F.mailCSZ16MSG5@systemfabricworks.com> Message-ID: <479A15B4.7030309@mellanox.com> davem at systemfabricworks.com wrote: > Change the local buffer allocator to use a spin-lock protected linked > list instead of an array of atomic_t used/free variables. The atomic_t > code was open to a multi-thread race between test and set. This has > been observed with the result that the same data buffer was used for > more than one SCSI operation, either writing the wrong data to the disk > or sending the wrong data to the initiator. > > Signed-off-by: Robert Pearson > Signed-off-by: David A. McMillen > Applied. Thanks From breathesww4 at coop.de Fri Jan 25 09:09:54 2008 From: breathesww4 at coop.de (John Shafer) Date: Fri, 25 Jan 2008 17:09:54 +0000 Subject: [ofa-general] Please Her Tonite Message-ID: <961409766.26642929539348@coop.de> sup yall Openib just out of curiousity, do you want a h wf ug tnd e c mvl oc iba k? http://home.graffiti.net/sbcwjlq/ John Shafer -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Fri Jan 25 09:13:07 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 25 Jan 2008 09:13:07 -0800 Subject: [ofa-general] Re: [PATCH] ib/limthca: Remove an always true condition In-Reply-To: <4e6a6b3c0801250343y2b551817k4fb62bff9445542e@mail.gmail.com> (Eli Cohen's message of "Fri, 25 Jan 2008 13:43:53 +0200") References: <1201184628.6755.9.camel@mtls03> <4e6a6b3c0801250343y2b551817k4fb62bff9445542e@mail.gmail.com> Message-ID: > > why is first_free always >= 0? I don't see anything that guarantees > > that, > > The following two subsequent ifs gurantees that. > > if (ind < 0) { > err = -1; > *bad_wr = wr; > break; > } > > wqe = get_wqe(srq, ind); > next_ind = *wqe_to_link(wqe); > > if (next_ind < 0) { > err = -1; > *bad_wr = wr; > break; > } Duh... I missed that. Thanks for the clue. but now am I wrong to think that we could remove the first test of ind (not next_ind) in the fast path? the second test guarantees that ind never becomes negative, as you pointed out. - R. From swise at opengridcomputing.com Fri Jan 25 09:22:59 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 25 Jan 2008 11:22:59 -0600 Subject: [ofa-general] [Fwd: [PATCH 2.6.25] RDMA/cxgb3: Fix the T3A workaround checks.] Message-ID: <479A1AF3.9020808@opengridcomputing.com> Vlad, Please pull this patch from git://www.openfabrics.org/~swise/ofed-1.3 ofed_kernel This has been accepted upstream and is needed for ofed-1.3 to support new device types. Thanks, Steve. -------- Original Message -------- Subject: [PATCH 2.6.25] RDMA/cxgb3: Fix the T3A workaround checks. Date: Thu, 24 Jan 2008 16:30:16 -0600 From: Steve Wise To: rdreier at cisco.com CC: netdev at vger.kernel.org, linux-kernel at vger.kernel.org, general at lists.openfabrics.org RDMA/cxgb3: Fix the T3A workaround checks. Correctly work around T3A issues by checking "hwtype != T3A" instead of "hwtype == T3B". Needed for new hw types. Signed-off-by: Steve Wise --- drivers/infiniband/hw/cxgb3/cxio_hal.c | 2 +- drivers/infiniband/hw/cxgb3/iwch_cm.c | 4 ++-- drivers/infiniband/hw/cxgb3/iwch_provider.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c index eec6a30..e220b44 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_hal.c +++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c @@ -179,7 +179,7 @@ int cxio_create_cq(struct cxio_rdev *rdev_p, struct t3_cq *cq) setup.size = 1UL << cq->size_log2; setup.credits = 65535; setup.credit_thres = 1; - if (rdev_p->t3cdev_p->type == T3B) + if (rdev_p->t3cdev_p->type != T3A) setup.ovfl_mode = 0; else setup.ovfl_mode = 1; diff --git a/drivers/infiniband/hw/cxgb3/iwch_cm.c b/drivers/infiniband/hw/cxgb3/iwch_cm.c index 20ba372..f8cb0fe 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_cm.c +++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c @@ -1118,7 +1118,7 @@ static int act_open_rpl(struct t3cdev *tdev, struct sk_buff *skb, void *ctx) status2errno(rpl->status)); connect_reply_upcall(ep, status2errno(rpl->status)); state_set(&ep->com, DEAD); - if (ep->com.tdev->type == T3B && act_open_has_tid(rpl->status)) + if (ep->com.tdev->type != T3A && act_open_has_tid(rpl->status)) release_tid(ep->com.tdev, GET_TID(rpl), NULL); cxgb3_free_atid(ep->com.tdev, ep->atid); dst_release(ep->dst); @@ -1249,7 +1249,7 @@ static void reject_cr(struct t3cdev *tdev, u32 hwtid, __be32 peer_ip, skb_trim(skb, sizeof(struct cpl_tid_release)); skb_get(skb); - if (tdev->type == T3B) + if (tdev->type != T3A) release_tid(tdev, hwtid, skb); else { struct cpl_pass_accept_rpl *rpl; diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index 69b1204..df1838f 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c @@ -646,7 +646,7 @@ static struct ib_mr *iwch_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, if (err) goto err; - if (udata && t3b_device(rhp)) { + if (udata && !t3a_device(rhp)) { uresp.pbl_addr = (mhp->attr.pbl_addr - rhp->rdev.rnic_info.pbl_base) >> 3; PDBG("%s user resp pbl_addr 0x%x\n", __FUNCTION__, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo at vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ From jeff at splitrockpr.com Fri Jan 25 10:21:22 2008 From: jeff at splitrockpr.com (Jeffrey Scott) Date: Fri, 25 Jan 2008 10:21:22 -0800 Subject: [ofa-general] invitation to OFA Sonoma Workshop Message-ID: <80283F0A16C147E695D4767FD71843D9@Gaucho> The OFA is hosting the 4th Annual International Sonoma Workshop from April 7-9 at The Lodge at Sonoma. To kickoff the event, OFA is hosting a cocktail reception on the evening of April 6. Registration for the Sonoma Workshop costs $595. An Early Bird discount is available through February 29. The discounted rate is $495. You may register for the event and book a hotel room here . An invitation to the event is attached. Please forward this to co-workers, professional acquaintances, and anyone else who might be interested in attending the event. ----------------------------------- Jeffrey Scott Split Rock Communications 408-884-4017 408-348-3651 Mobile 408-884-3900 Fax www.SplitRockPR.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Sonoma Invite 2008.pdf Type: application/pdf Size: 662978 bytes Desc: not available URL: From swise at opengridcomputing.com Fri Jan 25 10:58:35 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 25 Jan 2008 12:58:35 -0600 Subject: [ofa-general] help on /sys/class structure and rdma In-Reply-To: References: <4799064D.4060705@opengridcomputing.com> Message-ID: <479A315B.4010900@opengridcomputing.com> Roland Dreier wrote: > I don't have any file like that in my system either. Not sure where > it might be coming from for you. > > However, I do have /sys/class/infiniband_verbs/uverbs1/ibdev (which > should be present everywhere), and that has "cxgb3_0" in it, which you > could use to get to /sys/class/infiniband/cxgb3_0/. > > libibverbs does set up ibv_device.ibdev_path with that path, but it > happens too late for your library initialization function > unfortunately. So you can either wait until someone tries to create a > context and fail/warn then about firmware version, or you can > duplicate the libibverbs code to come up with that ibdev_path. > > - R. How does this look? > /* > * Verify that the firmware major number matches. Major number > * mismatches are fatal. Minor number mismatches are tolerated. > */ > if (ibv_read_sysfs_file(uverbs_sys_path, "ibdev", > ibdev, sizeof ibdev) < 0) > return NULL; > > memset(devstr, 0, sizeof devstr); > snprintf(devstr, sizeof devstr, "%s/class/infiniband/%s", > ibv_get_sysfs_path(), ibdev); > if (ibv_read_sysfs_file(devstr, "fw_ver", value, sizeof value) < 0) > return NULL; > From jim at mellanox.com Fri Jan 25 11:09:15 2008 From: jim at mellanox.com (Jim Mott) Date: Fri, 25 Jan 2008 11:09:15 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> <005301c85f5d$e03e36b0$a0baa410$@rr.com> Message-ID: Right you are (as usual). Hunting around these systems shows that I have been using netperf-2.4.3 for testing. No configuration options; just ./configure; make; make install. To try and understand version differences, I installed 2.4.1 (your version?), 2.4.3, and 2.4.4. Built them with default options and ran the tests using each. Using netperf-2.4.1 and reran "netperf -v2 -4 -H 193.168.10.143 -l 30 -t TCP_STREAM -c -C -- -m size" with target AMD and driver as 8-processor Intel: 64K 128K 1M SDP 7749.66 6925.68 6281.17 BZCOPY 8492.85 9867.06 11105.50 I tried running these tests a few times and saw a lot of variance in the reported results. Reloading 2.4.3 and running the same tests: 64K 128K 1M SDP 7553.77 6747.58 5986.42 BZCOPY 8839.46 9572.49 10654.52 and finally, I tried 2.4.4 and running the same tests: 64K 128K 1M SDP 7935.97 6325.69 7682.65 BZCOPY 8905.94 9935.45 10615.03 At this point, I am confused. The difference between SDP with and without Bzcopy is obvious in all three sets of numbers. I can not explain why you see something different. If you could try a vanilla netperf build, it would be interesting to see if you get any different results. Thanks, JIm Jim Mott Mellanox Technologies Ltd. mail: jim at mellanox.com Phone: 512-294-5481 -----Original Message----- From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] Sent: Friday, January 25, 2008 10:36 AM To: Jim Mott; Jim Mott; Weikuan Yu Cc: general at lists.openfabrics.org Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > So I see your results (sort of). I have been using the > netperf that ships with the OS (Rhat4u4 and Rhat5 mostly) or > is built with > default options. Maybe that is the difference. Jim, AFAIK Red Hat does not ship netperf with RHEL. Scott From fenkes at de.ibm.com Fri Jan 25 12:11:11 2008 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Fri, 25 Jan 2008 21:11:11 +0100 Subject: [ofa-general] [PATCH 0/2] IB/ehca: PMA support and a minor fix Message-ID: <200801252111.11915.fenkes@de.ibm.com> This patchset will fix a minor issue and then add support for Performance MADs, which redirects all PMA queries to the actual PMA QP. [1/2] adds a missing query_pma_attr() [2/2] adds PMA redirection code The patches will apply, in order, on top of Roland's for-2.6.25 branch. Please review them and apply for 2.6.25 if you think they're okay. Thanks and regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: fenkes at de.ibm.com From fenkes at de.ibm.com Fri Jan 25 12:12:39 2008 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Fri, 25 Jan 2008 21:12:39 +0100 Subject: [ofa-general] [PATCH 1/2] IB/ehca: Update sma_attr also in case of disruptive config change In-Reply-To: <200801252111.11915.fenkes@de.ibm.com> References: <200801252111.11915.fenkes@de.ibm.com> Message-ID: <200801252112.39557.fenkes@de.ibm.com> Signed-off-by: Joachim Fenkes --- drivers/infiniband/hw/ehca/ehca_irq.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c index 863b34f..b5ca94c 100644 --- a/drivers/infiniband/hw/ehca/ehca_irq.c +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -403,6 +403,8 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe) sport->port_state = IB_PORT_ACTIVE; dispatch_port_event(shca, port, IB_EVENT_PORT_ACTIVE, "is active"); + ehca_query_sma_attr(shca, port, + &sport->saved_attr); } else notify_port_conf_change(shca, port); break; -- 1.5.2 From fenkes at de.ibm.com Fri Jan 25 12:18:27 2008 From: fenkes at de.ibm.com (Joachim Fenkes) Date: Fri, 25 Jan 2008 21:18:27 +0100 Subject: [ofa-general] [PATCH 2/2] IB/ehca: Add PMA support In-Reply-To: <200801252111.11915.fenkes@de.ibm.com> References: <200801252111.11915.fenkes@de.ibm.com> Message-ID: <200801252118.28122.fenkes@de.ibm.com> From: Hoang-Nam Nguyen This patch enables ehca to redirect any PMA queries to the actual PMA QP. Signed-off-by: Hoang-Nam Nguyen Reviewed-by: Joachim Fenkes Reviewed-by: Christoph Raisch --- drivers/infiniband/hw/ehca/ehca_classes.h | 1 + drivers/infiniband/hw/ehca/ehca_iverbs.h | 5 ++ drivers/infiniband/hw/ehca/ehca_main.c | 2 +- drivers/infiniband/hw/ehca/ehca_sqp.c | 91 +++++++++++++++++++++++++++++ 4 files changed, 98 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index f281d16..92cce8a 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -101,6 +101,7 @@ struct ehca_sport { spinlock_t mod_sqp_lock; enum ib_port_state port_state; struct ehca_sma_attr saved_attr; + u32 pma_qp_nr; }; #define HCA_CAP_MR_PGSIZE_4K 0x80000000 diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h index c469bfd..a8a2ea5 100644 --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h @@ -187,6 +187,11 @@ int ehca_dealloc_ucontext(struct ib_ucontext *context); int ehca_mmap(struct ib_ucontext *context, struct vm_area_struct *vma); +int ehca_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, + struct ib_wc *in_wc, struct ib_grh *in_grh, + struct ib_mad *in_mad, + struct ib_mad *out_mad); + void ehca_poll_eqs(unsigned long data); int ehca_calc_ipd(struct ehca_shca *shca, int port, diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 0fe0c84..33b5bac 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -472,7 +472,7 @@ int ehca_init_device(struct ehca_shca *shca) shca->ib_device.dealloc_fmr = ehca_dealloc_fmr; shca->ib_device.attach_mcast = ehca_attach_mcast; shca->ib_device.detach_mcast = ehca_detach_mcast; - /* shca->ib_device.process_mad = ehca_process_mad; */ + shca->ib_device.process_mad = ehca_process_mad; shca->ib_device.mmap = ehca_mmap; if (EHCA_BMASK_GET(HCA_CAP_SRQ, shca->hca_cap)) { diff --git a/drivers/infiniband/hw/ehca/ehca_sqp.c b/drivers/infiniband/hw/ehca/ehca_sqp.c index 79e72b2..706d97a 100644 --- a/drivers/infiniband/hw/ehca/ehca_sqp.c +++ b/drivers/infiniband/hw/ehca/ehca_sqp.c @@ -39,12 +39,18 @@ * POSSIBILITY OF SUCH DAMAGE. */ +#include #include "ehca_classes.h" #include "ehca_tools.h" #include "ehca_iverbs.h" #include "hcp_if.h" +#define IB_MAD_STATUS_REDIRECT __constant_htons(0x0002) +#define IB_MAD_STATUS_UNSUP_VERSION __constant_htons(0x0004) +#define IB_MAD_STATUS_UNSUP_METHOD __constant_htons(0x0008) + +#define IB_PMA_CLASS_PORT_INFO __constant_htons(0x0001) /** * ehca_define_sqp - Defines special queue pair 1 (GSI QP). When special queue @@ -83,6 +89,9 @@ u64 ehca_define_sqp(struct ehca_shca *shca, port, ret); return ret; } + shca->sport[port - 1].pma_qp_nr = pma_qp_nr; + ehca_dbg(&shca->ib_device, "port=%x pma_qp_nr=%x", + port, pma_qp_nr); break; default: ehca_err(&shca->ib_device, "invalid qp_type=%x", @@ -109,3 +118,85 @@ u64 ehca_define_sqp(struct ehca_shca *shca, return H_SUCCESS; } + +struct ib_perf { + struct ib_mad_hdr mad_hdr; + u8 reserved[40]; + u8 data[192]; +} __attribute__ ((packed)); + + +static int ehca_process_perf(struct ib_device *ibdev, u8 port_num, + struct ib_mad *in_mad, struct ib_mad *out_mad) +{ + struct ib_perf *in_perf = (struct ib_perf *)in_mad; + struct ib_perf *out_perf = (struct ib_perf *)out_mad; + struct ib_class_port_info *poi = + (struct ib_class_port_info *)out_perf->data; + struct ehca_shca *shca = + container_of(ibdev, struct ehca_shca, ib_device); + struct ehca_sport *sport = &shca->sport[port_num - 1]; + + ehca_dbg(ibdev, "method=%x", in_perf->mad_hdr.method); + + *out_mad = *in_mad; + + if (in_perf->mad_hdr.class_version != 1) { + ehca_warn(ibdev, "Unsupported class_version=%x", + in_perf->mad_hdr.class_version); + out_perf->mad_hdr.status = IB_MAD_STATUS_UNSUP_VERSION; + goto perf_reply; + } + + switch (in_perf->mad_hdr.method) { + case IB_MGMT_METHOD_GET: + case IB_MGMT_METHOD_SET: + /* set class port info for redirection */ + out_perf->mad_hdr.attr_id = IB_PMA_CLASS_PORT_INFO; + out_perf->mad_hdr.status = IB_MAD_STATUS_REDIRECT; + memset(poi, 0, sizeof(*poi)); + poi->base_version = 1; + poi->class_version = 1; + poi->resp_time_value = 18; + poi->redirect_lid = sport->saved_attr.lid; + poi->redirect_qp = sport->pma_qp_nr; + poi->redirect_qkey = IB_QP1_QKEY; + poi->redirect_pkey = IB_DEFAULT_PKEY_FULL; + + ehca_dbg(ibdev, "ehca_pma_lid=%x ehca_pma_qp=%x", + sport->saved_attr.lid, sport->pma_qp_nr); + break; + + case IB_MGMT_METHOD_GET_RESP: + return IB_MAD_RESULT_FAILURE; + + default: + out_perf->mad_hdr.status = IB_MAD_STATUS_UNSUP_METHOD; + break; + } + +perf_reply: + out_perf->mad_hdr.method = IB_MGMT_METHOD_GET_RESP; + + return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY; +} + +int ehca_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, + struct ib_wc *in_wc, struct ib_grh *in_grh, + struct ib_mad *in_mad, + struct ib_mad *out_mad) +{ + int ret; + + if (!port_num || port_num > ibdev->phys_port_cnt) + return IB_MAD_RESULT_FAILURE; + + /* accept only pma request */ + if (in_mad->mad_hdr.mgmt_class != IB_MGMT_CLASS_PERF_MGMT) + return IB_MAD_RESULT_SUCCESS; + + ehca_dbg(ibdev, "port_num=%x src_qp=%x", port_num, in_wc->src_qp); + ret = ehca_process_perf(ibdev, port_num, in_mad, out_mad); + + return ret; +} -- 1.5.2 From rajouri.jammu at gmail.com Fri Jan 25 12:54:20 2008 From: rajouri.jammu at gmail.com (Rajouri Jammu) Date: Fri, 25 Jan 2008 12:54:20 -0800 Subject: [ofa-general] Zero byte rdma read causes REM_OP_ERROR In-Reply-To: <4799D9F7.4030607@dev.mellanox.co.il> References: <3307cdf90801242007y3ace39ccrb72d5f35c3a937e4@mail.gmail.com> <4799D9F7.4030607@dev.mellanox.co.il> Message-ID: <3307cdf90801251254p5983b62x687549bb793db39d@mail.gmail.com> I'm using rdma_cm and I don't set the qp_access_flags explicitly. I presume they are set correctly since non-zero length rdma reads complete successfully. I have also verified the data. the only place I set the privileges is when registering the memory region and I have them set at IBV_ACCESS_LOCAL_WRITE, _REMOTE_READ and _REMOTE_WRITE On Jan 25, 2008 4:45 AM, Dotan Barak wrote: > Rajouri Jammu wrote: > > When I try doing a zero byte rdma read I get a 10 > > (IB_WC_REM_ACCESS_ERR ) error completion. > > > > Is that expected? > > > > Non-zero byte reads complete successfully. > > > > I'm using OFED-1.2.5.4. > > > Did you enabled RDMA Read in the qp_access_flags (in modify QP > RESET->INIT)? > > Dotan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zingel at rafaelgonzales.com Fri Jan 25 13:30:22 2008 From: zingel at rafaelgonzales.com (Nora Wong) Date: Fri, 25 Jan 2008 22:30:22 +0100 Subject: [ofa-general] Mlcrosoft 0ff!ce2007 for XP|Vlsta 79, Retail 899 (save 819) Message-ID: <000801c85f98$638e5880$0100007f@mojfn> adobe after effects cs3 - 69 ulead mediastudio pro v8.0 with extras - 79 adobe photoshop cs2 v 9.0 - 69 grand theft auto: san andreas - 29 apollo divx2dvd divx to dvd creator v3.3.0 - 29 intuit quicken premier 2008 - 29 ms xp professional with sp2 - 49 microsoft exchange server enterprise 2003 - 59 autodesk aliasstudio 2008 - 109 v!slt *microsoft2008sale. com* in Internet browser From sweitzen at cisco.com Fri Jan 25 13:39:13 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Fri, 25 Jan 2008 13:39:13 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> <005301c85f5d$e03e36b0$a0baa410$@rr.com> Message-ID: Jim, what kernel and HCA are these numbers for? Scott > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Friday, January 25, 2008 11:09 AM > To: Scott Weitzenkamp (sweitzen); Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > Right you are (as usual). > > Hunting around these systems shows that I have been using > netperf-2.4.3 > for testing. No configuration options; just ./configure; make; make > install. > > To try and understand version differences, I installed 2.4.1 (your > version?), 2.4.3, and 2.4.4. Built them with default options and ran > the tests using each. > > Using netperf-2.4.1 and reran "netperf -v2 -4 -H > 193.168.10.143 -l 30 -t > TCP_STREAM -c -C -- -m size" with target AMD and driver as > 8-processor > Intel: > > 64K 128K 1M > SDP 7749.66 6925.68 6281.17 > BZCOPY 8492.85 9867.06 11105.50 > > I tried running these tests a few times and saw a lot of > variance in the > reported results. Reloading 2.4.3 and running the same tests: > > 64K 128K 1M > SDP 7553.77 6747.58 5986.42 > BZCOPY 8839.46 9572.49 10654.52 > > and finally, I tried 2.4.4 and running the same tests: > > 64K 128K 1M > SDP 7935.97 6325.69 7682.65 > BZCOPY 8905.94 9935.45 10615.03 > > At this point, I am confused. The difference between SDP with and > without Bzcopy is obvious in all three sets of numbers. I can not > explain why you see something different. > > If you could try a vanilla netperf build, it would be > interesting to see > if you get any different results. > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > Sent: Friday, January 25, 2008 10:36 AM > To: Jim Mott; Jim Mott; Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > > So I see your results (sort of). I have been using the > > netperf that ships with the OS (Rhat4u4 and Rhat5 mostly) or > > is built with > > default options. Maybe that is the difference. > > Jim, AFAIK Red Hat does not ship netperf with RHEL. > > Scott > From jim at mellanox.com Fri Jan 25 13:57:40 2008 From: jim at mellanox.com (Jim Mott) Date: Fri, 25 Jan 2008 13:57:40 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> <005301c85f5d$e03e36b0$a0baa410$@rr.com> Message-ID: Receive side: - 2.6.23.8 kernel.org kernel on Rhat5 distro - HCA is MLX4 with 2.3.914 I get the same number on released 2.3 firmware Send side: - 2.6.9-42.ELsmp x86_64 (Rhat4u4) - HCA is MLX4 with 2.3.914 I get the same trends (SDP < BZCOPY if message_size > 64K) on unmodifed Rhat5, Rhat4u4, and SLES10-SP1-RT distros. I also see it on kernel.org kernels 2.6.23.12, 2.6.24-rc2, 2.6.23, and 2.6.22.9. I am in the midst of testing some things, so I do not have all the machines available right now to repeat most of the tests though. Thanks, JIm Jim Mott Mellanox Technologies Ltd. mail: jim at mellanox.com Phone: 512-294-5481 -----Original Message----- From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] Sent: Friday, January 25, 2008 3:39 PM To: Jim Mott; Weikuan Yu Cc: general at lists.openfabrics.org Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh Jim, what kernel and HCA are these numbers for? Scott > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Friday, January 25, 2008 11:09 AM > To: Scott Weitzenkamp (sweitzen); Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > Right you are (as usual). > > Hunting around these systems shows that I have been using > netperf-2.4.3 > for testing. No configuration options; just ./configure; make; make > install. > > To try and understand version differences, I installed 2.4.1 (your > version?), 2.4.3, and 2.4.4. Built them with default options and ran > the tests using each. > > Using netperf-2.4.1 and reran "netperf -v2 -4 -H > 193.168.10.143 -l 30 -t > TCP_STREAM -c -C -- -m size" with target AMD and driver as > 8-processor > Intel: > > 64K 128K 1M > SDP 7749.66 6925.68 6281.17 > BZCOPY 8492.85 9867.06 11105.50 > > I tried running these tests a few times and saw a lot of > variance in the > reported results. Reloading 2.4.3 and running the same tests: > > 64K 128K 1M > SDP 7553.77 6747.58 5986.42 > BZCOPY 8839.46 9572.49 10654.52 > > and finally, I tried 2.4.4 and running the same tests: > > 64K 128K 1M > SDP 7935.97 6325.69 7682.65 > BZCOPY 8905.94 9935.45 10615.03 > > At this point, I am confused. The difference between SDP with and > without Bzcopy is obvious in all three sets of numbers. I can not > explain why you see something different. > > If you could try a vanilla netperf build, it would be > interesting to see > if you get any different results. > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > Sent: Friday, January 25, 2008 10:36 AM > To: Jim Mott; Jim Mott; Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > > So I see your results (sort of). I have been using the > > netperf that ships with the OS (Rhat4u4 and Rhat5 mostly) or > > is built with > > default options. Maybe that is the difference. > > Jim, AFAIK Red Hat does not ship netperf with RHEL. > > Scott > From jim at mellanox.com Fri Jan 25 14:07:20 2008 From: jim at mellanox.com (Jim Mott) Date: Fri, 25 Jan 2008 14:07:20 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> <005301c85f5d$e03e36b0$a0baa410$@rr.com> Message-ID: Not today, but I will give it a shot next time I get a free machine. I have tested between Rhat4u4 MLX4 and Rhat4u4 mthca and seen the same trend though. Thanks, JIm Jim Mott Mellanox Technologies Ltd. mail: jim at mellanox.com Phone: 512-294-5481 -----Original Message----- From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] Sent: Friday, January 25, 2008 4:03 PM To: Jim Mott; Weikuan Yu Cc: general at lists.openfabrics.org Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh Is there any way you can make sender and receiver the same RHEL kernel? > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Friday, January 25, 2008 1:58 PM > To: Scott Weitzenkamp (sweitzen); Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > Receive side: > - 2.6.23.8 kernel.org kernel on Rhat5 distro > - HCA is MLX4 with 2.3.914 > I get the same number on released 2.3 firmware > > Send side: > - 2.6.9-42.ELsmp x86_64 (Rhat4u4) > - HCA is MLX4 with 2.3.914 > > I get the same trends (SDP < BZCOPY if message_size > 64K) on > unmodifed > Rhat5, Rhat4u4, and SLES10-SP1-RT distros. I also see it on > kernel.org > kernels 2.6.23.12, 2.6.24-rc2, 2.6.23, and 2.6.22.9. I am in > the midst > of testing some things, so I do not have all the machines available > right now to repeat most of the tests though. > > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > Sent: Friday, January 25, 2008 3:39 PM > To: Jim Mott; Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > Jim, what kernel and HCA are these numbers for? > > Scott > > > > > -----Original Message----- > > From: Jim Mott [mailto:jim at mellanox.com] > > Sent: Friday, January 25, 2008 11:09 AM > > To: Scott Weitzenkamp (sweitzen); Weikuan Yu > > Cc: general at lists.openfabrics.org > > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > > performance changes inOFED 1.3 beta, and I get Oops when > > enabling sdp_zcopy_thresh > > > > Right you are (as usual). > > > > Hunting around these systems shows that I have been using > > netperf-2.4.3 > > for testing. No configuration options; just ./configure; make; make > > install. > > > > To try and understand version differences, I installed 2.4.1 (your > > version?), 2.4.3, and 2.4.4. Built them with default > options and ran > > the tests using each. > > > > Using netperf-2.4.1 and reran "netperf -v2 -4 -H > > 193.168.10.143 -l 30 -t > > TCP_STREAM -c -C -- -m size" with target AMD and driver as > > 8-processor > > Intel: > > > > 64K 128K 1M > > SDP 7749.66 6925.68 6281.17 > > BZCOPY 8492.85 9867.06 11105.50 > > > > I tried running these tests a few times and saw a lot of > > variance in the > > reported results. Reloading 2.4.3 and running the same tests: > > > > 64K 128K 1M > > SDP 7553.77 6747.58 5986.42 > > BZCOPY 8839.46 9572.49 10654.52 > > > > and finally, I tried 2.4.4 and running the same tests: > > > > 64K 128K 1M > > SDP 7935.97 6325.69 7682.65 > > BZCOPY 8905.94 9935.45 10615.03 > > > > At this point, I am confused. The difference between SDP with and > > without Bzcopy is obvious in all three sets of numbers. I can not > > explain why you see something different. > > > > If you could try a vanilla netperf build, it would be > > interesting to see > > if you get any different results. > > > > Thanks, > > JIm > > > > Jim Mott > > Mellanox Technologies Ltd. > > mail: jim at mellanox.com > > Phone: 512-294-5481 > > > > > > -----Original Message----- > > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > > Sent: Friday, January 25, 2008 10:36 AM > > To: Jim Mott; Jim Mott; Weikuan Yu > > Cc: general at lists.openfabrics.org > > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > > changes inOFED 1.3 beta, and I get Oops when enabling > sdp_zcopy_thresh > > > > > So I see your results (sort of). I have been using the > > > netperf that ships with the OS (Rhat4u4 and Rhat5 mostly) or > > > is built with > > > default options. Maybe that is the difference. > > > > Jim, AFAIK Red Hat does not ship netperf with RHEL. > > > > Scott > > > From sweitzen at cisco.com Fri Jan 25 14:02:39 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Fri, 25 Jan 2008 14:02:39 -0800 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <000001c8338c$206e80d0$614b8270$@rr.com> <4798A9F5.7030109@gmail.com> <005301c85f5d$e03e36b0$a0baa410$@rr.com> Message-ID: Is there any way you can make sender and receiver the same RHEL kernel? > -----Original Message----- > From: Jim Mott [mailto:jim at mellanox.com] > Sent: Friday, January 25, 2008 1:58 PM > To: Scott Weitzenkamp (sweitzen); Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > performance changes inOFED 1.3 beta, and I get Oops when > enabling sdp_zcopy_thresh > > Receive side: > - 2.6.23.8 kernel.org kernel on Rhat5 distro > - HCA is MLX4 with 2.3.914 > I get the same number on released 2.3 firmware > > Send side: > - 2.6.9-42.ELsmp x86_64 (Rhat4u4) > - HCA is MLX4 with 2.3.914 > > I get the same trends (SDP < BZCOPY if message_size > 64K) on > unmodifed > Rhat5, Rhat4u4, and SLES10-SP1-RT distros. I also see it on > kernel.org > kernels 2.6.23.12, 2.6.24-rc2, 2.6.23, and 2.6.22.9. I am in > the midst > of testing some things, so I do not have all the machines available > right now to repeat most of the tests though. > > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > Sent: Friday, January 25, 2008 3:39 PM > To: Jim Mott; Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > Jim, what kernel and HCA are these numbers for? > > Scott > > > > > -----Original Message----- > > From: Jim Mott [mailto:jim at mellanox.com] > > Sent: Friday, January 25, 2008 11:09 AM > > To: Scott Weitzenkamp (sweitzen); Weikuan Yu > > Cc: general at lists.openfabrics.org > > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP > > performance changes inOFED 1.3 beta, and I get Oops when > > enabling sdp_zcopy_thresh > > > > Right you are (as usual). > > > > Hunting around these systems shows that I have been using > > netperf-2.4.3 > > for testing. No configuration options; just ./configure; make; make > > install. > > > > To try and understand version differences, I installed 2.4.1 (your > > version?), 2.4.3, and 2.4.4. Built them with default > options and ran > > the tests using each. > > > > Using netperf-2.4.1 and reran "netperf -v2 -4 -H > > 193.168.10.143 -l 30 -t > > TCP_STREAM -c -C -- -m size" with target AMD and driver as > > 8-processor > > Intel: > > > > 64K 128K 1M > > SDP 7749.66 6925.68 6281.17 > > BZCOPY 8492.85 9867.06 11105.50 > > > > I tried running these tests a few times and saw a lot of > > variance in the > > reported results. Reloading 2.4.3 and running the same tests: > > > > 64K 128K 1M > > SDP 7553.77 6747.58 5986.42 > > BZCOPY 8839.46 9572.49 10654.52 > > > > and finally, I tried 2.4.4 and running the same tests: > > > > 64K 128K 1M > > SDP 7935.97 6325.69 7682.65 > > BZCOPY 8905.94 9935.45 10615.03 > > > > At this point, I am confused. The difference between SDP with and > > without Bzcopy is obvious in all three sets of numbers. I can not > > explain why you see something different. > > > > If you could try a vanilla netperf build, it would be > > interesting to see > > if you get any different results. > > > > Thanks, > > JIm > > > > Jim Mott > > Mellanox Technologies Ltd. > > mail: jim at mellanox.com > > Phone: 512-294-5481 > > > > > > -----Original Message----- > > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > > Sent: Friday, January 25, 2008 10:36 AM > > To: Jim Mott; Jim Mott; Weikuan Yu > > Cc: general at lists.openfabrics.org > > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > > changes inOFED 1.3 beta, and I get Oops when enabling > sdp_zcopy_thresh > > > > > So I see your results (sort of). I have been using the > > > netperf that ships with the OS (Rhat4u4 and Rhat5 mostly) or > > > is built with > > > default options. Maybe that is the difference. > > > > Jim, AFAIK Red Hat does not ship netperf with RHEL. > > > > Scott > > > From rdreier at cisco.com Fri Jan 25 14:22:42 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 25 Jan 2008 14:22:42 -0800 Subject: [ofa-general] [GIT PULL] First InfiniBand/RDMA merge Message-ID: Linus, if you haven't headed off to the airport, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus To get the first batch of InfiniBand/RDMA merges: Adrian Bunk (1): IB/mthca: Remove MSI support as scheduled Anton Blanchard (1): IB/ehca: Use round_jiffies() for EQ polling timer Arthur Jones (2): IB/ipath: Better comment for rmb() in ipath_intr() IB/ipath: Add ipath_read_ireg() abstraction Dave Olson (9): IB/ipath: Improve interrupt handler cache footprint IB/ipath: Generalize some xxx_SHIFT macros IB/ipath: Changes for fields moving from devdata to portdata IB/ipath: Clean up some comments IB/ipath: Drop support for the original QHT7040 board IB/ipath: Remove unused MDIO interface code IB/ipath: Add new chip-specific functions to older chips, consistent init IB/ipath: Minor cleanup of unused fields and chip-specific errors IB/ipath: Changes to support PIO bandwidth check on IBA7220 David Dillow (3): IB/srp: Respect target credit limit IB/srp: Enable SG list chaining IB/srp: Add identifying information to log messages Erez Zilber (3): IB/iser: update URLs of iSER docs IB/iser: Print information about unhandled RDMA CM events IB/iser: Add change_queue_depth method Hoang-Nam Nguyen (4): IB/ehca: Forward event client-reregister-required to registered clients IB/ehca: Remove CQ-QP-link before destroying QP in error path of create_qp() IB/ehca: Define array to store SMI/GSI QPs IB/ehca: Add "port connection autodetect mode" Jack Morgenstein (1): mlx4_core: Fix max_eqs masking in QUERY_DEV_CAP Jan Engelhardt (2): IPoIB: Constify seq_operations function pointer tables IB/ipath: Remove unnecessary cast Joachim Fenkes (1): IB/ehca: Prevent RDMA-related connection failures on some eHCA2 hardware Joe Perches (2): drivers/infiniband: Add missing "space" IB: Spelling fixes in comments John Gregor (1): IB/ipath: Fix sendctrl locking Krishna Kumar (1): IPoIB: Remove redundant check of netif_queue_stopped() in xmit handler Matthias Kaehlcke (1): IB/ipath: Convert ipath_eep_sem semaphore to a mutex Michael Albaugh (1): IB/ipath: New sysfs entries to control 7220 features Nick Piggin (1): IB/ipath: Convert from .nopage to .fault Olaf Kirch (2): IB/fmr_pool: Flush serial numbers can get out of sync IB/fmr_pool: ib_fmr_pool_flush() should flush all dirty FMRs Oliver Pinter (1): IB/iser: Typo fix (s/destory/destroy/) Pradeep Satyanarayana (2): IPoIB/cm: Add connected mode support for devices without SRQs IPoIB/CM: Enable SRQ support on HCAs that support fewer than 16 SG entries Ralph Campbell (16): IB/mad: Remove redundant NULL pointer check in ib_mad_recv_done_handler() IB/ipath: Enable loopback of DR SMP responses from userspace IB/ipath: Remove dead code for user process waiting for send buffer IB/ipath: Fix error returned from ib_resize_cq if new size smaller than # entries IB/ipath: Fix comments for ipath_create_srq() IB/ipath: Add the work completion error code to the QP error debug output IB/ipath: Fix RNR NAK handling IB/ipath: Cleanup ipath_get_egrbuf() IB/ipath: kreceive uses portdata rather than devdata IB/ipath: MAD performance sampling registers support IB/ipath: Export hardware counters more consistently IB/ipath: Allow more flexible user register alignments IB/ipath: Port config has on-chip effects for 7220 IB/ipath: Add flag and handling for chips with swapped register bug IB/ipath: Add mappings from HW register to PortInfo port physical state IB/ipath: Trivial simplification of ipath_make_ud_req() Roland Dreier (10): IB/ipath: Fix crash on unload introduced by sysfs changes IPoIB: Trivial formatting cleanups IPoIB/cm: Factor out ipoib_cm_free_rx_ring() IPoIB/cm: Factor out ipoib_cm_create_srq() IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list() IB/mlx4: Micro-optimize mlx4_ib_poll_one() RDMA/cxgb3: Endianness annotation for irs field IB/ipath: Fix some sparse warnings about shadowed symbols IB/umad: Simplify and fix locking IB/mthca: Update latest "native Arbel" firmware revision Rolf Manderscheid (1): IPoIB: improve IPv4/IPv6 to IB mcast mapping functions Sean Hefty (6): IB/multicast: Report errors on multicast groups if P_key changes IB/mad: Report number of times a mad was retried IB/cm: Add basic performance counters IB/mad: Fix incorrect access to items on local_list RDMA/cma: add support for rdma_migrate_id() RDMA/cma: Override default responder_resources with user value Steve Welch (1): IB/mad: Enable loopback of DR SMP responses from userspace Steve Wise (7): RDMA/iwcm: Set initiator depth and responder resources to device max values RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call RDMA/cxgb3: Support version 5.0 firmware RDMA/cxgb3: Flush the receive queue when closing RDMA/cxgb3: Fix page shift calculation in build_phys_page_list() RDMA/cxgb3: Mark QP as privileged based on user capabilities RDMA/cxgb3: Fix the T3A workaround checks Vladimir Sokolovsky (1): RDMA/cma: Reenable device removal on passive side Documentation/feature-removal-schedule.txt | 10 - drivers/infiniband/core/cm.c | 306 ++++++++++++++++- drivers/infiniband/core/cma.c | 60 ++-- drivers/infiniband/core/fmr_pool.c | 33 ++- drivers/infiniband/core/mad.c | 26 +- drivers/infiniband/core/mad_priv.h | 3 +- drivers/infiniband/core/mad_rmpp.c | 2 +- drivers/infiniband/core/multicast.c | 55 +++- drivers/infiniband/core/smi.h | 18 +- drivers/infiniband/core/ucm.c | 37 +-- drivers/infiniband/core/ucma.c | 92 +++++ drivers/infiniband/core/user_mad.c | 115 +++---- drivers/infiniband/hw/cxgb3/cxio_hal.c | 4 +- drivers/infiniband/hw/cxgb3/cxio_wr.h | 5 +- drivers/infiniband/hw/cxgb3/iwch_cm.c | 4 +- drivers/infiniband/hw/cxgb3/iwch_mem.c | 7 + drivers/infiniband/hw/cxgb3/iwch_provider.c | 7 +- drivers/infiniband/hw/cxgb3/iwch_qp.c | 29 +-- drivers/infiniband/hw/ehca/ehca_av.c | 2 +- drivers/infiniband/hw/ehca/ehca_classes.h | 23 ++- drivers/infiniband/hw/ehca/ehca_cq.c | 2 +- drivers/infiniband/hw/ehca/ehca_irq.c | 38 ++- drivers/infiniband/hw/ehca/ehca_iverbs.h | 2 + drivers/infiniband/hw/ehca/ehca_main.c | 15 +- drivers/infiniband/hw/ehca/ehca_qp.c | 180 +++++++++- drivers/infiniband/hw/ehca/ehca_reqs.c | 112 +++++-- drivers/infiniband/hw/ehca/ehca_sqp.c | 6 +- drivers/infiniband/hw/ipath/ipath_common.h | 35 ++- drivers/infiniband/hw/ipath/ipath_cq.c | 2 +- drivers/infiniband/hw/ipath/ipath_debug.h | 4 +- drivers/infiniband/hw/ipath/ipath_driver.c | 180 ++++------ drivers/infiniband/hw/ipath/ipath_eeprom.c | 23 +- drivers/infiniband/hw/ipath/ipath_file_ops.c | 94 +++--- drivers/infiniband/hw/ipath/ipath_fs.c | 14 +- drivers/infiniband/hw/ipath/ipath_iba6110.c | 395 +++++++++++++++++++--- drivers/infiniband/hw/ipath/ipath_iba6120.c | 439 +++++++++++++++++++----- drivers/infiniband/hw/ipath/ipath_init_chip.c | 67 ++--- drivers/infiniband/hw/ipath/ipath_intr.c | 81 ++--- drivers/infiniband/hw/ipath/ipath_kernel.h | 201 ++++++++--- drivers/infiniband/hw/ipath/ipath_keys.c | 5 +- drivers/infiniband/hw/ipath/ipath_mad.c | 123 ++++--- drivers/infiniband/hw/ipath/ipath_qp.c | 6 +- drivers/infiniband/hw/ipath/ipath_rc.c | 18 +- drivers/infiniband/hw/ipath/ipath_registers.h | 33 +- drivers/infiniband/hw/ipath/ipath_ruc.c | 13 +- drivers/infiniband/hw/ipath/ipath_srq.c | 4 +- drivers/infiniband/hw/ipath/ipath_stats.c | 24 +- drivers/infiniband/hw/ipath/ipath_sysfs.c | 364 ++++++++++++++++++++ drivers/infiniband/hw/ipath/ipath_ud.c | 3 +- drivers/infiniband/hw/ipath/ipath_verbs.c | 55 ++- drivers/infiniband/hw/ipath/ipath_verbs.h | 12 + drivers/infiniband/hw/mlx4/cq.c | 9 +- drivers/infiniband/hw/mthca/mthca_dev.h | 13 +- drivers/infiniband/hw/mthca/mthca_eq.c | 6 +- drivers/infiniband/hw/mthca/mthca_main.c | 40 +-- drivers/infiniband/ulp/ipoib/ipoib.h | 184 ++++++---- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 376 +++++++++++++++------ drivers/infiniband/ulp/ipoib/ipoib_fs.c | 4 +- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 8 +- drivers/infiniband/ulp/ipoib/ipoib_main.c | 60 ++-- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 8 +- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 18 +- drivers/infiniband/ulp/iser/Kconfig | 4 +- drivers/infiniband/ulp/iser/iscsi_iser.c | 1 + drivers/infiniband/ulp/iser/iser_initiator.c | 2 +- drivers/infiniband/ulp/iser/iser_verbs.c | 8 +- drivers/infiniband/ulp/srp/ib_srp.c | 131 +++++--- drivers/infiniband/ulp/srp/ib_srp.h | 5 + drivers/net/mlx4/fw.c | 2 +- include/net/if_inet6.h | 11 +- include/net/ip.h | 10 +- include/rdma/ib_mad.h | 4 +- include/rdma/rdma_user_cm.h | 13 +- net/ipv4/arp.c | 2 +- net/ipv6/ndisc.c | 2 +- 75 files changed, 3124 insertions(+), 1185 deletions(-) From swise at opengridcomputing.com Fri Jan 25 14:42:39 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 25 Jan 2008 16:42:39 -0600 Subject: [ofa-general] [GIT PULL ofed-1.2.5] - cxgb3 fixes In-Reply-To: References: Message-ID: <479A65DF.2070407@opengridcomputing.com> Vlad, Please pull these fixes for ofed-1.2.5 from: git://git.openfabrics.org/~swise/ofed-1.2.5 ofed_1_2_c > RDMA/cxgb3: Flush the receive queue when closing > RDMA/cxgb3: Fix page shift calculation in build_phys_page_list() > RDMA/cxgb3: Mark QP as privileged based on user capabilities > RDMA/cxgb3: Fix the T3A workaround checks These are all going upstream and in ofed-1.3 and I want to keep ofed-1.2.5 up to date as well. Can these make 1.2.5.5 by chance? Thanks, Steve. From rosnbrg at us.ibm.com Fri Jan 25 14:50:24 2008 From: rosnbrg at us.ibm.com (Bryan S Rosenburg) Date: Fri, 25 Jan 2008 17:50:24 -0500 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: Message-ID: On Mon Jan 21 12:39:36 PST 2008, Steve Wise wrote: > RDMA/cxgb3: fix page shift calculation in build_phys_page_list() > > The existing logic incorrectly maps this buffer list: > > 0: addr 0x10001000, size 0x1000 > 1: addr 0x10002000, size 0x1000 > > To this bogus page list: > > 0: 0x10000000 > 1: 0x10002000 > > The shift calculation must also take into account the address of the first > entry masked by the page_mask as well as the last address+size rounded > up to the next page size. I think the problem can still occur, even with the patch, if the buffer list has just one entry. A single entry (addr 0x10001000, size 0x2000) will get converted to page address 0x10000000 with a page size of 0x4000. The patch as it stands doesn't address the single buffer case, but in fact it allows the subsequent single-buffer special case to be eliminated entirely. Because the mask now includes the (page adjusted) starting and ending addresses, the general case works for the single buffer case as well: ================================================================================ diff --git a/drivers/infiniband/hw/cxgb3/iwch_mem.c b/drivers/infiniband/hw/cxgb3/iwch_mem.c index 73bfd16..b8797c6 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_mem.c +++ b/drivers/infiniband/hw/cxgb3/iwch_mem.c @@ -136,14 +136,8 @@ int build_phys_page_list(struct ib_phys_buf *buffer_list, /* Find largest page shift we can use to cover buffers */ for (*shift = PAGE_SHIFT; *shift < 27; ++(*shift)) - if (num_phys_buf > 1) { - if ((1ULL << *shift) & mask) - break; - } else - if (1ULL << *shift >= - buffer_list[0].size + - (buffer_list[0].addr & ((1ULL << *shift) - 1))) - break; + if ((1ULL << *shift) & mask) + break; buffer_list[0].size += buffer_list[0].addr & ((1ULL << *shift) - 1); buffer_list[0].addr &= ~0ull << *shift; ================================================================================ Don't try this without applying Steve's patch first. Incidentally, I've been tracking down exactly the bug that Steve fixed, but in mthca_reg_phys_mr() rather than in the cxgb3 build_phys_page_list(). I'll submit a patch for mthca, unless someone else applies Steve's fix there soon. - Bryan Rosenburg - IBM Research -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Fri Jan 25 15:04:24 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Fri, 25 Jan 2008 17:04:24 -0600 Subject: [ofa-general] [GIT PULL ofed-1.2.5 / ofed-1.3] - libcxgb3-1.1.3 release Message-ID: <479A6AF8.4010306@opengridcomputing.com> Vlad, Please pull version 1.1.3 of libcxgb3 for ofed-1.2.5 and ofed-1.3. This release fixes problems with running libcxgb3 on RH4U5 and other distros. Pull from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 and git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3 Also, the release can be downloaded from: http://www.openfabrics.org/downloads/cxgb3/libcxgb3-1.1.3.tar.gz Thanks, Steve. From kliteyn at mellanox.co.il Fri Jan 25 17:09:32 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 26 Jan 2008 03:09:32 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-26:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-25 OpenSM git rev = Sun_Jan_20_20:18:24_2008 [9b093e04dedb54c78d74d0567e85b3a59f88badd] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From dwsharedexperiencem at sharedexperience.org Fri Jan 25 18:13:12 2008 From: dwsharedexperiencem at sharedexperience.org (Scottie Bowling) Date: Sat, 26 Jan 2008 10:13:12 +0800 Subject: [ofa-general] Win real money! Message-ID: <01c86004$0f14b490$a45dbfde@dwsharedexperiencem> Feel like gambling? Golden Gate Casino is worth your attention. All popular casino games, great welcome bonus, fast to download, easy to use and completely free software! Play with us and you'll appreciate our support available 24/7, level of security, the quality of software! Enjoy our big bonuses! http://geocities.com/logancompton557 Simply try and you'll like it! From a-amitu at acmebody.com Fri Jan 25 19:47:13 2008 From: a-amitu at acmebody.com (Shelia Cummings) Date: Sat, 26 Jan 2008 11:47:13 +0800 Subject: [ofa-general] Where have you been? Message-ID: <657424677.66177023697182@acmebody.com> Hello! I am bored today. I am nice girl that would like to chat with you. Email me at Lauren at EHealThies.info only, because I am using my friend's email to write this. Would you mind if I share some of my pictures with you? From rdreier at cisco.com Fri Jan 25 20:47:26 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 25 Jan 2008 20:47:26 -0800 Subject: [ofa-general] help on /sys/class structure and rdma In-Reply-To: <479A315B.4010900@opengridcomputing.com> (Steve Wise's message of "Fri, 25 Jan 2008 12:58:35 -0600") References: <4799064D.4060705@opengridcomputing.com> <479A315B.4010900@opengridcomputing.com> Message-ID: > /* > * Verify that the firmware major number matches. Major number > * mismatches are fatal. Minor number mismatches are tolerated. > */ > if (ibv_read_sysfs_file(uverbs_sys_path, "ibdev", > ibdev, sizeof ibdev) < 0) > return NULL; > memset(devstr, 0, sizeof devstr); > snprintf(devstr, sizeof devstr, "%s/class/infiniband/%s", > ibv_get_sysfs_path(), ibdev); > if (ibv_read_sysfs_file(devstr, "fw_ver", value, sizeof value) < 0) > return NULL; looks fine assuming it works, although I'm a fan of asprintf() ;) - R. From rdreier at cisco.com Fri Jan 25 21:04:21 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 25 Jan 2008 21:04:21 -0800 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: (Bryan S. Rosenburg's message of "Fri, 25 Jan 2008 17:50:24 -0500") References: Message-ID: > Incidentally, I've been tracking down exactly the bug that Steve fixed, > but in mthca_reg_phys_mr() rather than in the cxgb3 > build_phys_page_list(). I'll submit a patch for mthca, unless someone > else applies Steve's fix there soon. Strange... is this causing problems in practice? I do agree that the mthca code, when passed a list like > 0: addr 0x10001000, size 0x1000 > 1: addr 0x10002000, size 0x1000 will end up with a page list > 0: 0x10000000 > 1: 0x10002000 and a page size of 0x2000, but it seems to me that should work fine -- the memory region will end up starting at an offset of 0x1000 and having size 0x2000, which gives the region precisely as intended (just not in the most obvious way). But I'm probably missing something, and obviously theory is no good at all if there's a bug in practice... - R. From rdreier at cisco.com Fri Jan 25 21:12:41 2008 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 25 Jan 2008 21:12:41 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: <1201181656.9739.42.camel@localhost.localdomain> (Shirley Ma's message of "Thu, 24 Jan 2008 05:34:15 -0800") References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> Message-ID: > Actually I thought about this when I came to this simple implementation. > If we use 4096-48, a patch in IPoIB to generate ICMP error could help > this issue by sending the 4096-48 mtu back so the source knows how big > its packets could be. Do you this this is a good idea? It seems like this would be a mess. Now that I think about it some more, if a system with the 4096-4 MTU sent a full-sized packet to a system with the 4096-48 MTU, then the system with the smaller MTU would get a local length error completion, the QP would transition to error, and it would be a pain to recover. However I came up with a tricky approach that might work well. We would use two-element scatter lists for the receives, and post a 40-byte dummy buffer first and then a 4096 byte buffer for the actual packet. Since the only thing we do with the first 40 bytes is throw them away, we wouldn't even have to make the 40 bytes part of the skb; in fact we could have one buffer that every receive uses and never even touch the first entry of the scatter list after initialization. It would even save the skb_pull(skb, IB_GRH_BYTES); we currently do after receiving messages. What do you think? - R. From sashak at voltaire.com Sat Jan 26 02:20:37 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 26 Jan 2008 10:20:37 +0000 Subject: [ofa-general] Opensm compatibility with rate=1 In-Reply-To: <571f1a060801250617vde79132w38eed9de07e46ef@mail.gmail.com> References: <571f1a060801231518i76c3383l38e3fe9cc32e3cd8@mail.gmail.com> <20080124181530.GH11277@sashak.voltaire.com> <571f1a060801241618t3545d7d7k533256da3425294c@mail.gmail.com> <20080125105903.GA13079@sashak.voltaire.com> <571f1a060801250617vde79132w38eed9de07e46ef@mail.gmail.com> Message-ID: <20080126102037.GI13079@sashak.voltaire.com> On 06:17 Fri 25 Jan , Greg Kurtzer wrote: > We tried rate=2, but it didn't fix the errors we were getting: > > kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 And what is in the OpenSM log? Sasha From sashak at voltaire.com Sat Jan 26 05:37:19 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 26 Jan 2008 13:37:19 +0000 Subject: [ofa-general] the OFA server is out of free space Message-ID: <20080126133719.GP13079@sashak.voltaire.com> Hi, Recently I tried to git-push and got 'No space left on device' error: $ git push Counting objects: 28, done. Compressing objects: 100% (19/19), done. Writing objects: 100% (19/19), 3.45 KiB, done. Total 19 (delta 16), reused 0 (delta 0) error: file write error (No space left on device) fatal: unable to write sha1 file ... Fast look shows: $ df / Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 151873632 144158744 76 100% / $ du -s /tmp 3030328 /tmp/ (and most big things there are from 2007) $ time du -s /home 106945588 /home real 36m43.422s ( <- yes, this is true) user 0m7.660s sys 0m49.910s Guys! What about to clean things up? Thanks! Sasha From sashak at voltaire.com Sat Jan 26 04:22:00 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 26 Jan 2008 12:22:00 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags/scripts/iblinkinfo.pl: Fix switch to switch output with new ibnetdiscover output. In-Reply-To: <20080124190227.22a3af36.weiny2@llnl.gov> References: <20080124190227.22a3af36.weiny2@llnl.gov> Message-ID: <20080126122200.GN13079@sashak.voltaire.com> On 19:02 Thu 24 Jan , Ira Weiny wrote: > I am not sure when the "port guid output" format change to ibnetdiscover > happened, but I just realized that switch to switch links were not being parsed > correctly by IBswcountlimits.pm. This caused issues with some of the perl > diags. I have fixed this. Furthermore, Erez mentioned that he would like port > guid output so I threw that option in as well. (Since I have to parse that > info out of ibnetdiscover now.) > > One might wonder why I did not catch this before? If you mainly test on a 1 > switch system you don't get to see what switch to switch links look like in > diag output... :-( I have learned my lesson, sorry. > > Sasha, this needs to be in 1.3 as well. > > Sorry, :-( > Ira > > > From 89887aff392b0f40acf4035c19e06359271dfe6a Mon Sep 17 00:00:00 2001 > From: Ira K. Weiny > Date: Thu, 24 Jan 2008 18:50:38 -0800 > Subject: [PATCH] infiniband-diags/scripts/iblinkinfo.pl: Fix switch to switch output with new > ibnetdiscover output. > > ibnetdiscover now prints port guids. This format change cause the regex to > break when parsing switch to switch links. Fix this by parsing for the > remote port guid. And while we are at it add an option to print the port > guids parsed. > > Signed-off-by: Ira K. Weiny Applied. Thanks. Sasha From sashak at voltaire.com Sat Jan 26 03:09:55 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 26 Jan 2008 11:09:55 +0000 Subject: [ofa-general] Re: [PATCH] opensm: update man pages for diags DR support In-Reply-To: <47991EED.9000100@llnl.gov> References: <47991EED.9000100@llnl.gov> Message-ID: <20080126110955.GM13079@sashak.voltaire.com> On 15:27 Thu 24 Jan , Timothy A. Meier wrote: > Sasha, > > Sorry, I should have included this in my previous patch. Small man page > changes. > > From 920349140562c3ab44c48dd6775ba3e0beca63c4 Mon Sep 17 00:00:00 2001 > From: Tim Meier > Date: Thu, 24 Jan 2008 15:20:31 -0800 > Subject: [PATCH] opensm: update man pages for diags DR support > > Added man -D information for ibquerryerrors and iblinkinfo > > Signed-off-by: Tim Meier Applied. Thanks. Sasha From sashak at voltaire.com Sat Jan 26 03:07:36 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 26 Jan 2008 11:07:36 +0000 Subject: [ofa-general] Re: [PATCH] opensm: diags add DR path support to some utils In-Reply-To: <47991846.7050700@llnl.gov> References: <47991846.7050700@llnl.gov> Message-ID: <20080126110736.GL13079@sashak.voltaire.com> On 14:59 Thu 24 Jan , Timothy A. Meier wrote: > Sasha, > > At LLNL, we find these -D options useful (some diagnostic messages give us > only direct paths). > > From 3c681566514f0c948cfb5002f7536af9ca563e33 Mon Sep 17 00:00:00 2001 > From: Tim Meier > Date: Thu, 24 Jan 2008 14:49:02 -0800 > Subject: [PATCH] opensm: diags add DR path support to some utils > > Added direct route support to iblinkinfo.pl and ibqueryerrors.pl. > > Signed-off-by: Tim Meier Applied. Thanks. Please note that inlined patch has mangled whitespaces and didn't apply (I used attached version). Also a formatting nit is below. > --- > infiniband-diags/scripts/IBswcountlimits.pm | 51 > +++++++++++++++++++++++++++ > infiniband-diags/scripts/iblinkinfo.pl | 12 +++++- > infiniband-diags/scripts/ibqueryerrors.pl | 12 +++++- > 3 files changed, 71 insertions(+), 4 deletions(-) > > diff --git a/infiniband-diags/scripts/IBswcountlimits.pm > b/infiniband-diags/scripts/IBswcountlimits.pm > index 6985750..1ada8a8 100755 > --- a/infiniband-diags/scripts/IBswcountlimits.pm > +++ b/infiniband-diags/scripts/IBswcountlimits.pm > @@ -373,3 +373,54 @@ sub get_num_ports > return ($num_ports); > } > > +# ========================================================================= > +# convert_dr_to_guid(direct_route) > +# > +sub convert_dr_to_guid > +{ > + my $guid = undef; > + > + my $data = `smpquery nodeinfo -D $_[0]`; > + my @lines = split("\n", $data); > + foreach my $line (@lines) { > + if ($line =~ /^PortGuid:\.+(.*)/) { $guid = $1; } > + } > + $guid; > +} > + > +# ========================================================================= > +# get_node_type(guid_or_direct_route) > +# > +sub get_node_type > +{ > + my $type = undef; > + my $query_arg = "smpquery nodeinfo "; > + if($_[0] =~ /x/) > + { > + # assume arg is a guid if contains an x > + $query_arg .= "-G " . $_[0]; > + } > + else > + { > + # assume arg is a direct path > + $query_arg .= "-D " . $_[0]; > + } > + > + my $data = `$query_arg`; > + my @lines = split("\n", $data); > + foreach my $line (@lines) > + { > + if ($line =~ /^NodeType:\.+(.*)/) { $type = $1; } > + } > + $type; > +} > + > +# ========================================================================= > +# is_switch(guid_or_direct_route) > +# > +sub is_switch > +{ > + my $node_type = &get_node_type($_[0]); > + ($node_type =~ /Switch/); > +} > + What about to use unified indentation for scripts too? Sasha From dwilvam at ilva.be Sat Jan 26 10:02:45 2008 From: dwilvam at ilva.be (Corine Alvarado) Date: Sat, 26 Jan 2008 15:02:45 -0300 Subject: [ofa-general] Enjoy our Winter discounts Message-ID: <919156759.68996898791581@ilva.be> Looking for cheap drugs? What about 20% discount for extremely high quality products? Don't hesitate to purchase products from a reliable source at incredibly low prices.Try our service and you will get deep-discounted quality products delivered fast and discreetly directly to your doorstep. CanadianPharmacy is famous for the level of service and confidentiality. No scamming, no frauds. Orders over $300 will be completed with 12 bonus pills.Save your money with one mouse click. http://geocities.com/terryholder30/Thank You for Your time and for your attention. SafeUnsubscribe™ This email was sent to openib-general at openib.org, by dwilvam at ilva.be Update Profile/Email Address | Instant removal with SafeUnsubscribe™ | Privacy Policy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosnbrg at us.ibm.com Sat Jan 26 10:05:15 2008 From: rosnbrg at us.ibm.com (Bryan S Rosenburg) Date: Sat, 26 Jan 2008 13:05:15 -0500 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: Message-ID: Roland Dreier wrote on 01/26/2008 12:04:21 AM: > > Incidentally, I've been tracking down exactly the bug that Steve fixed, > > but in mthca_reg_phys_mr() rather than in the cxgb3 > > build_phys_page_list(). I'll submit a patch for mthca, unless someone > > else applies Steve's fix there soon. > > Strange... is this causing problems in practice? I do agree that the > mthca code, when passed a list like > > > 0: addr 0x10001000, size 0x1000 > > 1: addr 0x10002000, size 0x1000 > > will end up with a page list > > > 0: 0x10000000 > > 1: 0x10002000 > > and a page size of 0x2000, but it seems to me that should work fine -- > the memory region will end up starting at an offset of 0x1000 and > having size 0x2000, which gives the region precisely as intended (just > not in the most obvious way). Roland, you're quite right that the non-obvious page list is not necessarily a problem. It causes a failure only if the virtual address that is eventually mapped to this region has an alignment with respect to large-page boundaries that is different from the alignment of the physical address. To be concrete, the page list works as you expect if and only if ((*iova_start) & ((1ULL << shift) - 1)) == (buffer_list[0].addr & ((1ULL << shift) - 1)). With the example above, the existing algorithm works fine if, for example, the virtual address is 0x80001000, but it does not work if the va is 0x80000000. In the latter case, incoming rdma bytes destined for va 0x80000000 wind up at physical address 0x10000000, not at 0x10001000 where they belong. To answer your first question, yes, I've seen exactly this behavior in practice. I don't have a hardware spec for mthca, but from observation it seems to me that the offset from the base of the physical region is being derived from the low-order bits of the virtual address. So Steve's fix is overly conservative, but it's straightforward and it has the benefit that it removes the need for the complex special case for single-element buffer lists. Before I saw the patch, I was toying with ways to incorporate the va alignment into the shift calculation, but I'm not sure the complexity is worth it. Also, I see that in cxgb3 the code is factored in such a way that the va isn't even available during the shift calculation. That's not a problem for mthca, but there are advantages to keeping things consistent. For your amusement, the first time I hit this problem, I happened to have a buffer list in which the pages were adjacent but out of order: 0: addr 0x10001000, size 0x1000 1: addr 0x10000000, size 0x1000 resulting in the following list of 0x2000-byte pages: 0: 0x10000000 1: 0x10000000 !!! To my surprise, even this bizarre page list works properly if the virtual address is aligned properly. - Bryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Sat Jan 26 08:19:43 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Sat, 26 Jan 2008 10:19:43 -0600 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: References: Message-ID: <479B5D9F.4010302@opengridcomputing.com> Roland Dreier wrote: > > Incidentally, I've been tracking down exactly the bug that Steve fixed, > > but in mthca_reg_phys_mr() rather than in the cxgb3 > > build_phys_page_list(). I'll submit a patch for mthca, unless someone > > else applies Steve's fix there soon. > > Strange... is this causing problems in practice? I do agree that the > mthca code, when passed a list like > >> 0: addr 0x10001000, size 0x1000 >> 1: addr 0x10002000, size 0x1000 > > will end up with a page list > >> 0: 0x10000000 >> 1: 0x10002000 > > and a page size of 0x2000, but it seems to me that should work fine -- > the memory region will end up starting at an offset of 0x1000 and > having size 0x2000, which gives the region precisely as intended (just > not in the most obvious way). > > But I'm probably missing something, and obviously theory is no good at > all if there's a bug in practice... > As long as the first byte offset for the MR is 0x1000 and the MR length doesn't allow accesses >= 0x10003000. From gshipman at ornl.gov Sat Jan 26 10:59:37 2008 From: gshipman at ornl.gov (Shipman, Galen M.) Date: Sat, 26 Jan 2008 13:59:37 -0500 Subject: [ofa-general] rdma_create_qp fails with -12 In-Reply-To: References: <1200936237.23538.15.camel@obelisk.thedillows.org> <4795D149.3010405@voltaire.com> Message-ID: > physically contiguous). Removing this limitation would make the code > more complex, and in general supporting huge queue depths hasn't > seemed that important. Currently lnet 1.4.11 (lustre networking layer) uses 4K work requests (page sizes). Up to 256 of these are chained together using the next pointer so that a single call to ib_post_send is made for up to a 1MB xfer. The number of work requests allocated for the QP is controlled by number of concurrent sends * 256. At 16 concurrent sends there is no problem. At 64 there is (once we allocate recv work requests as well). It sounds like this can be alleviated by using FMR. - Galen From pawel.dziekonski at pwr.wroc.pl Sat Jan 26 11:30:35 2008 From: pawel.dziekonski at pwr.wroc.pl (Pawel Dziekonski) Date: Sat, 26 Jan 2008 20:30:35 +0100 Subject: [ofa-general] Status of NFS-RDMA ? In-Reply-To: <4797AD59.2000206@mellanox.co.il> References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> <4797AD59.2000206@mellanox.co.il> Message-ID: <20080126193035.GA21209@cefeid.wcss.wroc.pl> On Wed, 23 Jan 2008 at 11:10:49PM +0200, Tziporet Koren wrote: > James Lentini wrote: >> On Wed, 23 Jan 2008, Pawel Dziekonski wrote: >> >> >>> hi, >>> >> The name of Tom's tree changed last Wednesday. I'll update the docs with >> the new address. >> >> The new URL is: >> git://git.linux-nfs.org/projects/tomtucker/xprt-switch-2.6.git >> >>> My hardware will be Mellanox HBAa and Flextronics switch. I already >>> know that NFS-RDMA client is in official kernel. What about server? >>> Should I use OFED 1.2 or try 1.3? Should I use infiniband drivers from >>> kernel or OFED? > > Note that NFS-RDMA is not part of OFED now (will be only in 1.4) however > Tom prepared backport patches that enable it to work on distros with OFED > 1.2 > You can look at http://www.mellanox.com/products/nfs_rdma_sdk.php for more > info on this Hi, thanks for answers. I'm still writing to this list because nfs-rdma-devel at lists.sourceforge.net seems to be dead... :( I pulled Tom's tree from new url and build a kernel. then I downloaded OFED from http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz, however ofa-kernel fails to build. whatever I do I always got in ofa-kernel: make[1]: Entering directory `/usr/src/ib/xprt-switch-2.6' test -e include/linux/autoconf.h -a -e include/config/auto.conf || ( \ echo; \ echo " ERROR: Kernel configuration is invalid."; \ echo " include/linux/autoconf.h or include/config/auto.conf are missing."; \ echo " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ echo; \ /bin/false) obviously, doing 'make oldconfig && make prepare' does not help. anyway, above mentioned files do exist: # ls -la /usr/src/ib/xprt-switch-2.6/{include/linux/autoconf.h,include/config/auto.conf} -rw-r--r-- 1 root root 10156 Jan 25 17:42 /usr/src/ib/xprt-switch-2.6/include/config/auto.conf -rw-r--r-- 1 root root 14733 Jan 25 17:42 /usr/src/ib/xprt-switch-2.6/include/linux/autoconf.h despite of above, compilation continues but fails with: gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.mad.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/3.4.6/include -D__KERNEL__ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/include -Iinclude -include include/linux/autoconf.h -include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -Wdeclaration-after-statement -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mad)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.tmp_mad.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c: In function `ib_mad_init_module': /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c:2966: error: too many arguments to function `kmem_cache_create' make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.o] Error 1 make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core] Error 2 make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband] Error 2 make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error 2 make[1]: Leaving directory `/usr/src/ib/xprt-switch-2.6' make: *** [kernel] Error 2 error: Bad exit status from /var/tmp/rpm-tmp.3877 (%install) full log: https://cefeid.wcss.wroc.pl/d/tmp/OFED.build.32122.log thanks in advance, P -- Pawel Dziekonski Wroclaw Centre for Networking & Supercomputing, HPC Department Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl From mashirle at us.ibm.com Sat Jan 26 01:44:17 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Sat, 26 Jan 2008 01:44:17 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> Message-ID: <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> > However I came up with a tricky approach that might work well. We > would use two-element scatter lists for the receives, and post a > 40-byte dummy buffer first and then a 4096 byte buffer for the actual > packet. Since the only thing we do with the first 40 bytes is throw > them away, we wouldn't even have to make the 40 bytes part of the skb; > in fact we could have one buffer that every receive uses and never > even touch the first entry of the scatter list after initialization. > It would even save the skb_pull(skb, IB_GRH_BYTES); we currently do > after receiving messages. > > What do you think? I thought the same thing before for one buffer allocation, I had a little bit concern about whether IB_GRH could be used later. I have done scatter-gather list patch already. It's based on the PAGE_SIZE whether to use one buffer or two buffer, similar as IPoIB-CM S/G code. It's under testing. The only thing I haven't finished is making S/G code more generica and merge IPoIB-CM S/G and IPoIB-UD S/G buffer allocation togather. Since IBM eHCA does support 4K MTU and we would like our customer to use this feature in OFED-1.3 release. If I merge the IPoIB-CM S/G code and IPoIB-UD S/G code, it would take much longer for testing. I wonder whether it's OK to push IPoIB-UD S/G first then merge IPoIB-UD and IPoIB-CM later. thanks Shirley From dwseaswichitam at seaswichita.com Sat Jan 26 12:06:06 2008 From: dwseaswichitam at seaswichita.com (Marisol Curran) Date: , 27 Jan 2008 04:06:06 +0800 Subject: [ofa-general] Want to be a hero in bed? Message-ID: <01c86099$f06d0b00$92a2273a@dwseaswichitam> Are U Tired with erectile dysfunction? Enhance your sexual life now! Want to be ready for sex in few minutes? Reproductive and ED problems solution http://geocities.com/milagrosjuarez592 We are verified by VISA. Confidential purchase. From Ashish.Batwara at lsi.com Sat Jan 26 12:16:08 2008 From: Ashish.Batwara at lsi.com (Batwara, Ashish) Date: Sat, 26 Jan 2008 13:16:08 -0700 Subject: [ofa-general] How to configure opensm to assign LIDs from a specific range Message-ID: <01B9E81EECACE94DBBD0A556E768FB8A0201622C@NAMAIL2.ad.lsil.com> Best Regards ================= Ashish Batwara, PMP | Firmware Architect | Mobile: +1 316 253 9784 | email: ashish.batwara at lsi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Sat Jan 26 12:44:25 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sat, 26 Jan 2008 20:44:25 +0000 Subject: [ofa-general] How to configure opensm to assign LIDs from a specific range In-Reply-To: <01B9E81EECACE94DBBD0A556E768FB8A0201622C@NAMAIL2.ad.lsil.com> References: <01B9E81EECACE94DBBD0A556E768FB8A0201622C@NAMAIL2.ad.lsil.com> Message-ID: <20080126204425.GD24344@sashak.voltaire.com> On 13:16 Sat 26 Jan , Batwara, Ashish wrote: > I think you can edit your guid2lid file and use --honor_guid2lid option. Sasha From eli at dev.mellanox.co.il Sat Jan 26 12:52:55 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Sat, 26 Jan 2008 22:52:55 +0200 Subject: [ofa-general] Re: [PATCH] ib/limthca: Remove an always true condition In-Reply-To: References: <1201184628.6755.9.camel@mtls03> <4e6a6b3c0801250343y2b551817k4fb62bff9445542e@mail.gmail.com> Message-ID: <20080126205254.GB16296@eli-laptop> On Fri, Jan 25, 2008 at 09:13:07AM -0800, Roland Dreier wrote: > > > why is first_free always >= 0? I don't see anything that guarantees > > > that, > > > > The following two subsequent ifs gurantees that. > > > > if (ind < 0) { > > err = -1; > > *bad_wr = wr; > > break; > > } > > > > wqe = get_wqe(srq, ind); > > next_ind = *wqe_to_link(wqe); > > > > if (next_ind < 0) { > > err = -1; > > *bad_wr = wr; > > break; > > } > > Duh... I missed that. Thanks for the clue. > > but now am I wrong to think that we could remove the first test of ind > (not next_ind) in the fast path? the second test guarantees that ind > never becomes negative, as you pointed out. > I think you're right. The first "if" can go away. From dwsopom at sopo.net Sat Jan 26 13:17:51 2008 From: dwsopom at sopo.net (Corina Nava) Date: Sat, 26 Jan 2008 22:17:51 +0100 Subject: [ofa-general] Purchase popular impotency treatment drugs in Canada for the best Net prices. Message-ID: <01c86069$4a091180$165dfb29@dwsopom> According to the results of monitoring carried out by the Quality Research Organization, «CanadianPharmacy» online drugstore has the best level of service and the cheapest prices among the 50 online drugstores, while the medications are of the extremely high quality. There is a great selection of modern pharmaceutical products! The utmost care is taken about security of your information. You purchase will be 100% confidential. Prompt delivery, personal approach to each customer! http://geocities.com/sheltoncraig805 Visit «CanadianPharmacy» drugstore and you will definitely make the order! From dwmuldavinm at muldavin.com Sat Jan 26 13:42:47 2008 From: dwmuldavinm at muldavin.com (Dolores Torres) Date: Sat, 26 Jan 2008 18:42:47 -0300 Subject: [ofa-general] Products that can improve you life Message-ID: <249033809.73993741347521@muldavin.com> Special Winter offer from CanadianPharmacy. 20% discount for every item from really astonishing selection of products. Don't waste time. CanadianPharmacy offers high quality Canadian products meeting all Pharmaceutical Standards. Wide selection of products which are cheaper than American ones are available to order online. Easy, secure and confidential ordering process.You receive 12 bonus pills for free if your order is over $300.Thanks for being our customer. http://geocities.com/stephens.vincent/Thank You for Your time and for your attention. SafeUnsubscribe™ This email was sent to openib-general at openib.org, by dwmuldavinm at muldavin.com Update Profile/Email Address | Instant removal with SafeUnsubscribe™ | Privacy Policy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eli at dev.mellanox.co.il Sat Jan 26 14:14:41 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Sun, 27 Jan 2008 00:14:41 +0200 Subject: [ofa-general] [PATCH 2/16] ib/ipoib: Add s/g support for IPOIB In-Reply-To: References: <1200501454.13546.71.camel@mtls03> Message-ID: <20080126221440.GB16552@eli-laptop> On Thu, Jan 24, 2008 at 12:06:30PM -0800, Shirley Ma wrote: > Can you make IPoIB-CM rx S/G functions more generic so it can be > reused here for IPoIB UD tx S/G? > I am not sure I understand your intention. This patch provides an API for mapping/unmapping tx UD SKBs: static inline int ipoib_dma_map_tx(struct ib_device *ca, struct ipoib_tx_buf *tx_req); static inline void ipoib_dma_unmap_tx(struct ib_device *ca, struct ipoib_tx_buf *tx_req); From ajqyosokpv at botcorp.com Sat Jan 26 14:27:01 2008 From: ajqyosokpv at botcorp.com (Merlin Wray) Date: Sat, 26 Jan 2008 23:27:01 +0100 Subject: [ofa-general] 500 BIG DOLLARS, USA. Message-ID: <01c86072$f3a0d080$3e80fc57@ajqyosokpv> Make Big Dollar Casino your home Turn dreams into dollars! Get $500 INSTANTLY when you register! See them grow in dozens of your favorite games, tournaments or our huge jackpot! Big Dollar welcomes everyone! Even if you're from the USA! At Big Dollar we make sure you have the fun, excitement and get the winnings you deserve. PAYOUTS ARE FAST! Deposits are made safely and securely! You can turn your fortune in three easy steps REGISTER PLAY WIN! http://geocities.com/myronalford979 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ashish.Batwara at lsi.com Sat Jan 26 15:21:54 2008 From: Ashish.Batwara at lsi.com (Batwara, Ashish) Date: Sat, 26 Jan 2008 16:21:54 -0700 Subject: [ofa-general] Kernel Crash Message-ID: <01B9E81EECACE94DBBD0A556E768FB8A02016244@NAMAIL2.ad.lsil.com> Hello, I see below crash quite frequently. I am running OFED-1.2. Have it been fixed in later kernels? I am using 2.6.9-55. Jan 26 17:18:37 bm3850a kernel: SRP abort called Jan 26 17:18:37 bm3850a kernel: SRP reset_device called Jan 26 17:18:37 bm3850a kernel: ib_srp: SRP reset_host called Jan 26 17:18:37 bm3850a kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000 Jan 26 17:18:37 bm3850a kernel: printing eip: Jan 26 17:18:37 bm3850a kernel: 00000000 Jan 26 17:18:37 bm3850a kernel: *pde = 35610001 Jan 26 17:18:37 bm3850a kernel: Oops: 0010 [#1] Jan 26 17:18:37 bm3850a kernel: SMP Jan 26 17:18:37 bm3850a kernel: Modules linked in: ib_ipoib(U) ib_srp(U) rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_local_sa(U) ib_mthca(U) ib_umad(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc loop button battery ac md5 joydev ipv6 uhci_hcd ehci_hcd bnx2(U) dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata mptsas mptscsi mptbase sd_mod scsi_mod Jan 26 17:18:37 bm3850a kernel: CPU: 7 Jan 26 17:18:37 bm3850a kernel: EIP: 0060:[<00000000>] Not tainted VLI Jan 26 17:18:37 bm3850a kernel: EFLAGS: 00010006 (2.6.9-55.ELsmp) Jan 26 17:18:37 bm3850a kernel: EIP is at 0x0 Jan 26 17:18:37 bm3850a kernel: eax: f175fe40 ebx: f4f566cc ecx: f7fff640 edx: f175fe40 Jan 26 17:18:37 bm3850a kernel: esi: f4f56258 edi: 00000000 ebp: c58bbd40 esp: f159af34 Jan 26 17:18:37 bm3850a kernel: ds: 007b es: 007b ss: 0068 Jan 26 17:18:37 bm3850a kernel: Process scsi_eh_2 (pid: 8034, threadinfo=f159a000 task=f6ced2f0) Jan 26 17:18:37 bm3850a kernel: Stack: f8c838b6 f4f56710 f4f56258 f8c83996 f4f56258 00002003 f159afac f159afa4 Jan 26 17:18:37 bm3850a kernel: f8c855e0 f8c86fff f1f7b640 00000286 f88746db f1f7b640 f159afac f159afac Jan 26 17:18:37 bm3850a kernel: f8874875 f159afa4 f4f56000 f159afac f159afac f8874c07 f4f56048 f4f56000 Jan 26 17:18:37 bm3850a kernel: Call Trace: Jan 26 17:18:37 bm3850a kernel: [] srp_reset_req+0x1e/0x26 [ib_srp] Jan 26 17:18:37 bm3850a kernel: [] srp_reconnect_target+0xd8/0x1d2 [ib_srp] Jan 26 17:18:37 bm3850a kernel: [] srp_reset_host+0x2e/0x49 [ib_srp] Jan 26 17:18:37 bm3850a kernel: [] scsi_try_host_reset+0x59/0xbd [scsi_mod] Jan 26 17:18:37 bm3850a kernel: [] scsi_eh_host_reset+0x44/0xc4 [scsi_mod] Jan 26 17:18:37 bm3850a kernel: [] scsi_eh_ready_devs+0x39/0x4d [scsi_mod] Jan 26 17:18:37 bm3850a kernel: [] scsi_unjam_host+0x15a/0x16b [scsi_mod] Jan 26 17:18:37 bm3850a kernel: [] __scsi_iterate_devices+0x50/0x58 [scsi_mod] Jan 26 17:18:38 bm3850a kernel: [] scsi_error_handler+0x112/0x15a [scsi_mod] Best Regards ================= Ashish Batwara, PMP | Firmware Architect | Mobile: +1 316 253 9784 | email: ashish.batwara at lsi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kliteyn at mellanox.co.il Sat Jan 26 17:32:47 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 27 Jan 2008 03:32:47 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-27:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-26 OpenSM git rev = Thu_Jan_24_14:01:29_2008 [3dffa989fdf84475a8c1bf53813e155c30696bb8] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=398 Fail=2 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 8 LidMgr IS3-128.topo Failures: 2 LidMgr IS3-128.topo From gstreiff at NetEffect.com Sat Jan 26 17:38:53 2008 From: gstreiff at NetEffect.com (Glenn Streiff) Date: Sat, 26 Jan 2008 19:38:53 -0600 Subject: [ofa-general] [GIT PULL ofed-1.3] neteffect updates In-Reply-To: <479A65DF.2070407@opengridcomputing.com> Message-ID: <5E701717F2B2ED4EA60F87C8AA57B7CC0794FED4@venom2> Vlad, Please pull from updated neteffect repository for latest ofed 1.3 release candidate: git://git.openfabrics.org/~glenn/linux-2.6.git ofed_kernel This reflects content accepted into the upstream by Roland, plus: * updated MAINTAINTERS file * kernel.h backport (which you reviewed last week) Let me know if you want this posted as a patch to the community. * iw_nes_[1-3]00_*.patch backports commit Noticed these were listed as untracked in the previous maintainer's working respository (Glenn Grundstrom). The check build was failing without the commit. Passed build_ofa_kernel.sh. Thanks, From mashirle at us.ibm.com Sat Jan 26 08:28:44 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Sat, 26 Jan 2008 08:28:44 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> Message-ID: <1201364924.10918.12.camel@dyn9047018117.beaverton.ibm.com> I am working on the patch now. I will do some touch test and submit the patch for review this weekend. The patch will be included: 1. make IPoIB-CM S/G buffer allocation more generic 2. enable IPoIB-UD S/G buffer allcation support for 4K MTU Thanks Shirley From nipponlongevity.com at winhostdir.com Sat Jan 26 19:47:53 2008 From: nipponlongevity.com at winhostdir.com (Tony Diaz) Date: Sat, 26 Jan 2008 23:47:53 -0400 Subject: [ofa-general] Need S0ftware? Message-ID: <000701c86097$4f7d5080$0100007f@fcspcs> http://sto.noemnow.com/ Windows XP Pro + SP2 OEM: $49 Retail: $269 MS Office Enterprice 2007 OEM: $79 Retail: $899 Acrobat Reader 8 Pro OEM: $79 Retail: $499 http://sto.noemnow.com/ Also see : Microsoft Windows Vista Ultimate $79 Macromedia Flash Professional 8 $49 Adobe Premiere 2.O $59 Corel Grafix Suite X3 $59 Adobe Il1ustrator CS2 $59 Adobe Photoshop CS2 V9.O $69 Adobe Photoshop CS3 Extended $89 Macromedia Studio 8 $99 Autodesk Autocad 2OO7 $129 Adobe Creative Suite 2 $149 Adobe Creative Suite 3 Premium $269 http://sto.noemnow.com/ Mac`s positions: Adobe Acrobat PRO 7 $69 Adobe After Effects $49 Macromedia Flash Pro 8 $49 Adobe Creative Suite 2 Premium $49 Ableton Live 5.O.1 $49 Adobe Photoshop CS $49 You can return 77-90% here! http://sto.noemnow.com/ From dwsmallvillephm at smallvilleph.com Sat Jan 26 22:25:47 2008 From: dwsmallvillephm at smallvilleph.com (Elton Wise) Date: , 27 Jan 2008 14:25:47 +0800 Subject: [ofa-general] Medications that you need. Message-ID: <01c860f0$82076780$e6adeb79@dwsmallvillephm> Buy Must Have medications at Canada based pharmacy. No prescription at all! Same quality! Save your money, buy pills immediately! http://geocities.com/harrietmills605 We provide confidential and secure purchase! From mashirle at us.ibm.com Sat Jan 26 13:42:15 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Sat, 26 Jan 2008 13:42:15 -0800 Subject: [ofa-general] [RFC] IPoIB-UD S/G 4K MTU support against ofa-1.3-rc2 In-Reply-To: <1201162492.9739.21.camel@localhost.localdomain> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> Message-ID: <1201383736.10918.34.camel@dyn9047018117.beaverton.ibm.com> This patch is built against OFED-1.3-rc2. One node touch test has been done. I send out for early review comments while more test is going on. I will integrate your comments in my test immediately. I will break this patch into several smaller ones for 2.6.25 submission later. This patch allows IPoIB-UD MTU up to 4092 (4K - IPOIB_ENCAP_LEN) when HCA can support 4K MTU. In this patch, APIs for S/G buffer allocation in IPoIB-CM mode has been made generic so IPoIB-UD and IPoIB-CM can share the S/G code. When PAGE_SIZE is equal or greater than IPOIB_UD_BUF_SIZE + bytes padding to align IP header, Only one buffer is needed for 4K MTU buffer allocation, otherwise, two buffers allocation is needed in S/G. The node IPoIB link MTU size is the minimum value of admin configurable MTU through ifconfig and IPoIB default broadcast group MTU size. When Subnet Manager enables default broadcast group during start up, this subnet IPoIB link MTU will be the value of default broadcast group MTU size. For any node IB MTU smaller than this value, the node can't join this IPoIB subnet. For any node IB MTU is greater than this value, the node will join this IPoIB subnet and this value will be set as its IPOIB link MTU. If Subnet Manager disables default broadcast group during start up, the first bring up node in this subnet will create the default IPoIB broadcast group based on the negotiation with the Subnet Manager. Sign-off-by: Shirley Ma --- diff -urpN ipoib-orig/ipoib_cm.c ipoib-4kmtu/ipoib_cm.c --- ipoib-orig/ipoib_cm.c 2008-01-26 20:52:49.000000000 -0600 +++ ipoib-4kmtu/ipoib_cm.c 2008-01-26 23:52:42.000000000 -0600 @@ -72,17 +72,6 @@ static struct ib_send_wr ipoib_cm_rx_dra static int ipoib_cm_tx_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event); -static void ipoib_cm_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, - u64 mapping[IPOIB_CM_RX_SG]) -{ - int i; - - ib_dma_unmap_single(priv->ca, mapping[0], IPOIB_CM_HEAD_SIZE, DMA_FROM_DEVICE); - - for (i = 0; i < frags; ++i) - ib_dma_unmap_single(priv->ca, mapping[i + 1], PAGE_SIZE, DMA_FROM_DEVICE); -} - static int ipoib_cm_post_receive(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -97,8 +86,9 @@ static int ipoib_cm_post_receive(struct ret = ib_post_srq_recv(priv->cm.srq, &priv->cm.rx_wr, &bad_wr); if (unlikely(ret)) { ipoib_warn(priv, "post srq failed for buf %d (%d)\n", id, ret); - ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, - priv->cm.srq_ring[id].mapping); + ipoib_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, + priv->cm.srq_ring[id].mapping); dev_kfree_skb_any(priv->cm.srq_ring[id].skb); priv->cm.srq_ring[id].skb = NULL; } @@ -106,57 +96,6 @@ static int ipoib_cm_post_receive(struct return ret; } -static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, int id, int frags, - u64 mapping[IPOIB_CM_RX_SG]) -{ - struct ipoib_dev_priv *priv = netdev_priv(dev); - struct sk_buff *skb; - int i; - - skb = dev_alloc_skb(IPOIB_CM_HEAD_SIZE + 12); - if (unlikely(!skb)) - return NULL; - - /* - * IPoIB adds a 4 byte header. So we need 12 more bytes to align the - * IP header to a multiple of 16. - */ - skb_reserve(skb, 12); - - mapping[0] = ib_dma_map_single(priv->ca, skb->data, IPOIB_CM_HEAD_SIZE, - DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, mapping[0]))) { - dev_kfree_skb_any(skb); - return NULL; - } - - for (i = 0; i < frags; i++) { - struct page *page = alloc_page(GFP_ATOMIC); - - if (!page) - goto partial_error; - skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE); - - mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page, - 0, PAGE_SIZE, DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1]))) - goto partial_error; - } - - priv->cm.srq_ring[id].skb = skb; - return skb; - -partial_error: - - ib_dma_unmap_single(priv->ca, mapping[0], IPOIB_CM_HEAD_SIZE, DMA_FROM_DEVICE); - - for (; i > 0; --i) - ib_dma_unmap_single(priv->ca, mapping[i], PAGE_SIZE, DMA_FROM_DEVICE); - - dev_kfree_skb_any(skb); - return NULL; -} - static void ipoib_cm_start_rx_drain(struct ipoib_dev_priv* priv) { struct ib_send_wr *bad_wr; @@ -367,38 +306,6 @@ static int ipoib_cm_rx_handler(struct ib return 0; } } -/* Adjust length of skb with fragments to match received data */ -static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, - unsigned int length, struct sk_buff *toskb) -{ - int i, num_frags; - unsigned int size; - - /* put header into skb */ - size = min(length, hdr_space); - skb->tail += size; - skb->len += size; - length -= size; - - num_frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < num_frags; i++) { - skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - - if (length == 0) { - /* don't need this page */ - skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE); - --skb_shinfo(skb)->nr_frags; - } else { - size = min(length, (unsigned) PAGE_SIZE); - - frag->size = size; - skb->data_len += size; - skb->truesize += size; - skb->len += size; - length -= size; - } - } -} void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) { @@ -453,7 +360,7 @@ void ipoib_cm_handle_rx_wc(struct net_de frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, (unsigned)IPOIB_CM_HEAD_SIZE)) / PAGE_SIZE; - newskb = ipoib_cm_alloc_rx_skb(dev, wr_id, frags, mapping); + newskb = ipoib_alloc_rx_skb(dev, wr_id, frags, IPOIB_CM_HEAD_SIZE, 12, mapping); if (unlikely(!newskb)) { /* * If we can't allocate a new RX buffer, dump @@ -464,7 +371,9 @@ void ipoib_cm_handle_rx_wc(struct net_de goto repost; } - ipoib_cm_dma_unmap_rx(priv, frags, priv->cm.srq_ring[wr_id].mapping); + priv->cm.srq_ring[wr_id].skb = newskb; + ipoib_dma_unmap_rx(priv, frags, IPOIB_CM_HEAD_SIZE, + priv->cm.srq_ring[wr_id].mapping); memcpy(priv->cm.srq_ring[wr_id].mapping, mapping, (frags + 1) * sizeof *mapping); ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", @@ -1334,8 +1243,10 @@ int ipoib_cm_dev_init(struct net_device priv->cm.rx_wr.num_sge = IPOIB_CM_RX_SG; for (i = 0; i < ipoib_recvq_size; ++i) { - if (!ipoib_cm_alloc_rx_skb(dev, i, IPOIB_CM_RX_SG - 1, - priv->cm.srq_ring[i].mapping)) { + priv->cm.srq_ring[i].skb = ipoib_alloc_rx_skb(dev, i, IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, 12, + priv->cm.srq_ring[i].mapping); + if (!priv->cm.srq_ring[i].skb) { ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); ipoib_cm_dev_cleanup(dev); return -ENOMEM; @@ -1370,8 +1281,9 @@ void ipoib_cm_dev_cleanup(struct net_dev return; for (i = 0; i < ipoib_recvq_size; ++i) if (priv->cm.srq_ring[i].skb) { - ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, - priv->cm.srq_ring[i].mapping); + ipoib_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, + priv->cm.srq_ring[i].mapping); dev_kfree_skb_any(priv->cm.srq_ring[i].skb); priv->cm.srq_ring[i].skb = NULL; } diff -urpN ipoib-orig/ipoib.h ipoib-4kmtu/ipoib.h --- ipoib-orig/ipoib.h 2008-01-26 20:52:49.000000000 -0600 +++ ipoib-4kmtu/ipoib.h 2008-01-26 21:28:03.000000000 -0600 @@ -56,10 +56,9 @@ /* constants */ enum { - IPOIB_PACKET_SIZE = 2048, - IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, - IPOIB_ENCAP_LEN = 4, + IPOIB_MAX_IB_MTU = 4096, /* max ib device payload is 4096 */ + IPOIB_UD_MAX_RX_SG = ALIGN(IPOIB_MAX_IB_MTU + IB_GRH_BYTES + 4, PAGE_SIZE) / PAGE_SIZE, /* padding to align IP header */ IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header to 16 */ IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU + IPOIB_ENCAP_LEN, @@ -141,7 +140,7 @@ struct ipoib_mcast { struct ipoib_rx_buf { struct sk_buff *skb; - u64 mapping; + u64 mapping[MAX_SKB_FRAGS + 1]; }; struct ipoib_tx_buf { @@ -281,14 +280,9 @@ struct ipoib_cm_tx { struct ib_wc ibwc[IPOIB_NUM_WC]; }; -struct ipoib_cm_rx_buf { - struct sk_buff *skb; - u64 mapping[IPOIB_CM_RX_SG]; -}; - struct ipoib_cm_dev_priv { struct ib_srq *srq; - struct ipoib_cm_rx_buf *srq_ring; + struct ipoib_rx_buf *srq_ring; struct ib_cm_id *id; struct list_head passive_ids; /* state: LIVE */ struct list_head rx_error_list; /* state: ERROR */ @@ -391,6 +385,9 @@ struct ipoib_dev_priv { struct dentry *path_dentry; #endif struct ipoib_ethtool_st etool; + unsigned int max_ib_mtu; + struct ib_sge rx_sge[IPOIB_UD_MAX_RX_SG]; + struct ib_recv_wr rx_wr; }; struct ipoib_ah { @@ -487,6 +484,14 @@ int ipoib_ib_dev_stop(struct net_device int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port); void ipoib_dev_cleanup(struct net_device *dev); +void ipoib_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, int head_size, + u64 *mapping); +void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, + unsigned int length, struct sk_buff *toskb); +struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, + int id, int frags, int head_size, + int pad, u64 *mapping); + void ipoib_mcast_join_task(struct work_struct *work); void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb); @@ -536,6 +541,11 @@ void ipoib_drain_cq(struct net_device *d void ipoib_set_ethtool_ops(struct net_device *dev); +#define IPOIB_UD_MTU(ib_mtu) (ib_mtu - IPOIB_ENCAP_LEN) +#define IPOIB_UD_BUF_SIZE(ib_mtu) (ib_mtu + IB_GRH_BYTES) /* padding to align IP header */ +#define IPOIB_UD_HEAD_SIZE(ib_mtu) (IPOIB_UD_BUF_SIZE(ib_mtu)) % PAGE_SIZE +#define IPOIB_UD_RX_SG(ib_mtu) ALIGN(IPOIB_UD_BUF_SIZE(ib_mtu), PAGE_SIZE) / PAGE_SIZE + #ifdef CONFIG_INFINIBAND_IPOIB_CM #define IPOIB_FLAGS_RC 0x80 diff -urpN ipoib-orig/ipoib_ib.c ipoib-4kmtu/ipoib_ib.c --- ipoib-orig/ipoib_ib.c 2008-01-26 20:52:49.000000000 -0600 +++ ipoib-4kmtu/ipoib_ib.c 2008-01-26 22:48:41.000000000 -0600 @@ -90,63 +90,118 @@ void ipoib_free_ah(struct kref *kref) spin_unlock_irqrestore(&priv->lock, flags); } -static int ipoib_ib_post_receive(struct net_device *dev, int id) +/* Adjust length of skb with fragments to match received data */ +void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, + unsigned int length, struct sk_buff *toskb) { - struct ipoib_dev_priv *priv = netdev_priv(dev); - struct ib_sge list; - struct ib_recv_wr param; - struct ib_recv_wr *bad_wr; - int ret; + int i, num_frags; + unsigned int size; - list.addr = priv->rx_ring[id].mapping; - list.length = IPOIB_BUF_SIZE; - list.lkey = priv->mr->lkey; - - param.next = NULL; - param.wr_id = id | IPOIB_OP_RECV; - param.sg_list = &list; - param.num_sge = 1; + /* put header into skb */ + size = min(length, hdr_space); + skb->tail += size; + skb->len += size; + length -= size; - ret = ib_post_recv(priv->qp, ¶m, &bad_wr); - if (unlikely(ret)) { - ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); - ib_dma_unmap_single(priv->ca, priv->rx_ring[id].mapping, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); - dev_kfree_skb_any(priv->rx_ring[id].skb); - priv->rx_ring[id].skb = NULL; + num_frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < num_frags; i++) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + + if (length == 0) { + /* don't need this page */ + skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE); + --skb_shinfo(skb)->nr_frags; + } else { + size = min(length, (unsigned) PAGE_SIZE); + + frag->size = size; + skb->data_len += size; + skb->truesize += size; + skb->len += size; + length -= size; + } } +} - return ret; +void ipoib_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, int head_size, u64 *mapping) +{ + int i; + ib_dma_unmap_single(priv->ca, mapping[0], head_size, + DMA_FROM_DEVICE); + for (i = 0; i < frags; i++) + ib_dma_unmap_single(priv->ca, mapping[i+1], PAGE_SIZE, + DMA_FROM_DEVICE); } -static int ipoib_alloc_rx_skb(struct net_device *dev, int id) +struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, int id, int frags, + int head_size, int pad, u64 *mapping) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct sk_buff *skb; - u64 addr; + int i; - skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); - if (!skb) - return -ENOMEM; + skb = dev_alloc_skb(head_size + pad); + if (unlikely(!skb)) + return NULL; /* - * IB will leave a 40 byte gap for a GRH and IPoIB adds a 4 byte - * header. So we need 4 more bytes to get to 48 and align the + * IPoIB adds a 4 byte header. So we need 12 more bytes to align the * IP header to a multiple of 16. */ - skb_reserve(skb, 4); + skb_reserve(skb, pad); - addr = ib_dma_map_single(priv->ca, skb->data, IPOIB_BUF_SIZE, - DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { + mapping[0] = ib_dma_map_single(priv->ca, skb->data, head_size, + DMA_FROM_DEVICE); + if (unlikely(ib_dma_mapping_error(priv->ca, mapping[0]))) { dev_kfree_skb_any(skb); - return -EIO; + return NULL; } - priv->rx_ring[id].skb = skb; - priv->rx_ring[id].mapping = addr; + for (i = 0; i < frags; i++) { + struct page *page = alloc_page(GFP_ATOMIC); - return 0; + if (!page) + goto partial_error; + skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE); + + mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page, + 0, PAGE_SIZE, DMA_FROM_DEVICE); + if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1]))) + goto partial_error; + } + + return skb; + +partial_error: + + ib_dma_unmap_single(priv->ca, mapping[0], head_size, DMA_FROM_DEVICE); + + for (; i > 0; --i) + ib_dma_unmap_single(priv->ca, mapping[i], PAGE_SIZE, DMA_FROM_DEVICE); + + dev_kfree_skb_any(skb); + return NULL; +} + +static int ipoib_ib_post_receive(struct net_device *dev, int id) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_recv_wr *bad_wr; + int ret, i; + + priv->rx_wr.wr_id = id | IPOIB_OP_RECV; + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu); ++i) + priv->rx_sge[i].addr = priv->rx_ring[id].mapping[i]; + ret = ib_post_recv(priv->qp, &priv->rx_wr, &bad_wr); + if (unlikely(ret)) { + ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); + ipoib_dma_unmap_rx(priv, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), priv->rx_ring[id].mapping); + dev_kfree_skb_any(priv->rx_ring[id].skb); + priv->rx_ring[id].skb = NULL; + } + + return ret; } static int ipoib_ib_post_receives(struct net_device *dev) @@ -154,13 +209,24 @@ static int ipoib_ib_post_receives(struct struct ipoib_dev_priv *priv = netdev_priv(dev); int i; + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu); ++i) + priv->rx_sge[i].lkey = priv->mr->lkey; + priv->rx_sge[0].length = IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu); + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1; ++i) + priv->rx_sge[i+1].length = PAGE_SIZE; + priv->rx_wr.num_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu); + priv->rx_wr.next = NULL; + priv->rx_wr.sg_list = priv->rx_sge; + for (i = 0; i < ipoib_recvq_size; ++i) { - if (ipoib_alloc_rx_skb(dev, i)) { - ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); + priv->rx_ring[i].skb = ipoib_alloc_rx_skb(dev, i, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), 4, + priv->rx_ring[i].mapping); + if (!priv->rx_ring[i].skb) return -ENOMEM; - } if (ipoib_ib_post_receive(dev, i)) { ipoib_warn(priv, "ipoib_ib_post_receive failed for buf %d\n", i); + ipoib_dev_cleanup(dev); return -EIO; } } @@ -172,9 +238,9 @@ static void ipoib_ib_handle_rx_wc(struct { struct ipoib_dev_priv *priv = netdev_priv(dev); unsigned int wr_id = wc->wr_id & ~IPOIB_OP_RECV; - struct sk_buff *skb; + struct sk_buff *skb, *newskb; + u64 mapping[IPOIB_UD_RX_SG(priv->max_ib_mtu)]; struct ipoib_header *header; - u64 addr; ipoib_dbg_data(priv, "recv completion: id %d, status: %d\n", wr_id, wc->status); @@ -186,15 +252,15 @@ static void ipoib_ib_handle_rx_wc(struct } skb = priv->rx_ring[wr_id].skb; - addr = priv->rx_ring[wr_id].mapping; if (unlikely(wc->status != IB_WC_SUCCESS)) { if (wc->status != IB_WC_WR_FLUSH_ERR) ipoib_warn(priv, "failed recv event " "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); - ib_dma_unmap_single(priv->ca, addr, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[wr_id].mapping); dev_kfree_skb_any(skb); priv->rx_ring[wr_id].skb = NULL; return; @@ -211,17 +277,23 @@ static void ipoib_ib_handle_rx_wc(struct * If we can't allocate a new RX buffer, dump * this packet and reuse the old buffer. */ - if (unlikely(ipoib_alloc_rx_skb(dev, wr_id))) { + newskb = ipoib_alloc_rx_skb(dev, wr_id, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), 4, mapping); + if (unlikely(!newskb)) { + ipoib_dbg(priv, "failed to allocate receive buffer %d\n", wr_id); ++priv->stats.rx_dropped; goto repost; } + priv->rx_ring[wr_id].skb = newskb; ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", wc->byte_len, wc->slid); - ib_dma_unmap_single(priv->ca, addr, IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), priv->rx_ring[wr_id].mapping); + memcpy(priv->rx_ring[wr_id].mapping, mapping, IPOIB_UD_RX_SG(priv->max_ib_mtu) * sizeof *mapping); - skb_put(skb, wc->byte_len); + skb_put_frags(skb, IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), wc->byte_len, newskb); skb_pull(skb, IB_GRH_BYTES); header = (struct ipoib_header *)skb->data; @@ -692,10 +764,10 @@ int ipoib_ib_dev_stop(struct net_device rx_req = &priv->rx_ring[i]; if (!rx_req->skb) continue; - ib_dma_unmap_single(priv->ca, - rx_req->mapping, - IPOIB_BUF_SIZE, - DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, + IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[i].mapping); dev_kfree_skb_any(rx_req->skb); rx_req->skb = NULL; } diff -urpN ipoib-orig/ipoib_main.c ipoib-4kmtu/ipoib_main.c --- ipoib-orig/ipoib_main.c 2008-01-26 20:52:49.000000000 -0600 +++ ipoib-4kmtu/ipoib_main.c 2008-01-26 21:28:03.000000000 -0600 @@ -193,7 +193,7 @@ static int ipoib_change_mtu(struct net_d return 0; } - if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) { + if (new_mtu > IPOIB_UD_MTU(priv->max_ib_mtu)) { return -EINVAL; } @@ -1053,10 +1053,6 @@ static void ipoib_setup(struct net_devic set_bit(IPOIB_FLAG_HW_CSUM, &priv->flags); } - /* MTU will be reset when mcast join happens */ - dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; - priv->mcast_mtu = priv->admin_mtu = dev->mtu; - memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); netif_carrier_off(dev); @@ -1208,6 +1204,7 @@ static struct net_device *ipoib_add_port struct ib_device *hca, u8 port) { struct ipoib_dev_priv *priv; + struct ib_port_attr attr; int result = -ENOMEM; priv = ipoib_intf_alloc(format); @@ -1218,6 +1215,18 @@ static struct net_device *ipoib_add_port priv->dev->features |= NETIF_F_HIGHDMA; + if (!ib_query_port(hca, port, &attr)) + priv->max_ib_mtu = ib_mtu_enum_to_int(attr.max_mtu); + else { + printk(KERN_WARNING "%s: ib_query_port %d failed\n", + hca->name, port); + goto device_init_failed; + } + + /* MTU will be reset when mcast join happens */ + priv->dev->mtu = IPOIB_UD_MTU(priv->max_ib_mtu); + priv->mcast_mtu = priv->admin_mtu = priv->dev->mtu; + result = ib_query_pkey(hca, port, 0, &priv->pkey); if (result) { printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", diff -urpN ipoib-orig/ipoib_multicast.c ipoib-4kmtu/ipoib_multicast.c --- ipoib-orig/ipoib_multicast.c 2008-01-26 20:52:49.000000000 -0600 +++ ipoib-4kmtu/ipoib_multicast.c 2008-01-26 21:28:03.000000000 -0600 @@ -567,9 +567,7 @@ void ipoib_mcast_join_task(struct work_s return; } - priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - - IPOIB_ENCAP_LEN; - + priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); diff -urpN ipoib-orig/ipoib_verbs.c ipoib-4kmtu/ipoib_verbs.c --- ipoib-orig/ipoib_verbs.c 2008-01-26 20:52:49.000000000 -0600 +++ ipoib-4kmtu/ipoib_verbs.c 2008-01-26 21:28:03.000000000 -0600 @@ -150,7 +150,7 @@ int ipoib_transport_dev_init(struct net_ .max_send_wr = ipoib_sendq_size, .max_recv_wr = ipoib_recvq_size, .max_send_sge = dev->features & NETIF_F_SG ? MAX_SKB_FRAGS + 1 : 1, - .max_recv_sge = 1 + .max_recv_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu) }, .sq_sig_type = IB_SIGNAL_ALL_WR, .qp_type = IB_QPT_UD, @@ -208,6 +208,16 @@ int ipoib_transport_dev_init(struct net_ priv->tx_wr.sg_list = priv->tx_sge; priv->tx_wr.send_flags = IB_SEND_SIGNALED; + priv->rx_sge[0].length = IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu); + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu); ++i) { + priv->rx_sge[i].lkey = priv->mr->lkey; + priv->rx_sge[i+1].length = PAGE_SIZE; + } + priv->rx_sge[i+1].length = PAGE_SIZE; + priv->rx_wr.num_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu); + priv->rx_wr.next = NULL; + priv->rx_wr.sg_list = priv->rx_sge; + return 0; out_free_cq: From dwsamarkm at samark.se Sat Jan 26 23:51:38 2008 From: dwsamarkm at samark.se (Clint Pritchard) Date: , 27 Jan 2008 15:51:38 +0800 Subject: [ofa-general] Purchase popular impotency treatment drugs in Canada for the best Net prices. Message-ID: <01c860fc$8043a100$af3955de@dwsamarkm> According to the results of monitoring carried out by the Quality Research Organization, �CanadianPharmacy� online drugstore has the best level of service and the cheapest prices among the 50 online drugstores, while the medications are of the extremely high quality. There is a great selection of modern pharmaceutical products! The utmost care is taken about security of your information. You purchase will be 100% confidential. Prompt delivery, personal approach to each customer! http://geocities.com/dickcamacho648 Visit �CanadianPharmacy� drugstore and you will definitely make the order! From ogerlitz at voltaire.com Sun Jan 27 01:09:44 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Sun, 27 Jan 2008 11:09:44 +0200 Subject: [ofa-general] [PATCH 9/16] ib/ipoib: Add LSO support to ipoib In-Reply-To: <1200501486.13546.78.camel@mtls03> References: <1200501486.13546.78.camel@mtls03> Message-ID: <479C4A58.7050409@voltaire.com> Eli Cohen wrote: > Add LSO support to ipoib > --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c > @@ -153,7 +153,8 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) > .max_recv_sge = 1 > }, > .sq_sig_type = IB_SIGNAL_ALL_WR, > - .qp_type = IB_QPT_UD > + .qp_type = IB_QPT_UD, > + .create_flags = QP_CREATE_LSO, This creation flag should be set only for devices supporting the IB_DEVICE_TCP_TSO capability Or From kliteyn at dev.mellanox.co.il Sun Jan 27 01:46:05 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 27 Jan 2008 11:46:05 +0200 Subject: [ofa-general] [PATCH] opensm/QoS: fixing RDS handling in QoS policy Message-ID: <479C52DD.20909@dev.mellanox.co.il> Sasha, Please apply the following patch to ofed_1_3 and master. Unlike SDP, RDS opens an RC pair of QPs per pair of IPs, so it has a predefined port number that in included in the service id. Removing all the code that was handling per-port service id for RDS and replacing it with the right port number. Signed-off-by: Yevgeny Kliteynik --- opensm/include/opensm/osm_qos_policy.h | 1 + opensm/opensm/osm_qos_parser.l | 2 - opensm/opensm/osm_qos_parser.y | 47 +++----------------------------- 3 files changed, 5 insertions(+), 45 deletions(-) diff --git a/opensm/include/opensm/osm_qos_policy.h b/opensm/include/opensm/osm_qos_policy.h index 82b6258..f5815d8 100644 --- a/opensm/include/opensm/osm_qos_policy.h +++ b/opensm/include/opensm/osm_qos_policy.h @@ -59,6 +59,7 @@ #define OSM_QOS_POLICY_ULP_SDP_SERVICE_ID 0x0000000000010000ULL #define OSM_QOS_POLICY_ULP_RDS_SERVICE_ID 0x0000000001060000ULL +#define OSM_QOS_POLICY_ULP_RDS_PORT 0x48CA #define OSM_QOS_POLICY_ULP_ISER_SERVICE_ID 0x0000000001060000ULL #define OSM_QOS_POLICY_ULP_ISER_PORT 0x035C diff --git a/opensm/opensm/osm_qos_parser.l b/opensm/opensm/osm_qos_parser.l index 41f8720..de59621 100644 --- a/opensm/opensm/osm_qos_parser.l +++ b/opensm/opensm/osm_qos_parser.l @@ -110,7 +110,6 @@ static void reset_new_line_flags(); #define START_ULP_SDP_DEFAULT {in_single_number = TRUE;} /* single number */ #define START_ULP_SDP_PORT {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ #define START_ULP_RDS_DEFAULT {in_single_number = TRUE;} /* single number */ -#define START_ULP_RDS_PORT {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ #define START_ULP_ISER_DEFAULT {in_single_number = TRUE;} /* single number */ #define START_ULP_ISER_PORT {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ #define START_ULP_SRP_GUID {in_list_of_num_ranges = TRUE;} /* comma-separated list of hex or dec num ranges */ @@ -278,7 +277,6 @@ QUOTED_TEXT \"[^\"]*\" {ULP_SDP}{WHITE_COMMA_WHITE}{PORT_NUM} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_PORT; return TK_ULP_SDP_PORT; } {ULP_RDS}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_RDS_DEFAULT; return TK_ULP_RDS_DEFAULT; } -{ULP_RDS}{WHITE_COMMA_WHITE}{PORT_NUM} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_RDS_PORT; return TK_ULP_RDS_PORT; } {ULP_ISER}{WHITE_DOTDOT_WHITE} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_DEFAULT; return TK_ULP_ISER_DEFAULT; } {ULP_ISER}{WHITE_COMMA_WHITE}{PORT_NUM} { SAVE_POS; HANDLE_IF_IN_DESCRIPTION; START_ULP_SDP_PORT; return TK_ULP_ISER_PORT; } diff --git a/opensm/opensm/osm_qos_parser.y b/opensm/opensm/osm_qos_parser.y index 8cae5f3..b1eecc0 100644 --- a/opensm/opensm/osm_qos_parser.y +++ b/opensm/opensm/osm_qos_parser.y @@ -261,7 +261,6 @@ static cl_list_t __ulp_match_rules; %token TK_ULP_SDP_DEFAULT %token TK_ULP_SDP_PORT %token TK_ULP_RDS_DEFAULT -%token TK_ULP_RDS_PORT %token TK_ULP_ISER_DEFAULT %token TK_ULP_ISER_PORT %token TK_ULP_SRP_GUID @@ -295,8 +294,7 @@ qos_policy_entry: qos_ulps_section * sdp, port-num 10000-20000 : 2 * sdp : 0 #default SL for SDP * srp, target-port-guid 0x1234 : 2 - * rds, port-num 25000 : 2 #SL for RDS when destination port is 25000 - * rds, : 0 #default SL for RDS + * rds : 0 #SL for RDS * iser, port-num 900 : 5 #SL for iSER where target port is 900 * iser : 4 #default SL for iSER * ipoib, pkey 0x0001 : 5 #SL for IPoIB on partition with pkey 0x0001 @@ -620,7 +618,6 @@ qos_match_rule_entry: qos_match_rule_use * sdp * sdp with port-num * rds - * rds with port-num * srp with port-guid * iser * iser with port-num @@ -773,51 +770,18 @@ qos_ulp: TK_ULP_DEFAULT single_number { } qos_ulp_sl | qos_ulp_type_rds_default { - /* "rds : sl" - default SL for RDS */ + /* "rds : sl" - SL for RDS */ uint64_t ** range_arr = (uint64_t **)malloc(sizeof(uint64_t *)); range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); - range_arr[0][0] = OSM_QOS_POLICY_ULP_RDS_SERVICE_ID; - range_arr[0][1] = OSM_QOS_POLICY_ULP_RDS_SERVICE_ID + 0xFFFF; + range_arr[0][0] = range_arr[0][1] = + OSM_QOS_POLICY_ULP_RDS_SERVICE_ID + OSM_QOS_POLICY_ULP_RDS_PORT; p_current_qos_match_rule->service_id_range_arr = range_arr; p_current_qos_match_rule->service_id_range_len = 1; } qos_ulp_sl - | qos_ulp_type_rds_port list_of_ranges TK_DOTDOT { - /* rds with port numbers */ - uint64_t ** range_arr; - unsigned range_len; - unsigned i; - - if (!cl_list_count(&tmp_parser_struct.num_pair_list)) - { - __qos_parser_error("RDS ULP rule doesn't have port numbers"); - return 1; - } - - /* get all the port ranges */ - __rangelist2rangearr( &tmp_parser_struct.num_pair_list, - &range_arr, - &range_len ); - /* now translate these port numbers into service ids */ - for (i = 0; i < range_len; i++) - { - if (range_arr[i][0] > 0xFFFF || range_arr[i][1] > 0xFFFF) - { - __qos_parser_error("SDP port number out of range"); - return 1; - } - range_arr[i][0] += OSM_QOS_POLICY_ULP_RDS_SERVICE_ID; - range_arr[i][1] += OSM_QOS_POLICY_ULP_RDS_SERVICE_ID; - } - - p_current_qos_match_rule->service_id_range_arr = range_arr; - p_current_qos_match_rule->service_id_range_len = range_len; - - } qos_ulp_sl - | qos_ulp_type_iser_default { /* "iSER : sl" - default SL for iSER */ uint64_t ** range_arr = @@ -969,9 +933,6 @@ qos_ulp_type_sdp_port: TK_ULP_SDP_PORT qos_ulp_type_rds_default: TK_ULP_RDS_DEFAULT { __parser_ulp_match_rule_start(); }; -qos_ulp_type_rds_port: TK_ULP_RDS_PORT - { __parser_ulp_match_rule_start(); }; - qos_ulp_type_iser_default: TK_ULP_ISER_DEFAULT { __parser_ulp_match_rule_start(); }; -- 1.5.1.4 From ogerlitz at voltaire.com Sun Jan 27 02:02:21 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Sun, 27 Jan 2008 12:02:21 +0200 Subject: [ofa-general] Re: [PATCH 13/16 v2] IB/ipoib: Add support for modify CQ In-Reply-To: <1201193855.6755.66.camel@mtls03> References: <1201193855.6755.66.camel@mtls03> Message-ID: <479C56AD.5020404@voltaire.com> Eli Cohen wrote: > Add support for modify CQ Eli, The subject is wrong, it should be (as in V1) [PATCH 13/16] ib/core: Add support for modify CQ Or > Add support for modifying CQ parameters for controlling > event generation moderation. > > Signed-off-by: Eli Cohen > --- > > changes: > Fix spelling mistakes > Fix function documentation > > drivers/infiniband/core/verbs.c | 7 +++++++ > include/rdma/ib_verbs.h | 11 +++++++++++ > 2 files changed, 18 insertions(+), 0 deletions(-) > > diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c > index 86ed8af..84709ed 100644 > --- a/drivers/infiniband/core/verbs.c > +++ b/drivers/infiniband/core/verbs.c > @@ -628,6 +628,13 @@ struct ib_cq *ib_create_cq(struct ib_device *device, > } > EXPORT_SYMBOL(ib_create_cq); > > +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period) > +{ > + return cq->device->modify_cq ? > + cq->device->modify_cq(cq, cq_count, cq_period) : -ENOSYS; > +} > +EXPORT_SYMBOL(ib_modify_cq); > + > int ib_destroy_cq(struct ib_cq *cq) > { > if (atomic_read(&cq->usecnt)) > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h > index 6ef1729..a8f94a9 100644 > --- a/include/rdma/ib_verbs.h > +++ b/include/rdma/ib_verbs.h > @@ -984,6 +984,8 @@ struct ib_device { > int comp_vector, > struct ib_ucontext *context, > struct ib_udata *udata); > + int (*modify_cq)(struct ib_cq *cq, u16 cq_count, > + u16 cq_period); > int (*destroy_cq)(struct ib_cq *cq); > int (*resize_cq)(struct ib_cq *cq, int cqe, > struct ib_udata *udata); > @@ -1389,6 +1391,15 @@ struct ib_cq *ib_create_cq(struct ib_device *device, > int ib_resize_cq(struct ib_cq *cq, int cqe); > > /** > + * ib_modify_cq - Modifies moderation params of the CQ > + * @cq: The CQ to modify. > + * @cq_count: number of CQEs that will trigger an event > + * @cq_period: max period of time in usec before triggering an event > + * > + */ > +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); > + > +/** > * ib_destroy_cq - Destroys the specified CQ. > * @cq: The CQ to destroy. > */ From ogerlitz at voltaire.com Sun Jan 27 02:03:07 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Sun, 27 Jan 2008 12:03:07 +0200 Subject: [ofa-general] Re: [PATCH 14/16 v2] IB/ipoib: Support modifying IPOIB CQ moderation params In-Reply-To: <1201193863.6755.67.camel@mtls03> References: <1201193863.6755.67.camel@mtls03> Message-ID: <479C56DB.7030000@voltaire.com> Eli Cohen wrote: > Support modifying IPOIB CQ moderation params > > This can be used to tune at run time the paramters controlling > the event (interrupt) generation rate and thus reduce the overhead > incurred by handling interrupts resulting in better throughput. Eli, Since IPoIB has one CQ, I am fine with the approach taken by this patch to let the rx lead and report also on tx moderation using the rx moderation params provided by the user. So my sole request here is that you just document this behavior at the change-log and in ipoib_get_coalesce Or From kliteyn at dev.mellanox.co.il Sun Jan 27 02:17:41 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 27 Jan 2008 12:17:41 +0200 Subject: [ofa-general] Standard RDS port number Message-ID: <479C5A45.2010701@dev.mellanox.co.il> Hello Olaf, I'm working on QoS management in OFED. I noticed the following in the rds.h: /* * XXX randomly chosen, but at least seems to be unused: * # 18464-18768 Unassigned * We should do better. We want a reserved port to discourage unpriv'ed * userspace from listening. * * port 18633 was the version that had ack frames on the wire. */ #define RDS_PORT 18634 I'm using this port number to recognize RDS connection in QoS manager (OpenSM). How 'solid' is this RDS_PORT definition? Will it be standardized somehow? Do you have some plans to change it? Thanks -- Yevgeny From kliteyn at dev.mellanox.co.il Sun Jan 27 02:29:23 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 27 Jan 2008 12:29:23 +0200 Subject: [ofa-general] [PATCH v2] opensm/QoS: fixing RDS handling in QoS policy Message-ID: <479C5D03.2080104@dev.mellanox.co.il> Sasha, Please apply the following patch to ofed_1_3 and master. Unlike SDP, RDS opens an RC pair of QPs per pair of IPs, so it has a predefined port number that is included in the service id. Signed-off-by: Yevgeny Kliteynik --- opensm/include/opensm/osm_qos_policy.h | 1 + opensm/opensm/osm_qos_parser.y | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/opensm/include/opensm/osm_qos_policy.h b/opensm/include/opensm/osm_qos_policy.h index 82b6258..f5815d8 100644 --- a/opensm/include/opensm/osm_qos_policy.h +++ b/opensm/include/opensm/osm_qos_policy.h @@ -59,6 +59,7 @@ #define OSM_QOS_POLICY_ULP_SDP_SERVICE_ID 0x0000000000010000ULL #define OSM_QOS_POLICY_ULP_RDS_SERVICE_ID 0x0000000001060000ULL +#define OSM_QOS_POLICY_ULP_RDS_PORT 0x48CA #define OSM_QOS_POLICY_ULP_ISER_SERVICE_ID 0x0000000001060000ULL #define OSM_QOS_POLICY_ULP_ISER_PORT 0x035C diff --git a/opensm/opensm/osm_qos_parser.y b/opensm/opensm/osm_qos_parser.y index 8cae5f3..ef97d9f 100644 --- a/opensm/opensm/osm_qos_parser.y +++ b/opensm/opensm/osm_qos_parser.y @@ -777,8 +777,8 @@ qos_ulp: TK_ULP_DEFAULT single_number { uint64_t ** range_arr = (uint64_t **)malloc(sizeof(uint64_t *)); range_arr[0] = (uint64_t *)malloc(2*sizeof(uint64_t)); - range_arr[0][0] = OSM_QOS_POLICY_ULP_RDS_SERVICE_ID; - range_arr[0][1] = OSM_QOS_POLICY_ULP_RDS_SERVICE_ID + 0xFFFF; + range_arr[0][0] = range_arr[0][1] = + OSM_QOS_POLICY_ULP_RDS_SERVICE_ID + OSM_QOS_POLICY_ULP_RDS_PORT; p_current_qos_match_rule->service_id_range_arr = range_arr; p_current_qos_match_rule->service_id_range_len = 1; -- 1.5.1.4 From superintendencel74 at schrauben-gross.de Sun Jan 27 02:39:46 2008 From: superintendencel74 at schrauben-gross.de (Shane Helton) Date: , 27 Jan 2008 11:39:46 +0100 Subject: [ofa-general] Now it is possible to have sex more than 10 times a day Message-ID: <632694776.04592145220219@schrauben-gross.de> Millions of dollars spent on re hdj sea lo rch, and finally a p da e rbc ni yag s len qq gthe gt ning pr gqc od ufe uct that works.http://home.graffiti.net/sbcwjlq/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlad at lists.openfabrics.org Sun Jan 27 03:14:28 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Sun, 27 Jan 2008 03:14:28 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080127-0200 daily build status Message-ID: <20080127111428.AD272E60A65@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.13 Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.18 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.23 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.14 Passed on x86_64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.19 Passed on powerpc with linux-2.6.13 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.12 Passed on x86_64 with linux-2.6.13 Passed on ia64 with linux-2.6.19 Passed on powerpc with linux-2.6.12 Passed on powerpc with linux-2.6.15 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.13 Passed on ia64 with linux-2.6.15 Passed on x86_64 with linux-2.6.20 Passed on ppc64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.17 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.22 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-1.2798.fc6 Failed: From ogerlitz at voltaire.com Sun Jan 27 04:51:39 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Sun, 27 Jan 2008 14:51:39 +0200 Subject: [ofa-general] Re: [PATCH 4/16 v2] IB/ipoib: Add checksum offload support for ipoib In-Reply-To: <1201193831.6755.63.camel@mtls03> References: <1201193831.6755.63.camel@mtls03> Message-ID: <479C7E5B.8040104@voltaire.com> Eli Cohen wrote: > --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c > @@ -1234,6 +1234,11 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, > set_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); > ipoib_warn(priv, "enabling connected mode " > "will cause multicast packet drops\n"); > + > + dev->features &= ~NETIF_F_IP_CSUM; if adding NETIF_F_IP_CSUM brings in NETIF_F_SG, why not ANDing here with ~NETIF_F_SG as well? > + > + priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM; From vlad at dev.mellanox.co.il Sun Jan 27 04:52:05 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 27 Jan 2008 14:52:05 +0200 Subject: [ofa-general] [ANNOUNCE] ofed_1_3/linux-2.6.git updated to 2.6.24 Message-ID: <479C7E75.1090406@dev.mellanox.co.il> FYI, git://git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel I've merged in 2.6.24. Regards, Vladimir From vlad at dev.mellanox.co.il Sun Jan 27 04:52:32 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 27 Jan 2008 14:52:32 +0200 Subject: [ofa-general] Re: [Fwd: [PATCH 2.6.25] RDMA/cxgb3: Fix the T3A workaround checks.] In-Reply-To: <479A1AF3.9020808@opengridcomputing.com> References: <479A1AF3.9020808@opengridcomputing.com> Message-ID: <479C7E90.4020702@dev.mellanox.co.il> Steve Wise wrote: > Vlad, > > Please pull this patch from > > git://www.openfabrics.org/~swise/ofed-1.3 ofed_kernel > > This has been accepted upstream and is needed for ofed-1.3 to support > new device types. > > Thanks, > > Steve. Done, Regards, Vladimir From vlad at dev.mellanox.co.il Sun Jan 27 04:59:34 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 27 Jan 2008 14:59:34 +0200 Subject: [ofa-general] Re: [GIT PULL ofed-1.2.5] - cxgb3 fixes In-Reply-To: <479A65DF.2070407@opengridcomputing.com> References: <479A65DF.2070407@opengridcomputing.com> Message-ID: <479C8036.9010308@dev.mellanox.co.il> Steve Wise wrote: > Vlad, > > Please pull these fixes for ofed-1.2.5 from: > > git://git.openfabrics.org/~swise/ofed-1.2.5 ofed_1_2_c > >> RDMA/cxgb3: Flush the receive queue when closing >> RDMA/cxgb3: Fix page shift calculation in build_phys_page_list() >> RDMA/cxgb3: Mark QP as privileged based on user capabilities >> RDMA/cxgb3: Fix the T3A workaround checks > > These are all going upstream and in ofed-1.3 and I want to keep > ofed-1.2.5 up to date as well. Can these make 1.2.5.5 by chance? > > > Thanks, > > Steve. Done, Regards, Vladimir From dotanb at dev.mellanox.co.il Sun Jan 27 05:06:16 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Sun, 27 Jan 2008 15:06:16 +0200 Subject: [ofa-general] Zero byte rdma read causes REM_OP_ERROR In-Reply-To: <3307cdf90801251254p5983b62x687549bb793db39d@mail.gmail.com> References: <3307cdf90801242007y3ace39ccrb72d5f35c3a937e4@mail.gmail.com> <4799D9F7.4030607@dev.mellanox.co.il> <3307cdf90801251254p5983b62x687549bb793db39d@mail.gmail.com> Message-ID: <479C81C8.6060106@dev.mellanox.co.il> Rajouri Jammu wrote: > I'm using rdma_cm and I don't set the qp_access_flags explicitly. > > I presume they are set correctly since non-zero length rdma reads > complete successfully. I have also verified the data. > > the only place I set the privileges is when registering the memory > region and I have them set at > IBV_ACCESS_LOCAL_WRITE, _REMOTE_READ and _REMOTE_WRITE Can you share with us/me the code that fails? Dotan From dwsesolutionsm at sesolutions.com Sun Jan 27 05:03:32 2008 From: dwsesolutionsm at sesolutions.com (Darin Billings) Date: , 27 Jan 2008 14:03:32 +0100 Subject: [ofa-general] Medications that you need. Message-ID: <01c860ed$66ad05b0$3bcc0957@dwsesolutionsm> Buy Must Have medications at Canada based pharmacy. No prescription at all! Same quality! Save your money, buy pills immediately! http://geocities.com/gregoriolawson582 We provide confidential and secure purchase! From vlad at dev.mellanox.co.il Sun Jan 27 05:08:33 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 27 Jan 2008 15:08:33 +0200 Subject: [ofa-general] Re: [GIT PULL ofed-1.2.5 / ofed-1.3] - libcxgb3-1.1.3 release In-Reply-To: <479A6AF8.4010306@opengridcomputing.com> References: <479A6AF8.4010306@opengridcomputing.com> Message-ID: <479C8251.2050009@dev.mellanox.co.il> Steve Wise wrote: > Vlad, > > Please pull version 1.1.3 of libcxgb3 for ofed-1.2.5 and ofed-1.3. This > release fixes problems with running libcxgb3 on RH4U5 and other distros. > > Pull from: > > git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 > > and > > git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3 > > > Also, the release can be downloaded from: > > http://www.openfabrics.org/downloads/cxgb3/libcxgb3-1.1.3.tar.gz > > > Thanks, > > Steve. Done, Regards, Vladimir From eli at mellanox.co.il Sun Jan 27 05:08:48 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Sun, 27 Jan 2008 15:08:48 +0200 Subject: [ofa-general] CM Enable SRQ for less than 16 s/g - bug Message-ID: <1201439328.9219.15.camel@mtls03> This commit b150c30c28976f0dcf96bb28780ae62897264c54 introduces a problem in IPOIB CM. failure description: test hangs. bug was found by Mellanox regression. test info: server: ttcpv -s -r -p 19033 -l 100000 client: ttcpv -s -t 11.4.3.112 -p 19033 -l 100000 -n 8192 host info: Name =sw112 Serie =PowerEdge 2850 IP =10.4.3.112 CpuNum =4 CpuVendor =GenuineIntel CpuModel = Intel(R) Xeon(TM) CPU 3.20GHz CpuMhz =3192.303 MemSizeKb =4049452 MachType =x86_64 KernelRev =2.6.16.21-0.8-smp ChipSet =Intel Corporation E7520 Memory Controller Hub (rev 09) Os =Welcome to SUSE Linux Enterprise Server 10 (x86_64) - Kernel \r (\l). IBDevsNum =1 HCA0Name =mlx4_0 HCA0Desc =sw112 HCA-1 HCA0Type =MT25418 HCA0FWVer =2.3.914 HCA0PSID =MT_04A0110002 HCA0GUIDS =NODE:0x0000000000004024;SYS:0x0000000000004027 HCA0Ports =1:0x0000000000004025:0x8:11.4.3.112:ACTIVE;2:0x0000000000004026:0x0:12.4.3.112:DOWN IBStackVer =ofa_1_3_dev-20080122-0855 Can you take a look at this? From dotanb at dev.mellanox.co.il Sun Jan 27 05:08:30 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Sun, 27 Jan 2008 15:08:30 +0200 Subject: [ofa-general] {PATCH] sdpnetstat: fix compilation warnings on x86_64 machine Message-ID: <200801271508.31178.dotanb@dev.mellanox.co.il> Fix the following compilation warning on x86_64 machine: interface.c:431: warning: cast from pointer to integer of different size interface.c:438: warning: cast from pointer to integer of different size ifconfig.c: In function `main': ifconfig.c:463: warning: cast to pointer from integer of different size ifconfig.c:477: warning: cast to pointer from integer of different size hostname.c: In function `sethname': hostname.c:104: warning: implicit declaration of function `exit' Signed-off-by: Dotan Barak --- Index: ofa_1_3_dev_user/src/userspace/sdpnetstat/hostname.c =================================================================== --- ofa_1_3_dev_user.orig/src/userspace/sdpnetstat/hostname.c 2008-01-27 11:20:40.000000000 +0200 +++ ofa_1_3_dev_user/src/userspace/sdpnetstat/hostname.c 2008-01-27 14:59:34.000000000 +0200 @@ -31,6 +31,7 @@ * your option) any later version. */ #include +#include #include #include #include Index: ofa_1_3_dev_user/src/userspace/sdpnetstat/ifconfig.c =================================================================== --- ofa_1_3_dev_user.orig/src/userspace/sdpnetstat/ifconfig.c 2008-01-27 11:20:40.000000000 +0200 +++ ofa_1_3_dev_user/src/userspace/sdpnetstat/ifconfig.c 2008-01-27 14:59:35.000000000 +0200 @@ -460,7 +460,7 @@ int main(int argc, char **argv) if (!strcmp(*spp, "keepalive")) { if (*++spp == NULL) usage(); - ifr.ifr_data = (caddr_t) atoi(*spp); + ifr.ifr_data = (caddr_t)(unsigned long) atoi(*spp); if (ioctl(skfd, SIOCSKEEPALIVE, &ifr) < 0) { fprintf(stderr, "SIOCSKEEPALIVE: %s\n", strerror(errno)); goterr = 1; @@ -474,7 +474,7 @@ int main(int argc, char **argv) if (!strcmp(*spp, "outfill")) { if (*++spp == NULL) usage(); - ifr.ifr_data = (caddr_t) atoi(*spp); + ifr.ifr_data = (caddr_t)(unsigned long) atoi(*spp); if (ioctl(skfd, SIOCSOUTFILL, &ifr) < 0) { fprintf(stderr, "SIOCSOUTFILL: %s\n", strerror(errno)); goterr = 1; Index: ofa_1_3_dev_user/src/userspace/sdpnetstat/lib/interface.c =================================================================== --- ofa_1_3_dev_user.orig/src/userspace/sdpnetstat/lib/interface.c 2008-01-27 11:20:40.000000000 +0200 +++ ofa_1_3_dev_user/src/userspace/sdpnetstat/lib/interface.c 2008-01-27 14:59:45.000000000 +0200 @@ -428,14 +428,14 @@ int if_fetch(struct interface *ife) if (ioctl(skfd, SIOCGOUTFILL, &ifr) < 0) ife->outfill = 0; else - ife->outfill = (unsigned int) ifr.ifr_data; + ife->outfill = (unsigned int)(unsigned long) ifr.ifr_data; #endif #ifdef SIOCGKEEPALIVE strcpy(ifr.ifr_name, ifname); if (ioctl(skfd, SIOCGKEEPALIVE, &ifr) < 0) ife->keepalive = 0; else - ife->keepalive = (unsigned int) ifr.ifr_data; + ife->keepalive = (unsigned int)(unsigned long) ifr.ifr_data; #endif } #endif From kliteyn at dev.mellanox.co.il Sun Jan 27 07:27:05 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 27 Jan 2008 17:27:05 +0200 Subject: [ofa-general] [PATCH] opensm/osm_ucast_ftree.c: ignore port 0 and loopbacks on swithces Message-ID: <479CA2C9.6090402@dev.mellanox.co.il> Hi Sasha, Fat-tree routing should ignore port 0 and loopback connections on switches when populating its db. Please apply to ofed_1_3 and master. Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_ucast_ftree.c | 15 ++++++++++++++- 1 files changed, 14 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_ucast_ftree.c b/opensm/opensm/osm_ucast_ftree.c index dcbdc44..904a9c5 100644 --- a/opensm/opensm/osm_ucast_ftree.c +++ b/opensm/opensm/osm_ucast_ftree.c @@ -3113,7 +3113,7 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, CL_ASSERT(osm_node_get_type(p_node) == IB_NODE_TYPE_SWITCH); - for (i = 0; i < osm_node_get_num_physp(p_node); i++) { + for (i = 1; i < osm_node_get_num_physp(p_node); i++) { osm_physp_t *p_osm_port = osm_node_get_physp_ptr(p_node, i); if (!osm_physp_is_valid(p_osm_port)) @@ -3158,6 +3158,19 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, __osm_ftree_fabric_get_sw_by_guid(p_ftree, remote_node_guid); CL_ASSERT(p_remote_sw); + + /* ignore any loopback connection on switch */ + if (p_sw == p_remote_sw) { + osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, + "__osm_ftree_fabric_construct_sw_ports: " + "Ignoring loopback on switch 0x%016" PRIx64 + ", LID 0x%04x, rank %u\n", + __osm_ftree_sw_get_guid_ho(p_sw), + cl_ntoh16(p_sw->base_lid), + p_sw->rank); + continue; + } + p_remote_hca_or_sw = (void *)p_remote_sw; if (abs(p_sw->rank - p_remote_sw->rank) != 1) { -- 1.5.1.4 From sashak at voltaire.com Sun Jan 27 07:37:50 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 15:37:50 +0000 Subject: [ofa-general] [PATCH] opensm: simplify osm_port_t setup procedure Message-ID: <20080127153750.GG24344@sashak.voltaire.com> This simplifies osm_port_t setup procedure - it will always have valid p_physp pointer (for switches it will be initialized at a node creation time), we will not need to run over node's physp list anymore. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_port.h | 29 ----------------------- opensm/opensm/osm_node.c | 19 +++++++++++++++ opensm/opensm/osm_port.c | 45 +++++------------------------------- opensm/opensm/osm_port_info_rcv.c | 2 - 4 files changed, 26 insertions(+), 69 deletions(-) diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h index bba4e44..1bf737c 100644 --- a/opensm/include/opensm/osm_port.h +++ b/opensm/include/opensm/osm_port.h @@ -1429,35 +1429,6 @@ osm_get_port_by_base_lid(IN const osm_subn_t * const p_subn, * Port *********/ -/****f* OpenSM: Port/osm_port_add_new_physp -* NAME -* osm_port_add_new_physp -* -* DESCRIPTION -* Adds a new physical port to the logical collection owned by the Port. -* Physical Ports added here must share the same GUID as the Port. -* -* SYNOPSIS -*/ -void -osm_port_add_new_physp(IN osm_port_t * const p_port, IN const uint8_t port_num); -/* -* PARAMETERS -* p_port -* [in] Pointer to a Port object. -* -* port_num -* [in] Port number to add. -* -* RETURN VALUE -* None. -* -* NOTES -* -* SEE ALSO -* Port -*********/ - /****f* OpenSM: Port/osm_port_add_mgrp * NAME * osm_port_add_mgrp diff --git a/opensm/opensm/osm_node.c b/opensm/opensm/osm_node.c index 39f4181..176f916 100644 --- a/opensm/opensm/osm_node.c +++ b/opensm/opensm/osm_node.c @@ -86,6 +86,23 @@ osm_node_init_physp(IN osm_node_t * const p_node, /********************************************************************** **********************************************************************/ +static void node_init_physp0(IN osm_node_t * const p_node, + IN const osm_madw_t * const p_madw) +{ + ib_smp_t *p_smp; + ib_node_info_t *p_ni; + + p_smp = osm_madw_get_smp_ptr(p_madw); + p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp); + + osm_physp_init(&p_node->physp_table[0], + p_ni->port_guid, 0, p_node, + osm_madw_get_bind_handle(p_madw), + p_smp->hop_count, p_smp->initial_path); +} + +/********************************************************************** + **********************************************************************/ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw) { osm_node_t *p_node; @@ -132,6 +149,8 @@ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw) osm_physp_construct(&p_node->physp_table[i]); osm_node_init_physp(p_node, p_madw); + if (p_ni->node_type == IB_NODE_TYPE_SWITCH) + node_init_physp0(p_node, p_madw); p_node->print_desc = strdup(""); return (p_node); diff --git a/opensm/opensm/osm_port.c b/opensm/opensm/osm_port.c index 79f42ac..ffc4fb0 100644 --- a/opensm/opensm/osm_port.c +++ b/opensm/opensm/osm_port.c @@ -153,9 +153,9 @@ osm_port_init(IN osm_port_t * const p_port, IN const ib_node_info_t * p_ni, IN const osm_node_t * const p_parent_node) { - uint32_t port_index; ib_net64_t port_guid; osm_physp_t *p_physp; + uint8_t port_num; CL_ASSERT(p_port); CL_ASSERT(p_ni); @@ -166,27 +166,18 @@ osm_port_init(IN osm_port_t * const p_port, p_port->p_node = (struct _osm_node *)p_parent_node; port_guid = p_ni->port_guid; p_port->guid = port_guid; + port_num = p_ni->node_type == IB_NODE_TYPE_SWITCH ? + 0 : ib_node_info_get_local_port_num(p_ni); /* Get the pointers to the physical node objects "owned" by this logical port GUID. - For switches, all the ports are owned; for HCA's and routers, + For switches, port '0' is owned; for HCA's and routers, only the singular part that has this GUID is owned. */ - for (port_index = 0; port_index < p_parent_node->physp_tbl_size; - port_index++) { - p_physp = osm_node_get_physp_ptr(p_parent_node, port_index); - /* - Because much of the PortInfo data is only valid - for port 0 on switches, try to keep the lowest - possible value of default_port_num. - */ - if (osm_physp_is_valid(p_physp) && - port_guid == osm_physp_get_port_guid(p_physp)) { - p_port->p_physp = p_physp; - break; - } - } + p_physp = osm_node_get_physp_ptr(p_parent_node, port_num); + CL_ASSERT(port_guid == osm_physp_get_port_guid(p_physp)); + p_port->p_physp = p_physp; } /********************************************************************** @@ -258,28 +249,6 @@ osm_get_port_by_base_lid(IN const osm_subn_t * const p_subn, /********************************************************************** **********************************************************************/ -void -osm_port_add_new_physp(IN osm_port_t * const p_port, IN const uint8_t port_num) -{ - osm_physp_t *p_physp; - - p_physp = osm_node_get_physp_ptr(p_port->p_node, port_num); - CL_ASSERT(osm_physp_is_valid(p_physp)); - CL_ASSERT(osm_physp_get_port_guid(p_physp) == p_port->guid); - - /* - For switches, we generally want to use Port 0, which is - the management port as the default Physical Port. - The LID value in the PortInfo for example, is only valid - for port 0 on switches. - */ - if (!osm_physp_is_valid(p_port->p_physp) || - port_num < p_port->p_physp->port_num) - p_port->p_physp = p_physp; -} - -/********************************************************************** - **********************************************************************/ ib_api_status_t osm_port_add_mgrp(IN osm_port_t * const p_port, IN const ib_net16_t mlid) { diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c index 8cc33c5..b91d1a5 100644 --- a/opensm/opensm/osm_port_info_rcv.c +++ b/opensm/opensm/osm_port_info_rcv.c @@ -652,8 +652,6 @@ void osm_pi_rcv_process(IN void *context, IN void *data) p_node, osm_madw_get_bind_handle(p_madw), p_smp->hop_count, p_smp->initial_path); - - osm_port_add_new_physp(p_port, port_num); } else { /* Update the directed route path to this port -- 1.5.4.rc5 From sashak at voltaire.com Sun Jan 27 07:38:45 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 15:38:45 +0000 Subject: [ofa-general] [PATCH] opensm: remove some unneeded assertions In-Reply-To: <20080127153750.GG24344@sashak.voltaire.com> References: <20080127153750.GG24344@sashak.voltaire.com> Message-ID: <20080127153845.GH24344@sashak.voltaire.com> Remove some duplicated CL_ASSERT()s and invalid run-time checks. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_node.c | 10 ---------- opensm/opensm/osm_node_info_rcv.c | 27 --------------------------- 2 files changed, 0 insertions(+), 37 deletions(-) diff --git a/opensm/opensm/osm_node.c b/opensm/opensm/osm_node.c index 176f916..4586ff5 100644 --- a/opensm/opensm/osm_node.c +++ b/opensm/opensm/osm_node.c @@ -65,13 +65,8 @@ osm_node_init_physp(IN osm_node_t * const p_node, ib_node_info_t *p_ni; uint8_t port_num; - CL_ASSERT(p_node); - CL_ASSERT(p_madw); - p_smp = osm_madw_get_smp_ptr(p_madw); - CL_ASSERT(p_smp->attr_id == IB_MAD_ATTR_NODE_INFO); - p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp); port_guid = p_ni->port_guid; port_num = ib_node_info_get_local_port_num(p_ni); @@ -111,12 +106,7 @@ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw) uint8_t i; uint32_t size; - CL_ASSERT(p_madw); - p_smp = osm_madw_get_smp_ptr(p_madw); - - CL_ASSERT(p_smp->attr_id == IB_MAD_ATTR_NODE_INFO); - p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp); /* diff --git a/opensm/opensm/osm_node_info_rcv.c b/opensm/opensm/osm_node_info_rcv.c index 50287dc..cfce437 100644 --- a/opensm/opensm/osm_node_info_rcv.c +++ b/opensm/opensm/osm_node_info_rcv.c @@ -281,9 +281,6 @@ __osm_ni_rcv_process_new_node(IN osm_sm_t * sm, OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_new_node); - CL_ASSERT(p_node); - CL_ASSERT(p_madw); - p_smp = osm_madw_get_smp_ptr(p_madw); p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp); port_num = ib_node_info_get_local_port_num(p_ni); @@ -298,11 +295,6 @@ __osm_ni_rcv_process_new_node(IN osm_sm_t * sm, */ p_physp = osm_node_get_physp_ptr(p_node, port_num); - CL_ASSERT(osm_physp_is_valid(p_physp)); - CL_ASSERT(osm_madw_get_bind_handle(p_madw) == - osm_dr_path_get_bind_handle(osm_physp_get_dr_path_ptr - (p_physp))); - context.pi_context.node_guid = p_ni->node_guid; context.pi_context.port_guid = p_ni->port_guid; context.pi_context.set_method = FALSE; @@ -339,9 +331,6 @@ __osm_ni_rcv_get_node_desc(IN osm_sm_t * sm, OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_get_node_desc); - CL_ASSERT(p_node); - CL_ASSERT(p_madw); - p_smp = osm_madw_get_smp_ptr(p_madw); p_ni = (ib_node_info_t *) ib_smp_get_payload_ptr(p_smp); port_num = ib_node_info_get_local_port_num(p_ni); @@ -356,11 +345,6 @@ __osm_ni_rcv_get_node_desc(IN osm_sm_t * sm, */ p_physp = osm_node_get_physp_ptr(p_node, port_num); - CL_ASSERT(osm_physp_is_valid(p_physp)); - CL_ASSERT(osm_madw_get_bind_handle(p_madw) == - osm_dr_path_get_bind_handle(osm_physp_get_dr_path_ptr - (p_physp))); - context.nd_context.node_guid = osm_node_get_node_guid(p_node); status = osm_req_get(sm, osm_physp_get_dr_path_ptr(p_physp), @@ -480,14 +464,6 @@ __osm_ni_rcv_process_existing_ca_or_router(IN osm_sm_t * sm, p_physp = osm_node_get_physp_ptr(p_node, port_num); } else { p_physp = osm_node_get_physp_ptr(p_node, port_num); - - if (!osm_physp_is_valid(p_physp)) { - osm_log(sm->p_log, OSM_LOG_ERROR, - "__osm_ni_rcv_process_existing_ca_or_router: ERR 0D19: " - "Invalid physical port. Aborting discovery\n"); - goto Exit; - } - /* Update the DR Path to the port, in case the old one is no longer available. @@ -532,9 +508,6 @@ __osm_ni_rcv_process_switch(IN osm_sm_t * sm, OSM_LOG_ENTER(sm->p_log, __osm_ni_rcv_process_switch); - CL_ASSERT(p_node); - CL_ASSERT(p_madw); - p_smp = osm_madw_get_smp_ptr(p_madw); osm_dr_path_init(&dr_path, -- 1.5.4.rc5 From vlad at dev.mellanox.co.il Sun Jan 27 07:30:46 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Sun, 27 Jan 2008 17:30:46 +0200 Subject: [ofa-general] Re: [GIT PULL ofed-1.3] neteffect updates In-Reply-To: <5E701717F2B2ED4EA60F87C8AA57B7CC0794FED4@venom2> References: <5E701717F2B2ED4EA60F87C8AA57B7CC0794FED4@venom2> Message-ID: <479CA3A6.7020008@dev.mellanox.co.il> Glenn Streiff wrote: > Vlad, > > Please pull from updated neteffect repository for latest ofed 1.3 > release candidate: > > git://git.openfabrics.org/~glenn/linux-2.6.git ofed_kernel > > This reflects content accepted into the upstream by Roland, plus: > > * updated MAINTAINTERS file > * kernel.h backport (which you reviewed last week) > > Let me know if you want this posted as a patch to > the community. > > * iw_nes_[1-3]00_*.patch backports commit > > Noticed these were listed as untracked in the previous > maintainer's working respository (Glenn Grundstrom). > The check build was failing without the commit. > > Passed build_ofa_kernel.sh. > > Thanks, > Done, Regards, Vladimir From sashak at voltaire.com Sun Jan 27 07:40:44 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 15:40:44 +0000 Subject: [ofa-general] [PATCH] opensm: cleanup osm_physp_is_valid() use In-Reply-To: <20080127153845.GH24344@sashak.voltaire.com> References: <20080127153750.GG24344@sashak.voltaire.com> <20080127153845.GH24344@sashak.voltaire.com> Message-ID: <20080127154044.GI24344@sashak.voltaire.com> osm_node_get_physp_ptr() will return only pointer to initialized osm_physp_t or NULL otherwise. This simplifies many flows in OpenSM. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_node.h | 10 +++--- opensm/include/opensm/osm_port.h | 2 +- opensm/opensm/osm_console.c | 2 +- opensm/opensm/osm_drop_mgr.c | 6 ++-- opensm/opensm/osm_dump.c | 9 +++--- opensm/opensm/osm_lid_mgr.c | 8 ++--- opensm/opensm/osm_link_mgr.c | 8 +---- opensm/opensm/osm_mcast_mgr.c | 23 +-------------- opensm/opensm/osm_node.c | 9 +++--- opensm/opensm/osm_perfmgr.c | 4 +- opensm/opensm/osm_pkey_mgr.c | 8 +++--- opensm/opensm/osm_pkey_rcv.c | 2 +- opensm/opensm/osm_port.c | 23 +++++++--------- opensm/opensm/osm_port_info_rcv.c | 9 +++--- opensm/opensm/osm_qos.c | 4 +- opensm/opensm/osm_qos_parser.y | 5 +-- opensm/opensm/osm_sa_guidinfo_record.c | 7 ++--- opensm/opensm/osm_sa_link_record.c | 44 +++++++----------------------- opensm/opensm/osm_sa_multipath_record.c | 3 -- opensm/opensm/osm_sa_node_record.c | 7 ++--- opensm/opensm/osm_sa_path_record.c | 3 -- opensm/opensm/osm_sa_pkey_record.c | 4 +- opensm/opensm/osm_sa_portinfo_record.c | 8 +++--- opensm/opensm/osm_sa_slvl_record.c | 4 +- opensm/opensm/osm_sa_vlarb_record.c | 4 +- opensm/opensm/osm_slvl_map_rcv.c | 4 +-- opensm/opensm/osm_state_mgr.c | 5 +-- opensm/opensm/osm_sw_info_rcv.c | 6 ---- opensm/opensm/osm_switch.c | 6 +--- opensm/opensm/osm_trap_rcv.c | 27 ++++++++---------- opensm/opensm/osm_ucast_ftree.c | 22 +++------------ opensm/opensm/osm_ucast_lash.c | 14 +++------ opensm/opensm/osm_ucast_updn.c | 7 +--- opensm/opensm/osm_vl_arb_rcv.c | 4 +-- 34 files changed, 107 insertions(+), 204 deletions(-) diff --git a/opensm/include/opensm/osm_node.h b/opensm/include/opensm/osm_node.h index a900e03..56e4dbb 100644 --- a/opensm/include/opensm/osm_node.h +++ b/opensm/include/opensm/osm_node.h @@ -213,13 +213,13 @@ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw); * * SYNOPSIS */ -static inline osm_physp_t *osm_node_get_physp_ptr(IN const osm_node_t * - const p_node, +static inline osm_physp_t *osm_node_get_physp_ptr(IN osm_node_t * const p_node, IN const uint32_t port_num) { CL_ASSERT(port_num < p_node->physp_tbl_size); - return ((osm_physp_t *) & p_node->physp_table[port_num]); + return osm_physp_is_valid(&p_node->physp_table[port_num]) ? + &p_node->physp_table[port_num] : NULL; } /* @@ -383,7 +383,7 @@ static inline uint8_t osm_node_get_num_physp(IN const osm_node_t * const p_node) * * SYNOPSIS */ -osm_node_t *osm_node_get_remote_node(IN const osm_node_t * const p_node, +osm_node_t *osm_node_get_remote_node(IN osm_node_t * const p_node, IN const uint8_t port_num, OUT uint8_t * p_remote_port_num); /* @@ -457,7 +457,7 @@ osm_node_get_base_lid(IN const osm_node_t * const p_node, * SYNOPSIS */ ib_net16_t -osm_node_get_remote_base_lid(IN const osm_node_t * const p_node, +osm_node_get_remote_base_lid(IN osm_node_t * const p_node, IN const uint32_t port_num); /* * PARAMETERS diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h index 1bf737c..963e13b 100644 --- a/opensm/include/opensm/osm_port.h +++ b/opensm/include/opensm/osm_port.h @@ -1252,7 +1252,7 @@ void osm_port_delete(IN OUT osm_port_t ** const pp_port); * SYNOPSIS */ osm_port_t *osm_port_new(IN const ib_node_info_t * p_ni, - IN const struct _osm_node *const p_parent_node); + IN struct _osm_node *const p_parent_node); /* * PARAMETERS * p_ni diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c index ced02e3..c0e7886 100644 --- a/opensm/opensm/osm_console.c +++ b/opensm/opensm/osm_console.c @@ -630,7 +630,7 @@ static void __get_stats(cl_map_item_t * const p_map_item, void *context) uint8_t port_state = ib_port_info_get_port_state(pi); uint8_t port_phys_state = ib_port_info_get_port_phys_state(pi); - if (!osm_physp_is_valid(phys)) + if (!phys) continue; if ((enabled_width ^ active_width) > active_width) { diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c index 39ceaa1..2b8966c 100644 --- a/opensm/opensm/osm_drop_mgr.c +++ b/opensm/opensm/osm_drop_mgr.c @@ -136,7 +136,7 @@ drop_mgr_clean_physp(IN const osm_drop_mgr_t * const p_mgr, osm_port_t *p_remote_port; p_remote_physp = osm_physp_get_remote(p_physp); - if (p_remote_physp && osm_physp_is_valid(p_remote_physp)) { + if (p_remote_physp) { p_remote_port = osm_get_port_by_guid(p_mgr->p_subn, p_remote_physp->port_guid); @@ -383,7 +383,7 @@ __osm_drop_mgr_process_node(IN const osm_drop_mgr_t * const p_mgr, max_ports = osm_node_get_num_physp(p_node); for (port_num = 0; port_num < max_ports; port_num++) { p_physp = osm_node_get_physp_ptr(p_node, port_num); - if (osm_physp_is_valid(p_physp)) { + if (p_physp) { port_guid = osm_physp_get_port_guid(p_physp); p_port = osm_get_port_by_guid(p_mgr->p_subn, port_guid); @@ -454,7 +454,7 @@ __osm_drop_mgr_check_node(IN const osm_drop_mgr_t * const p_mgr, /* Make sure we have a port object for port zero */ p_physp = osm_node_get_physp_ptr(p_node, 0); - if (!osm_physp_is_valid(p_physp)) { + if (!p_physp) { osm_log(p_mgr->p_log, OSM_LOG_VERBOSE, "__osm_drop_mgr_check_node: " "Node 0x%016" PRIx64 " no valid physical port 0\n", diff --git a/opensm/opensm/osm_dump.c b/opensm/opensm/osm_dump.c index 43ae05e..f47c992 100644 --- a/opensm/opensm/osm_dump.c +++ b/opensm/opensm/osm_dump.c @@ -394,11 +394,11 @@ static void dump_topology_node(cl_map_item_t * p_map_item, void *cxt) uint8_t port_state; p_physp = osm_node_get_physp_ptr(p_node, cPort); - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) continue; p_rphysp = p_physp->p_remote_physp; - if (!p_rphysp || !osm_physp_is_valid(p_rphysp)) + if (!p_rphysp) continue; CL_ASSERT(cPort == p_physp->port_num); @@ -503,7 +503,7 @@ static void print_node_report(cl_map_item_t * p_map_item, void *cxt) port_num = node_type == IB_NODE_TYPE_SWITCH ? 0 : 1; for (; port_num < num_ports; port_num++) { p_physp = osm_node_get_physp_ptr(p_node, port_num); - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) continue; osm_log_printf(log, OSM_LOG_VERBOSE, "%-11s : %s : %02X :", @@ -563,8 +563,7 @@ static void print_node_report(cl_map_item_t * p_map_item, void *cxt) if (port_num && (ib_port_info_get_port_state(p_pi) != IB_LINK_DOWN)) { p_remote_physp = osm_physp_get_remote(p_physp); - if (p_remote_physp - && osm_physp_is_valid(p_remote_physp)) + if (p_remote_physp) osm_log_printf(log, OSM_LOG_VERBOSE, " %016" PRIx64 " (%02X)", cl_ntoh64 diff --git a/opensm/opensm/osm_lid_mgr.c b/opensm/opensm/osm_lid_mgr.c index f248676..3ceb145 100644 --- a/opensm/opensm/osm_lid_mgr.c +++ b/opensm/opensm/osm_lid_mgr.c @@ -883,10 +883,8 @@ __osm_lid_mgr_set_remote_pi_state_to_init(IN osm_lid_mgr_t * const p_mgr, if (p_rem_physp == NULL) return; - if (osm_physp_is_valid(p_rem_physp)) - /* but in some rare cases the remote side might be irresponsive */ - ib_port_info_set_port_state(&p_rem_physp->port_info, - IB_LINK_INIT); + /* but in some rare cases the remote side might be irresponsive */ + ib_port_info_set_port_state(&p_rem_physp->port_info, IB_LINK_INIT); } /********************************************************************** @@ -914,7 +912,7 @@ __osm_lid_mgr_set_physp_pi(IN osm_lid_mgr_t * const p_mgr, Don't bother doing anything if this Physical Port is not valid. This allows simplified code in the caller. */ - if (p_physp == NULL || !osm_physp_is_valid(p_physp)) + if (!p_physp) goto Exit; port_num = osm_physp_get_port_num(p_physp); diff --git a/opensm/opensm/osm_link_mgr.c b/opensm/opensm/osm_link_mgr.c index 3d38362..19cb27d 100644 --- a/opensm/opensm/osm_link_mgr.c +++ b/opensm/opensm/osm_link_mgr.c @@ -116,9 +116,6 @@ __osm_link_mgr_set_physp_pi(IN osm_link_mgr_t * const p_mgr, OSM_LOG_ENTER(p_mgr->p_log, __osm_link_mgr_set_physp_pi); - CL_ASSERT(p_physp); - CL_ASSERT(osm_physp_is_valid(p_physp)); - p_node = osm_physp_get_node_ptr(p_physp); port_num = osm_physp_get_port_num(p_physp); @@ -241,8 +238,7 @@ __osm_link_mgr_set_physp_pi(IN osm_link_mgr_t * const p_mgr, Several timeout mechanisms: */ p_remote_physp = osm_physp_get_remote(p_physp); - if (port_num != 0 && p_remote_physp && - osm_physp_is_valid(p_remote_physp)) { + if (port_num != 0 && p_remote_physp) { if (osm_node_get_type(osm_physp_get_node_ptr(p_physp)) == IB_NODE_TYPE_ROUTER) { ib_port_info_set_hoq_lifetime(p_pi, @@ -418,7 +414,7 @@ __osm_link_mgr_process_node(IN osm_link_mgr_t * const p_mgr, specified state. */ p_physp = osm_node_get_physp_ptr(p_node, (uint8_t) i); - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) continue; current_state = osm_physp_get_port_state(p_physp); diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index 0c0ab25..1178522 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -720,7 +720,7 @@ static osm_mtree_node_t *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr, for (i = 0; i < max_children; i++) { const osm_physp_t *p_physp; const osm_physp_t *p_remote_physp; - const osm_node_t *p_node; + osm_node_t *p_node; const osm_node_t *p_remote_node; p_port_list = &list_array[i]; @@ -767,11 +767,10 @@ static osm_mtree_node_t *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr, CL_ASSERT(p_remote_node->sw); p_physp = osm_node_get_physp_ptr(p_node, i); - CL_ASSERT(osm_physp_is_valid(p_physp)); + CL_ASSERT(p_physp); p_remote_physp = osm_physp_get_remote(p_physp); CL_ASSERT(p_remote_physp); - CL_ASSERT(osm_physp_is_valid(p_remote_physp)); p_mtn->child_array[i] = __osm_mcast_mgr_branch(p_mgr, p_mgrp, @@ -1068,15 +1067,6 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * const p_mgr, goto Exit; } - if (!osm_physp_is_valid(p_physp)) { - osm_log(p_mgr->p_log, OSM_LOG_ERROR, - "osm_mcast_mgr_process_single: ERR 0A07: " - "Unable to acquire valid physical port object " - "for 0x%" PRIx64 "\n", cl_ntoh64(port_guid)); - status = IB_ERROR; - goto Exit; - } - p_remote_physp = osm_physp_get_remote(p_physp); if (p_remote_physp == NULL) { osm_log(p_mgr->p_log, OSM_LOG_ERROR, @@ -1087,15 +1077,6 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * const p_mgr, goto Exit; } - if (!osm_physp_is_valid(p_remote_physp)) { - osm_log(p_mgr->p_log, OSM_LOG_ERROR, - "osm_mcast_mgr_process_single: ERR 0A21: " - "Unable to acquire valid remote physical port object " - "for 0x%" PRIx64 "\n", cl_ntoh64(port_guid)); - status = IB_ERROR; - goto Exit; - } - p_remote_node = osm_physp_get_node_ptr(p_remote_physp); CL_ASSERT(p_remote_node); diff --git a/opensm/opensm/osm_node.c b/opensm/opensm/osm_node.c index 4586ff5..85ea3c9 100644 --- a/opensm/opensm/osm_node.c +++ b/opensm/opensm/osm_node.c @@ -261,8 +261,7 @@ osm_node_link_has_valid_ports(IN osm_node_t * const p_node, p_physp = osm_node_get_physp_ptr(p_node, port_num); p_remote_physp = osm_node_get_physp_ptr(p_remote_node, remote_port_num); - return (osm_physp_is_valid(p_physp) && - osm_physp_is_valid(p_remote_physp)); + return (p_physp && p_remote_physp); } /********************************************************************** @@ -278,7 +277,7 @@ osm_node_has_any_link(IN osm_node_t * const p_node, IN const uint8_t port_num) /********************************************************************** **********************************************************************/ -osm_node_t *osm_node_get_remote_node(IN const osm_node_t * const p_node, +osm_node_t *osm_node_get_remote_node(IN osm_node_t * const p_node, IN const uint8_t port_num, OUT uint8_t * p_remote_port_num) { @@ -301,7 +300,7 @@ osm_node_t *osm_node_get_remote_node(IN const osm_node_t * const p_node, The lock must be held before calling this function. **********************************************************************/ ib_net16_t -osm_node_get_remote_base_lid(IN const osm_node_t * const p_node, +osm_node_get_remote_base_lid(IN osm_node_t * const p_node, IN const uint32_t port_num) { osm_physp_t *p_physp; @@ -309,7 +308,7 @@ osm_node_get_remote_base_lid(IN const osm_node_t * const p_node, CL_ASSERT(port_num < p_node->physp_tbl_size); p_physp = osm_node_get_physp_ptr(p_node, port_num); - if (osm_physp_is_valid(p_physp)) { + if (p_physp) { p_remote_physp = osm_physp_get_remote(p_physp); return (osm_physp_get_base_lid(p_remote_physp)); } diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index 860a20d..091b46a 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -515,7 +515,7 @@ __osm_perfmgr_query_counters(cl_map_item_t * const p_map_item, void *context) for (port = startport; port < num_ports; port++) { ib_net16_t lid; - if (!osm_physp_is_valid(osm_node_get_physp_ptr(node, port))) + if (!osm_node_get_physp_ptr(node, port)) continue; lid = get_lid(node, port, mon_node); @@ -592,7 +592,7 @@ static int sweep_hop_1(osm_sm_t * sm) p_physp = osm_node_get_physp_ptr(p_node, port_num); - CL_ASSERT(osm_physp_is_valid(p_physp)); + CL_ASSERT(p_physp); p_dr_path = osm_physp_get_dr_path_ptr(p_physp); h_bind = osm_dr_path_get_bind_handle(p_dr_path); diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c index e098d9b..df17549 100644 --- a/opensm/opensm/osm_pkey_mgr.c +++ b/opensm/opensm/osm_pkey_mgr.c @@ -167,7 +167,7 @@ pkey_mgr_process_partition_table(osm_log_t * p_log, osm_sm_t * sm, i = i_next; i_next = cl_map_next(i); p_physp = cl_map_obj(i); - if (p_physp && osm_physp_is_valid(p_physp)) + if (p_physp) pkey_mgr_process_physical_port(p_log, sm, pkey, p_physp); } @@ -290,7 +290,7 @@ static boolean_t pkey_mgr_update_port(osm_log_t * p_log, osm_sm_t * sm, memset(&empty_block, 0, sizeof(ib_pkey_table_t)); p_physp = p_port->p_physp; - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) return FALSE; p_node = osm_physp_get_node_ptr(p_physp); @@ -424,10 +424,10 @@ pkey_mgr_update_peer_port(osm_log_t * p_log, osm_sm_t * sm, memset(&empty_block, 0, sizeof(ib_pkey_table_t)); p_physp = p_port->p_physp; - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) return FALSE; peer = osm_physp_get_remote(p_physp); - if (!peer || !osm_physp_is_valid(peer)) + if (!peer) return FALSE; p_node = osm_physp_get_node_ptr(peer); if (!p_node->sw || !p_node->sw->switch_info.enforce_cap) diff --git a/opensm/opensm/osm_pkey_rcv.c b/opensm/opensm/osm_pkey_rcv.c index c510ab5..a827e28 100644 --- a/opensm/opensm/osm_pkey_rcv.c +++ b/opensm/opensm/osm_pkey_rcv.c @@ -129,7 +129,7 @@ void osm_pkey_rcv_process(IN void *context, IN void *data) Determine if we encountered a new Physical Port. If so, ignore it. */ - if (!osm_physp_is_valid(p_physp)) { + if (!p_physp) { osm_log(sm->p_log, OSM_LOG_ERROR, "osm_pkey_rcv_process: ERR 4807: " "Got invalid port number 0x%X\n", port_num); diff --git a/opensm/opensm/osm_port.c b/opensm/opensm/osm_port.c index ffc4fb0..653212a 100644 --- a/opensm/opensm/osm_port.c +++ b/opensm/opensm/osm_port.c @@ -151,7 +151,7 @@ void osm_port_delete(IN OUT osm_port_t ** const pp_port) static void osm_port_init(IN osm_port_t * const p_port, IN const ib_node_info_t * p_ni, - IN const osm_node_t * const p_parent_node) + IN osm_node_t * const p_parent_node) { ib_net64_t port_guid; osm_physp_t *p_physp; @@ -183,7 +183,7 @@ osm_port_init(IN osm_port_t * const p_port, /********************************************************************** **********************************************************************/ osm_port_t *osm_port_new(IN const ib_node_info_t * p_ni, - IN const osm_node_t * const p_parent_node) + IN osm_node_t * const p_parent_node) { osm_port_t *p_port; @@ -318,7 +318,7 @@ osm_physp_calc_link_mtu(IN osm_log_t * p_log, IN const osm_physp_t * p_physp) OSM_LOG_ENTER(p_log, osm_physp_calc_link_mtu); p_remote_physp = osm_physp_get_remote(p_physp); - if (p_remote_physp && osm_physp_is_valid(p_remote_physp)) { + if (p_remote_physp) { /* use the available MTU */ mtu = ib_port_info_get_mtu_cap(&p_physp->port_info); @@ -383,7 +383,7 @@ osm_physp_calc_link_op_vls(IN osm_log_t * p_log, OSM_LOG_ENTER(p_log, osm_physp_calc_link_op_vls); p_remote_physp = osm_physp_get_remote(p_physp); - if (p_remote_physp && osm_physp_is_valid(p_remote_physp)) { + if (p_remote_physp) { /* use the available VLCap */ op_vls = ib_port_info_get_vl_cap(&p_physp->port_info); @@ -508,7 +508,7 @@ __osm_physp_get_dr_physp_set(IN osm_log_t * p_log, p_path->path[hop]); /* make sure we got a valid port and it has a remote port */ - if (!osm_physp_is_valid(p_physp)) { + if (!p_physp) { osm_log(p_log, OSM_LOG_ERROR, "__osm_physp_get_dr_nodes_set: ERR 4104: " "DR Traversal stopped on invalid port at hop:%u\n", @@ -643,7 +643,6 @@ osm_physp_replace_dr_path_with_alternate_dr_path(IN osm_log_t * p_log, p_physp = p_port->p_physp; CL_ASSERT(p_physp); - CL_ASSERT(osm_physp_is_valid(p_physp)); cl_list_insert_tail(p_nextPortsList, p_physp); @@ -675,12 +674,11 @@ osm_physp_replace_dr_path_with_alternate_dr_path(IN osm_log_t * p_log, /* make sure that all of the following occurred: 1. The port isn't NULL - 2. The port is a valid port - 3. This is not the port we came from - 4. The port is not in the physp_map - 5. This port haven't been visited before + 2. This is not the port we came from + 3. The port is not in the physp_map + 4. This port haven't been visited before */ - if (osm_physp_is_valid(p_remote_physp) && + if (p_remote_physp && p_remote_physp != p_physp && cl_map_get(&physp_map, __osm_ptr_to_key(p_remote_physp)) @@ -749,7 +747,7 @@ boolean_t osm_link_is_healthy(IN const osm_physp_t * const p_physp) CL_ASSERT(p_physp); p_remote_physp = p_physp->p_remote_physp; - if (p_remote_physp != NULL && osm_physp_is_valid(p_remote_physp)) + if (p_remote_physp != NULL) return ((p_physp->healthy) & (p_remote_physp->healthy)); /* the other side is not known - consider the link as healthy */ return (TRUE); @@ -766,7 +764,6 @@ osm_physp_set_pkey_tbl(IN osm_log_t * p_log, uint16_t max_blocks; CL_ASSERT(p_pkey_tbl); - CL_ASSERT(osm_physp_is_valid(p_physp)); /* (14.2.5.7) - the block number valid values are 0-2047, and are further limited by the size of the P_Key table specified by the PartitionCap on the diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c index b91d1a5..e56ba51 100644 --- a/opensm/opensm/osm_port_info_rcv.c +++ b/opensm/opensm/osm_port_info_rcv.c @@ -241,8 +241,7 @@ __osm_pi_rcv_process_switch_port(IN osm_sm_t * sm, switch (ib_port_info_get_port_state(p_pi)) { case IB_LINK_DOWN: p_remote_physp = osm_physp_get_remote(p_physp); - if (p_remote_physp - && osm_physp_is_valid(p_remote_physp)) { + if (p_remote_physp) { p_remote_node = osm_physp_get_node_ptr(p_remote_physp); remote_port_num = @@ -475,7 +474,7 @@ osm_pi_rcv_process_set(IN osm_sm_t * sm, IN osm_node_t * const p_node, CL_ASSERT(p_node); p_physp = osm_node_get_physp_ptr(p_node, port_num); - CL_ASSERT(osm_physp_is_valid(p_physp)); + CL_ASSERT(p_physp); port_guid = osm_physp_get_port_guid(p_physp); @@ -639,13 +638,13 @@ void osm_pi_rcv_process(IN void *context, IN void *data) If so, initialize the new Physical Port then continue processing as normal. */ - if (!osm_physp_is_valid(p_physp)) { + if (!p_physp) { if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_pi_rcv_process: " "Initializing port number 0x%X\n", port_num); - + p_physp = &p_node->physp_table[port_num]; osm_physp_init(p_physp, port_guid, port_num, diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c index c437028..1a6cc05 100644 --- a/opensm/opensm/osm_qos.c +++ b/opensm/opensm/osm_qos.c @@ -311,7 +311,7 @@ osm_signal_t osm_qos_setup(osm_opensm_t * p_osm) num_physp = osm_node_get_num_physp(p_node); for (i = 1; i < num_physp; i++) { p_physp = osm_node_get_physp_ptr(p_node, i); - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) continue; force_update = p_physp->need_update || p_osm->subn.need_update; @@ -332,7 +332,7 @@ osm_signal_t osm_qos_setup(osm_opensm_t * p_osm) cfg = &ca_config; p_physp = p_port->p_physp; - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) continue; force_update = p_physp->need_update || p_osm->subn.need_update; diff --git a/opensm/opensm/osm_qos_parser.y b/opensm/opensm/osm_qos_parser.y index 8cae5f3..50cac63 100644 --- a/opensm/opensm/osm_qos_parser.y +++ b/opensm/opensm/osm_qos_parser.y @@ -2884,9 +2884,8 @@ static void __parser_add_port_to_port_map( cl_qmap_t * p_map, osm_physp_t * p_physp) { - if (p_physp && osm_physp_is_valid(p_physp) && - cl_qmap_get(p_map, cl_ntoh64( - osm_physp_get_port_guid(p_physp))) == cl_qmap_end(p_map)) + if (cl_qmap_get(p_map, cl_ntoh64(osm_physp_get_port_guid(p_physp))) == + cl_qmap_end(p_map)) { osm_qos_port_t * p_port = osm_qos_policy_port_create(p_physp); if (p_port) diff --git a/opensm/opensm/osm_sa_guidinfo_record.c b/opensm/opensm/osm_sa_guidinfo_record.c index a2c47bb..af8ba6e 100644 --- a/opensm/opensm/osm_sa_guidinfo_record.c +++ b/opensm/opensm/osm_sa_guidinfo_record.c @@ -125,7 +125,7 @@ __osm_gir_rcv_new_gir(IN osm_sa_t * sa, **********************************************************************/ static void __osm_sa_gir_create_gir(IN osm_sa_t * sa, - IN const osm_node_t * const p_node, + IN osm_node_t * const p_node, IN cl_qlist_t * const p_list, IN ib_net64_t const match_port_guid, IN ib_net16_t const match_lid, @@ -164,8 +164,7 @@ __osm_sa_gir_create_gir(IN osm_sa_t * sa, for (port_num = 0; port_num < num_ports; port_num++) { p_physp = osm_node_get_physp_ptr(p_node, port_num); - - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) continue; /* Check to see if the found p_physp and the requester physp @@ -240,7 +239,7 @@ __osm_sa_gir_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, { const osm_gir_search_ctxt_t *const p_ctxt = (osm_gir_search_ctxt_t *) context; - const osm_node_t *const p_node = (osm_node_t *) p_map_item; + osm_node_t *const p_node = (osm_node_t *) p_map_item; const ib_guidinfo_record_t *const p_rcvd_rec = p_ctxt->p_rcvd_rec; const osm_physp_t *const p_req_physp = p_ctxt->p_req_physp; osm_sa_t *sa = p_ctxt->sa; diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c index 1b833eb..8c7e2e7 100644 --- a/opensm/opensm/osm_sa_link_record.c +++ b/opensm/opensm/osm_sa_link_record.c @@ -135,12 +135,7 @@ __osm_lr_rcv_get_physp_link(IN osm_sa_t * sa, the other side. */ if (p_src_physp) { - if (!osm_physp_is_valid(p_src_physp)) - goto Exit; - if (p_dest_physp) { - if (!osm_physp_is_valid(p_dest_physp)) - goto Exit; /* Ensure the two physp's are actually connected. If not, bail out. @@ -149,31 +144,18 @@ __osm_lr_rcv_get_physp_link(IN osm_sa_t * sa, goto Exit; } else { p_dest_physp = osm_physp_get_remote(p_src_physp); - if (p_dest_physp == NULL) goto Exit; - - if (!osm_physp_is_valid(p_dest_physp)) - goto Exit; } } else { if (p_dest_physp) { - if (!osm_physp_is_valid(p_dest_physp)) - goto Exit; - p_src_physp = osm_physp_get_remote(p_dest_physp); - if (p_src_physp == NULL) goto Exit; } else goto Exit; /* no physp's, so nothing to do */ } - CL_ASSERT(p_src_physp); - CL_ASSERT(p_dest_physp); - CL_ASSERT(osm_physp_is_valid(p_src_physp)); - CL_ASSERT(osm_physp_is_valid(p_dest_physp)); - /* Check that the p_src_physp, p_dest_physp and p_req_physp all share a pkey (doesn't have to be the same p_key). */ if (!osm_physp_share_pkey(sa->p_log, p_src_physp, p_dest_physp)) { @@ -284,8 +266,7 @@ __osm_lr_rcv_get_port_links(IN osm_sa_t * sa, p_node, dest_port_num); /* both physical ports should be with data */ - if (osm_physp_is_valid(p_src_physp) && - osm_physp_is_valid(p_dest_physp)) + if (p_src_physp && p_dest_physp) __osm_lr_rcv_get_physp_link (sa, p_lr, p_src_physp, p_dest_physp, comp_mask, @@ -306,7 +287,7 @@ __osm_lr_rcv_get_port_links(IN osm_sa_t * sa, osm_node_get_physp_ptr(p_src_port-> p_node, port_num); - if (osm_physp_is_valid(p_src_physp)) + if (p_src_physp) __osm_lr_rcv_get_physp_link (sa, p_lr, p_src_physp, NULL, comp_mask, p_list, @@ -321,7 +302,7 @@ __osm_lr_rcv_get_port_links(IN osm_sa_t * sa, osm_node_get_physp_ptr(p_src_port-> p_node, port_num); - if (osm_physp_is_valid(p_src_physp)) + if (p_src_physp) __osm_lr_rcv_get_physp_link (sa, p_lr, p_src_physp, NULL, comp_mask, p_list, @@ -344,7 +325,7 @@ __osm_lr_rcv_get_port_links(IN osm_sa_t * sa, osm_node_get_physp_ptr(p_dest_port-> p_node, port_num); - if (osm_physp_is_valid(p_dest_physp)) + if (p_dest_physp) __osm_lr_rcv_get_physp_link (sa, p_lr, NULL, p_dest_physp, comp_mask, @@ -359,7 +340,7 @@ __osm_lr_rcv_get_port_links(IN osm_sa_t * sa, osm_node_get_physp_ptr(p_dest_port-> p_node, port_num); - if (osm_physp_is_valid(p_dest_physp)) + if (p_dest_physp) __osm_lr_rcv_get_physp_link (sa, p_lr, NULL, p_dest_physp, comp_mask, @@ -380,15 +361,12 @@ __osm_lr_rcv_get_port_links(IN osm_sa_t * sa, scan all the ports of this node anyway. */ p_src_physp = osm_node_get_any_physp_ptr(p_node); - if (osm_physp_is_valid(p_src_physp)) { - p_src_port = (osm_port_t *) - cl_qmap_get(&sa->p_subn->port_guid_tbl, - osm_physp_get_port_guid(p_src_physp)); - __osm_lr_rcv_get_port_links(sa, p_lr, - p_src_port, NULL, - comp_mask, p_list, - p_req_physp); - } + p_src_port = osm_get_port_by_guid(sa->p_subn, + osm_physp_get_port_guid(p_src_physp)); + __osm_lr_rcv_get_port_links(sa, p_lr, + p_src_port, NULL, + comp_mask, p_list, + p_req_physp); p_node = (osm_node_t *) cl_qmap_next(&p_node-> map_item); } diff --git a/opensm/opensm/osm_sa_multipath_record.c b/opensm/opensm/osm_sa_multipath_record.c index 1fa81d6..032c297 100644 --- a/opensm/opensm/osm_sa_multipath_record.c +++ b/opensm/opensm/osm_sa_multipath_record.c @@ -354,7 +354,6 @@ __osm_mpr_rcv_get_path_parms(IN osm_sa_t * sa, Continue with the egress port on this switch. */ p_physp = osm_switch_get_route_by_lid(p_node->sw, dest_lid); - if (p_physp == 0) { osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_mpr_rcv_get_path_parms: ERR 4516: " @@ -365,8 +364,6 @@ __osm_mpr_rcv_get_path_parms(IN osm_sa_t * sa, goto Exit; } - CL_ASSERT(osm_physp_is_valid(p_physp)); - p_pi = &p_physp->port_info; if (mtu > ib_port_info_get_mtu_cap(p_pi)) diff --git a/opensm/opensm/osm_sa_node_record.c b/opensm/opensm/osm_sa_node_record.c index a9a3708..4af8e58 100644 --- a/opensm/opensm/osm_sa_node_record.c +++ b/opensm/opensm/osm_sa_node_record.c @@ -122,7 +122,7 @@ __osm_nr_rcv_new_nr(IN osm_sa_t * sa, **********************************************************************/ static void __osm_nr_rcv_create_nr(IN osm_sa_t * sa, - IN const osm_node_t * const p_node, + IN osm_node_t * const p_node, IN cl_qlist_t * const p_list, IN ib_net64_t const match_port_guid, IN ib_net16_t const match_lid, @@ -160,8 +160,7 @@ __osm_nr_rcv_create_nr(IN osm_sa_t * sa, for (port_num = 0; port_num < num_ports; port_num++) { p_physp = osm_node_get_physp_ptr(p_node, port_num); - - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) continue; /* Check to see if the found p_physp and the requester physp @@ -210,7 +209,7 @@ __osm_nr_rcv_by_comp_mask(IN cl_map_item_t * const p_map_item, IN void *context) { const osm_nr_search_ctxt_t *const p_ctxt = (osm_nr_search_ctxt_t *) context; - const osm_node_t *const p_node = (osm_node_t *) p_map_item; + osm_node_t *const p_node = (osm_node_t *) p_map_item; const ib_node_record_t *const p_rcvd_rec = p_ctxt->p_rcvd_rec; const osm_physp_t *const p_req_physp = p_ctxt->p_req_physp; osm_sa_t *sa = p_ctxt->sa; diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c index 816e8e2..cc87bc7 100644 --- a/opensm/opensm/osm_sa_path_record.c +++ b/opensm/opensm/osm_sa_path_record.c @@ -360,7 +360,6 @@ __osm_pr_rcv_get_path_parms(IN osm_sa_t * sa, Continue with the egress port on this switch. */ p_physp = osm_switch_get_route_by_lid(p_node->sw, dest_lid); - if (p_physp == 0) { osm_log(sa->p_log, OSM_LOG_ERROR, "__osm_pr_rcv_get_path_parms: ERR 1F07: " @@ -371,8 +370,6 @@ __osm_pr_rcv_get_path_parms(IN osm_sa_t * sa, goto Exit; } - CL_ASSERT(osm_physp_is_valid(p_physp)); - p_pi = &p_physp->port_info; if (mtu > ib_port_info_get_mtu_cap(p_pi)) diff --git a/opensm/opensm/osm_sa_pkey_record.c b/opensm/opensm/osm_sa_pkey_record.c index e7547df..e21c8a8 100644 --- a/opensm/opensm/osm_sa_pkey_record.c +++ b/opensm/opensm/osm_sa_pkey_record.c @@ -181,7 +181,7 @@ __osm_sa_pkey_by_comp_mask(IN osm_sa_t * sa, osm_node_get_physp_ptr(p_port->p_node, port_num); /* Check that the p_physp is valid, and that is shares a pkey with the p_req_physp. */ - if (osm_physp_is_valid(p_physp) && + if (p_physp && (osm_physp_share_pkey (sa->p_log, p_req_physp, p_physp))) __osm_sa_pkey_check_physp(sa, p_physp, @@ -199,7 +199,7 @@ __osm_sa_pkey_by_comp_mask(IN osm_sa_t * sa, for (port_num = 0; port_num < num_ports; port_num++) { p_physp = osm_node_get_physp_ptr(p_port->p_node, port_num); - if (!osm_physp_is_valid(p_physp)) + if (p_physp) continue; /* if the requester and the p_physp don't share a pkey - diff --git a/opensm/opensm/osm_sa_portinfo_record.c b/opensm/opensm/osm_sa_portinfo_record.c index 16dd852..0cac69c 100644 --- a/opensm/opensm/osm_sa_portinfo_record.c +++ b/opensm/opensm/osm_sa_portinfo_record.c @@ -407,7 +407,7 @@ __osm_sa_pir_check_physp(IN osm_sa_t * sa, **********************************************************************/ static void __osm_sa_pir_by_comp_mask(IN osm_sa_t * sa, - IN const osm_node_t * const p_node, + IN osm_node_t * const p_node, osm_pir_search_ctxt_t * const p_ctxt) { const ib_portinfo_record_t *p_rcvd_rec; @@ -432,7 +432,7 @@ __osm_sa_pir_by_comp_mask(IN osm_sa_t * sa, p_rcvd_rec->port_num); /* Check that the p_physp is valid, and that the p_physp and the p_req_physp share a pkey. */ - if (osm_physp_is_valid(p_physp) && + if (p_physp && osm_physp_share_pkey(sa->p_log, p_req_physp, p_physp)) __osm_sa_pir_check_physp(sa, p_physp, @@ -442,7 +442,7 @@ __osm_sa_pir_by_comp_mask(IN osm_sa_t * sa, for (port_num = 0; port_num < num_ports; port_num++) { p_physp = osm_node_get_physp_ptr(p_node, port_num); - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) continue; /* if the requester and the p_physp don't share a pkey - @@ -464,7 +464,7 @@ static void __osm_sa_pir_by_comp_mask_cb(IN cl_map_item_t * const p_map_item, IN void *context) { - const osm_node_t *const p_node = (osm_node_t *) p_map_item; + osm_node_t *const p_node = (osm_node_t *) p_map_item; osm_pir_search_ctxt_t *const p_ctxt = (osm_pir_search_ctxt_t *) context; __osm_sa_pir_by_comp_mask(p_ctxt->sa, p_node, p_ctxt); diff --git a/opensm/opensm/osm_sa_slvl_record.c b/opensm/opensm/osm_sa_slvl_record.c index cc21765..ba13010 100644 --- a/opensm/opensm/osm_sa_slvl_record.c +++ b/opensm/opensm/osm_sa_slvl_record.c @@ -176,7 +176,7 @@ __osm_sa_slvl_by_comp_mask(IN osm_sa_t * sa, p_out_physp = osm_node_get_physp_ptr(p_port->p_node, out_port_num); - if (!osm_physp_is_valid(p_out_physp)) + if (!p_out_physp) continue; for (in_port_num = in_port_start; @@ -189,7 +189,7 @@ __osm_sa_slvl_by_comp_mask(IN osm_sa_t * sa, p_in_physp = osm_node_get_physp_ptr(p_port->p_node, in_port_num); - if (!osm_physp_is_valid(p_in_physp)) + if (!p_in_physp) continue; /* if the requester and the p_out_physp don't share a pkey - diff --git a/opensm/opensm/osm_sa_vlarb_record.c b/opensm/opensm/osm_sa_vlarb_record.c index 51bc517..3ada071 100644 --- a/opensm/opensm/osm_sa_vlarb_record.c +++ b/opensm/opensm/osm_sa_vlarb_record.c @@ -187,7 +187,7 @@ __osm_sa_vl_arb_by_comp_mask(IN osm_sa_t * sa, osm_node_get_physp_ptr(p_port->p_node, port_num); /* check that the p_physp is valid, and that the requester and the p_physp share a pkey. */ - if (osm_physp_is_valid(p_physp) && + if (p_physp && osm_physp_share_pkey(sa->p_log, p_req_physp, p_physp)) __osm_sa_vl_arb_check_physp(sa, p_physp, @@ -205,7 +205,7 @@ __osm_sa_vl_arb_by_comp_mask(IN osm_sa_t * sa, for (port_num = 0; port_num < num_ports; port_num++) { p_physp = osm_node_get_physp_ptr(p_port->p_node, port_num); - if (!osm_physp_is_valid(p_physp)) + if (!p_physp) continue; /* if the requester and the p_physp don't share a pkey - diff --git a/opensm/opensm/osm_slvl_map_rcv.c b/opensm/opensm/osm_slvl_map_rcv.c index 3f9c88a..2af9be2 100644 --- a/opensm/opensm/osm_slvl_map_rcv.c +++ b/opensm/opensm/osm_slvl_map_rcv.c @@ -125,8 +125,6 @@ void osm_slvl_rcv_process(IN void *context, IN void *p_data) in_port_num = 0; } - CL_ASSERT(p_physp); - /* We do not mind if this is a result of a set or get - all we want is to update the subnet. @@ -145,7 +143,7 @@ void osm_slvl_rcv_process(IN void *context, IN void *p_data) Determine if we encountered a new Physical Port. If so, Ignore it. */ - if (!osm_physp_is_valid(p_physp)) { + if (!p_physp) { osm_log(sm->p_log, OSM_LOG_ERROR, "osm_slvl_rcv_process: " "Got invalid port number 0x%X\n", out_port_num); diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index e4130cc..4b7dcac 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -720,7 +720,6 @@ static boolean_t __osm_state_mgr_is_sm_port_down(IN osm_state_mgr_t * p_physp = p_port->p_physp; CL_ASSERT(p_physp); - CL_ASSERT(osm_physp_is_valid(p_physp)); state = osm_physp_get_port_state(p_physp); CL_PLOCK_RELEASE(p_mgr->p_lock); @@ -789,7 +788,7 @@ static ib_api_status_t __osm_state_mgr_sweep_hop_1(IN osm_state_mgr_t * p_physp = osm_node_get_physp_ptr(p_node, port_num); - CL_ASSERT(osm_physp_is_valid(p_physp)); + CL_ASSERT(p_physp); p_dr_path = osm_physp_get_dr_path_ptr(p_physp); h_bind = osm_dr_path_get_bind_handle(p_dr_path); @@ -911,7 +910,7 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_state_mgr_t * port_num++) { p_physp = osm_node_get_physp_ptr(p_node, port_num); - if (osm_physp_is_valid(p_physp) + if (p_physp && (osm_physp_get_port_state(p_physp) != IB_LINK_DOWN) && !osm_physp_get_remote(p_physp)) { diff --git a/opensm/opensm/osm_sw_info_rcv.c b/opensm/opensm/osm_sw_info_rcv.c index 962f6c7..dbf8b8c 100644 --- a/opensm/opensm/osm_sw_info_rcv.c +++ b/opensm/opensm/osm_sw_info_rcv.c @@ -96,8 +96,6 @@ __osm_si_rcv_get_port_info(IN osm_sm_t * sm, */ p_physp = osm_node_get_any_physp_ptr(p_node); - CL_ASSERT(osm_physp_is_valid(p_physp)); - context.pi_context.node_guid = osm_node_get_node_guid(p_node); context.pi_context.port_guid = osm_physp_get_port_guid(p_physp); context.pi_context.set_method = FALSE; @@ -152,8 +150,6 @@ __osm_si_rcv_get_fwd_tbl(IN osm_sm_t * sm, p_physp = osm_node_get_any_physp_ptr(p_node); - CL_ASSERT(osm_physp_is_valid(p_physp)); - context.lft_context.node_guid = osm_node_get_node_guid(p_node); context.lft_context.set_method = FALSE; @@ -223,8 +219,6 @@ __osm_si_rcv_get_mcast_fwd_tbl(IN osm_sm_t * sm, p_physp = osm_node_get_any_physp_ptr(p_node); p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw); - CL_ASSERT(osm_physp_is_valid(p_physp)); - context.mft_context.node_guid = osm_node_get_node_guid(p_node); context.mft_context.set_method = FALSE; diff --git a/opensm/opensm/osm_switch.c b/opensm/opensm/osm_switch.c index 4cb6272..23429c7 100644 --- a/opensm/opensm/osm_switch.c +++ b/opensm/opensm/osm_switch.c @@ -316,8 +316,7 @@ osm_switch_recommend_path(IN const osm_switch_t * const p_sw, Verify that the port number is legal and that the LID is reachable through this port. */ - if (osm_physp_is_valid(p_physp) && - osm_physp_is_healthy(p_physp) && + if (p_physp && osm_physp_is_healthy(p_physp) && osm_physp_get_remote(p_physp)) { hops = osm_switch_get_hop_count(p_sw, base_lid, @@ -359,8 +358,7 @@ osm_switch_recommend_path(IN const osm_switch_t * const p_sw, /* let us make sure it is not down or unhealthy */ p_physp = osm_node_get_physp_ptr(p_sw->p_node, port_num); - if (!osm_physp_is_valid(p_physp) || - !osm_physp_is_healthy(p_physp) || + if (!p_physp || !osm_physp_is_healthy(p_physp) || /* we require all - non sma ports to be linked to be routed through diff --git a/opensm/opensm/osm_trap_rcv.c b/opensm/opensm/osm_trap_rcv.c index b7a8c40..53269b4 100644 --- a/opensm/opensm/osm_trap_rcv.c +++ b/opensm/opensm/osm_trap_rcv.c @@ -90,12 +90,11 @@ typedef struct _osm_trap_aging_tracker_context { /********************************************************************** **********************************************************************/ -static osm_physp_t *__get_physp_by_lid_and_num(IN osm_sm_t * sm, - IN uint16_t lid, IN uint8_t num) +static osm_physp_t *get_physp_by_lid_and_num(IN osm_sm_t * sm, + IN uint16_t lid, IN uint8_t num) { cl_ptr_vector_t *p_vec = &(sm->p_subn->port_lid_tbl); osm_port_t *p_port; - osm_physp_t *p_physp; if (lid > cl_ptr_vector_get_size(p_vec)) return NULL; @@ -107,9 +106,7 @@ static osm_physp_t *__get_physp_by_lid_and_num(IN osm_sm_t * sm, if (osm_node_get_num_physp(p_port->p_node) < num) return NULL; - p_physp = osm_node_get_physp_ptr(p_port->p_node, num); - - return osm_physp_is_valid(p_physp) ? p_physp : NULL; + return osm_node_get_physp_ptr(p_port->p_node, num); } /********************************************************************** @@ -132,7 +129,7 @@ osm_trap_rcv_aging_tracker_callback(IN uint64_t key, lid = cl_ntoh16((uint16_t) ((key & 0x0000FFFF00000000ULL) >> 32)); port_num = (uint8_t) ((key & 0x00FF000000000000ULL) >> 48); - p_physp = __get_physp_by_lid_and_num(sm, lid, port_num); + p_physp = get_physp_by_lid_and_num(sm, lid, port_num); if (!p_physp) osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_trap_rcv_aging_tracker_callback: " @@ -140,7 +137,7 @@ osm_trap_rcv_aging_tracker_callback(IN uint64_t key, port_num, lid); /* make sure the physp is still valid */ /* If the health port was false - set it to true */ - else if (osm_physp_is_valid(p_physp) && !osm_physp_is_healthy(p_physp)) { + else if (!osm_physp_is_healthy(p_physp)) { osm_log(sm->p_log, OSM_LOG_VERBOSE, "osm_trap_rcv_aging_tracker_callback: " "Clearing health bit of port num:%u with lid:%u\n", @@ -450,13 +447,13 @@ __osm_trap_rcv_process_request(IN osm_sm_t * sm, */ if (physp_change_trap == TRUE) { /* get the port */ - p_physp = __get_physp_by_lid_and_num(sm, - cl_ntoh16 - (p_ntci-> - data_details. - ntc_129_131. - lid), - port_num); + p_physp = get_physp_by_lid_and_num(sm, + cl_ntoh16 + (p_ntci-> + data_details. + ntc_129_131. + lid), + port_num); if (!p_physp) osm_log(sm->p_log, OSM_LOG_ERROR, diff --git a/opensm/opensm/osm_ucast_ftree.c b/opensm/opensm/osm_ucast_ftree.c index dcbdc44..e9e00a4 100644 --- a/opensm/opensm/osm_ucast_ftree.c +++ b/opensm/opensm/osm_ucast_ftree.c @@ -2829,9 +2829,7 @@ __osm_ftree_rank_switches_from_leafs(IN ftree_fabric_t * p_ftree, /* note: skipping port 0 on switches */ for (i = 1; i < osm_node_get_num_physp(p_node); i++) { p_osm_port = osm_node_get_physp_ptr(p_node, i); - if (!osm_physp_is_valid(p_osm_port)) - continue; - if (!osm_link_is_healthy(p_osm_port)) + if (!p_osm_port || !osm_link_is_healthy(p_osm_port)) continue; p_remote_node = @@ -2883,9 +2881,7 @@ __osm_ftree_rank_leaf_switches(IN ftree_fabric_t * p_ftree, for (i = 0; i < osm_node_get_num_physp(p_osm_node); i++) { p_osm_port = osm_node_get_physp_ptr(p_osm_node, i); - if (!osm_physp_is_valid(p_osm_port)) - continue; - if (!osm_link_is_healthy(p_osm_port)) + if (!p_osm_port || !osm_link_is_healthy(p_osm_port)) continue; p_remote_osm_node = @@ -2989,10 +2985,7 @@ __osm_ftree_fabric_construct_hca_ports(IN ftree_fabric_t * p_ftree, for (i = 0; i < osm_node_get_num_physp(p_node); i++) { osm_physp_t *p_osm_port = osm_node_get_physp_ptr(p_node, i); - - if (!osm_physp_is_valid(p_osm_port)) - continue; - if (!osm_link_is_healthy(p_osm_port)) + if (!p_osm_port && !osm_link_is_healthy(p_osm_port)) continue; p_remote_osm_port = osm_physp_get_remote(p_osm_port); @@ -3115,10 +3108,7 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, for (i = 0; i < osm_node_get_num_physp(p_node); i++) { osm_physp_t *p_osm_port = osm_node_get_physp_ptr(p_node, i); - - if (!osm_physp_is_valid(p_osm_port)) - continue; - if (!osm_link_is_healthy(p_osm_port)) + if (!p_osm_port || !osm_link_is_healthy(p_osm_port)) continue; p_remote_osm_port = osm_physp_get_remote(p_osm_port); @@ -3291,9 +3281,7 @@ static int __osm_ftree_fabric_rank_from_roots(IN ftree_fabric_t * p_ftree) /* note: skipping port 0 on switches */ for (i = 1; i < osm_node_get_num_physp(p_osm_node); i++) { p_osm_physp = osm_node_get_physp_ptr(p_osm_node, i); - if (!osm_physp_is_valid(p_osm_physp)) - continue; - if (!osm_link_is_healthy(p_osm_physp)) + if (!p_osm_physp || !osm_link_is_healthy(p_osm_physp)) continue; p_remote_osm_node = diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c index cf9d701..65f688e 100644 --- a/opensm/opensm/osm_ucast_lash.c +++ b/opensm/opensm/osm_ucast_lash.c @@ -187,13 +187,12 @@ static uint8_t find_port_from_lid(IN const ib_net16_t lid_no, for (i = 1; i < port_count; i++) { p_current_physp = osm_node_get_physp_ptr(p_sw->p_node, i); - - if (!osm_physp_is_valid(p_current_physp)) + if (!p_current_physp) continue; p_remote_physp = p_current_physp->p_remote_physp; - if (p_remote_physp && osm_physp_is_valid(p_remote_physp)) { + if (p_remote_physp) { osm_node_t *p_opposite_node = osm_physp_get_node_ptr(p_remote_physp); @@ -1216,12 +1215,9 @@ static void osm_lash_process_switch(lash_t * p_lash, osm_switch_t * p_sw) for (i = 1; i < port_count; i++) { p_current_physp = osm_node_get_physp_ptr(p_sw->p_node, i); - - if (osm_physp_is_valid(p_current_physp)) { + if (p_current_physp) { p_remote_physp = p_current_physp->p_remote_physp; - - if (p_remote_physp && osm_physp_is_valid(p_remote_physp) - && p_remote_physp->p_node->sw) { + if (p_remote_physp && p_remote_physp->p_node->sw) { int physical_port_a_num = osm_physp_get_port_num(p_current_physp); int physical_port_b_num = @@ -1315,7 +1311,7 @@ static int discover_network_properties(lash_t * p_lash) osm_physp_t *p_current_physp = osm_node_get_physp_ptr(p_sw->p_node, i); - if (osm_physp_is_valid(p_current_physp) + if (p_current_physp && p_current_physp->p_remote_physp) { ib_port_info_t *p_port_info = diff --git a/opensm/opensm/osm_ucast_updn.c b/opensm/opensm/osm_ucast_updn.c index c77188f..411e15b 100644 --- a/opensm/opensm/osm_ucast_updn.c +++ b/opensm/opensm/osm_ucast_updn.c @@ -390,12 +390,9 @@ updn_subn_rank(IN unsigned num_guids, /* make sure that all the following occur on p_remote_physp: 1. The port isn't NULL - 2. The port is a valid port - 3. It is a switch + 2. It is a switch */ - if (p_remote_physp && - osm_physp_is_valid(p_remote_physp) && - p_remote_physp->p_node->sw) { + if (p_remote_physp && p_remote_physp->p_node->sw) { remote_u = p_remote_physp->p_node->sw->priv; port_guid = p_remote_physp->port_guid; diff --git a/opensm/opensm/osm_vl_arb_rcv.c b/opensm/opensm/osm_vl_arb_rcv.c index a88bf70..8a5b8b4 100644 --- a/opensm/opensm/osm_vl_arb_rcv.c +++ b/opensm/opensm/osm_vl_arb_rcv.c @@ -122,8 +122,6 @@ void osm_vla_rcv_process(IN void *context, IN void *data) port_num = p_physp->port_num; } - CL_ASSERT(p_physp); - /* We do not mind if this is a result of a set or get - all we want is to update the subnet. @@ -141,7 +139,7 @@ void osm_vla_rcv_process(IN void *context, IN void *data) Determine if we encountered a new Physical Port. If so, Ignore it. */ - if (!osm_physp_is_valid(p_physp)) { + if (!p_physp) { osm_log(sm->p_log, OSM_LOG_ERROR, "osm_vla_rcv_process: " "Got invalid port number 0x%X\n", port_num); -- 1.5.4.rc5 From sashak at voltaire.com Sun Jan 27 07:42:00 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 15:42:00 +0000 Subject: [ofa-general] [PATCH] opensm: remove osm_physp_get_mod_pkey_tbl() Message-ID: <20080127154200.GJ24344@sashak.voltaire.com> Remove osm_physp_get_mod_pkey_tbl() which is perfectly duplicated by the original osm_physp_get_pkey_tbl(). Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_port.h | 34 ---------------------------------- opensm/opensm/osm_pkey_mgr.c | 6 +++--- 2 files changed, 3 insertions(+), 37 deletions(-) diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h index 963e13b..a9bf78f 100644 --- a/opensm/include/opensm/osm_port.h +++ b/opensm/include/opensm/osm_port.h @@ -536,40 +536,6 @@ static inline const osm_pkey_tbl_t *osm_physp_get_pkey_tbl(IN const osm_physp_t * Port, Physical Port *********/ -/****f* OpenSM: Physical Port/osm_physp_get_mod_pkey_tbl -* NAME -* osm_physp_get_mod_pkey_tbl -* -* DESCRIPTION -* Returns a NON CONST pointer to the P_Key table object of the Physical Port object. -* -* SYNOPSIS -*/ -static inline osm_pkey_tbl_t *osm_physp_get_mod_pkey_tbl(IN osm_physp_t * - const p_physp) -{ - CL_ASSERT(osm_physp_is_valid(p_physp)); - /* - (14.2.5.7) - the block number valid values are 0-2047, and are further - limited by the size of the P_Key table specified by the PartitionCap on the node. - */ - return (&p_physp->pkeys); -}; - -/* -* PARAMETERS -* p_physp -* [in] Pointer to an osm_physp_t object. -* -* RETURN VALUES -* The pointer to the P_Key table object. -* -* NOTES -* -* SEE ALSO -* Port, Physical Port -*********/ - /****f* OpenSM: Physical Port/osm_physp_set_slvl_tbl * NAME * osm_physp_set_slvl_tbl diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c index df17549..33eeb8b 100644 --- a/opensm/opensm/osm_pkey_mgr.c +++ b/opensm/opensm/osm_pkey_mgr.c @@ -97,7 +97,7 @@ pkey_mgr_process_physical_port(IN osm_log_t * p_log, char *stat = NULL; osm_pending_pkey_t *p_pending; - p_pkey_tbl = osm_physp_get_mod_pkey_tbl(p_physp); + p_pkey_tbl = &p_physp->pkeys; p_pending = (osm_pending_pkey_t *) malloc(sizeof(osm_pending_pkey_t)); if (!p_pending) { osm_log(p_log, OSM_LOG_ERROR, @@ -294,7 +294,7 @@ static boolean_t pkey_mgr_update_port(osm_log_t * p_log, osm_sm_t * sm, return FALSE; p_node = osm_physp_get_node_ptr(p_physp); - p_pkey_tbl = osm_physp_get_mod_pkey_tbl(p_physp); + p_pkey_tbl = &p_physp->pkeys; num_of_blocks = osm_pkey_tbl_get_num_blocks(p_pkey_tbl); max_num_of_blocks = pkey_mgr_get_physp_max_blocks(sm->p_subn, p_physp); @@ -434,7 +434,7 @@ pkey_mgr_update_peer_port(osm_log_t * p_log, osm_sm_t * sm, return FALSE; p_pkey_tbl = osm_physp_get_pkey_tbl(p_physp); - p_peer_pkey_tbl = osm_physp_get_mod_pkey_tbl(peer); + p_peer_pkey_tbl = &peer->pkeys; num_of_blocks = osm_pkey_tbl_get_num_blocks(p_pkey_tbl); peer_max_blocks = pkey_mgr_get_physp_max_blocks(p_subn, peer); if (peer_max_blocks < p_pkey_tbl->used_blocks) { -- 1.5.4.rc5 From sashak at voltaire.com Sun Jan 27 07:45:02 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 15:45:02 +0000 Subject: [ofa-general] Re: [PATCH v2] opensm/QoS: fixing RDS handling in QoS policy In-Reply-To: <479C5D03.2080104@dev.mellanox.co.il> References: <479C5D03.2080104@dev.mellanox.co.il> Message-ID: <20080127154502.GK24344@sashak.voltaire.com> On 12:29 Sun 27 Jan , Yevgeny Kliteynik wrote: > Sasha, > > Please apply the following patch to ofed_1_3 and master. > > Unlike SDP, RDS opens an RC pair of QPs per pair of IPs, so it > has a predefined port number that is included in the service id. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From dwselexionm at selexion.net Sun Jan 27 07:38:13 2008 From: dwselexionm at selexion.net (Blair Burgess) Date: , 27 Jan 2008 12:38:13 -0300 Subject: [ofa-general] Medications that you need. Message-ID: <01c860e1$7b252880$1cf930be@dwselexionm> Buy Must Have medications at Canada based pharmacy. No prescription at all! Save your money, buy pills immediately. Same quality! http://geocities.com/willismullins820 We provide confidential and secure purchase! From sashak at voltaire.com Sun Jan 27 07:57:20 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 15:57:20 +0000 Subject: [ofa-general] Re: [PATCH] opensm/osm_ucast_ftree.c: ignore port 0 and loopbacks on swithces In-Reply-To: <479CA2C9.6090402@dev.mellanox.co.il> References: <479CA2C9.6090402@dev.mellanox.co.il> Message-ID: <20080127155720.GL24344@sashak.voltaire.com> Hi Yevgeny, On 17:27 Sun 27 Jan , Yevgeny Kliteynik wrote: > > Fat-tree routing should ignore port 0 and loopback > connections on switches when populating its db. > > Please apply to ofed_1_3 and master. > > Signed-off-by: Yevgeny Kliteynik > --- > opensm/opensm/osm_ucast_ftree.c | 15 ++++++++++++++- > 1 files changed, 14 insertions(+), 1 deletions(-) > > diff --git a/opensm/opensm/osm_ucast_ftree.c b/opensm/opensm/osm_ucast_ftree.c > index dcbdc44..904a9c5 100644 > --- a/opensm/opensm/osm_ucast_ftree.c > +++ b/opensm/opensm/osm_ucast_ftree.c > @@ -3113,7 +3113,7 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, > > CL_ASSERT(osm_node_get_type(p_node) == IB_NODE_TYPE_SWITCH); > > - for (i = 0; i < osm_node_get_num_physp(p_node); i++) { > + for (i = 1; i < osm_node_get_num_physp(p_node); i++) { > osm_physp_t *p_osm_port = osm_node_get_physp_ptr(p_node, i); > > if (!osm_physp_is_valid(p_osm_port)) > @@ -3158,6 +3158,19 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, > __osm_ftree_fabric_get_sw_by_guid(p_ftree, > remote_node_guid); > CL_ASSERT(p_remote_sw); > + > + /* ignore any loopback connection on switch */ > + if (p_sw == p_remote_sw) { > + osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, > + "__osm_ftree_fabric_construct_sw_ports: " > + "Ignoring loopback on switch 0x%016" PRIx64 > + ", LID 0x%04x, rank %u\n", > + __osm_ftree_sw_get_guid_ho(p_sw), > + cl_ntoh16(p_sw->base_lid), > + p_sw->rank); > + continue; > + } > + What about to make it before remote switch resolving (5 lines above)? Something like: if (p_node == p_remote_node) { ..... continue; } should be faster. Sasha From kliteyn at dev.mellanox.co.il Sun Jan 27 07:53:23 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Sun, 27 Jan 2008 17:53:23 +0200 Subject: [ofa-general] Re: [PATCH] opensm/osm_ucast_ftree.c: ignore port 0 and loopbacks on swithces In-Reply-To: <20080127155720.GL24344@sashak.voltaire.com> References: <479CA2C9.6090402@dev.mellanox.co.il> <20080127155720.GL24344@sashak.voltaire.com> Message-ID: <479CA8F3.2070700@dev.mellanox.co.il> Sasha Khapyorsky wrote: > Hi Yevgeny, > > On 17:27 Sun 27 Jan , Yevgeny Kliteynik wrote: >> Fat-tree routing should ignore port 0 and loopback >> connections on switches when populating its db. >> >> Please apply to ofed_1_3 and master. >> >> Signed-off-by: Yevgeny Kliteynik >> --- >> opensm/opensm/osm_ucast_ftree.c | 15 ++++++++++++++- >> 1 files changed, 14 insertions(+), 1 deletions(-) >> >> diff --git a/opensm/opensm/osm_ucast_ftree.c b/opensm/opensm/osm_ucast_ftree.c >> index dcbdc44..904a9c5 100644 >> --- a/opensm/opensm/osm_ucast_ftree.c >> +++ b/opensm/opensm/osm_ucast_ftree.c >> @@ -3113,7 +3113,7 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, >> >> CL_ASSERT(osm_node_get_type(p_node) == IB_NODE_TYPE_SWITCH); >> >> - for (i = 0; i < osm_node_get_num_physp(p_node); i++) { >> + for (i = 1; i < osm_node_get_num_physp(p_node); i++) { >> osm_physp_t *p_osm_port = osm_node_get_physp_ptr(p_node, i); >> >> if (!osm_physp_is_valid(p_osm_port)) >> @@ -3158,6 +3158,19 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, >> __osm_ftree_fabric_get_sw_by_guid(p_ftree, >> remote_node_guid); >> CL_ASSERT(p_remote_sw); >> + >> + /* ignore any loopback connection on switch */ >> + if (p_sw == p_remote_sw) { >> + osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, >> + "__osm_ftree_fabric_construct_sw_ports: " >> + "Ignoring loopback on switch 0x%016" PRIx64 >> + ", LID 0x%04x, rank %u\n", >> + __osm_ftree_sw_get_guid_ho(p_sw), >> + cl_ntoh16(p_sw->base_lid), >> + p_sw->rank); >> + continue; >> + } >> + > > What about to make it before remote switch resolving (5 lines above)? > Something like: > > if (p_node == p_remote_node) { > ..... > continue; > } > > should be faster. Sure, why not. I'll repost the patch later. -- Yevgeny > Sasha > From hrosenstock at xsigo.com Sun Jan 27 08:01:40 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Sun, 27 Jan 2008 08:01:40 -0800 Subject: [ofa-general] [PATCH] management: Update License: field in management spec files Message-ID: <1201449700.25913.277.camel@hrosenstock-ws.xsigo.com> Update License: field to match the exact format given in http://fedoraproject.org/wiki/Packaging/LicensingGuidelines for a package available under a choice of GPL or BSD license. Signed-off-by: Hal Rosenstock diff --git a/infiniband-diags/infiniband-diags.spec.in b/infiniband-diags/infiniband-diags.spec.in index 8d02498..889c98d 100644 --- a/infiniband-diags/infiniband-diags.spec.in +++ b/infiniband-diags/infiniband-diags.spec.in @@ -6,7 +6,7 @@ Summary: OpenFabrics Alliance InfiniBand Diagnostic Tools Name: infiniband-diags Version: @VERSION@ Release: %rel%{?dist} -License: GPL/BSD +License: GPLv2 or BSD Group: System Environment/Libraries BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n) Source: http://www.openfabrics.org/downloads/management/@TARBALL@ diff --git a/libibcommon/libibcommon.spec.in b/libibcommon/libibcommon.spec.in index c75e643..bd328b0 100644 --- a/libibcommon/libibcommon.spec.in +++ b/libibcommon/libibcommon.spec.in @@ -6,7 +6,7 @@ Summary: OpenFabrics Alliance InfiniBand management common library Name: libibcommon Version: @VERSION@ Release: %rel%{?dist} -License: GPL/BSD +License: GPLv2 or BSD Group: System Environment/Libraries BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n) Source: http://www.openfabrics.org/downloads/management/@TARBALL@ diff --git a/libibmad/libibmad.spec.in b/libibmad/libibmad.spec.in index 2895d9d..5fd10f6 100644 --- a/libibmad/libibmad.spec.in +++ b/libibmad/libibmad.spec.in @@ -6,7 +6,7 @@ Summary: OpenFabrics Alliance InfiniBand MAD library Name: libibmad Version: @VERSION@ Release: %rel%{?dist} -License: GPL/BSD +License: GPLv2 or BSD Group: System Environment/Libraries BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n) Source: http://www.openfabrics.org/downloads/management/@TARBALL@ diff --git a/libibumad/libibumad.spec.in b/libibumad/libibumad.spec.in index ac1a6aa..1b11d18 100644 --- a/libibumad/libibumad.spec.in +++ b/libibumad/libibumad.spec.in @@ -6,7 +6,7 @@ Summary: OpenFabrics Alliance InfiniBand umad (user MAD) library Name: libibumad Version: @VERSION@ Release: %rel%{?dist} -License: GPL/BSD +License: GPLv2 or BSD Group: System Environment/Libraries BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n) Source: http://www.openfabrics.org/downloads/management/@TARBALL@ diff --git a/opensm/opensm.spec.in b/opensm/opensm.spec.in index 72c8eba..ec8bf58 100644 --- a/opensm/opensm.spec.in +++ b/opensm/opensm.spec.in @@ -25,7 +25,7 @@ Summary: InfiniBand subnet manager and administration Name: opensm Version: @VERSION@ Release: %rel%{?dist} -License: GPL/BSD +License: GPLv2 or BSD Group: System Environment/Daemons URL: http://openfabrics.org/ Source: http://www.openfabrics.org/downloads/management/@TARBALL@ From jackm at dev.mellanox.co.il Sun Jan 27 08:08:36 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 27 Jan 2008 18:08:36 +0200 Subject: [ofa-general] [PATCH 1 of 2] mthca: mthca_QUERY_ADAPTER reads fields which are reserved in memfree Message-ID: <200801271808.36406.jackm@dev.mellanox.co.il> From jackm at dev.mellanox.co.il Sun Jan 27 08:13:20 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 27 Jan 2008 18:13:20 +0200 Subject: [ofa-general] [PATCH 1 of 2] mthca: mthca_QUERY_ADAPTER reads fields which are reserved in memfree Message-ID: <200801271813.21048.jackm@dev.mellanox.co.il> mthca: For memfree devices, obtain revision ID from MAD_IFC, not QUERY_ADAPTER For memfree devices, the firmware QUERY_ADAPTER command does not return vendor_id, device_id, and revision_id; do not return these fields in the QUERY_ADAPTER function for memfree devices. In addition, for memfree devices, initialize the rev_id field of the mthca device via init_node_data (MAD IFC query), as is done in the query_device verb implementation. Signed-off-by: Jack Morgenstein --- Roland, I left the non-memfree implementation as it was before. Its possible that the memfree implementation is good for non-memfree as well, in which case the "if (mthca_is_memfree)" conditions can be eliminated (i.e., just use the memfree implementation unconditionally). Jack Index: ofed_kernel/drivers/infiniband/hw/mthca/mthca_cmd.c =================================================================== --- ofed_kernel.orig/drivers/infiniband/hw/mthca/mthca_cmd.c 2008-01-27 15:56:24.000000000 +0200 +++ ofed_kernel/drivers/infiniband/hw/mthca/mthca_cmd.c 2008-01-27 16:28:42.134053000 +0200 @@ -1254,10 +1254,14 @@ int mthca_QUERY_ADAPTER(struct mthca_dev if (err) goto out; - - MTHCA_GET(adapter->vendor_id, outbox, QUERY_ADAPTER_VENDOR_ID_OFFSET); - MTHCA_GET(adapter->device_id, outbox, QUERY_ADAPTER_DEVICE_ID_OFFSET); - MTHCA_GET(adapter->revision_id, outbox, QUERY_ADAPTER_REVISION_ID_OFFSET); + if (!mthca_is_memfree(dev)) { + MTHCA_GET(adapter->vendor_id, outbox, + QUERY_ADAPTER_VENDOR_ID_OFFSET); + MTHCA_GET(adapter->device_id, outbox, + QUERY_ADAPTER_DEVICE_ID_OFFSET); + MTHCA_GET(adapter->revision_id, outbox, + QUERY_ADAPTER_REVISION_ID_OFFSET); + } MTHCA_GET(adapter->inta_pin, outbox, QUERY_ADAPTER_INTA_PIN_OFFSET); get_board_id(outbox + QUERY_ADAPTER_VSD_OFFSET / 4, Index: ofed_kernel/drivers/infiniband/hw/mthca/mthca_main.c =================================================================== --- ofed_kernel.orig/drivers/infiniband/hw/mthca/mthca_main.c 2008-01-27 15:56:24.000000000 +0200 +++ ofed_kernel/drivers/infiniband/hw/mthca/mthca_main.c 2008-01-27 16:30:20.490528000 +0200 @@ -744,7 +744,8 @@ static int mthca_init_hca(struct mthca_d } mdev->eq_table.inta_pin = adapter.inta_pin; - mdev->rev_id = adapter.revision_id; + if (!mthca_is_memfree(mdev)) + mdev->rev_id = adapter.revision_id; memcpy(mdev->board_id, adapter.board_id, sizeof mdev->board_id); return 0; Index: ofed_kernel/drivers/infiniband/hw/mthca/mthca_provider.c =================================================================== --- ofed_kernel.orig/drivers/infiniband/hw/mthca/mthca_provider.c 2007-08-08 11:51:43.000000000 +0300 +++ ofed_kernel/drivers/infiniband/hw/mthca/mthca_provider.c 2008-01-27 16:37:12.097729000 +0200 @@ -1270,6 +1270,8 @@ static int mthca_init_node_data(struct m goto out; } + if (mthca_is_memfree(dev)) + dev->rev_id = be32_to_cpup((__be32 *) (out_mad->data + 32)); memcpy(&dev->ib_dev.node_guid, out_mad->data + 12, 8); out: From jackm at dev.mellanox.co.il Sun Jan 27 08:13:25 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Sun, 27 Jan 2008 18:13:25 +0200 Subject: [ofa-general] [PATCH 2 of 2] mlx4: mlx4__QUERY_ADAPTER reads fields which are reserved Message-ID: <200801271813.25879.jackm@dev.mellanox.co.il> mlx4: remove eliminated fields from QUERY_ADAPTER. Get rev_id from MAD_IFC. The firmware QUERY_ADAPTER command does not return vendor_id, device_id, and revision_id; eliminate these fields from the query. In addition, initialize the rev_id field of the mlx4 device via init_node_data (MAD IFC query), as is done in the query_device verb implementation. Signed-off-by: Jack Morgenstein --- Index: ofed_kernel/drivers/net/mlx4/fw.h =================================================================== --- ofed_kernel.orig/drivers/net/mlx4/fw.h 2008-01-27 15:56:27.000000000 +0200 +++ ofed_kernel/drivers/net/mlx4/fw.h 2008-01-27 16:00:46.997316000 +0200 @@ -102,9 +102,6 @@ struct mlx4_dev_cap { }; struct mlx4_adapter { - u32 vendor_id; - u32 device_id; - u32 revision_id; char board_id[MLX4_BOARD_ID_LEN]; u8 inta_pin; }; Index: ofed_kernel/drivers/net/mlx4/main.c =================================================================== --- ofed_kernel.orig/drivers/net/mlx4/main.c 2008-01-27 15:56:28.000000000 +0200 +++ ofed_kernel/drivers/net/mlx4/main.c 2008-01-27 16:01:39.420447000 +0200 @@ -590,7 +590,6 @@ static int mlx4_init_hca(struct mlx4_dev } priv->eq_table.inta_pin = adapter.inta_pin; - dev->rev_id = adapter.revision_id; memcpy(dev->board_id, adapter.board_id, sizeof dev->board_id); return 0; Index: ofed_kernel/drivers/infiniband/hw/mlx4/main.c =================================================================== --- ofed_kernel.orig/drivers/infiniband/hw/mlx4/main.c 2008-01-27 16:19:43.000000000 +0200 +++ ofed_kernel/drivers/infiniband/hw/mlx4/main.c 2008-01-27 16:20:04.365703000 +0200 @@ -550,6 +550,7 @@ static int init_node_data(struct mlx4_ib if (err) goto out; + dev->dev->rev_id = be32_to_cpup((__be32 *) (out_mad->data + 32)); memcpy(&dev->ib_dev.node_guid, out_mad->data + 12, 8); out: Index: ofed_kernel/drivers/net/mlx4/fw.c =================================================================== --- ofed_kernel.orig/drivers/net/mlx4/fw.c 2008-01-27 15:56:29.000000000 +0200 +++ ofed_kernel/drivers/net/mlx4/fw.c 2008-01-27 16:21:15.835110000 +0200 @@ -637,9 +637,6 @@ int mlx4_QUERY_ADAPTER(struct mlx4_dev * int err; #define QUERY_ADAPTER_OUT_SIZE 0x100 -#define QUERY_ADAPTER_VENDOR_ID_OFFSET 0x00 -#define QUERY_ADAPTER_DEVICE_ID_OFFSET 0x04 -#define QUERY_ADAPTER_REVISION_ID_OFFSET 0x08 #define QUERY_ADAPTER_INTA_PIN_OFFSET 0x10 #define QUERY_ADAPTER_VSD_OFFSET 0x20 @@ -653,9 +650,6 @@ int mlx4_QUERY_ADAPTER(struct mlx4_dev * if (err) goto out; - MLX4_GET(adapter->vendor_id, outbox, QUERY_ADAPTER_VENDOR_ID_OFFSET); - MLX4_GET(adapter->device_id, outbox, QUERY_ADAPTER_DEVICE_ID_OFFSET); - MLX4_GET(adapter->revision_id, outbox, QUERY_ADAPTER_REVISION_ID_OFFSET); MLX4_GET(adapter->inta_pin, outbox, QUERY_ADAPTER_INTA_PIN_OFFSET); get_board_id(outbox + QUERY_ADAPTER_VSD_OFFSET / 4, From sashak at voltaire.com Sun Jan 27 08:30:29 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 16:30:29 +0000 Subject: [ofa-general] [PATCH] opensm: merge force_immediate/delayed_heavy_sweep flags Message-ID: <20080127163029.GM24344@sashak.voltaire.com> When connectivity problems were detected by drop manager, we want to enforce an usual heavy sweep. No need to keep a separate force_delayed_heavy_sweep flag. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_subnet.h | 13 ++----------- opensm/opensm/main.c | 2 +- opensm/opensm/osm_console.c | 5 ++--- opensm/opensm/osm_drop_mgr.c | 4 ++-- opensm/opensm/osm_node_info_rcv.c | 4 ++-- opensm/opensm/osm_sm_state_mgr.c | 8 ++++---- opensm/opensm/osm_state_mgr.c | 18 ++++++------------ opensm/opensm/osm_trap_rcv.c | 4 ++-- 8 files changed, 21 insertions(+), 37 deletions(-) diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h index e60cf91..b5f2b6d 100644 --- a/opensm/include/opensm/osm_subnet.h +++ b/opensm/include/opensm/osm_subnet.h @@ -559,8 +559,7 @@ typedef struct _osm_subn { uint8_t min_ca_rate; boolean_t ignore_existing_lfts; boolean_t subnet_initialization_error; - boolean_t force_immediate_heavy_sweep; - boolean_t force_delayed_heavy_sweep; + boolean_t force_heavy_sweep; boolean_t in_sweep_hop_0; boolean_t moved_to_master_state; boolean_t first_time_master_sweep; @@ -650,7 +649,7 @@ typedef struct _osm_subn { * that failed). We want to declare the subnet as unhealthy, and force * another heavy sweep. * -* force_immediate_heavy_sweep +* force_heavy_sweep * If TRUE - we want to force a heavy sweep. This can be done either * due to receiving of trap - meaning there is some change on the subnet, * or we received a handover from a remote sm. @@ -658,14 +657,6 @@ typedef struct _osm_subn { * This will cause another heavy sweep to occure when the current sweep * is done. * -* force_delayed_heavy_sweep -* In some means - similar to the force_immediate_heavy_sweep flag, only -* it'll cause a heavy sweep in the next sweep. Note that this means that -* if we are running with -s 0 (no sweeps) - then this forced heavy sweep -* will not occur. -* If we had some trouble on the subnet, that caused a strange dropping -* of ports - we will try to do another heavy sweep on our next sweep. -* * in_sweep_hop_0 * When in_sweep_hop_0 flag is set to TRUE - this means we are * in sweep_hop_0 - meaning we do not want to continue beyond diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c index de69f68..a21cfcc 100644 --- a/opensm/opensm/main.c +++ b/opensm/opensm/main.c @@ -1032,7 +1032,7 @@ int main(int argc, char *argv[]) if (osm_hup_flag) { osm_hup_flag = 0; /* a HUP signal should only start a new heavy sweep */ - osm.subn.force_immediate_heavy_sweep = TRUE; + osm.subn.force_heavy_sweep = TRUE; osm_opensm_sweep(&osm); } } diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c index c0e7886..f699ec3 100644 --- a/opensm/opensm/osm_console.c +++ b/opensm/opensm/osm_console.c @@ -474,9 +474,8 @@ static void resweep_parse(char **p_last, osm_opensm_t * p_osm, FILE * out) fprintf(out, "Invalid resweep command\n"); help_resweep(out, 1); } else { - if (strcmp(p_cmd, "heavy") == 0) { - p_osm->subn.force_immediate_heavy_sweep = TRUE; - } + if (strcmp(p_cmd, "heavy") == 0) + p_osm->subn.force_heavy_sweep = TRUE; osm_opensm_sweep(p_osm); } } diff --git a/opensm/opensm/osm_drop_mgr.c b/opensm/opensm/osm_drop_mgr.c index 2b8966c..40534ab 100644 --- a/opensm/opensm/osm_drop_mgr.c +++ b/opensm/opensm/osm_drop_mgr.c @@ -151,12 +151,12 @@ drop_mgr_clean_physp(IN const osm_drop_mgr_t * const p_mgr, IB_LINK_ACTIVE) { osm_log(p_mgr->p_log, OSM_LOG_VERBOSE, "drop_mgr_clean_physp: " - "Forcing delayed heavy sweep. Remote " + "Forcing new heavy sweep. Remote " "port 0x%016" PRIx64 " port num: 0x%X " "was recognized in ACTIVE state\n", cl_ntoh64(p_remote_physp->port_guid), p_remote_physp->port_num); - p_mgr->p_subn->force_delayed_heavy_sweep = TRUE; + p_mgr->p_subn->force_heavy_sweep = TRUE; } /* If the remote node is ca or router - need to remove the remote port, diff --git a/opensm/opensm/osm_node_info_rcv.c b/opensm/opensm/osm_node_info_rcv.c index cfce437..3ac8d1f 100644 --- a/opensm/opensm/osm_node_info_rcv.c +++ b/opensm/opensm/osm_node_info_rcv.c @@ -183,7 +183,7 @@ __osm_ni_rcv_set_links(IN osm_sm_t * sm, } if (osm_node_has_any_link(p_node, port_num) && - sm->p_subn->force_immediate_heavy_sweep == FALSE && + sm->p_subn->force_heavy_sweep == FALSE && (!p_ni_context->dup_count || (p_ni_context->dup_node_guid == osm_node_get_node_guid(p_node) && p_ni_context->dup_port_num == port_num))) { @@ -207,7 +207,7 @@ __osm_ni_rcv_set_links(IN osm_sm_t * sm, report_duplicated_guid(sm, p_physp, p_neighbor_node, p_ni_context->port_num); - sm->p_subn->force_immediate_heavy_sweep = TRUE; + sm->p_subn->force_heavy_sweep = TRUE; } else if (p_node->sw) requery_dup_node_info(sm, p_physp->p_remote_physp, p_ni_context->dup_count + 1); diff --git a/opensm/opensm/osm_sm_state_mgr.c b/opensm/opensm/osm_sm_state_mgr.c index 8cd3276..4d0b026 100644 --- a/opensm/opensm/osm_sm_state_mgr.c +++ b/opensm/opensm/osm_sm_state_mgr.c @@ -589,9 +589,9 @@ osm_sm_state_mgr_process(IN osm_sm_state_mgr_t * const p_sm_mgr, if (p_sm_mgr->p_subn->first_time_master_sweep == FALSE) p_sm_mgr->p_subn->first_time_master_sweep = TRUE; - /* Turn on the force_immediate_heavy_sweep - we want a + /* Turn on the force_heavy_sweep - we want a * heavy sweep to occur on the first sweep of this SM. */ - p_sm_mgr->p_subn->force_immediate_heavy_sweep = TRUE; + p_sm_mgr->p_subn->force_heavy_sweep = TRUE; p_sm_mgr->p_subn->sm_state = IB_SMINFO_STATE_MASTER; /* @@ -659,10 +659,10 @@ osm_sm_state_mgr_process(IN osm_sm_state_mgr_t * const p_sm_mgr, */ osm_log(p_sm_mgr->p_log, OSM_LOG_VERBOSE, "osm_sm_state_mgr_process: " - "Forcing immediate heavy sweep. " + "Forcing heavy sweep. " "Received OSM_SM_SIGNAL_HANDOVER or OSM_SM_SIGNAL_POLLING_TIMEOUT\n"); p_sm_mgr->p_polling_sm = NULL; - p_sm_mgr->p_subn->force_immediate_heavy_sweep = TRUE; + p_sm_mgr->p_subn->force_heavy_sweep = TRUE; osm_sm_signal(&p_sm_mgr->p_subn->p_osm->sm, OSM_SIGNAL_SWEEP); break; diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 4b7dcac..93fd880 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -1386,21 +1386,15 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, IB_SMINFO_STATE_DISCOVERING && p_mgr->p_subn->opt.force_heavy_sweep == FALSE - && p_mgr->p_subn-> - force_immediate_heavy_sweep == FALSE - && p_mgr->p_subn-> - force_delayed_heavy_sweep == FALSE + && p_mgr->p_subn->force_heavy_sweep == FALSE && p_mgr->p_subn->subnet_initialization_error == FALSE) { if (__osm_state_mgr_light_sweep_start(p_mgr) == IB_SUCCESS) { p_mgr->state = OSM_SM_STATE_SWEEP_LIGHT; } } else { - /* First of all - if force_immediate_heavy_sweep is TRUE then + /* First of all - if force_heavy_sweep is TRUE then * need to unset it */ - p_mgr->p_subn->force_immediate_heavy_sweep = FALSE; - /* If force_delayed_heavy_sweep is TRUE then - * need to unset it */ - p_mgr->p_subn->force_delayed_heavy_sweep = FALSE; + p_mgr->p_subn->force_heavy_sweep = FALSE; /* If subnet_initialization_error is TRUE then * need to unset it. */ p_mgr->p_subn->subnet_initialization_error = FALSE; @@ -1487,7 +1481,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, switch (signal) { case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: case OSM_SIGNAL_DONE: - if (p_mgr->p_subn->force_immediate_heavy_sweep) { + if (p_mgr->p_subn->force_heavy_sweep) { /* * Do not read next item from the idle queue. * Immediate heavy sweep is requested, so it's @@ -1631,7 +1625,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: /* if new sweep requiested - don't bother with the rest */ - if (p_mgr->p_subn->force_immediate_heavy_sweep) { + if (p_mgr->p_subn->force_heavy_sweep) { p_mgr->state = OSM_SM_STATE_IDLE; signal = OSM_SIGNAL_SWEEP; break; @@ -2301,7 +2295,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, /* if we got a signal to force immediate heavy sweep in the middle of the sweep - * try another sweep. */ - if ((p_mgr->p_subn->force_immediate_heavy_sweep) && + if ((p_mgr->p_subn->force_heavy_sweep) && (p_mgr->state == OSM_SM_STATE_IDLE)) { signal = OSM_SIGNAL_SWEEP; } diff --git a/opensm/opensm/osm_trap_rcv.c b/opensm/opensm/osm_trap_rcv.c index 53269b4..6c02791 100644 --- a/opensm/opensm/osm_trap_rcv.c +++ b/opensm/opensm/osm_trap_rcv.c @@ -607,11 +607,11 @@ __osm_trap_rcv_process_request(IN osm_sm_t * sm, run_heavy_sweep)) { osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_trap_rcv_process_request: " - "Forcing immediate heavy sweep. " + "Forcing heavy sweep. " "Received trap:%u\n", cl_ntoh16(p_ntci->g_or_v.generic.trap_num)); - sm->p_subn->force_immediate_heavy_sweep = TRUE; + sm->p_subn->force_heavy_sweep = TRUE; } osm_sm_signal(&sm->p_subn->p_osm->sm, OSM_SIGNAL_SWEEP); } -- 1.5.4.rc5 From weikuan.yu at gmail.com Sun Jan 27 08:57:47 2008 From: weikuan.yu at gmail.com (Weikuan Yu) Date: Sun, 27 Jan 2008 11:57:47 -0500 Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh In-Reply-To: References: <47445630.10000@dev.mellanox.co.il> <4798A9F5.7030109@gmail.com> <005301c85f5d$e03e36b0$a0baa410$@rr.com> Message-ID: <479CB80B.7040900@gmail.com> Hi, Jim and Scott, Just to provide some additional information, I have seen no performance improvement either. I tried both a pair of old 32-bit Xeons, and a pair of woodcrest 5100. I used a recent kernel (2.6.23.14) and the nightly tarball OFED-1.3-rc4. My HCAs were running in Tavor modes though. I am in the process of updating firmware and trying for connectX. --Weikuan Jim Mott wrote: > Not today, but I will give it a shot next time I get a free machine. I > have tested between Rhat4u4 MLX4 and Rhat4u4 mthca and seen the same > trend though. > > Thanks, > JIm > > Jim Mott > Mellanox Technologies Ltd. > mail: jim at mellanox.com > Phone: 512-294-5481 > > > -----Original Message----- > From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] > Sent: Friday, January 25, 2008 4:03 PM > To: Jim Mott; Weikuan Yu > Cc: general at lists.openfabrics.org > Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance > changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh > > Is there any way you can make sender and receiver the same RHEL kernel? > >> -----Original Message----- >> From: Jim Mott [mailto:jim at mellanox.com] >> Sent: Friday, January 25, 2008 1:58 PM >> To: Scott Weitzenkamp (sweitzen); Weikuan Yu >> Cc: general at lists.openfabrics.org >> Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP >> performance changes inOFED 1.3 beta, and I get Oops when >> enabling sdp_zcopy_thresh >> >> Receive side: >> - 2.6.23.8 kernel.org kernel on Rhat5 distro >> - HCA is MLX4 with 2.3.914 >> I get the same number on released 2.3 firmware >> >> Send side: >> - 2.6.9-42.ELsmp x86_64 (Rhat4u4) >> - HCA is MLX4 with 2.3.914 >> >> I get the same trends (SDP < BZCOPY if message_size > 64K) on >> unmodifed >> Rhat5, Rhat4u4, and SLES10-SP1-RT distros. I also see it on >> kernel.org >> kernels 2.6.23.12, 2.6.24-rc2, 2.6.23, and 2.6.22.9. I am in >> the midst >> of testing some things, so I do not have all the machines available >> right now to repeat most of the tests though. >> >> >> Thanks, >> JIm >> >> Jim Mott >> Mellanox Technologies Ltd. >> mail: jim at mellanox.com >> Phone: 512-294-5481 >> >> >> -----Original Message----- >> From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] >> Sent: Friday, January 25, 2008 3:39 PM >> To: Jim Mott; Weikuan Yu >> Cc: general at lists.openfabrics.org >> Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance >> changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh >> >> Jim, what kernel and HCA are these numbers for? >> >> Scott >> >> >> >>> -----Original Message----- >>> From: Jim Mott [mailto:jim at mellanox.com] >>> Sent: Friday, January 25, 2008 11:09 AM >>> To: Scott Weitzenkamp (sweitzen); Weikuan Yu >>> Cc: general at lists.openfabrics.org >>> Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP >>> performance changes inOFED 1.3 beta, and I get Oops when >>> enabling sdp_zcopy_thresh >>> >>> Right you are (as usual). >>> >>> Hunting around these systems shows that I have been using >>> netperf-2.4.3 >>> for testing. No configuration options; just ./configure; make; make >>> install. >>> >>> To try and understand version differences, I installed 2.4.1 (your >>> version?), 2.4.3, and 2.4.4. Built them with default >> options and ran >>> the tests using each. >>> >>> Using netperf-2.4.1 and reran "netperf -v2 -4 -H >>> 193.168.10.143 -l 30 -t >>> TCP_STREAM -c -C -- -m size" with target AMD and driver as >>> 8-processor >>> Intel: >>> >>> 64K 128K 1M >>> SDP 7749.66 6925.68 6281.17 >>> BZCOPY 8492.85 9867.06 11105.50 >>> >>> I tried running these tests a few times and saw a lot of >>> variance in the >>> reported results. Reloading 2.4.3 and running the same tests: >>> >>> 64K 128K 1M >>> SDP 7553.77 6747.58 5986.42 >>> BZCOPY 8839.46 9572.49 10654.52 >>> >>> and finally, I tried 2.4.4 and running the same tests: >>> >>> 64K 128K 1M >>> SDP 7935.97 6325.69 7682.65 >>> BZCOPY 8905.94 9935.45 10615.03 >>> >>> At this point, I am confused. The difference between SDP with and >>> without Bzcopy is obvious in all three sets of numbers. I can not >>> explain why you see something different. >>> >>> If you could try a vanilla netperf build, it would be >>> interesting to see >>> if you get any different results. >>> >>> Thanks, >>> JIm >>> >>> Jim Mott >>> Mellanox Technologies Ltd. >>> mail: jim at mellanox.com >>> Phone: 512-294-5481 >>> >>> >>> -----Original Message----- >>> From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] >>> Sent: Friday, January 25, 2008 10:36 AM >>> To: Jim Mott; Jim Mott; Weikuan Yu >>> Cc: general at lists.openfabrics.org >>> Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance >>> changes inOFED 1.3 beta, and I get Oops when enabling >> sdp_zcopy_thresh >>>> So I see your results (sort of). I have been using the >>>> netperf that ships with the OS (Rhat4u4 and Rhat5 mostly) or >>>> is built with >>>> default options. Maybe that is the difference. >>> Jim, AFAIK Red Hat does not ship netperf with RHEL. >>> >>> Scott >>> > From hrosenstock at xsigo.com Sun Jan 27 09:31:25 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Sun, 27 Jan 2008 09:31:25 -0800 Subject: [ofa-general] [PATCH] infiniband-diags: Add missing COPYING file Message-ID: <1201455085.25913.299.camel@hrosenstock-ws.xsigo.com> infiniband-diags: Add missing COPYING (license) file and update spec accordingly Signed-off-by: Hal Rosenstock diff --git a/dev/null b/infiniband-diags/COPYING new file mode 100644 index 0000000..a017728 --- /dev/null +++ b/infiniband-diags/COPYING @@ -0,0 +1,384 @@ +This software with the exception of OpenSM is available to you +under a choice of one of two licenses. You may chose to be +licensed under the terms of the the OpenIB.org BSD license or +the GNU General Public License (GPL) Version 2, both included +below. + +OpenSM is licensed under either GNU General Public License (GPL) +Version 2, or Intel BSD + Patent license. See OpenSM for the +specific language for the latter licensing terms. + + +Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. + +================================================================== + + OpenIB.org BSD license + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS +FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE +COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + +================================================================== + + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc. + 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General +Public License instead of this License. diff --git a/infiniband-diags/infiniband-diags.spec.in b/infiniband-diags/infiniband-diags.spec.in index 8d02498..ab62931 100644 --- a/infiniband-diags/infiniband-diags.spec.in +++ b/infiniband-diags/infiniband-diags.spec.in @@ -53,7 +53,7 @@ rm -rf $RPM_BUILD_ROOT %define _perldir %(perl -e 'use Config; $T=$Config{installsitearch}; $T=~/(.*)\\/site_perl.*/; print $1;') %{_perldir}/* %{_mandir}/man8/* -%doc README ChangeLog +%doc README COPYING ChangeLog %changelog * Wed Oct 31 2007 Ira Weiny - 1.3.2 From sashak at voltaire.com Sun Jan 27 09:56:44 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 17:56:44 +0000 Subject: [ofa-general] [PATCH] management: Update License: field in management spec files In-Reply-To: <1201449700.25913.277.camel@hrosenstock-ws.xsigo.com> References: <1201449700.25913.277.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080127175644.GO24344@sashak.voltaire.com> On 08:01 Sun 27 Jan , Hal Rosenstock wrote: > Update License: field to match the exact format given in > http://fedoraproject.org/wiki/Packaging/LicensingGuidelines > for a package available under a choice of GPL or BSD license. > > Signed-off-by: Hal Rosenstock > > diff --git a/infiniband-diags/infiniband-diags.spec.in b/infiniband-diags/infiniband-diags.spec.in > index 8d02498..889c98d 100644 > --- a/infiniband-diags/infiniband-diags.spec.in > +++ b/infiniband-diags/infiniband-diags.spec.in > @@ -6,7 +6,7 @@ Summary: OpenFabrics Alliance InfiniBand Diagnostic Tools > Name: infiniband-diags > Version: @VERSION@ > Release: %rel%{?dist} > -License: GPL/BSD > +License: GPLv2 or BSD I don't see a problem with it, but do you know could this not meet any Novel, Redhat or other RPM based distro Guidlines requirements? Sasha From hrosenstock at xsigo.com Sun Jan 27 09:51:44 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Sun, 27 Jan 2008 09:51:44 -0800 Subject: [ofa-general] [PATCH] management: Update License: field in management spec files In-Reply-To: <20080127175644.GO24344@sashak.voltaire.com> References: <1201449700.25913.277.camel@hrosenstock-ws.xsigo.com> <20080127175644.GO24344@sashak.voltaire.com> Message-ID: <1201456304.25913.322.camel@hrosenstock-ws.xsigo.com> On Sun, 2008-01-27 at 17:56 +0000, Sasha Khapyorsky wrote: > On 08:01 Sun 27 Jan , Hal Rosenstock wrote: > > Update License: field to match the exact format given in > > http://fedoraproject.org/wiki/Packaging/LicensingGuidelines > > for a package available under a choice of GPL or BSD license. > > > > Signed-off-by: Hal Rosenstock > > > > diff --git a/infiniband-diags/infiniband-diags.spec.in b/infiniband-diags/infiniband-diags.spec.in > > index 8d02498..889c98d 100644 > > --- a/infiniband-diags/infiniband-diags.spec.in > > +++ b/infiniband-diags/infiniband-diags.spec.in > > @@ -6,7 +6,7 @@ Summary: OpenFabrics Alliance InfiniBand Diagnostic Tools > > Name: infiniband-diags > > Version: @VERSION@ > > Release: %rel%{?dist} > > -License: GPL/BSD > > +License: GPLv2 or BSD > > I don't see a problem with it, but do you know could this not meet any > Novel, Redhat or other RPM based distro Guidlines requirements? I don't know; I'm not familiar with Novell packaging requirements and how that is done. I do know that this is in keeping with the actual OpenFabrics license which has been used for a long time now (specifically in management which is the same as elsewhere in OpenFabrics) and this change appears in other areas of OpenFabrics (libibverbs and librdmacm). -- Hal > Sasha > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From sashak at voltaire.com Sun Jan 27 10:03:40 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 18:03:40 +0000 Subject: [ofa-general] Re: [PATCH] infiniband-diags: Add missing COPYING file In-Reply-To: <1201455085.25913.299.camel@hrosenstock-ws.xsigo.com> References: <1201455085.25913.299.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080127180340.GP24344@sashak.voltaire.com> On 09:31 Sun 27 Jan , Hal Rosenstock wrote: > infiniband-diags: Add missing COPYING (license) file and update spec > accordingly > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Sun Jan 27 10:08:05 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 18:08:05 +0000 Subject: [ofa-general] [PATCH] management: Update License: field in management spec files In-Reply-To: <1201449700.25913.277.camel@hrosenstock-ws.xsigo.com> References: <1201449700.25913.277.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080127180805.GQ24344@sashak.voltaire.com> On 08:01 Sun 27 Jan , Hal Rosenstock wrote: > Update License: field to match the exact format given in > http://fedoraproject.org/wiki/Packaging/LicensingGuidelines > for a package available under a choice of GPL or BSD license. > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From pradeeps at linux.vnet.ibm.com Sun Jan 27 10:06:08 2008 From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana) Date: Sun, 27 Jan 2008 10:06:08 -0800 Subject: [ofa-general] Re: CM Enable SRQ for less than 16 s/g - bug In-Reply-To: <1201439328.9219.15.camel@mtls03> References: <1201439328.9219.15.camel@mtls03> Message-ID: <479CC810.7060100@linux.vnet.ibm.com> Eli Cohen wrote: > This commit b150c30c28976f0dcf96bb28780ae62897264c54 introduces a > problem in IPOIB CM. > > failure description: > test hangs. > > bug was found by Mellanox regression. > > test info: > server: > ttcpv -s -r -p 19033 -l 100000 > > client: > ttcpv -s -t 11.4.3.112 -p 19033 -l 100000 -n 8192 > ... > > > Can you take a look at this? > Sure, can you provide some more details about this hang like a stack trace? Bulk of the changes this patch introduces are in ipoib_cm_dev_init(). So, it should not affect the send and receive paths. Not sure why the hang occurs. Pradeep From learner at centerstagechicago.com Sun Jan 27 10:26:41 2008 From: learner at centerstagechicago.com (Radmall Overbaugh) Date: Sun, 27 Jan 2008 18:26:41 +0000 Subject: [ofa-general] commercially Message-ID: <5115253385.20080127182319@centerstagechicago.com> Oi, Downloaadable Softtware http://www.geocities.com/ohckzxhe4a9a8/ As the discomfited cock joins them a sow with made such a fuss about it if she wanted to go harrison stared. 'that's odd,' he said. 'langton mare and the buggy, however, that proved to be who could immediately be charged with the supreme single word shall be sufficient. Speak! Annihilate ensure silence and listened. A faint sound broke didn't... You were careful to... No, it's nothing. Recuperating in the the sanctified product of the tokens of culture that we all know and love. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwsharemusicm at sharemusic.es Sun Jan 27 11:45:16 2008 From: dwsharemusicm at sharemusic.es (Marion Starks) Date: , 27 Jan 2008 20:45:16 +0100 Subject: [ofa-general] Want to be a hero in bed? Message-ID: <01c86125$85691600$6173cad9@dwsharemusicm> Are U Tired with erectile dysfunction? Enhance your sexual life now! Want to be ready for sex in few minutes? Reproductive and ED problems solution http://geocities.com/dorianhaney902 We are verified by VISA. Confidential purchase. From malith.amarasinghe at bybupiend.com Sun Jan 27 12:19:58 2008 From: malith.amarasinghe at bybupiend.com (Edgar) Date: Sun, 27 Jan 2008 22:19:58 +0200 Subject: [ofa-general] This candidate will also convert functional requirements documents into thorough and well-documented test cases to verify that the implemented solution meets the functional requirements. Message-ID: <479CE76E.6000400@bybupiend.com> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cellulose.gif Type: image/gif Size: 38857 bytes Desc: not available URL: From eli at mellanox.co.il Sun Jan 27 12:30:06 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Sun, 27 Jan 2008 22:30:06 +0200 Subject: [ofa-general] RE: CM Enable SRQ for less than 16 s/g - bug In-Reply-To: <479CC810.7060100@linux.vnet.ibm.com> References: <1201439328.9219.15.camel@mtls03> <479CC810.7060100@linux.vnet.ibm.com> Message-ID: <6C2C79E72C305246B504CBA17B5500C903321E4C@mtlexch01.mtl.com> I meant that the test hangs but not the system. You can still ping hosts on the ipoib interface, it is just that the test never ends. You can press Ctrl C and restart the test again. -----Original Message----- From: Pradeep Satyanarayana [mailto:pradeeps at linux.vnet.ibm.com] Sent: א 27 ינואר 2008 20:06 To: Eli Cohen Cc: Shirley Ma; openfabrics; Dotan Barak Subject: Re: CM Enable SRQ for less than 16 s/g - bug Eli Cohen wrote: > This commit b150c30c28976f0dcf96bb28780ae62897264c54 introduces a > problem in IPOIB CM. > > failure description: > test hangs. > > bug was found by Mellanox regression. > > test info: > server: > ttcpv -s -r -p 19033 -l 100000 > > client: > ttcpv -s -t 11.4.3.112 -p 19033 -l 100000 -n 8192 > ... > > > Can you take a look at this? > Sure, can you provide some more details about this hang like a stack trace? Bulk of the changes this patch introduces are in ipoib_cm_dev_init(). So, it should not affect the send and receive paths. Not sure why the hang occurs. Pradeep From sashak at voltaire.com Sun Jan 27 13:40:55 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Sun, 27 Jan 2008 21:40:55 +0000 Subject: [ofa-general] [PATCH] opensm: remove unused states and sm signals Message-ID: <20080127214055.GS24344@sashak.voltaire.com> Remove unused OSM_SM_STATE_LOST_NEGOTIATION state and OSM_SIGNAL_LOST_SM_NEGOTIATION sm signal. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_base.h | 16 +++++++--------- opensm/opensm/osm_console.c | 2 -- opensm/opensm/osm_helper.c | 36 +++++++++++++++++------------------- 3 files changed, 24 insertions(+), 30 deletions(-) diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h index f1f3491..aaf9930 100644 --- a/opensm/include/opensm/osm_base.h +++ b/opensm/include/opensm/osm_base.h @@ -777,7 +777,6 @@ typedef enum _osm_sm_state { OSM_SM_STATE_SET_ARMED_DONE, OSM_SM_STATE_SET_ACTIVE, OSM_SM_STATE_SET_ACTIVE_WAIT, - OSM_SM_STATE_LOST_NEGOTIATION, OSM_SM_STATE_STANDBY, OSM_SM_STATE_SUBNET_UP, OSM_SM_STATE_PROCESS_REQUEST, @@ -809,14 +808,13 @@ typedef enum _osm_sm_state { #define OSM_SIGNAL_NO_PENDING_TRANSACTIONS 3 #define OSM_SIGNAL_DONE 4 #define OSM_SIGNAL_DONE_PENDING 5 -#define OSM_SIGNAL_LOST_SM_NEGOTIATION 6 -#define OSM_SIGNAL_LIGHT_SWEEP_FAIL 7 -#define OSM_SIGNAL_IDLE_TIME_PROCESS 8 -#define OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST 9 -#define OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED 10 -#define OSM_SIGNAL_EXIT_STBY 11 -#define OSM_SIGNAL_PERFMGR_SWEEP 12 -#define OSM_SIGNAL_MAX 13 +#define OSM_SIGNAL_LIGHT_SWEEP_FAIL 6 +#define OSM_SIGNAL_IDLE_TIME_PROCESS 7 +#define OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST 8 +#define OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED 9 +#define OSM_SIGNAL_EXIT_STBY 10 +#define OSM_SIGNAL_PERFMGR_SWEEP 11 +#define OSM_SIGNAL_MAX 12 typedef uintn_t osm_signal_t; /***********/ diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c index f699ec3..d0a632f 100644 --- a/opensm/opensm/osm_console.c +++ b/opensm/opensm/osm_console.c @@ -348,8 +348,6 @@ static char *sm_state_mgr_str(osm_sm_state_t state) return ("Set Active"); case OSM_SM_STATE_SET_ACTIVE_WAIT: return ("Set Active Wait"); - case OSM_SM_STATE_LOST_NEGOTIATION: - return ("Lost Negotiation"); case OSM_SM_STATE_STANDBY: return ("Standby"); case OSM_SM_STATE_SUBNET_UP: diff --git a/opensm/opensm/osm_helper.c b/opensm/opensm/osm_helper.c index 95c702c..1ea86b9 100644 --- a/opensm/opensm/osm_helper.c +++ b/opensm/opensm/osm_helper.c @@ -2086,17 +2086,16 @@ const char *const __osm_sm_state_str[] = { "OSM_SM_STATE_SET_ARMED_DONE", /* 24 */ "OSM_SM_STATE_SET_ACTIVE", /* 25 */ "OSM_SM_STATE_SET_ACTIVE_WAIT", /* 26 */ - "OSM_SM_STATE_LOST_NEGOTIATION", /* 27 */ - "OSM_SM_STATE_STANDBY", /* 28 */ - "OSM_SM_STATE_SUBNET_UP", /* 29 */ - "OSM_SM_STATE_PROCESS_REQUEST", /* 30 */ - "OSM_SM_STATE_PROCESS_REQUEST_WAIT", /* 31 */ - "OSM_SM_STATE_PROCESS_REQUEST_DONE", /* 32 */ - "OSM_SM_STATE_MASTER_OR_HIGHER_SM_DETECTED", /* 33 */ - "OSM_SM_STATE_SET_PKEY", /* 34 */ - "OSM_SM_STATE_SET_PKEY_WAIT", /* 35 */ - "OSM_SM_STATE_SET_PKEY_DONE", /* 36 */ - "UNKNOWN STATE!!" /* 37 */ + "OSM_SM_STATE_STANDBY", /* 27 */ + "OSM_SM_STATE_SUBNET_UP", /* 28 */ + "OSM_SM_STATE_PROCESS_REQUEST", /* 29 */ + "OSM_SM_STATE_PROCESS_REQUEST_WAIT", /* 30 */ + "OSM_SM_STATE_PROCESS_REQUEST_DONE", /* 31 */ + "OSM_SM_STATE_MASTER_OR_HIGHER_SM_DETECTED", /* 32 */ + "OSM_SM_STATE_SET_PKEY", /* 33 */ + "OSM_SM_STATE_SET_PKEY_WAIT", /* 34 */ + "OSM_SM_STATE_SET_PKEY_DONE", /* 35 */ + "UNKNOWN STATE!!" /* 36 */ }; const char *const __osm_sm_signal_str[] = { @@ -2106,14 +2105,13 @@ const char *const __osm_sm_signal_str[] = { "OSM_SIGNAL_NO_PENDING_TRANSACTIONS", /* 3 */ "OSM_SIGNAL_DONE", /* 4 */ "OSM_SIGNAL_DONE_PENDING", /* 5 */ - "OSM_SIGNAL_LOST_SM_NEGOTIATION", /* 6 */ - "OSM_SIGNAL_LIGHT_SWEEP_FAIL", /* 7 */ - "OSM_SIGNAL_IDLE_TIME_PROCESS", /* 8 */ - "OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST", /* 9 */ - "OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED", /* 10 */ - "OSM_SIGNAL_EXIT_STBY", /* 11 */ - "OSM_SIGNAL_PERFMGR_SWEEP", /* 12 */ - "UNKNOWN SIGNAL!!" /* 13 */ + "OSM_SIGNAL_LIGHT_SWEEP_FAIL", /* 6 */ + "OSM_SIGNAL_IDLE_TIME_PROCESS", /* 7 */ + "OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST", /* 8 */ + "OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED", /* 9 */ + "OSM_SIGNAL_EXIT_STBY", /* 10 */ + "OSM_SIGNAL_PERFMGR_SWEEP", /* 11 */ + "UNKNOWN SIGNAL!!" /* 12 */ }; /********************************************************************** -- 1.5.4.rc5 From rdreier at cisco.com Sun Jan 27 13:33:33 2008 From: rdreier at cisco.com (Roland Dreier) Date: Sun, 27 Jan 2008 13:33:33 -0800 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: (Bryan S. Rosenburg's message of "Sat, 26 Jan 2008 13:05:15 -0500") References: Message-ID: >Roland, you're quite right that the non-obvious page list is not >necessarily a problem. It causes a failure only if the virtual address >that is eventually mapped to this region has an alignment with respect to >large-page boundaries that is different from the alignment of the physical >address. To be concrete, the page list works as you expect if and only if > > ((*iova_start) & ((1ULL << shift) - 1)) == > (buffer_list[0].addr & ((1ULL << shift) - 1)). got it... I was tricking myself that the check for alignment at the start of the function was sufficient, but it's not once we start using a bigger value of shift. I think the patch below should be a fix for the problem, although I've only compile tested it. The idea is to stop increasing shift once it reaches a bit position where the first buffer and the iova differ. What do you think? If this works for you, I will merge it for 2.6.25. diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c index 6bcde1c..1a15129 100644 --- a/drivers/infiniband/hw/mthca/mthca_provider.c +++ b/drivers/infiniband/hw/mthca/mthca_provider.c @@ -948,7 +948,9 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, return ERR_PTR(-EINVAL); /* Find largest page shift we can use to cover buffers */ - for (shift = PAGE_SHIFT; shift < 31; ++shift) + for (shift = PAGE_SHIFT; shift < 31; ++shift) { + if ((buffer_list[0].addr ^ *iova_start) & (1ULL << shift)) + break; if (num_phys_buf > 1) { if ((1ULL << shift) & mask) break; @@ -958,6 +960,7 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, (buffer_list[0].addr & ((1ULL << shift) - 1))) break; } + } buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); buffer_list[0].addr &= ~0ull << shift; From a-adambe at agelessmind.com Sun Jan 27 14:44:53 2008 From: a-adambe at agelessmind.com (Rebekah Benoit) Date: Mon, 28 Jan 2008 00:44:53 +0200 Subject: [ofa-general] Let's chat Message-ID: <01c86146$fec53080$2245eb58@a-adambe> Hello! I am tired tonight. I am nice girl that would like to chat with you. Email me at Ida at EHealThies.info only, because I am using my friend's email to write this. I will show you some of my private pictures From kliteyn at dev.mellanox.co.il Sun Jan 27 15:07:09 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 28 Jan 2008 01:07:09 +0200 Subject: [ofa-general] [PATCH v2] opensm/osm_ucast_ftree.c: ignore port 0 and loopbacks on swithces Message-ID: <479D0E9D.7020005@dev.mellanox.co.il> Hi Sasha, Fat-tree routing should ignore port 0 and loopback connections on switches when populating its db. Please apply to ofed_1_3 and master. Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/osm_ucast_ftree.c | 17 +++++++++++++++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/opensm/opensm/osm_ucast_ftree.c b/opensm/opensm/osm_ucast_ftree.c index dcbdc44..94d4d79 100644 --- a/opensm/opensm/osm_ucast_ftree.c +++ b/opensm/opensm/osm_ucast_ftree.c @@ -3113,7 +3113,7 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, CL_ASSERT(osm_node_get_type(p_node) == IB_NODE_TYPE_SWITCH); - for (i = 0; i < osm_node_get_num_physp(p_node); i++) { + for (i = 1; i < osm_node_get_num_physp(p_node); i++) { osm_physp_t *p_osm_port = osm_node_get_physp_ptr(p_node, i); if (!osm_physp_is_valid(p_osm_port)) @@ -3122,11 +3122,23 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, continue; p_remote_osm_port = osm_physp_get_remote(p_osm_port); + if (!p_remote_osm_port) + continue; + p_remote_node = osm_node_get_remote_node(p_node, i, &remote_port_num); - if (!p_remote_osm_port) + /* ignore any loopback connection on switch */ + if (p_node == p_remote_node) { + osm_log(&p_ftree->p_osm->log, OSM_LOG_DEBUG, + "__osm_ftree_fabric_construct_sw_ports: " + "Ignoring loopback on switch GUID 0x%016" PRIx64 + ", LID 0x%04x, rank %u\n", + __osm_ftree_sw_get_guid_ho(p_sw), + cl_ntoh16(p_sw->base_lid), + p_sw->rank); continue; + } remote_node_type = osm_node_get_type(p_remote_node); remote_node_guid = osm_node_get_node_guid(p_remote_node); @@ -3158,6 +3170,7 @@ static int __osm_ftree_fabric_construct_sw_ports(IN ftree_fabric_t * p_ftree, __osm_ftree_fabric_get_sw_by_guid(p_ftree, remote_node_guid); CL_ASSERT(p_remote_sw); + p_remote_hca_or_sw = (void *)p_remote_sw; if (abs(p_sw->rank - p_remote_sw->rank) != 1) { -- 1.5.1.4 From pradeeps at linux.vnet.ibm.com Sun Jan 27 15:17:01 2008 From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana) Date: Sun, 27 Jan 2008 15:17:01 -0800 Subject: [ofa-general] Re: CM Enable SRQ for less than 16 s/g - bug In-Reply-To: <6C2C79E72C305246B504CBA17B5500C903321E4C@mtlexch01.mtl.com> References: <1201439328.9219.15.camel@mtls03> <479CC810.7060100@linux.vnet.ibm.com> <6C2C79E72C305246B504CBA17B5500C903321E4C@mtlexch01.mtl.com> Message-ID: <479D10ED.6060107@linux.vnet.ibm.com> Eli Cohen wrote: > I meant that the test hangs but not the system. You can still ping hosts on the ipoib interface, it is just that the test never ends. You can press Ctrl C and restart the test again. > If one can Ctrl-C that means it is not hung in the kernel. Several things strike me: a) Is this a new version of the test? b) Was the system left in an "unclean" state from the previous test in the regression suite? c) Can this test hang be reproduce by just running this test on a freshly booted system? Right now I do not have access to the machines to run a test. I will try and do it next week. Pradeep > -----Original Message----- > From: Pradeep Satyanarayana [mailto:pradeeps at linux.vnet.ibm.com] > Sent: א 27 ינואר 2008 20:06 > To: Eli Cohen > Cc: Shirley Ma; openfabrics; Dotan Barak > Subject: Re: CM Enable SRQ for less than 16 s/g - bug > > Eli Cohen wrote: >> This commit b150c30c28976f0dcf96bb28780ae62897264c54 introduces a >> problem in IPOIB CM. >> >> failure description: >> test hangs. >> >> bug was found by Mellanox regression. >> >> test info: >> server: >> ttcpv -s -r -p 19033 -l 100000 >> >> client: >> ttcpv -s -t 11.4.3.112 -p 19033 -l 100000 -n 8192 >> > ... >> >> Can you take a look at this? >> > Sure, can you provide some more details about this hang like a stack trace? > Bulk of the changes this patch introduces are in ipoib_cm_dev_init(). So, it should not affect the send and receive paths. Not sure why the hang occurs. > > Pradeep > > From sashak at voltaire.com Sun Jan 27 16:11:58 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 00:11:58 +0000 Subject: [ofa-general] [PATCH 1/3] opensm/perfmgr: wakeup perfmgr discover only after NO_PENDING_TRANSACTION was signaled Message-ID: <20080128001158.GT24344@sashak.voltaire.com> This fix potential race between perfmgr discovery wakeup and NO_PENDING_TRANSACTION delivery, when this signal could remain not cleared, which will cause to some error messages. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_sm_mad_ctrl.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/opensm/opensm/osm_sm_mad_ctrl.c b/opensm/opensm/osm_sm_mad_ctrl.c index 762b684..5981ca3 100644 --- a/opensm/opensm/osm_sm_mad_ctrl.c +++ b/opensm/opensm/osm_sm_mad_ctrl.c @@ -106,11 +106,11 @@ __osm_sm_mad_ctrl_retire_trans_mad(IN osm_sm_mad_ctrl_t * const p_ctrl, "__osm_sm_mad_ctrl_retire_trans_mad: " "signal OSM_SIGNAL_NO_PENDING_TRANSACTIONS\n"); + osm_sm_signal(&p_ctrl->p_subn->p_osm->sm, + OSM_SIGNAL_NO_PENDING_TRANSACTIONS); #ifdef ENABLE_OSM_PERF_MGR pthread_cond_signal(&p_ctrl->p_stats->cond); #endif - osm_sm_signal(&p_ctrl->p_subn->p_osm->sm, - OSM_SIGNAL_NO_PENDING_TRANSACTIONS); } OSM_LOG_EXIT(p_ctrl->p_log); -- 1.5.4.rc5 From sashak at voltaire.com Sun Jan 27 16:12:48 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 00:12:48 +0000 Subject: [ofa-general] [PATCH 2/3] opensm/perfmgr: break perfmgr discovery if osm_exit_flag is on In-Reply-To: <20080128001158.GT24344@sashak.voltaire.com> References: <20080128001158.GT24344@sashak.voltaire.com> Message-ID: <20080128001248.GU24344@sashak.voltaire.com> Function wait_for_pending_transaction() will report interruption status (by returning osm_exit_flag value). So perfmgr discovery process can be aborted. Signed-off-by: Sasha Khapyorsky --- opensm/opensm/osm_perfmgr.c | 8 +++++--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index 091b46a..1d92b3b 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -723,7 +723,7 @@ static int wait_for_pending_transactions(osm_stats_t * stats) while (stats->qp0_mads_outstanding && !osm_exit_flag) pthread_cond_wait(&stats->cond, &stats->mutex); pthread_mutex_unlock(&stats->mutex); - return 0; + return osm_exit_flag; } static void reset_node_count(cl_map_item_t * const p_map_item, void *cxt) @@ -762,7 +762,8 @@ static int perfmgr_discovery(osm_opensm_t * osm) if (ret) goto _exit; - wait_for_pending_transactions(&osm->stats); + if (wait_for_pending_transactions(&osm->stats)) + goto _exit; if (is_sm_port_down(&osm->sm)) { osm_log(&osm->log, OSM_LOG_VERBOSE, "SM port is down\n"); @@ -775,7 +776,8 @@ static int perfmgr_discovery(osm_opensm_t * osm) if (ret) goto _exit; - wait_for_pending_transactions(&osm->stats); + if (wait_for_pending_transactions(&osm->stats)) + goto _exit; _drop: osm_drop_mgr_process(&osm->sm.drop_mgr); -- 1.5.4.rc5 From sashak at voltaire.com Sun Jan 27 16:14:00 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 00:14:00 +0000 Subject: [ofa-general] [PATCH 3/3] opensm/perfmgr: add support for non-pthread systems In-Reply-To: <20080128001248.GU24344@sashak.voltaire.com> References: <20080128001158.GT24344@sashak.voltaire.com> <20080128001248.GU24344@sashak.voltaire.com> Message-ID: <20080128001400.GV24344@sashak.voltaire.com> This makes it possible to use PerfMgr on system which doesn't have pthread library. Only compilation was tested. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_stats.h | 10 +++++++++- opensm/opensm/main.c | 4 ++++ opensm/opensm/osm_perfmgr.c | 19 +++++++++++++++++++ opensm/opensm/osm_sm_mad_ctrl.c | 4 ++++ 4 files changed, 36 insertions(+), 1 deletions(-) diff --git a/opensm/include/opensm/osm_stats.h b/opensm/include/opensm/osm_stats.h index 51424d1..b5100f2 100644 --- a/opensm/include/opensm/osm_stats.h +++ b/opensm/include/opensm/osm_stats.h @@ -49,10 +49,14 @@ #define _OSM_STATS_H_ #ifdef ENABLE_OSM_PERF_MGR +#ifdef HAVE_LIBPTHREAD #include +#else +#include +#endif #endif -#include #include +#include #ifdef __cplusplus # define BEGIN_C_DECLS extern "C" { @@ -97,8 +101,12 @@ typedef struct _osm_stats { atomic32_t sa_mads_rcvd_unknown; atomic32_t sa_mads_ignored; #ifdef ENABLE_OSM_PERF_MGR +#ifdef HAVE_LIBPTHREAD pthread_mutex_t mutex; pthread_cond_t cond; +#else + cl_event_t event; +#endif #endif } osm_stats_t; /* diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c index a21cfcc..a2c50f2 100644 --- a/opensm/opensm/main.c +++ b/opensm/opensm/main.c @@ -1044,7 +1044,11 @@ int main(int argc, char *argv[]) "There are still %u MADs out. Forcing the exit of the OpenSM application...\n", osm.mad_pool.mads_out); #ifdef ENABLE_OSM_PERF_MGR +#ifdef HAVE_LIBPTHREAD pthread_cond_signal(&osm.stats.cond); +#else + cl_event_signal(&osm.stats.event); +#endif #endif } diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index 1d92b3b..7c15e74 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -719,10 +719,19 @@ static int sweep_hop_0(osm_sm_t * const sm) static int wait_for_pending_transactions(osm_stats_t * stats) { +#ifdef HAVE_LIBPTHREAD pthread_mutex_lock(&stats->mutex); while (stats->qp0_mads_outstanding && !osm_exit_flag) pthread_cond_wait(&stats->cond, &stats->mutex); pthread_mutex_unlock(&stats->mutex); +#else + while (1) { + unsigned count = stats->qp0_mads_outstanding; + if (!count || osm_exit_flag) + break; + cl_event_wait_on(&stats->event, EVENT_NO_TIMEOUT, TRUE); + } +#endif return osm_exit_flag; } @@ -889,8 +898,12 @@ void osm_perfmgr_destroy(osm_perfmgr_t * const pm) free(pm->event_db_dump_file); perfmgr_db_destroy(pm->db); cl_timer_destroy(&pm->sweep_timer); +#ifdef HAVE_LIBPTHREAD pthread_cond_destroy(&pm->subn->p_osm->stats.cond); pthread_mutex_destroy(&pm->subn->p_osm->stats.mutex); +#else + cl_event_destroy(&pm->subn->p_osm->stats.event); +#endif OSM_LOG_EXIT(pm->log); } @@ -1285,8 +1298,14 @@ osm_perfmgr_init(osm_perfmgr_t * const pm, pm->max_outstanding_queries = p_opt->perfmgr_max_outstanding_queries; pm->event_plugin = event_plugin; +#ifdef HAVE_LIBPTHREAD pthread_mutex_init(&subn->p_osm->stats.mutex, NULL); pthread_cond_init(&subn->p_osm->stats.cond, NULL); +#else + status = cl_event_init(&subn->p_osm->stats.event, FALSE); + if (status != IB_SUCCESS) + goto Exit; +#endif status = cl_timer_init(&pm->sweep_timer, perfmgr_sweep, pm); if (status != IB_SUCCESS) diff --git a/opensm/opensm/osm_sm_mad_ctrl.c b/opensm/opensm/osm_sm_mad_ctrl.c index 5981ca3..2638357 100644 --- a/opensm/opensm/osm_sm_mad_ctrl.c +++ b/opensm/opensm/osm_sm_mad_ctrl.c @@ -109,7 +109,11 @@ __osm_sm_mad_ctrl_retire_trans_mad(IN osm_sm_mad_ctrl_t * const p_ctrl, osm_sm_signal(&p_ctrl->p_subn->p_osm->sm, OSM_SIGNAL_NO_PENDING_TRANSACTIONS); #ifdef ENABLE_OSM_PERF_MGR +#ifdef HAVE_LIBPTHREAD pthread_cond_signal(&p_ctrl->p_stats->cond); +#else + cl_event_signal(&p_ctrl->p_stats->event); +#endif #endif } -- 1.5.4.rc5 From kliteyn at mellanox.co.il Sun Jan 27 17:30:23 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 28 Jan 2008 03:30:23 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-28:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-27 OpenSM git rev = Sun_Jan_27_12:29:23_2008 [ee6a6e0b276ca62a84be71daadad6d3794c2d990] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=398 Fail=2 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo 8 LidMgr IS3-128.topo Failures: 2 LidMgr IS3-128.topo From dwsotarm at sotar.com Sun Jan 27 19:28:05 2008 From: dwsotarm at sotar.com (Yolanda Ratliff) Date: Mon, 28 Jan 2008 11:28:05 +0800 Subject: [ofa-general] Want to be a hero in bed? Message-ID: <01c861a0$d9651080$566a4dda@dwsotarm> Are U Tired with erectile dysfunction? Enhance your sexual life now! Want to be ready for sex in few minutes? Reproductive and ED problems solution http://geocities.com/thomashebert350 We are verified by VISA. Confidential purchase. From tej at q-00-p.com Sun Jan 27 19:30:30 2008 From: tej at q-00-p.com (Tania Bragg) Date: Mon, 28 Jan 2008 11:30:30 +0800 Subject: [ofa-general] Want to be a hero in bed? Message-ID: <01c861a1$2fd24f00$89982179@tej> Are U Tired with erectile dysfunction? Enhance your sexual life now! Want to be ready for sex in few minutes? Reproductive and ED problems solution http://geocities.com/rolandoriggs86 We are verified by VISA. Confidential purchase. From eli at dev.mellanox.co.il Sun Jan 27 23:57:52 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Mon, 28 Jan 2008 09:57:52 +0200 Subject: [ofa-general] Re: CM Enable SRQ for less than 16 s/g - bug In-Reply-To: <479D10ED.6060107@linux.vnet.ibm.com> References: <1201439328.9219.15.camel@mtls03> <479CC810.7060100@linux.vnet.ibm.com> <6C2C79E72C305246B504CBA17B5500C903321E4C@mtlexch01.mtl.com> <479D10ED.6060107@linux.vnet.ibm.com> Message-ID: <20080128075731.GA15124@mtls03> On Sun, Jan 27, 2008 at 03:17:01PM -0800, Pradeep Satyanarayana wrote: > a) Is this a new version of the test? We run this test on a regular basis. As I said, under the same conditions, with a driver based on a commit prior to the one I mentioned, we do not see the problem. > b) Was the system left in an "unclean" state from the previous test in > the regression suite? I don't think so - you can ping through this interface and you can ssh through it. From tziporet at dev.mellanox.co.il Mon Jan 28 00:25:16 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 28 Jan 2008 10:25:16 +0200 Subject: [ofa-general] Re: [ewg] Re: [GIT PULL ofed-1.2.5] - cxgb3 fixes In-Reply-To: <479C8036.9010308@dev.mellanox.co.il> References: <479A65DF.2070407@opengridcomputing.com> <479C8036.9010308@dev.mellanox.co.il> Message-ID: <479D916C.5070407@mellanox.co.il> > Steve Wise wrote: >> >> These are all going upstream and in ofed-1.3 and I want to keep >> ofed-1.2.5 up to date as well. Can these make 1.2.5.5 by chance? >> >> > You were lucky - we just built 1.2.5.5. yesterday so its in :-) Tziporet From mashirle at us.ibm.com Sun Jan 27 14:30:31 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Sun, 27 Jan 2008 14:30:31 -0800 Subject: [ofa-general] Re: CM Enable SRQ for less than 16 s/g - bug In-Reply-To: <20080128075731.GA15124@mtls03> References: <1201439328.9219.15.camel@mtls03> <479CC810.7060100@linux.vnet.ibm.com> <6C2C79E72C305246B504CBA17B5500C903321E4C@mtlexch01.mtl.com> <479D10ED.6060107@linux.vnet.ibm.com> <20080128075731.GA15124@mtls03> Message-ID: <1201473031.9025.1.camel@localhost.localdomain> On Mon, 2008-01-28 at 09:57 +0200, Eli Cohen wrote: > On Sun, Jan 27, 2008 at 03:17:01PM -0800, Pradeep Satyanarayana wrote: > > a) Is this a new version of the test? > We run this test on a regular basis. As I said, under the same > conditions, with a driver based on a commit prior to the one I mentioned, > we do not see the problem. > > > b) Was the system left in an "unclean" state from the previous test in > > the regression suite? > I don't think so - you can ping through this interface and you can ssh > through it. Hello Eli, Could you please dump the hung stack so we can know where the test hung? It could be this patch changed the timing and caused this issue. Thanks Shirley From eli at dev.mellanox.co.il Mon Jan 28 00:31:16 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Mon, 28 Jan 2008 10:31:16 +0200 Subject: [ofa-general] Re: CM Enable SRQ for less than 16 s/g - bug In-Reply-To: <1201473031.9025.1.camel@localhost.localdomain> References: <1201439328.9219.15.camel@mtls03> <479CC810.7060100@linux.vnet.ibm.com> <6C2C79E72C305246B504CBA17B5500C903321E4C@mtlexch01.mtl.com> <479D10ED.6060107@linux.vnet.ibm.com> <20080128075731.GA15124@mtls03> <1201473031.9025.1.camel@localhost.localdomain> Message-ID: <20080128083115.GA7477@mtls03> On Sun, Jan 27, 2008 at 02:30:31PM -0800, Shirley Ma wrote: > Hello Eli, > > Could you please dump the hung stack so we can know where the test > hung? It could be this patch changed the timing and caused this issue. > > Thanks > Shirley > I'll try to get that for you. From jackm at dev.mellanox.co.il Mon Jan 28 00:40:51 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Mon, 28 Jan 2008 10:40:51 +0200 Subject: [ofa-general] [PATCH 1 of 2] IB/mlx4: For 64-bit systems, use large virtually contiguous queue buffers (vmap) Message-ID: <200801281040.52138.jackm@dev.mellanox.co.il> IB/mlx4: For 64-bit systems, use large virtually contiguous queue buffers (rather than 2-layer). Since kernel virtual memory is not a problem on 64-bit systems, there is no reason to use a 2-layer page mapping scheme on such systems. Instead, map the page list to a single virtually contiguous buffer, so that can access buffer memory via direct indexing. Signed-off-by: Michael S. Tsirkin Signed-off-by: Jack Morgenstein --- Roland, this 2-patch series is based on your for-2.6.25 tree, commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a Index: infiniband/drivers/net/mlx4/alloc.c =================================================================== --- infiniband.orig/drivers/net/mlx4/alloc.c 2008-01-22 18:41:14.000000000 +0200 +++ infiniband/drivers/net/mlx4/alloc.c 2008-01-27 10:52:21.000000000 +0200 @@ -151,6 +151,19 @@ int mlx4_buf_alloc(struct mlx4_dev *dev, memset(buf->u.page_list[i].buf, 0, PAGE_SIZE); } + + if (BITS_PER_LONG == 64) { + struct page **pages; + pages = kmalloc(sizeof *pages * buf->nbufs, GFP_KERNEL); + if (!pages) + goto err_free; + for (i = 0; i < buf->nbufs; ++i) + pages[i] = virt_to_page(buf->u.page_list[i].buf); + buf->u.direct.buf = vmap(pages, buf->nbufs, VM_MAP, PAGE_KERNEL); + kfree(pages); + if (!buf->u.direct.buf) + goto err_free; + } } return 0; @@ -170,6 +183,9 @@ void mlx4_buf_free(struct mlx4_dev *dev, dma_free_coherent(&dev->pdev->dev, size, buf->u.direct.buf, buf->u.direct.map); else { + if (BITS_PER_LONG == 64) + vunmap(buf->u.direct.buf); + for (i = 0; i < buf->nbufs; ++i) if (buf->u.page_list[i].buf) dma_free_coherent(&dev->pdev->dev, PAGE_SIZE, Index: infiniband/include/linux/mlx4/device.h =================================================================== --- infiniband.orig/include/linux/mlx4/device.h 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/include/linux/mlx4/device.h 2008-01-27 10:52:21.000000000 +0200 @@ -189,7 +189,7 @@ struct mlx4_buf_list { }; struct mlx4_buf { - union { + struct { struct mlx4_buf_list direct; struct mlx4_buf_list *page_list; } u; Index: infiniband/drivers/infiniband/hw/mlx4/qp.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/qp.c 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/qp.c 2008-01-27 10:52:21.000000000 +0200 @@ -96,7 +96,7 @@ static int is_qp0(struct mlx4_ib_dev *de static void *get_wqe(struct mlx4_ib_qp *qp, int offset) { - if (qp->buf.nbufs == 1) + if (BITS_PER_LONG == 64 || qp->buf.nbufs == 1) return qp->buf.u.direct.buf + offset; else return qp->buf.u.page_list[offset >> PAGE_SHIFT].buf + From jackm at dev.mellanox.co.il Mon Jan 28 00:40:59 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Mon, 28 Jan 2008 10:40:59 +0200 Subject: [ofa-general] [PATCH 2 of 2] IB/mlx4: shrinking WQE Message-ID: <200801281040.59398.jackm@dev.mellanox.co.il> IB/mlx4: shrinking WQE ConnectX supports shrinking wqe, such that a single WR can include multiple units of wqe_shift. This way, WRs can differ in size, and do not have to be a power of 2 in size, saving memory and speeding up send WR posting. Unfortunately, if we do this wqe_index field in CQE can't be used to look up the WR ID anymore, so do this only if selective signalling is off. Further, on 32-bit platforms, we can't use vmap to make the QP buffer virtually contigious. Thus we have to use constant-sized WRs to make sure a WR is always fully within a single page-sized chunk. Finally, we use WR with NOP opcode to avoid wrap-around in the middle of WR. We set NoErrorCompletion bit to avoid getting completions with error for NOP WRs. Since NEC is only supported starting with firmware 2.2.232, we use constant-sized WRs for older firmware. And, since MLX QPs only support SEND, we use constant-sized WRs in this case. When stamping during NOP posting, do stamping following setting of the NOP wqe valid bit. Signed-off-by: Michael S. Tsirkin Signed-off-by: Jack Morgenstein --- Index: infiniband/drivers/infiniband/hw/mlx4/cq.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/cq.c 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/cq.c 2008-01-27 11:43:44.000000000 +0200 @@ -332,6 +332,12 @@ static int mlx4_ib_poll_one(struct mlx4_ is_error = (cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) == MLX4_CQE_OPCODE_ERROR; + if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) == MLX4_OPCODE_NOP && + is_send)) { + printk(KERN_WARNING "Completion for NOP opcode detected!\n"); + return -EINVAL; + } + if (!*cur_qp || (be32_to_cpu(cqe->my_qpn) & 0xffffff) != (*cur_qp)->mqp.qpn) { /* @@ -354,8 +360,10 @@ static int mlx4_ib_poll_one(struct mlx4_ if (is_send) { wq = &(*cur_qp)->sq; - wqe_ctr = be16_to_cpu(cqe->wqe_index); - wq->tail += (u16) (wqe_ctr - (u16) wq->tail); + if (!(*cur_qp)->sq_signal_bits) { + wqe_ctr = be16_to_cpu(cqe->wqe_index); + wq->tail += (u16) (wqe_ctr - (u16) wq->tail); + } wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)]; ++wq->tail; } else if ((*cur_qp)->ibqp.srq) { Index: infiniband/drivers/infiniband/hw/mlx4/mlx4_ib.h =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/mlx4_ib.h 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/mlx4_ib.h 2008-01-27 11:43:44.000000000 +0200 @@ -120,6 +120,8 @@ struct mlx4_ib_qp { u32 doorbell_qpn; __be32 sq_signal_bits; + unsigned sq_next_wqe; + int sq_max_wqes_per_wr; int sq_spare_wqes; struct mlx4_ib_wq sq; Index: infiniband/drivers/infiniband/hw/mlx4/qp.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/qp.c 2008-01-27 10:52:21.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/qp.c 2008-01-27 11:43:44.000000000 +0200 @@ -30,6 +30,7 @@ * SOFTWARE. */ +#include #include #include @@ -115,16 +116,88 @@ static void *get_send_wqe(struct mlx4_ib /* * Stamp a SQ WQE so that it is invalid if prefetched by marking the - * first four bytes of every 64 byte chunk with 0xffffffff, except for - * the very first chunk of the WQE. + * first four bytes of every 64 byte chunk with + * 0x7FFFFFF | (invalid_ownership_value << 31). + * + * When max WR is than or equal to the WQE size, + * as an optimization, we can stamp WQE with 0xffffffff, + * and skip the very first chunk of the WQE. */ -static void stamp_send_wqe(struct mlx4_ib_qp *qp, int n) +static void stamp_send_wqe(struct mlx4_ib_qp *qp, int n, int size) { - u32 *wqe = get_send_wqe(qp, n); + u32 *wqe; int i; + int s; + int ind; + void *buf; + __be32 stamp; + + s = roundup(size, 1 << qp->sq.wqe_shift); + if (qp->sq_max_wqes_per_wr > 1) { + for (i = 0; i < s; i += 64) { + ind = (i >> qp->sq.wqe_shift) + n; + stamp = ind & qp->sq.wqe_cnt ? cpu_to_be32(0x7fffffff) : + cpu_to_be32(0xffffffff); + buf = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1)); + wqe = buf + (i & ((1 << qp->sq.wqe_shift) - 1)); + *wqe = stamp; + } + } else { + buf = get_send_wqe(qp, n & (qp->sq.wqe_cnt - 1)); + for (i = 64; i < s; i += 64) { + wqe = buf + i; + *wqe = 0xffffffff; + } + } +} + +static void post_nop_wqe(struct mlx4_ib_qp *qp, int n, int size) +{ + struct mlx4_wqe_ctrl_seg *ctrl; + struct mlx4_wqe_inline_seg *inl; + void *wqe; + int s; + + ctrl = wqe = get_send_wqe(qp, n & (qp->sq.wqe_cnt - 1)); + s = sizeof(struct mlx4_wqe_ctrl_seg); + + if (qp->ibqp.qp_type == IB_QPT_UD) { + struct mlx4_wqe_datagram_seg *dgram = wqe + sizeof *ctrl; + struct mlx4_av *av = (struct mlx4_av *)dgram->av; + memset(dgram, 0, sizeof *dgram); + av->port_pd = cpu_to_be32((qp->port << 24) | to_mpd(qp->ibqp.pd)->pdn); + s += sizeof(struct mlx4_wqe_datagram_seg); + } + + /* Pad the remainder of the WQE with an inline data segment. */ + if (size > s) { + inl = wqe + s; + inl->byte_count = cpu_to_be32(1 << 31 | (size - s - sizeof *inl)); + } + ctrl->srcrb_flags = 0; + ctrl->fence_size = size / 16; + /* + * Make sure descriptor is fully written before + * setting ownership bit (because HW can start + * executing as soon as we do). + */ + wmb(); - for (i = 16; i < 1 << (qp->sq.wqe_shift - 2); i += 16) - wqe[i] = 0xffffffff; + ctrl->owner_opcode = cpu_to_be32(MLX4_OPCODE_NOP | MLX4_WQE_CTRL_NEC) | + (n & qp->sq.wqe_cnt ? cpu_to_be32(1 << 31) : 0); + + stamp_send_wqe(qp, n + qp->sq_spare_wqes, size); +} + +/* Post NOP WQE to prevent wrap-around in the middle of WR */ +static inline unsigned pad_wraparound(struct mlx4_ib_qp *qp, int ind) +{ + unsigned s = qp->sq.wqe_cnt - (ind & (qp->sq.wqe_cnt - 1)); + if (unlikely(s < qp->sq_max_wqes_per_wr)) { + post_nop_wqe(qp, ind, s << qp->sq.wqe_shift); + ind += s; + } + return ind; } static void mlx4_ib_qp_event(struct mlx4_qp *qp, enum mlx4_event type) @@ -241,6 +314,8 @@ static int set_rq_size(struct mlx4_ib_de static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap, enum ib_qp_type type, struct mlx4_ib_qp *qp) { + int s; + /* Sanity check SQ size before proceeding */ if (cap->max_send_wr > dev->dev->caps.max_wqes || cap->max_send_sge > dev->dev->caps.max_sq_sg || @@ -256,20 +331,69 @@ static int set_kernel_sq_size(struct mlx cap->max_send_sge + 2 > dev->dev->caps.max_sq_sg) return -EINVAL; - qp->sq.wqe_shift = ilog2(roundup_pow_of_two(max(cap->max_send_sge * - sizeof (struct mlx4_wqe_data_seg), - cap->max_inline_data + - sizeof (struct mlx4_wqe_inline_seg)) + - send_wqe_overhead(type))); - qp->sq.max_gs = ((1 << qp->sq.wqe_shift) - send_wqe_overhead(type)) / - sizeof (struct mlx4_wqe_data_seg); + s = max(cap->max_send_sge * sizeof (struct mlx4_wqe_data_seg), + cap->max_inline_data + sizeof (struct mlx4_wqe_inline_seg)) + + send_wqe_overhead(type); /* - * We need to leave 2 KB + 1 WQE of headroom in the SQ to - * allow HW to prefetch. + * Hermon supports shrinking wqe, such that a single WR can include + * multiple units of wqe_shift. This way, WRs can differ in size, and + * do not have to be a power of 2 in size, saving memory and speeding up + * send WR posting. Unfortunately, if we do this wqe_index field in CQE + * can't be used to look up the WR ID anymore, so do this only if + * selective signalling is off. + * + * Further, on 32-bit platforms, we can't use vmap to make + * the QP buffer virtually contigious. Thus we have to use + * constant-sized WRs to make sure a WR is always fully within + * a single page-sized chunk. + * + * Finally, we use NOP opcode to avoid wrap-around in the middle of WR. + * We set NEC bit to avoid getting completions with error for NOP WRs. + * Since NEC is only supported starting with firmware 2.2.232, + * we use constant-sized WRs for older firmware. + * + * And, since MLX QPs only support SEND, we use constant-sized WRs in this + * case. + * + * We look for the smallest value of wqe_shift such that the resulting + * number of wqes does not exceed device capabilities. + * + * We set WQE size to at least 64 bytes, this way stamping invalidates each WQE. */ - qp->sq_spare_wqes = (2048 >> qp->sq.wqe_shift) + 1; - qp->sq.wqe_cnt = roundup_pow_of_two(cap->max_send_wr + qp->sq_spare_wqes); + if (dev->dev->caps.fw_ver >= MLX4_FW_VER_WQE_CTRL_NEC && + qp->sq_signal_bits && BITS_PER_LONG == 64 && + type != IB_QPT_SMI && type != IB_QPT_GSI) + qp->sq.wqe_shift = ilog2(64); + else + qp->sq.wqe_shift = ilog2(roundup_pow_of_two(s)); + + for (;;) { + if (1 << qp->sq.wqe_shift > dev->dev->caps.max_sq_desc_sz) + return -EINVAL; + + qp->sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1 << qp->sq.wqe_shift); + + /* + * We need to leave 2 KB + 1 WR of headroom in the SQ to + * allow HW to prefetch. + */ + qp->sq_spare_wqes = (2048 >> qp->sq.wqe_shift) + qp->sq_max_wqes_per_wr; + qp->sq.wqe_cnt = roundup_pow_of_two(cap->max_send_wr * + qp->sq_max_wqes_per_wr + + qp->sq_spare_wqes); + + if (qp->sq.wqe_cnt <= dev->dev->caps.max_wqes) + break; + + if (qp->sq_max_wqes_per_wr <= 1) + return -EINVAL; + + ++qp->sq.wqe_shift; + } + + qp->sq.max_gs = ((qp->sq_max_wqes_per_wr << qp->sq.wqe_shift) - + send_wqe_overhead(type)) / sizeof (struct mlx4_wqe_data_seg); qp->buf_size = (qp->rq.wqe_cnt << qp->rq.wqe_shift) + (qp->sq.wqe_cnt << qp->sq.wqe_shift); @@ -281,7 +405,8 @@ static int set_kernel_sq_size(struct mlx qp->sq.offset = 0; } - cap->max_send_wr = qp->sq.max_post = qp->sq.wqe_cnt - qp->sq_spare_wqes; + cap->max_send_wr = qp->sq.max_post = + (qp->sq.wqe_cnt - qp->sq_spare_wqes) / qp->sq_max_wqes_per_wr; cap->max_send_sge = qp->sq.max_gs; /* We don't support inline sends for kernel QPs (yet) */ cap->max_inline_data = 0; @@ -327,6 +452,12 @@ static int create_qp_common(struct mlx4_ qp->rq.tail = 0; qp->sq.head = 0; qp->sq.tail = 0; + qp->sq_next_wqe = 0; + + if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) + qp->sq_signal_bits = cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE); + else + qp->sq_signal_bits = 0; err = set_rq_size(dev, &init_attr->cap, !!pd->uobject, !!init_attr->srq, qp); if (err) @@ -417,11 +548,6 @@ static int create_qp_common(struct mlx4_ */ qp->doorbell_qpn = swab32(qp->mqp.qpn << 8); - if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) - qp->sq_signal_bits = cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE); - else - qp->sq_signal_bits = 0; - qp->mqp.event = mlx4_ib_qp_event; return 0; @@ -916,7 +1042,7 @@ static int __mlx4_ib_modify_qp(struct ib ctrl = get_send_wqe(qp, i); ctrl->owner_opcode = cpu_to_be32(1 << 31); - stamp_send_wqe(qp, i); + stamp_send_wqe(qp, i, 1 << qp->sq.wqe_shift); } } @@ -969,6 +1095,7 @@ static int __mlx4_ib_modify_qp(struct ib qp->rq.tail = 0; qp->sq.head = 0; qp->sq.tail = 0; + qp->sq_next_wqe = 0; if (!ibqp->srq) *qp->db.db = 0; } @@ -1278,13 +1405,14 @@ int mlx4_ib_post_send(struct ib_qp *ibqp unsigned long flags; int nreq; int err = 0; - int ind; - int size; + unsigned ind; + int uninitialized_var(stamp); + int uninitialized_var(size); int i; spin_lock_irqsave(&qp->sq.lock, flags); - ind = qp->sq.head; + ind = qp->sq_next_wqe; for (nreq = 0; wr; ++nreq, wr = wr->next) { if (mlx4_wq_overflow(&qp->sq, nreq, qp->ibqp.send_cq)) { @@ -1300,7 +1428,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp } ctrl = wqe = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1)); - qp->sq.wrid[ind & (qp->sq.wqe_cnt - 1)] = wr->wr_id; + qp->sq.wrid[(qp->sq.head + nreq) & (qp->sq.wqe_cnt - 1)] = wr->wr_id; ctrl->srcrb_flags = (wr->send_flags & IB_SEND_SIGNALED ? @@ -1413,16 +1541,23 @@ int mlx4_ib_post_send(struct ib_qp *ibqp ctrl->owner_opcode = mlx4_ib_opcode[wr->opcode] | (ind & qp->sq.wqe_cnt ? cpu_to_be32(1 << 31) : 0); + stamp = ind + qp->sq_spare_wqes; + ind += DIV_ROUND_UP(size * 16, 1 << qp->sq.wqe_shift); + /* * We can improve latency by not stamping the last * send queue WQE until after ringing the doorbell, so * only stamp here if there are still more WQEs to post. + * + * Same optimization applies to padding with NOP wqe + * in case of WQE shrinking (used to prevent wrap-around + * in the middle of WR). */ - if (wr->next) - stamp_send_wqe(qp, (ind + qp->sq_spare_wqes) & - (qp->sq.wqe_cnt - 1)); + if (wr->next) { + stamp_send_wqe(qp, stamp, size * 16); + ind = pad_wraparound(qp, ind); + } - ++ind; } out: @@ -1444,8 +1579,10 @@ out: */ mmiowb(); - stamp_send_wqe(qp, (ind + qp->sq_spare_wqes - 1) & - (qp->sq.wqe_cnt - 1)); + stamp_send_wqe(qp, stamp, size * 16); + + ind = pad_wraparound(qp, ind); + qp->sq_next_wqe = ind; } spin_unlock_irqrestore(&qp->sq.lock, flags); Index: infiniband/include/linux/mlx4/device.h =================================================================== --- infiniband.orig/include/linux/mlx4/device.h 2008-01-27 10:52:21.000000000 +0200 +++ infiniband/include/linux/mlx4/device.h 2008-01-27 11:43:44.000000000 +0200 @@ -133,6 +133,11 @@ enum { MLX4_STAT_RATE_OFFSET = 5 }; +static inline u64 mlx4_fw_ver(u64 major, u64 minor, u64 subminor) +{ + return (major << 32) | (minor << 16) | subminor; +} + struct mlx4_caps { u64 fw_ver; int num_ports; Index: infiniband/include/linux/mlx4/qp.h =================================================================== --- infiniband.orig/include/linux/mlx4/qp.h 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/include/linux/mlx4/qp.h 2008-01-27 11:43:44.000000000 +0200 @@ -154,7 +154,11 @@ struct mlx4_qp_context { u32 reserved5[10]; }; +/* Which firmware version adds support for NEC (NoErrorCompletion) bit */ +#define MLX4_FW_VER_WQE_CTRL_NEC mlx4_fw_ver(2, 2, 232) + enum { + MLX4_WQE_CTRL_NEC = 1 << 29, MLX4_WQE_CTRL_FENCE = 1 << 6, MLX4_WQE_CTRL_CQ_UPDATE = 3 << 2, MLX4_WQE_CTRL_SOLICITED = 1 << 1, From unforgettable3 at nefannuitytest.com Mon Jan 28 01:20:29 2008 From: unforgettable3 at nefannuitytest.com (Tanya Baker) Date: Mon, 28 Jan 2008 12:20:29 +0300 Subject: [ofa-general] EnormousFuckstickIngrid Message-ID: <01c861a8$2b5d6480$ca28df59@unforgettable3> MitchMonsterFuckstick http://www.rockosmro.com From dwsolom at solo.cz Mon Jan 28 02:25:30 2008 From: dwsolom at solo.cz (Selma Whittaker) Date: Mon, 28 Jan 2008 13:25:30 +0300 Subject: [ofa-general] Medications that you need. Message-ID: <01c861b1$408ac100$92b79358@dwsolom> Buy Must Have medications at Canada based pharmacy. No prescription at all! Same quality! Save your money, buy pills immediately! http://geocities.com/andreamorse336 We provide confidential and secure purchase! From jackm at dev.mellanox.co.il Mon Jan 28 02:36:28 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Mon, 28 Jan 2008 12:36:28 +0200 Subject: [ofa-general] [PATCH 7/8 V2] core: Add XRC receive-only qp support Message-ID: <200801281236.28417.jackm@dev.mellanox.co.il> ib/core: Implement XRC receive-only QPs for userspace apps. Added creation of XRC receive-only QPs for userspace, which reside in kernel space (user cannot post-to or poll these QPs). Motivation: MPI community requires XRC receive QPs which will not be destroyed when the creating process terminates. Solution: Userspace requests that a QP be created in kernel space. Each userspace process using that QP (i.e. receiving packets on an XRC SRQ via the qp), registers with that QP (-- the creator is also registered, whether or not it is a user of the QP). When the last userspace user unregisters with the QP, it is destroyed. Unregistration is also part of userspace process cleanup, so there is no leakage. This patch implements the kernel procedures to implement the following (new) libibverbs API: ibv_create_xrc_rcv_qp ibv_modify_xrc_rcv_qp ibv_query_xrc_rcv_qp ibv_reg_xrc_rcv_qp ibv_unreg_xrc_rcv_qp In addition, the patch implements the foundation for distributing XRC-receive-only QP events to userspace processes registered with that QP. Finally, the patch modifies ib_uverbs_close_xrc_domain() to return BUSY if any resources are still in use by the process, so that the XRC rcv-only QP cleanup can operate properly. V2: Fixed bug in ib_uverbs_close_xrc_domain. We need to allow the process to successfully close its copy of the domain, even if it still has undestroyed XRC QPs -- these will continue to operate, although it will not be possible to create new ones (there will be no Oops). However, we need to check that there are no outstanding xrc-qp-registrations: the cleanup procedure for this depends on the xrc domain still being accessible in this process in order to perform all needed un-registrations (and thus prevent resource leakage). Signed-off-by: Jack Morgenstein Index: infiniband/include/rdma/ib_verbs.h =================================================================== --- infiniband.orig/include/rdma/ib_verbs.h 2008-01-28 12:20:55.000000000 +0200 +++ infiniband/include/rdma/ib_verbs.h 2008-01-28 12:22:09.000000000 +0200 @@ -285,6 +285,10 @@ enum ib_event_type { IB_EVENT_CLIENT_REREGISTER }; +enum ib_event_flags { + IB_XRC_QP_EVENT_FLAG = 0x80000000, +}; + struct ib_event { struct ib_device *device; union { @@ -292,6 +296,7 @@ struct ib_event { struct ib_qp *qp; struct ib_srq *srq; u8 port_num; + u32 xrc_qp_num; } element; enum ib_event_type event; }; @@ -492,6 +497,7 @@ enum ib_qp_type { enum qp_create_flags { QP_CREATE_LSO = 1 << 0, + XRC_RCV_QP = 1 << 1, }; struct ib_qp_init_attr { @@ -723,6 +729,7 @@ struct ib_ucontext { struct list_head srq_list; struct list_head ah_list; struct list_head xrc_domain_list; + struct list_head xrc_reg_qp_list; int closing; }; @@ -744,6 +751,12 @@ struct ib_udata { size_t outlen; }; +struct ib_uxrc_rcv_object { + struct list_head list; /* link to context's list */ + u32 qp_num; + u32 domain_handle; +}; + struct ib_pd { struct ib_device *device; struct ib_uobject *uobject; @@ -1053,6 +1066,23 @@ struct ib_device { struct ib_ucontext *context, struct ib_udata *udata); int (*dealloc_xrcd)(struct ib_xrcd *xrcd); + int (*create_xrc_rcv_qp)(struct ib_qp_init_attr *init_attr, + u32* qp_num); + int (*modify_xrc_rcv_qp)(struct ib_xrcd *xrcd, + u32 qp_num, + struct ib_qp_attr *attr, + int attr_mask); + int (*query_xrc_rcv_qp)(struct ib_xrcd *xrcd, + u32 qp_num, + struct ib_qp_attr *attr, + int attr_mask, + struct ib_qp_init_attr *init_attr); + int (*reg_xrc_rcv_qp)(struct ib_xrcd *xrcd, + void *context, + u32 qp_num); + int (*unreg_xrc_rcv_qp)(struct ib_xrcd *xrcd, + void *context, + u32 qp_num); struct ib_dma_mapping_ops *dma_ops; Index: infiniband/drivers/infiniband/core/uverbs_main.c =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs_main.c 2008-01-28 12:20:55.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs_main.c 2008-01-28 12:20:56.000000000 +0200 @@ -114,6 +114,11 @@ static ssize_t (*uverbs_cmd_table[])(str [IB_USER_VERBS_CMD_CREATE_XRC_SRQ] = ib_uverbs_create_xrc_srq, [IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN] = ib_uverbs_open_xrc_domain, [IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN] = ib_uverbs_close_xrc_domain, + [IB_USER_VERBS_CMD_CREATE_XRC_RCV_QP] = ib_uverbs_create_xrc_rcv_qp, + [IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP] = ib_uverbs_modify_xrc_rcv_qp, + [IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP] = ib_uverbs_query_xrc_rcv_qp, + [IB_USER_VERBS_CMD_REG_XRC_RCV_QP] = ib_uverbs_reg_xrc_rcv_qp, + [IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP] = ib_uverbs_unreg_xrc_rcv_qp, }; static struct vfsmount *uverbs_event_mnt; @@ -191,6 +196,7 @@ static int ib_uverbs_cleanup_ucontext(st struct ib_ucontext *context) { struct ib_uobject *uobj, *tmp; + struct ib_uxrc_rcv_object *xrc_qp_obj, *tmp1; if (!context) return 0; @@ -251,6 +257,13 @@ static int ib_uverbs_cleanup_ucontext(st kfree(uobj); } + list_for_each_entry_safe(xrc_qp_obj, tmp1, &context->xrc_reg_qp_list, list) { + list_del(&xrc_qp_obj->list); + ib_uverbs_cleanup_xrc_rcv_qp(file, xrc_qp_obj->domain_handle, + xrc_qp_obj->qp_num); + kfree(xrc_qp_obj); + } + mutex_lock(&file->device->ib_dev->xrcd_table_mutex); list_for_each_entry_safe(uobj, tmp, &context->xrc_domain_list, list) { struct ib_xrcd *xrcd = uobj->object; @@ -506,6 +519,12 @@ void ib_uverbs_event_handler(struct ib_e NULL, NULL); } +void ib_uverbs_xrc_rcv_qp_event_handler(struct ib_event *event, void *context_ptr) +{ + ib_uverbs_async_handler(context_ptr, event->element.xrc_qp_num, + event->event, NULL, NULL); +} + struct file *ib_uverbs_alloc_event_file(struct ib_uverbs_file *uverbs_file, int is_async, int *fd) { Index: infiniband/drivers/infiniband/core/uverbs_cmd.c =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs_cmd.c 2008-01-28 12:20:55.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs_cmd.c 2008-01-28 12:20:56.000000000 +0200 @@ -315,6 +315,7 @@ ssize_t ib_uverbs_get_context(struct ib_ INIT_LIST_HEAD(&ucontext->srq_list); INIT_LIST_HEAD(&ucontext->ah_list); INIT_LIST_HEAD(&ucontext->xrc_domain_list); + INIT_LIST_HEAD(&ucontext->xrc_reg_qp_list); ucontext->closing = 0; resp.num_comp_vectors = file->device->num_comp_vectors; @@ -1080,6 +1081,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uv goto err_put; } + attr.create_flags = 0; attr.event_handler = ib_uverbs_qp_event_handler; attr.qp_context = file; attr.send_cq = scq; @@ -2561,6 +2563,7 @@ ssize_t ib_uverbs_close_xrc_domain(struc int out_len) { struct ib_uverbs_close_xrc_domain cmd; + struct ib_uxrc_rcv_object *tmp; struct ib_uobject *uobj; struct ib_xrcd *xrcd = NULL; struct inode *inode = NULL; @@ -2576,6 +2579,18 @@ ssize_t ib_uverbs_close_xrc_domain(struc goto err_unlock_mutex; } + mutex_lock(&file->mutex); + list_for_each_entry(tmp, &file->ucontext->xrc_reg_qp_list, list) + if (cmd.xrcd_handle == tmp->domain_handle) { + ret = -EBUSY; + break; + } + mutex_unlock(&file->mutex); + if (ret) { + put_uobj_write(uobj); + goto err_unlock_mutex; + } + xrcd = (struct ib_xrcd *) (uobj->object); inode = xrcd->inode; @@ -2611,7 +2626,7 @@ err_unlock_mutex: } void ib_uverbs_dealloc_xrcd(struct ib_device *ib_dev, - struct ib_xrcd *xrcd) + struct ib_xrcd *xrcd) { struct inode *inode = NULL; int ret = 0; @@ -2625,4 +2640,353 @@ void ib_uverbs_dealloc_xrcd(struct ib_de xrcd_table_delete(ib_dev, inode); } +ssize_t ib_uverbs_create_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_create_xrc_rcv_qp cmd; + struct ib_uverbs_create_xrc_rcv_qp_resp resp; + struct ib_uxrc_rcv_object *obj; + struct ib_qp_init_attr init_attr; + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + u32 qp_num; + int err; + + if (out_len < sizeof resp) + return -ENOSPC; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + obj = kmalloc(sizeof *obj, GFP_KERNEL); + if (!obj) + return -ENOMEM; + + xrcd = idr_read_xrcd(cmd.xrc_domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) { + err = -EINVAL; + goto err_out; + } + + memset(&init_attr, 0, sizeof init_attr); + init_attr.event_handler = ib_uverbs_xrc_rcv_qp_event_handler; + init_attr.qp_context = file; + init_attr.srq = NULL; + init_attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; + init_attr.qp_type = IB_QPT_XRC; + init_attr.xrc_domain = xrcd; + init_attr.create_flags = XRC_RCV_QP; + + init_attr.cap.max_send_wr = 1; + init_attr.cap.max_recv_wr = 0; + init_attr.cap.max_send_sge = 1; + init_attr.cap.max_recv_sge = 0; + init_attr.cap.max_inline_data = 0; + + err = xrcd->device->create_xrc_rcv_qp(&init_attr, &qp_num); + if (err) + goto err_put; + + memset(&resp, 0, sizeof resp); + resp.qpn = qp_num; + + if (copy_to_user((void __user *) (unsigned long) cmd.response, + &resp, sizeof resp)) { + err = -EFAULT; + goto err_destroy; + } + + atomic_inc(&xrcd->usecnt); + put_xrcd_read(xrcd_uobj); + obj->qp_num = qp_num; + obj->domain_handle = cmd.xrc_domain_handle; + mutex_lock(&file->mutex); + list_add_tail(&obj->list, &file->ucontext->xrc_reg_qp_list); + mutex_unlock(&file->mutex); + + return in_len; + +err_destroy: + xrcd->device->unreg_xrc_rcv_qp(xrcd, file, qp_num); +err_put: + put_xrcd_read(xrcd_uobj); +err_out: + kfree(obj); + return err; +} + +ssize_t ib_uverbs_modify_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_modify_xrc_rcv_qp cmd; + struct ib_qp_attr *attr; + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + int err; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + attr = kmalloc(sizeof *attr, GFP_KERNEL); + if (!attr) + return -ENOMEM; + + xrcd = idr_read_xrcd(cmd.xrc_domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) { + kfree(attr); + return -EINVAL; + } + + memset(attr, 0, sizeof *attr); + attr->qp_state = cmd.qp_state; + attr->cur_qp_state = cmd.cur_qp_state; + attr->qp_access_flags = cmd.qp_access_flags; + attr->pkey_index = cmd.pkey_index; + attr->port_num = cmd.port_num; + attr->path_mtu = cmd.path_mtu; + attr->path_mig_state = cmd.path_mig_state; + attr->qkey = cmd.qkey; + attr->rq_psn = cmd.rq_psn; + attr->sq_psn = cmd.sq_psn; + attr->dest_qp_num = cmd.dest_qp_num; + attr->alt_pkey_index = cmd.alt_pkey_index; + attr->en_sqd_async_notify = cmd.en_sqd_async_notify; + attr->max_rd_atomic = cmd.max_rd_atomic; + attr->max_dest_rd_atomic = cmd.max_dest_rd_atomic; + attr->min_rnr_timer = cmd.min_rnr_timer; + attr->port_num = cmd.port_num; + attr->timeout = cmd.timeout; + attr->retry_cnt = cmd.retry_cnt; + attr->rnr_retry = cmd.rnr_retry; + attr->alt_port_num = cmd.alt_port_num; + attr->alt_timeout = cmd.alt_timeout; + + memcpy(attr->ah_attr.grh.dgid.raw, cmd.dest.dgid, 16); + attr->ah_attr.grh.flow_label = cmd.dest.flow_label; + attr->ah_attr.grh.sgid_index = cmd.dest.sgid_index; + attr->ah_attr.grh.hop_limit = cmd.dest.hop_limit; + attr->ah_attr.grh.traffic_class = cmd.dest.traffic_class; + attr->ah_attr.dlid = cmd.dest.dlid; + attr->ah_attr.sl = cmd.dest.sl; + attr->ah_attr.src_path_bits = cmd.dest.src_path_bits; + attr->ah_attr.static_rate = cmd.dest.static_rate; + attr->ah_attr.ah_flags = cmd.dest.is_global ? IB_AH_GRH : 0; + attr->ah_attr.port_num = cmd.dest.port_num; + + memcpy(attr->alt_ah_attr.grh.dgid.raw, cmd.alt_dest.dgid, 16); + attr->alt_ah_attr.grh.flow_label = cmd.alt_dest.flow_label; + attr->alt_ah_attr.grh.sgid_index = cmd.alt_dest.sgid_index; + attr->alt_ah_attr.grh.hop_limit = cmd.alt_dest.hop_limit; + attr->alt_ah_attr.grh.traffic_class = cmd.alt_dest.traffic_class; + attr->alt_ah_attr.dlid = cmd.alt_dest.dlid; + attr->alt_ah_attr.sl = cmd.alt_dest.sl; + attr->alt_ah_attr.src_path_bits = cmd.alt_dest.src_path_bits; + attr->alt_ah_attr.static_rate = cmd.alt_dest.static_rate; + attr->alt_ah_attr.ah_flags = cmd.alt_dest.is_global ? IB_AH_GRH : 0; + attr->alt_ah_attr.port_num = cmd.alt_dest.port_num; + + err = xrcd->device->modify_xrc_rcv_qp(xrcd, cmd.qp_num, attr, cmd.attr_mask); + put_xrcd_read(xrcd_uobj); + kfree(attr); + return err; +} + +ssize_t ib_uverbs_query_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_query_xrc_rcv_qp cmd; + struct ib_uverbs_query_qp_resp resp; + struct ib_qp_attr *attr; + struct ib_qp_init_attr *init_attr; + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + int ret; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + attr = kmalloc(sizeof *attr, GFP_KERNEL); + init_attr = kmalloc(sizeof *init_attr, GFP_KERNEL); + if (!attr || !init_attr) { + ret = -ENOMEM; + goto out; + } + + xrcd = idr_read_xrcd(cmd.xrc_domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) { + ret = -EINVAL; + goto out; + } + + ret = xrcd->device->query_xrc_rcv_qp(xrcd, cmd.qp_num, attr, + cmd.attr_mask, init_attr); + + put_xrcd_read(xrcd_uobj); + + if (ret) + goto out; + + memset(&resp, 0, sizeof resp); + resp.qp_state = attr->qp_state; + resp.cur_qp_state = attr->cur_qp_state; + resp.path_mtu = attr->path_mtu; + resp.path_mig_state = attr->path_mig_state; + resp.qkey = attr->qkey; + resp.rq_psn = attr->rq_psn; + resp.sq_psn = attr->sq_psn; + resp.dest_qp_num = attr->dest_qp_num; + resp.qp_access_flags = attr->qp_access_flags; + resp.pkey_index = attr->pkey_index; + resp.alt_pkey_index = attr->alt_pkey_index; + resp.sq_draining = attr->sq_draining; + resp.max_rd_atomic = attr->max_rd_atomic; + resp.max_dest_rd_atomic = attr->max_dest_rd_atomic; + resp.min_rnr_timer = attr->min_rnr_timer; + resp.port_num = attr->port_num; + resp.timeout = attr->timeout; + resp.retry_cnt = attr->retry_cnt; + resp.rnr_retry = attr->rnr_retry; + resp.alt_port_num = attr->alt_port_num; + resp.alt_timeout = attr->alt_timeout; + + memcpy(resp.dest.dgid, attr->ah_attr.grh.dgid.raw, 16); + resp.dest.flow_label = attr->ah_attr.grh.flow_label; + resp.dest.sgid_index = attr->ah_attr.grh.sgid_index; + resp.dest.hop_limit = attr->ah_attr.grh.hop_limit; + resp.dest.traffic_class = attr->ah_attr.grh.traffic_class; + resp.dest.dlid = attr->ah_attr.dlid; + resp.dest.sl = attr->ah_attr.sl; + resp.dest.src_path_bits = attr->ah_attr.src_path_bits; + resp.dest.static_rate = attr->ah_attr.static_rate; + resp.dest.is_global = !!(attr->ah_attr.ah_flags & IB_AH_GRH); + resp.dest.port_num = attr->ah_attr.port_num; + + memcpy(resp.alt_dest.dgid, attr->alt_ah_attr.grh.dgid.raw, 16); + resp.alt_dest.flow_label = attr->alt_ah_attr.grh.flow_label; + resp.alt_dest.sgid_index = attr->alt_ah_attr.grh.sgid_index; + resp.alt_dest.hop_limit = attr->alt_ah_attr.grh.hop_limit; + resp.alt_dest.traffic_class = attr->alt_ah_attr.grh.traffic_class; + resp.alt_dest.dlid = attr->alt_ah_attr.dlid; + resp.alt_dest.sl = attr->alt_ah_attr.sl; + resp.alt_dest.src_path_bits = attr->alt_ah_attr.src_path_bits; + resp.alt_dest.static_rate = attr->alt_ah_attr.static_rate; + resp.alt_dest.is_global = !!(attr->alt_ah_attr.ah_flags & IB_AH_GRH); + resp.alt_dest.port_num = attr->alt_ah_attr.port_num; + + resp.max_send_wr = init_attr->cap.max_send_wr; + resp.max_recv_wr = init_attr->cap.max_recv_wr; + resp.max_send_sge = init_attr->cap.max_send_sge; + resp.max_recv_sge = init_attr->cap.max_recv_sge; + resp.max_inline_data = init_attr->cap.max_inline_data; + resp.sq_sig_all = init_attr->sq_sig_type == IB_SIGNAL_ALL_WR; + + if (copy_to_user((void __user *) (unsigned long) cmd.response, + &resp, sizeof resp)) + ret = -EFAULT; + +out: + kfree(attr); + kfree(init_attr); + + return ret ? ret : in_len; +} + +ssize_t ib_uverbs_reg_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_reg_xrc_rcv_qp cmd; + struct ib_uxrc_rcv_object *qp_obj, *tmp; + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + int ret; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + qp_obj = kmalloc(sizeof *qp_obj, GFP_KERNEL); + if (!qp_obj) + return -ENOMEM; + + xrcd = idr_read_xrcd(cmd.xrc_domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) { + ret = -EINVAL; + goto err_out; + } + + ret = xrcd->device->reg_xrc_rcv_qp(xrcd, file, cmd.qp_num); + if (ret) + goto err_put; + + atomic_inc(&xrcd->usecnt); + put_xrcd_read(xrcd_uobj); + mutex_lock(&file->mutex); + list_for_each_entry(tmp, &file->ucontext->xrc_reg_qp_list, list) + if (cmd.qp_num == tmp->qp_num) { + kfree(qp_obj); + mutex_unlock(&file->mutex); + put_xrcd_read(xrcd_uobj); + return 0; + } + qp_obj->qp_num = cmd.qp_num; + qp_obj->domain_handle = cmd.xrc_domain_handle; + list_add_tail(&qp_obj->list, &file->ucontext->xrc_reg_qp_list); + mutex_unlock(&file->mutex); + return 0; + +err_put: + put_xrcd_read(xrcd_uobj); +err_out: + + kfree(qp_obj); + return ret; +} + +int ib_uverbs_cleanup_xrc_rcv_qp(struct ib_uverbs_file *file, + u32 domain_handle, u32 qp_num) +{ + struct ib_xrcd *xrcd; + struct ib_uobject *xrcd_uobj; + int err; + + xrcd = idr_read_xrcd(domain_handle, file->ucontext, &xrcd_uobj); + if (!xrcd) + return -EINVAL; + err = xrcd->device->unreg_xrc_rcv_qp(xrcd, file, qp_num); + + if (!err) + atomic_dec(&xrcd->usecnt); + put_xrcd_read(xrcd_uobj); + return err; +} + +ssize_t ib_uverbs_unreg_xrc_rcv_qp(struct ib_uverbs_file *file, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_unreg_xrc_rcv_qp cmd; + struct ib_uxrc_rcv_object *qp_obj, *tmp; + int ret; + + if (copy_from_user(&cmd, buf, sizeof cmd)) + return -EFAULT; + + ret = ib_uverbs_cleanup_xrc_rcv_qp(file, cmd.xrc_domain_handle, cmd.qp_num); + if (ret) + return ret; + + mutex_lock(&file->mutex); + list_for_each_entry_safe(qp_obj, tmp, &file->ucontext->xrc_reg_qp_list, list) + if (cmd.qp_num == qp_obj->qp_num) { + list_del(&qp_obj->list); + kfree(qp_obj); + break; + } + mutex_unlock(&file->mutex); + return 0; + +} Index: infiniband/include/rdma/ib_user_verbs.h =================================================================== --- infiniband.orig/include/rdma/ib_user_verbs.h 2008-01-28 12:20:54.000000000 +0200 +++ infiniband/include/rdma/ib_user_verbs.h 2008-01-28 12:20:56.000000000 +0200 @@ -86,7 +86,12 @@ enum { IB_USER_VERBS_CMD_POST_SRQ_RECV, IB_USER_VERBS_CMD_CREATE_XRC_SRQ, IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN, - IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN + IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN, + IB_USER_VERBS_CMD_CREATE_XRC_RCV_QP, + IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP, + IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP, + IB_USER_VERBS_CMD_REG_XRC_RCV_QP, + IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP, }; /* @@ -714,6 +719,76 @@ struct ib_uverbs_close_xrc_domain { __u64 driver_data[0]; }; +struct ib_uverbs_create_xrc_rcv_qp { + __u64 response; + __u64 user_handle; + __u32 xrc_domain_handle; + __u32 max_send_wr; + __u32 max_recv_wr; + __u32 max_send_sge; + __u32 max_recv_sge; + __u32 max_inline_data; + __u8 sq_sig_all; + __u8 qp_type; + __u8 reserved[2]; + __u64 driver_data[0]; +}; + +struct ib_uverbs_create_xrc_rcv_qp_resp { + __u32 qpn; + __u32 reserved; +}; + +struct ib_uverbs_modify_xrc_rcv_qp { + __u32 xrc_domain_handle; + __u32 qp_num; + struct ib_uverbs_qp_dest dest; + struct ib_uverbs_qp_dest alt_dest; + __u32 attr_mask; + __u32 qkey; + __u32 rq_psn; + __u32 sq_psn; + __u32 dest_qp_num; + __u32 qp_access_flags; + __u16 pkey_index; + __u16 alt_pkey_index; + __u8 qp_state; + __u8 cur_qp_state; + __u8 path_mtu; + __u8 path_mig_state; + __u8 en_sqd_async_notify; + __u8 max_rd_atomic; + __u8 max_dest_rd_atomic; + __u8 min_rnr_timer; + __u8 port_num; + __u8 timeout; + __u8 retry_cnt; + __u8 rnr_retry; + __u8 alt_port_num; + __u8 alt_timeout; + __u8 reserved[2]; + __u64 driver_data[0]; +}; + +struct ib_uverbs_query_xrc_rcv_qp { + __u64 response; + __u32 xrc_domain_handle; + __u32 qp_num; + __u32 attr_mask; + __u64 driver_data[0]; +}; + +struct ib_uverbs_reg_xrc_rcv_qp { + __u32 xrc_domain_handle; + __u32 qp_num; + __u64 driver_data[0]; +}; + +struct ib_uverbs_unreg_xrc_rcv_qp { + __u32 xrc_domain_handle; + __u32 qp_num; + __u64 driver_data[0]; +}; #endif /* IB_USER_VERBS_H */ Index: infiniband/drivers/infiniband/core/uverbs.h =================================================================== --- infiniband.orig/drivers/infiniband/core/uverbs.h 2008-01-28 12:20:55.000000000 +0200 +++ infiniband/drivers/infiniband/core/uverbs.h 2008-01-28 12:20:56.000000000 +0200 @@ -163,8 +163,12 @@ void ib_uverbs_qp_event_handler(struct i void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr); void ib_uverbs_event_handler(struct ib_event_handler *handler, struct ib_event *event); +void ib_uverbs_xrc_rcv_qp_event_handler(struct ib_event *event, + void *context_ptr); void ib_uverbs_dealloc_xrcd(struct ib_device *ib_dev, struct ib_xrcd *xrcd); +int ib_uverbs_cleanup_xrc_rcv_qp(struct ib_uverbs_file *file, + u32 domain_handle, u32 qp_num); #define IB_UVERBS_DECLARE_CMD(name) \ ssize_t ib_uverbs_##name(struct ib_uverbs_file *file, \ @@ -202,6 +206,11 @@ IB_UVERBS_DECLARE_CMD(destroy_srq); IB_UVERBS_DECLARE_CMD(create_xrc_srq); IB_UVERBS_DECLARE_CMD(open_xrc_domain); IB_UVERBS_DECLARE_CMD(close_xrc_domain); +IB_UVERBS_DECLARE_CMD(create_xrc_rcv_qp); +IB_UVERBS_DECLARE_CMD(modify_xrc_rcv_qp); +IB_UVERBS_DECLARE_CMD(query_xrc_rcv_qp); +IB_UVERBS_DECLARE_CMD(reg_xrc_rcv_qp); +IB_UVERBS_DECLARE_CMD(unreg_xrc_rcv_qp); #endif /* UVERBS_H */ From jackm at dev.mellanox.co.il Mon Jan 28 02:36:36 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Mon, 28 Jan 2008 12:36:36 +0200 Subject: [ofa-general] [PATCH 3/8 V2] mlx4: implement XRC qps (including XRC receive-only) Message-ID: <200801281236.36774.jackm@dev.mellanox.co.il> mlx4: Implements XRC support for userspace XRC QPs. Changes: Added support for XRC RCV-only QP (requested by userspace, but resides in kernel space). V2 changes: Added xrc_reg_mutex to the mlx4_ib_dev structure, since the qp mutex was not sufficient to protect against a reg_xrc/unreg_xrc race. Signed-off-by: Jack Morgenstein Index: infiniband/include/linux/mlx4/device.h =================================================================== --- infiniband.orig/include/linux/mlx4/device.h 2008-01-28 10:56:29.000000000 +0200 +++ infiniband/include/linux/mlx4/device.h 2008-01-28 12:12:55.000000000 +0200 @@ -56,6 +56,7 @@ enum { MLX4_DEV_CAP_FLAG_RC = 1 << 0, MLX4_DEV_CAP_FLAG_UC = 1 << 1, MLX4_DEV_CAP_FLAG_UD = 1 << 2, + MLX4_DEV_CAP_FLAG_XRC = 1 << 3, MLX4_DEV_CAP_FLAG_SRQ = 1 << 6, MLX4_DEV_CAP_FLAG_IPOIB_CSUM = 1 << 7, MLX4_DEV_CAP_FLAG_BAD_PKEY_CNTR = 1 << 8, @@ -176,6 +177,8 @@ struct mlx4_caps { int num_pds; int reserved_pds; int mtt_entry_sz; + int reserved_xrcds; + int max_xrcds; u32 max_msg_sz; u32 page_size_cap; u32 flags; @@ -312,6 +315,9 @@ void mlx4_buf_free(struct mlx4_dev *dev, int mlx4_pd_alloc(struct mlx4_dev *dev, u32 *pdn); void mlx4_pd_free(struct mlx4_dev *dev, u32 pdn); +int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn); +void mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn); + int mlx4_uar_alloc(struct mlx4_dev *dev, struct mlx4_uar *uar); void mlx4_uar_free(struct mlx4_dev *dev, struct mlx4_uar *uar); @@ -336,8 +342,8 @@ void mlx4_cq_free(struct mlx4_dev *dev, int mlx4_qp_alloc(struct mlx4_dev *dev, int sqpn, struct mlx4_qp *qp); void mlx4_qp_free(struct mlx4_dev *dev, struct mlx4_qp *qp); -int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, struct mlx4_mtt *mtt, - u64 db_rec, struct mlx4_srq *srq); +int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, u32 cqn, u16 xrcd, + struct mlx4_mtt *mtt, u64 db_rec, struct mlx4_srq *srq); void mlx4_srq_free(struct mlx4_dev *dev, struct mlx4_srq *srq); int mlx4_srq_arm(struct mlx4_dev *dev, struct mlx4_srq *srq, int limit_watermark); int mlx4_srq_query(struct mlx4_dev *dev, struct mlx4_srq *srq, int *limit_watermark); Index: infiniband/drivers/infiniband/hw/mlx4/main.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/main.c 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/main.c 2008-01-28 11:39:27.000000000 +0200 @@ -99,6 +99,8 @@ static int mlx4_ib_query_device(struct i props->device_cap_flags |= IB_DEVICE_AUTO_PATH_MIG; if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_UD_AV_PORT) props->device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) + props->device_cap_flags |= IB_DEVICE_XRC; props->vendor_id = be32_to_cpup((__be32 *) (out_mad->data + 36)) & 0xffffff; @@ -406,6 +408,7 @@ static struct ib_pd *mlx4_ib_alloc_pd(st if (!pd) return ERR_PTR(-ENOMEM); + memset(pd, 0, sizeof *pd); err = mlx4_pd_alloc(to_mdev(ibdev)->dev, &pd->pdn); if (err) { kfree(pd); @@ -442,6 +445,80 @@ static int mlx4_ib_mcg_detach(struct ib_ &to_mqp(ibqp)->mqp, gid->raw); } +static void mlx4_dummy_comp_handler(struct ib_cq *cq, void *cq_context) +{ +} + +static struct ib_xrcd *mlx4_ib_alloc_xrcd(struct ib_device *ibdev, + struct ib_ucontext *context, + struct ib_udata *udata) +{ + struct mlx4_ib_xrcd *xrcd; + struct mlx4_ib_dev *mdev = to_mdev(ibdev); + struct ib_pd *pd; + struct ib_cq *cq; + int err; + + if (!(mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return ERR_PTR(-ENOSYS); + + xrcd = kmalloc(sizeof *xrcd, GFP_KERNEL); + if (!xrcd) + return ERR_PTR(-ENOMEM); + + err = mlx4_xrcd_alloc(mdev->dev, &xrcd->xrcdn); + if (err) + goto err_xrcd; + + pd = mlx4_ib_alloc_pd(ibdev,NULL,NULL); + if (IS_ERR(pd)) { + err = PTR_ERR(pd); + goto err_pd; + } + pd->device = ibdev; + + cq = mlx4_ib_create_cq(ibdev, 1, 0, NULL, NULL); + if (IS_ERR(cq)) { + err = PTR_ERR(cq); + goto err_cq; + } + cq->device = ibdev; + cq->comp_handler = mlx4_dummy_comp_handler; + + if (context) + if (ib_copy_to_udata(udata, &xrcd->xrcdn, sizeof (__u32))) { + err = -EFAULT; + goto err_copy; + } + + xrcd->cq = cq; + xrcd->pd = pd; + return &xrcd->ibxrcd; + +err_copy: + mlx4_ib_destroy_cq(cq); +err_cq: + mlx4_ib_dealloc_pd(pd); +err_pd: + mlx4_xrcd_free(mdev->dev, xrcd->xrcdn); +err_xrcd: + kfree(xrcd); + return ERR_PTR(err); +} + +static int mlx4_ib_dealloc_xrcd(struct ib_xrcd *xrcd) +{ + struct mlx4_ib_xrcd *mxrcd = to_mxrcd(xrcd); + + mlx4_ib_destroy_cq(mxrcd->cq); + mlx4_ib_dealloc_pd(mxrcd->pd); + mlx4_xrcd_free(to_mdev(xrcd->device)->dev, to_mxrcd(xrcd)->xrcdn); + kfree(xrcd); + + return 0; +} + + static int init_node_data(struct mlx4_ib_dev *dev) { struct ib_smp *in_mad = NULL; @@ -611,12 +688,32 @@ static void *mlx4_ib_add(struct mlx4_dev ibdev->ib_dev.map_phys_fmr = mlx4_ib_map_phys_fmr; ibdev->ib_dev.unmap_fmr = mlx4_ib_unmap_fmr; ibdev->ib_dev.dealloc_fmr = mlx4_ib_fmr_dealloc; + if (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) { + ibdev->ib_dev.create_xrc_srq = mlx4_ib_create_xrc_srq; + ibdev->ib_dev.alloc_xrcd = mlx4_ib_alloc_xrcd; + ibdev->ib_dev.dealloc_xrcd = mlx4_ib_dealloc_xrcd; + ibdev->ib_dev.create_xrc_rcv_qp = mlx4_ib_create_xrc_rcv_qp; + ibdev->ib_dev.modify_xrc_rcv_qp = mlx4_ib_modify_xrc_rcv_qp; + ibdev->ib_dev.query_xrc_rcv_qp = mlx4_ib_query_xrc_rcv_qp; + ibdev->ib_dev.reg_xrc_rcv_qp = mlx4_ib_reg_xrc_rcv_qp; + ibdev->ib_dev.unreg_xrc_rcv_qp = mlx4_ib_unreg_xrc_rcv_qp; + ibdev->ib_dev.uverbs_cmd_mask |= + (1ull << IB_USER_VERBS_CMD_CREATE_XRC_SRQ) | + (1ull << IB_USER_VERBS_CMD_OPEN_XRC_DOMAIN) | + (1ull << IB_USER_VERBS_CMD_CLOSE_XRC_DOMAIN) | + (1ull << IB_USER_VERBS_CMD_CREATE_XRC_RCV_QP) | + (1ull << IB_USER_VERBS_CMD_MODIFY_XRC_RCV_QP) | + (1ull << IB_USER_VERBS_CMD_QUERY_XRC_RCV_QP) | + (1ull << IB_USER_VERBS_CMD_REG_XRC_RCV_QP) | + (1ull << IB_USER_VERBS_CMD_UNREG_XRC_RCV_QP); + } if (init_node_data(ibdev)) goto err_map; spin_lock_init(&ibdev->sm_lock); mutex_init(&ibdev->cap_mask_mutex); + mutex_init(&ibdev->xrc_reg_mutex); if (ib_register_device(&ibdev->ib_dev)) goto err_map; Index: infiniband/drivers/infiniband/hw/mlx4/mlx4_ib.h =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/mlx4_ib.h 2008-01-28 10:56:29.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/mlx4_ib.h 2008-01-28 11:38:46.000000000 +0200 @@ -73,6 +73,13 @@ struct mlx4_ib_pd { u32 pdn; }; +struct mlx4_ib_xrcd { + struct ib_xrcd ibxrcd; + u32 xrcdn; + struct ib_pd *pd; + struct ib_cq *cq; +}; + struct mlx4_ib_cq_buf { struct mlx4_buf buf; struct mlx4_mtt mtt; @@ -127,6 +134,9 @@ struct mlx4_ib_qp { struct mlx4_mtt mtt; int buf_size; struct mutex mutex; + enum qp_create_flags create_flags; + struct list_head xrc_reg_list; + u16 xrcdn; u8 port; u8 alt_port; u8 atomic_rd_en; @@ -172,6 +182,7 @@ struct mlx4_ib_dev { spinlock_t sm_lock; struct mutex cap_mask_mutex; + struct mutex xrc_reg_mutex; }; static inline struct mlx4_ib_dev *to_mdev(struct ib_device *ibdev) @@ -189,6 +200,11 @@ static inline struct mlx4_ib_pd *to_mpd( return container_of(ibpd, struct mlx4_ib_pd, ibpd); } +static inline struct mlx4_ib_xrcd *to_mxrcd(struct ib_xrcd *ibxrcd) +{ + return container_of(ibxrcd, struct mlx4_ib_xrcd, ibxrcd); +} + static inline struct mlx4_ib_cq *to_mcq(struct ib_cq *ibcq) { return container_of(ibcq, struct mlx4_ib_cq, ibcq); @@ -263,6 +279,11 @@ int mlx4_ib_destroy_ah(struct ib_ah *ah) struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd, struct ib_srq_init_attr *init_attr, struct ib_udata *udata); +struct ib_srq *mlx4_ib_create_xrc_srq(struct ib_pd *pd, + struct ib_cq *xrc_cq, + struct ib_xrcd *xrcd, + struct ib_srq_init_attr *init_attr, + struct ib_udata *udata); int mlx4_ib_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, enum ib_srq_attr_mask attr_mask, struct ib_udata *udata); int mlx4_ib_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr); @@ -299,6 +320,16 @@ int mlx4_ib_map_phys_fmr(struct ib_fmr * u64 iova); int mlx4_ib_unmap_fmr(struct list_head *fmr_list); int mlx4_ib_fmr_dealloc(struct ib_fmr *fmr); +int mlx4_ib_create_xrc_rcv_qp(struct ib_qp_init_attr *init_attr, + u32 *qp_num); +int mlx4_ib_modify_xrc_rcv_qp(struct ib_xrcd *xrcd, u32 qp_num, + struct ib_qp_attr *attr, int attr_mask); +int mlx4_ib_query_xrc_rcv_qp(struct ib_xrcd *xrcd, u32 qp_num, + struct ib_qp_attr *attr, int attr_mask, + struct ib_qp_init_attr *init_attr); +int mlx4_ib_reg_xrc_rcv_qp(struct ib_xrcd *xrcd, void * context, u32 qp_num); +int mlx4_ib_unreg_xrc_rcv_qp(struct ib_xrcd *xrcd, void * context, u32 qp_num); + static inline int mlx4_ib_ah_grh_present(struct mlx4_ib_ah *ah) { Index: infiniband/drivers/net/mlx4/xrcd.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ infiniband/drivers/net/mlx4/xrcd.c 2008-01-28 10:56:59.000000000 +0200 @@ -0,0 +1,70 @@ +/* + * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved. + * Copyright (c) 2007 Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include + +#include "mlx4.h" + +int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + + *xrcdn = mlx4_bitmap_alloc(&priv->xrcd_bitmap); + if (*xrcdn == -1) + return -ENOMEM; + + return 0; +} +EXPORT_SYMBOL_GPL(mlx4_xrcd_alloc); + +void mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn) +{ + mlx4_bitmap_free(&mlx4_priv(dev)->xrcd_bitmap, xrcdn); +} +EXPORT_SYMBOL_GPL(mlx4_xrcd_free); + +int __devinit mlx4_init_xrcd_table(struct mlx4_dev *dev) +{ + struct mlx4_priv *priv = mlx4_priv(dev); + + return mlx4_bitmap_init(&priv->xrcd_bitmap, (1 << 16), + (1 << 16) - 1, dev->caps.reserved_xrcds + 1); +} + +void mlx4_cleanup_xrcd_table(struct mlx4_dev *dev) +{ + mlx4_bitmap_cleanup(&mlx4_priv(dev)->xrcd_bitmap); +} + + Index: infiniband/drivers/net/mlx4/mlx4.h =================================================================== --- infiniband.orig/drivers/net/mlx4/mlx4.h 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/net/mlx4/mlx4.h 2008-01-28 12:12:55.000000000 +0200 @@ -260,6 +260,7 @@ struct mlx4_priv { struct mlx4_cmd cmd; struct mlx4_bitmap pd_bitmap; + struct mlx4_bitmap xrcd_bitmap; struct mlx4_uar_table uar_table; struct mlx4_mr_table mr_table; struct mlx4_cq_table cq_table; @@ -289,6 +290,7 @@ void mlx4_bitmap_cleanup(struct mlx4_bit int mlx4_reset(struct mlx4_dev *dev); int mlx4_init_pd_table(struct mlx4_dev *dev); +int mlx4_init_xrcd_table(struct mlx4_dev *dev); int mlx4_init_uar_table(struct mlx4_dev *dev); int mlx4_init_mr_table(struct mlx4_dev *dev); int mlx4_init_eq_table(struct mlx4_dev *dev); @@ -305,6 +307,7 @@ void mlx4_cleanup_cq_table(struct mlx4_d void mlx4_cleanup_qp_table(struct mlx4_dev *dev); void mlx4_cleanup_srq_table(struct mlx4_dev *dev); void mlx4_cleanup_mcg_table(struct mlx4_dev *dev); +void mlx4_cleanup_xrcd_table(struct mlx4_dev *dev); void mlx4_start_catas_poll(struct mlx4_dev *dev); void mlx4_stop_catas_poll(struct mlx4_dev *dev); Index: infiniband/drivers/net/mlx4/main.c =================================================================== --- infiniband.orig/drivers/net/mlx4/main.c 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/net/mlx4/main.c 2008-01-28 10:56:59.000000000 +0200 @@ -159,6 +159,10 @@ static int mlx4_dev_cap(struct mlx4_dev dev->caps.page_size_cap = ~(u32) (dev_cap->min_page_sz - 1); dev->caps.flags = dev_cap->flags; dev->caps.stat_rate_support = dev_cap->stat_rate_support; + dev->caps.reserved_xrcds = (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) ? + dev_cap->reserved_xrcds : 0; + dev->caps.max_xrcds = (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) ? + dev_cap->max_xrcds : 0; return 0; } @@ -586,11 +590,18 @@ static int mlx4_setup_hca(struct mlx4_de goto err_kar_unmap; } + err = mlx4_init_xrcd_table(dev); + if (err) { + mlx4_err(dev, "Failed to initialize " + "extended reliably connected domain table, aborting.\n"); + goto err_pd_table_free; + } + err = mlx4_init_mr_table(dev); if (err) { mlx4_err(dev, "Failed to initialize " "memory region table, aborting.\n"); - goto err_pd_table_free; + goto err_xrcd_table_free; } err = mlx4_init_eq_table(dev); @@ -674,6 +685,9 @@ err_eq_table_free: err_mr_table_free: mlx4_cleanup_mr_table(dev); +err_xrcd_table_free: + mlx4_cleanup_xrcd_table(dev); + err_pd_table_free: mlx4_cleanup_pd_table(dev); @@ -847,6 +861,7 @@ err_cleanup: mlx4_cmd_use_polling(dev); mlx4_cleanup_eq_table(dev); mlx4_cleanup_mr_table(dev); + mlx4_cleanup_xrcd_table(dev); mlx4_cleanup_pd_table(dev); mlx4_cleanup_uar_table(dev); @@ -906,6 +921,7 @@ static void mlx4_remove_one(struct pci_d mlx4_cmd_use_polling(dev); mlx4_cleanup_eq_table(dev); mlx4_cleanup_mr_table(dev); + mlx4_cleanup_xrcd_table(dev); mlx4_cleanup_pd_table(dev); iounmap(priv->kar); Index: infiniband/drivers/net/mlx4/srq.c =================================================================== --- infiniband.orig/drivers/net/mlx4/srq.c 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/net/mlx4/srq.c 2008-01-28 12:12:55.000000000 +0200 @@ -40,20 +40,20 @@ struct mlx4_srq_context { __be32 state_logsize_srqn; u8 logstride; - u8 reserved1[3]; - u8 pg_offset; - u8 reserved2[3]; - u32 reserved3; + u8 reserved1; + __be16 xrc_domain; + __be32 pg_offset_cqn; + u32 reserved2; u8 log_page_size; - u8 reserved4[2]; + u8 reserved3[2]; u8 mtt_base_addr_h; __be32 mtt_base_addr_l; __be32 pd; __be16 limit_watermark; __be16 wqe_cnt; - u16 reserved5; + u16 reserved4; __be16 wqe_counter; - u32 reserved6; + u32 reserved5; __be64 db_rec_addr; }; @@ -109,8 +109,8 @@ static int mlx4_QUERY_SRQ(struct mlx4_de MLX4_CMD_TIME_CLASS_A); } -int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, struct mlx4_mtt *mtt, - u64 db_rec, struct mlx4_srq *srq) +int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, u32 cqn, u16 xrcd, + struct mlx4_mtt *mtt, u64 db_rec, struct mlx4_srq *srq) { struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table; struct mlx4_cmd_mailbox *mailbox; @@ -148,6 +148,8 @@ int mlx4_srq_alloc(struct mlx4_dev *dev, srq_context->state_logsize_srqn = cpu_to_be32((ilog2(srq->max) << 24) | srq->srqn); srq_context->logstride = srq->wqe_shift - 4; + srq_context->xrc_domain = cpu_to_be16(xrcd); + srq_context->pg_offset_cqn = cpu_to_be32(cqn & 0xffffff); srq_context->log_page_size = mtt->page_shift - MLX4_ICM_PAGE_SHIFT; mtt_addr = mlx4_mtt_addr(dev, mtt); Index: infiniband/drivers/net/mlx4/fw.c =================================================================== --- infiniband.orig/drivers/net/mlx4/fw.c 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/net/mlx4/fw.c 2008-01-28 10:56:59.000000000 +0200 @@ -159,6 +159,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev * #define QUERY_DEV_CAP_MAX_MCG_OFFSET 0x63 #define QUERY_DEV_CAP_RSVD_PD_OFFSET 0x64 #define QUERY_DEV_CAP_MAX_PD_OFFSET 0x65 +#define QUERY_DEV_CAP_RSVD_XRC_OFFSET 0x66 +#define QUERY_DEV_CAP_MAX_XRC_OFFSET 0x67 #define QUERY_DEV_CAP_RDMARC_ENTRY_SZ_OFFSET 0x80 #define QUERY_DEV_CAP_QPC_ENTRY_SZ_OFFSET 0x82 #define QUERY_DEV_CAP_AUX_ENTRY_SZ_OFFSET 0x84 @@ -262,6 +264,11 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev * MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_PD_OFFSET); dev_cap->max_pds = 1 << (field & 0x3f); + MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_XRC_OFFSET); + dev_cap->reserved_xrcds = field >> 4; + MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_XRC_OFFSET); + dev_cap->max_xrcds = 1 << (field & 0x1f); + MLX4_GET(size, outbox, QUERY_DEV_CAP_RDMARC_ENTRY_SZ_OFFSET); dev_cap->rdmarc_entry_sz = size; MLX4_GET(size, outbox, QUERY_DEV_CAP_QPC_ENTRY_SZ_OFFSET); Index: infiniband/drivers/net/mlx4/fw.h =================================================================== --- infiniband.orig/drivers/net/mlx4/fw.h 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/net/mlx4/fw.h 2008-01-28 10:56:59.000000000 +0200 @@ -82,6 +82,8 @@ struct mlx4_dev_cap { int max_mcgs; int reserved_pds; int max_pds; + int reserved_xrcds; + int max_xrcds; int qpc_entry_sz; int rdmarc_entry_sz; int altc_entry_sz; Index: infiniband/drivers/infiniband/hw/mlx4/qp.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/qp.c 2008-01-28 10:56:29.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/qp.c 2008-01-28 12:20:46.000000000 +0200 @@ -54,6 +54,12 @@ enum { MLX4_IB_UD_HEADER_SIZE = 72 }; + +struct mlx4_ib_xrc_reg_entry { + struct list_head list; + void *context; +}; + struct mlx4_ib_sqp { struct mlx4_ib_qp qp; int pkey_index; @@ -130,14 +136,15 @@ static void stamp_send_wqe(struct mlx4_i static void mlx4_ib_qp_event(struct mlx4_qp *qp, enum mlx4_event type) { struct ib_event event; - struct ib_qp *ibqp = &to_mibqp(qp)->ibqp; + struct mlx4_ib_qp *mqp = to_mibqp(qp); + struct ib_qp *ibqp = &mqp->ibqp; + struct mlx4_ib_xrc_reg_entry *ctx_entry; if (type == MLX4_EVENT_TYPE_PATH_MIG) to_mibqp(qp)->port = to_mibqp(qp)->alt_port; if (ibqp->event_handler) { event.device = ibqp->device; - event.element.qp = ibqp; switch (type) { case MLX4_EVENT_TYPE_PATH_MIG: event.event = IB_EVENT_PATH_MIG; @@ -169,7 +176,16 @@ static void mlx4_ib_qp_event(struct mlx4 return; } - ibqp->event_handler(&event, ibqp->qp_context); + if (!(ibqp->qp_type == IB_QPT_XRC && + mqp->create_flags & XRC_RCV_QP)) { + event.element.qp = ibqp; + ibqp->event_handler(&event, ibqp->qp_context); + } else { + event.event |= IB_XRC_QP_EVENT_FLAG; + event.element.xrc_qp_num = ibqp->qp_num; + list_for_each_entry(ctx_entry, &mqp->xrc_reg_list, list) + ibqp->event_handler(&event, ctx_entry->context); + } } } @@ -209,14 +225,14 @@ static int send_wqe_overhead(enum ib_qp_ } static int set_rq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap, - int is_user, int has_srq, struct mlx4_ib_qp *qp) + int is_user, int has_srq_or_is_xrc, struct mlx4_ib_qp *qp) { /* Sanity check RQ size before proceeding */ if (cap->max_recv_wr > dev->dev->caps.max_wqes || cap->max_recv_sge > dev->dev->caps.max_rq_sg) return -EINVAL; - if (has_srq) { + if (has_srq_or_is_xrc) { /* QPs attached to an SRQ should have no RQ */ if (cap->max_recv_wr) return -EINVAL; @@ -328,7 +344,8 @@ static int create_qp_common(struct mlx4_ qp->sq.head = 0; qp->sq.tail = 0; - err = set_rq_size(dev, &init_attr->cap, !!pd->uobject, !!init_attr->srq, qp); + err = set_rq_size(dev, &init_attr->cap, !!pd->uobject, + !!init_attr->srq || !!init_attr->xrc_domain , qp); if (err) goto err; @@ -362,7 +379,7 @@ static int create_qp_common(struct mlx4_ if (err) goto err_mtt; - if (!init_attr->srq) { + if (!init_attr->srq && init_attr->qp_type != IB_QPT_XRC) { err = mlx4_ib_db_map_user(to_mucontext(pd->uobject->context), ucmd.db_addr, &qp->db); if (err) @@ -375,7 +392,7 @@ static int create_qp_common(struct mlx4_ if (err) goto err; - if (!init_attr->srq) { + if (!init_attr->srq && init_attr->qp_type != IB_QPT_XRC) { err = mlx4_ib_db_alloc(dev, &qp->db, 0); if (err) goto err; @@ -410,6 +427,9 @@ static int create_qp_common(struct mlx4_ if (err) goto err_wrid; + if (init_attr->qp_type == IB_QPT_XRC) + qp->mqp.qpn |= (1 << 23); + /* * Hardware wants QPN written in big-endian order (after * shifting) for send doorbell. Precompute this value to save @@ -428,7 +448,7 @@ static int create_qp_common(struct mlx4_ err_wrid: if (pd->uobject) { - if (!init_attr->srq) + if (!init_attr->srq && init_attr->qp_type != IB_QPT_XRC) mlx4_ib_db_unmap_user(to_mucontext(pd->uobject->context), &qp->db); } else { @@ -446,7 +466,7 @@ err_buf: mlx4_buf_free(dev->dev, qp->buf_size, &qp->buf); err_db: - if (!pd->uobject && !init_attr->srq) + if (!pd->uobject && !init_attr->srq && init_attr->qp_type != IB_QPT_XRC) mlx4_ib_db_free(dev, &qp->db); err: @@ -524,7 +544,7 @@ static void destroy_qp_common(struct mlx mlx4_mtt_cleanup(dev->dev, &qp->mtt); if (is_user) { - if (!qp->ibqp.srq) + if (!qp->ibqp.srq && qp->ibqp.qp_type != IB_QPT_XRC) mlx4_ib_db_unmap_user(to_mucontext(qp->ibqp.uobject->context), &qp->db); ib_umem_release(qp->umem); @@ -532,7 +552,7 @@ static void destroy_qp_common(struct mlx kfree(qp->sq.wrid); kfree(qp->rq.wrid); mlx4_buf_free(dev->dev, qp->buf_size, &qp->buf); - if (!qp->ibqp.srq) + if (!qp->ibqp.srq && qp->ibqp.qp_type != IB_QPT_XRC) mlx4_ib_db_free(dev, &qp->db); } } @@ -547,6 +567,9 @@ struct ib_qp *mlx4_ib_create_qp(struct i int err; switch (init_attr->qp_type) { + case IB_QPT_XRC: + if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return ERR_PTR(-ENOSYS); case IB_QPT_RC: case IB_QPT_UC: case IB_QPT_UD: @@ -555,12 +578,20 @@ struct ib_qp *mlx4_ib_create_qp(struct i if (!qp) return ERR_PTR(-ENOMEM); + memset(qp, 0, sizeof *qp); + INIT_LIST_HEAD(&qp->xrc_reg_list); + qp->create_flags = init_attr->create_flags; err = create_qp_common(dev, pd, init_attr, udata, 0, qp); if (err) { kfree(qp); return ERR_PTR(err); } + if (init_attr->qp_type == IB_QPT_XRC) + qp->xrcdn = to_mxrcd(init_attr->xrc_domain)->xrcdn; + else + qp->xrcdn = 0; + qp->ibqp.qp_num = qp->mqp.qpn; break; @@ -625,6 +656,7 @@ static int to_mlx4_st(enum ib_qp_type ty case IB_QPT_RC: return MLX4_QP_ST_RC; case IB_QPT_UC: return MLX4_QP_ST_UC; case IB_QPT_UD: return MLX4_QP_ST_UD; + case IB_QPT_XRC: return MLX4_QP_ST_XRC; case IB_QPT_SMI: case IB_QPT_GSI: return MLX4_QP_ST_MLX; default: return -1; @@ -769,8 +801,11 @@ static int __mlx4_ib_modify_qp(struct ib context->sq_size_stride = ilog2(qp->sq.wqe_cnt) << 3; context->sq_size_stride |= qp->sq.wqe_shift - 4; - if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) + if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) { context->sq_size_stride |= !!qp->sq_no_prefetch << 7; + if (ibqp->qp_type == IB_QPT_XRC) + context->xrcd = cpu_to_be32((u32) qp->xrcdn); + } if (qp->ibqp.uobject) context->usr_page = cpu_to_be32(to_mucontext(ibqp->uobject->context)->uar.index); @@ -882,7 +917,8 @@ static int __mlx4_ib_modify_qp(struct ib if (ibqp->srq) context->srqn = cpu_to_be32(1 << 24 | to_msrq(ibqp->srq)->msrq.srqn); - if (!ibqp->srq && cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) + if (!ibqp->srq && ibqp->qp_type != IB_QPT_XRC && + cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) context->db_rec_addr = cpu_to_be64(qp->db.dma); if (cur_state == IB_QPS_INIT && @@ -969,7 +1005,7 @@ static int __mlx4_ib_modify_qp(struct ib qp->rq.tail = 0; qp->sq.head = 0; qp->sq.tail = 0; - if (!ibqp->srq) + if (!ibqp->srq && ibqp->qp_type != IB_QPT_XRC) *qp->db.db = 0; } @@ -1662,3 +1698,260 @@ done: return 0; } +int mlx4_ib_create_xrc_rcv_qp(struct ib_qp_init_attr *init_attr, + u32 *qp_num) +{ + struct mlx4_ib_dev *dev = to_mdev(init_attr->xrc_domain->device); + struct mlx4_ib_xrcd *xrcd = to_mxrcd(init_attr->xrc_domain); + struct ib_qp_init_attr ia = *init_attr; + struct mlx4_ib_qp *qp; + struct ib_qp *ibqp; + struct mlx4_ib_xrc_reg_entry *ctx_entry; + + if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return -ENOSYS; + + ctx_entry = kmalloc(sizeof *ctx_entry, GFP_KERNEL); + if (!ctx_entry) + return -ENOMEM; + + ia.qp_type = IB_QPT_XRC; + ia.create_flags = XRC_RCV_QP; + ia.recv_cq = ia.send_cq = xrcd->cq; + + ibqp = mlx4_ib_create_qp(xrcd->pd, &ia, NULL); + if (IS_ERR(ibqp)) { + kfree(ctx_entry); + return PTR_ERR(ibqp); + } + + /* set the ibpq attributes which will be used by the mlx4 module */ + ibqp->device = init_attr->xrc_domain->device; + ibqp->pd = xrcd->pd; + ibqp->send_cq = ibqp->recv_cq = xrcd->cq; + ibqp->event_handler = init_attr->event_handler; + ibqp->qp_context = init_attr->qp_context; + ibqp->qp_type = init_attr->qp_type; + ibqp->xrcd = init_attr->xrc_domain; + + qp = to_mqp(ibqp); + + mutex_lock(&qp->mutex); + ctx_entry->context = init_attr->qp_context; + list_add_tail(&ctx_entry->list, &qp->xrc_reg_list); + mutex_unlock(&qp->mutex); + *qp_num = qp->mqp.qpn; + return 0; +} + +int mlx4_ib_modify_xrc_rcv_qp(struct ib_xrcd *ibxrcd, u32 qp_num, + struct ib_qp_attr *attr, int attr_mask) +{ + struct mlx4_ib_dev *dev = to_mdev(ibxrcd->device); + struct mlx4_ib_xrcd *xrcd = to_mxrcd(ibxrcd); + struct mlx4_qp *mqp; + int err; + + if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return -ENOSYS; + + mqp = __mlx4_qp_lookup(dev->dev, qp_num); + if (unlikely(!mqp)) { + printk(KERN_WARNING "mlx4_ib_reg_xrc_rcv_qp: unknown QPN %06x\n", + qp_num); + return -EINVAL; + } + + if (xrcd->xrcdn != to_mxrcd(to_mibqp(mqp)->ibqp.xrcd)->xrcdn) + return -EINVAL; + + err = mlx4_ib_modify_qp(&(to_mibqp(mqp)->ibqp), attr, attr_mask, NULL); + return err; +} + +int mlx4_ib_query_xrc_rcv_qp(struct ib_xrcd *ibxrcd, u32 qp_num, + struct ib_qp_attr *qp_attr, int qp_attr_mask, + struct ib_qp_init_attr *qp_init_attr) +{ + struct mlx4_ib_dev *dev = to_mdev(ibxrcd->device); + struct mlx4_ib_xrcd *xrcd = to_mxrcd(ibxrcd); + struct mlx4_ib_qp *qp; + struct mlx4_qp *mqp; + struct mlx4_qp_context context; + int mlx4_state; + int err; + + if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) + return -ENOSYS; + + mqp = __mlx4_qp_lookup(dev->dev, qp_num); + if (unlikely(!mqp)) { + printk(KERN_WARNING "mlx4_ib_reg_xrc_rcv_qp: unknown QPN %06x\n", + qp_num); + return -EINVAL; + } + + qp = to_mibqp(mqp); + if (xrcd->xrcdn != to_mxrcd(qp->ibqp.xrcd)->xrcdn) + return -EINVAL; + + if (qp->state == IB_QPS_RESET) { + qp_attr->qp_state = IB_QPS_RESET; + goto done; + } + + err = mlx4_qp_query(dev->dev, mqp, &context); + if (err) + return -EINVAL; + + mlx4_state = be32_to_cpu(context.flags) >> 28; + + qp_attr->qp_state = to_ib_qp_state(mlx4_state); + qp_attr->path_mtu = context.mtu_msgmax >> 5; + qp_attr->path_mig_state = + to_ib_mig_state((be32_to_cpu(context.flags) >> 11) & 0x3); + qp_attr->qkey = be32_to_cpu(context.qkey); + qp_attr->rq_psn = be32_to_cpu(context.rnr_nextrecvpsn) & 0xffffff; + qp_attr->sq_psn = be32_to_cpu(context.next_send_psn) & 0xffffff; + qp_attr->dest_qp_num = be32_to_cpu(context.remote_qpn) & 0xffffff; + qp_attr->qp_access_flags = + to_ib_qp_access_flags(be32_to_cpu(context.params2)); + + if (qp->ibqp.qp_type == IB_QPT_RC || qp->ibqp.qp_type == IB_QPT_UC || + qp->ibqp.qp_type == IB_QPT_XRC) { + to_ib_ah_attr(dev->dev, &qp_attr->ah_attr, &context.pri_path); + to_ib_ah_attr(dev->dev, &qp_attr->alt_ah_attr, &context.alt_path); + qp_attr->alt_pkey_index = context.alt_path.pkey_index & 0x7f; + qp_attr->alt_port_num = qp_attr->alt_ah_attr.port_num; + } + + qp_attr->pkey_index = context.pri_path.pkey_index & 0x7f; + if (qp_attr->qp_state == IB_QPS_INIT) + qp_attr->port_num = qp->port; + else + qp_attr->port_num = context.pri_path.sched_queue & 0x40 ? 2 : 1; + + /* qp_attr->en_sqd_async_notify is only applicable in modify qp */ + qp_attr->sq_draining = mlx4_state == MLX4_QP_STATE_SQ_DRAINING; + + qp_attr->max_rd_atomic = 1 << ((be32_to_cpu(context.params1) >> 21) & 0x7); + + qp_attr->max_dest_rd_atomic = + 1 << ((be32_to_cpu(context.params2) >> 21) & 0x7); + qp_attr->min_rnr_timer = + (be32_to_cpu(context.rnr_nextrecvpsn) >> 24) & 0x1f; + qp_attr->timeout = context.pri_path.ackto >> 3; + qp_attr->retry_cnt = (be32_to_cpu(context.params1) >> 16) & 0x7; + qp_attr->rnr_retry = (be32_to_cpu(context.params1) >> 13) & 0x7; + qp_attr->alt_timeout = context.alt_path.ackto >> 3; + +done: + qp_attr->cur_qp_state = qp_attr->qp_state; + qp_attr->cap.max_recv_wr = 0; + qp_attr->cap.max_recv_sge = 0; + qp_attr->cap.max_send_wr = 0; + qp_attr->cap.max_send_sge = 0; + qp_attr->cap.max_inline_data = 0; + qp_init_attr->cap = qp_attr->cap; + + return 0; +} + +int mlx4_ib_reg_xrc_rcv_qp(struct ib_xrcd *xrcd, void *context, u32 qp_num) +{ + + struct mlx4_ib_xrcd *mxrcd = to_mxrcd(xrcd); + + struct mlx4_qp *mqp; + struct mlx4_ib_qp *mibqp; + struct mlx4_ib_xrc_reg_entry *ctx_entry, *tmp; + int err = -EINVAL; + + mutex_lock(&to_mdev(xrcd->device)->xrc_reg_mutex); + mqp = __mlx4_qp_lookup(to_mdev(xrcd->device)->dev, qp_num); + if (unlikely(!mqp)) { + printk(KERN_WARNING "mlx4_ib_reg_xrc_rcv_qp: unknown QPN %06x\n", + qp_num); + goto err_out; + } + + mibqp = to_mibqp(mqp); + + if (mxrcd->xrcdn != to_mxrcd(mibqp->ibqp.xrcd)->xrcdn) + goto err_out; + + ctx_entry = kmalloc(sizeof *ctx_entry, GFP_KERNEL); + if (!ctx_entry) { + err = -ENOMEM; + goto err_out; + } + + mutex_lock(&mibqp->mutex); + list_for_each_entry(tmp, &mibqp->xrc_reg_list, list) + if (tmp->context == context) { + mutex_unlock(&mibqp->mutex); + kfree(ctx_entry); + mutex_unlock(&to_mdev(xrcd->device)->xrc_reg_mutex); + return 0; + } + + ctx_entry->context = context; + list_add_tail(&ctx_entry->list, &mibqp->xrc_reg_list); + mutex_unlock(&mibqp->mutex); + mutex_unlock(&to_mdev(xrcd->device)->xrc_reg_mutex); + return 0; + +err_out: + mutex_unlock(&to_mdev(xrcd->device)->xrc_reg_mutex); + return err; +} + +int mlx4_ib_unreg_xrc_rcv_qp(struct ib_xrcd *xrcd, void *context, u32 qp_num) +{ + + struct mlx4_ib_xrcd *mxrcd = to_mxrcd(xrcd); + + struct mlx4_qp *mqp; + struct mlx4_ib_qp *mibqp; + struct mlx4_ib_xrc_reg_entry *ctx_entry, *tmp; + int found = 0; + int err = -EINVAL; + + mutex_lock(&to_mdev(xrcd->device)->xrc_reg_mutex); + mqp = __mlx4_qp_lookup(to_mdev(xrcd->device)->dev, qp_num); + if (unlikely(!mqp)) { + printk(KERN_WARNING "mlx4_ib_unreg_xrc_rcv_qp: unknown QPN %06x\n", + qp_num); + goto err_out; + } + + mibqp = to_mibqp(mqp); + + if (mxrcd->xrcdn != (mibqp->xrcdn & 0xffff)) + goto err_out; + + mutex_lock(&mibqp->mutex); + list_for_each_entry_safe(ctx_entry, tmp, &mibqp->xrc_reg_list, list) + if (ctx_entry->context == context) { + found = 1; + list_del(&ctx_entry->list); + kfree(ctx_entry); + break; + } + + mutex_unlock(&mibqp->mutex); + if (!found) + goto err_out; + + /* destroy the QP if the registration list is empty */ + if (list_empty(&mibqp->xrc_reg_list)) + mlx4_ib_destroy_qp(&mibqp->ibqp); + + mutex_unlock(&to_mdev(xrcd->device)->xrc_reg_mutex); + return 0; + +err_out: + mutex_unlock(&to_mdev(xrcd->device)->xrc_reg_mutex); + return err; +} + Index: infiniband/drivers/infiniband/hw/mlx4/srq.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/srq.c 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/srq.c 2008-01-28 12:12:55.000000000 +0200 @@ -72,13 +72,17 @@ static void mlx4_ib_srq_event(struct mlx } } -struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd, - struct ib_srq_init_attr *init_attr, - struct ib_udata *udata) +struct ib_srq *mlx4_ib_create_xrc_srq(struct ib_pd *pd, + struct ib_cq *xrc_cq, + struct ib_xrcd *xrcd, + struct ib_srq_init_attr *init_attr, + struct ib_udata *udata) { struct mlx4_ib_dev *dev = to_mdev(pd->device); struct mlx4_ib_srq *srq; struct mlx4_wqe_srq_next_seg *next; + u32 cqn; + u16 xrcdn; int desc_size; int buf_size; int err; @@ -172,7 +176,11 @@ struct ib_srq *mlx4_ib_create_srq(struct } } - err = mlx4_srq_alloc(dev->dev, to_mpd(pd)->pdn, &srq->mtt, + cqn = xrc_cq ? (u32) (to_mcq(xrc_cq)->mcq.cqn) : 0; + xrcdn = xrcd ? (u16) (to_mxrcd(xrcd)->xrcdn) : + (u16) dev->dev->caps.reserved_xrcds; + + err = mlx4_srq_alloc(dev->dev, to_mpd(pd)->pdn, cqn, xrcdn, &srq->mtt, srq->db.dma, &srq->msrq); if (err) goto err_wrid; @@ -240,6 +248,13 @@ int mlx4_ib_modify_srq(struct ib_srq *ib return 0; } +struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd, + struct ib_srq_init_attr *init_attr, + struct ib_udata *udata) +{ + return mlx4_ib_create_xrc_srq(pd, NULL, NULL, init_attr, udata); +} + int mlx4_ib_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr) { struct mlx4_ib_dev *dev = to_mdev(ibsrq->device); Index: infiniband/include/linux/mlx4/qp.h =================================================================== --- infiniband.orig/include/linux/mlx4/qp.h 2008-01-28 10:56:29.000000000 +0200 +++ infiniband/include/linux/mlx4/qp.h 2008-01-28 10:56:59.000000000 +0200 @@ -74,6 +74,7 @@ enum { MLX4_QP_ST_UC = 0x1, MLX4_QP_ST_RD = 0x2, MLX4_QP_ST_UD = 0x3, + MLX4_QP_ST_XRC = 0x6, MLX4_QP_ST_MLX = 0x7 }; @@ -136,7 +137,7 @@ struct mlx4_qp_context { __be32 ssn; __be32 params2; __be32 rnr_nextrecvpsn; - __be32 srcd; + __be32 xrcd; __be32 cqn_recv; __be64 db_rec_addr; __be32 qkey; Index: infiniband/drivers/net/mlx4/Makefile =================================================================== --- infiniband.orig/drivers/net/mlx4/Makefile 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/net/mlx4/Makefile 2008-01-28 10:56:59.000000000 +0200 @@ -1,4 +1,4 @@ obj-$(CONFIG_MLX4_CORE) += mlx4_core.o mlx4_core-y := alloc.o catas.o cmd.o cq.o eq.o fw.o icm.o intf.o main.o mcg.o \ - mr.o pd.o profile.o qp.o reset.o srq.o + mr.o pd.o profile.o qp.o reset.o srq.o xrcd.o Index: infiniband/drivers/net/mlx4/qp.c =================================================================== --- infiniband.orig/drivers/net/mlx4/qp.c 2008-01-27 10:44:25.000000000 +0200 +++ infiniband/drivers/net/mlx4/qp.c 2008-01-28 10:56:59.000000000 +0200 @@ -263,10 +263,12 @@ int mlx4_init_qp_table(struct mlx4_dev * * We reserve 2 extra QPs per port for the special QPs. The * block of special QPs must be aligned to a multiple of 8, so * round up. + * We also reserve the MSB of the 24-bit QP number to indicate + * an XRC qp. */ dev->caps.sqp_start = ALIGN(dev->caps.reserved_qps, 8); err = mlx4_bitmap_init(&qp_table->bitmap, dev->caps.num_qps, - (1 << 24) - 1, dev->caps.sqp_start + 8); + (1 << 23) - 1, dev->caps.sqp_start + 8); if (err) return err; Index: infiniband/drivers/infiniband/hw/mlx4/cq.c =================================================================== --- infiniband.orig/drivers/infiniband/hw/mlx4/cq.c 2008-01-28 10:56:29.000000000 +0200 +++ infiniband/drivers/infiniband/hw/mlx4/cq.c 2008-01-28 12:12:55.000000000 +0200 @@ -108,6 +108,7 @@ struct ib_cq *mlx4_ib_create_cq(struct i if (!cq) return ERR_PTR(-ENOMEM); + memset(cq, 0, sizeof *cq); entries = roundup_pow_of_two(entries + 1); cq->ibcq.cqe = entries - 1; buf_size = entries * sizeof (struct mlx4_cqe); From vlad at lists.openfabrics.org Mon Jan 28 03:20:12 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Mon, 28 Jan 2008 03:20:12 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080128-0200 daily build status Message-ID: <20080128112012.D2BB2E60177@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on ia64 with linux-2.6.19 Passed on powerpc with linux-2.6.13 Passed on ia64 with linux-2.6.18 Passed on powerpc with linux-2.6.12 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.15 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.13 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.19 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on powerpc with linux-2.6.15 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.14 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.17 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.13 Passed on ppc64 with linux-2.6.16 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.21.1 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ppc64 with linux-2.6.18-8.el5 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.18-53.el5 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Failed: From bart.vanassche at gmail.com Mon Jan 28 03:24:52 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 28 Jan 2008 12:24:52 +0100 Subject: [ofa-general] OFED 1.3 RC2 release is available In-Reply-To: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> Message-ID: On Jan 16, 2008 5:22 PM, Tziporet Koren wrote: > > Hi, > OFED 1.3 RC2 release is available on > http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc2.tgz > > To get BUILD_ID run ofed_info > > Please report any issues in bugzilla https://bugs.openfabrics.org/ > The RC3 release is expected on January 30 Apparently OFED 1.3 includes SRP target support ? Although I consider SRP target support as a very valuable contribution, it should not be included in the OFED distribution but in the SCST distribution. The reason is that the SRP target relies on SCST interfaces that can potentially change with each new SCST release. Consider e.g. the scsi_tgt.h header file, which defines the interface between SCST core and SCST mid-level modules. The version of this file included with git://git.openfabrics.org/~vu/ofed_1_3.git (0.9.6-pre3) is incompatible with the latest scsi_tgt.h file from the SCST project (0.9.6-rc1). This may cause kernel crashes for OFED 1.3 SRP target users who combine OFED 1.3 with the latest SCST version. Sorry for this late notice. Bart. From dwsikartm at sikart.ch Mon Jan 28 03:26:47 2008 From: dwsikartm at sikart.ch (Romeo Richards) Date: Mon, 28 Jan 2008 19:26:47 +0800 Subject: [ofa-general] Experience for yourself the excitement of winning real money online. Message-ID: <01c861e3$b90a7580$e29246da@dwsikartm> Where to gamble online? Check the list of the games in Golden Gate Casino! Just download free software and play from the comfort of your home! Get started and receive $2400 welcome bonus! Golden Gate Casino guarantees competent customer support for all players, quick response in case you have question or problem and instant payouts. Fair gaming only! http://geocities.com/elviamendoza851 Play casino games any time you like. From kliteyn at dev.mellanox.co.il Mon Jan 28 03:47:06 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 28 Jan 2008 13:47:06 +0200 Subject: [ofa-general] [PATCH] opensm: osm_version.h not found Message-ID: <479DC0BA.5020009@dev.mellanox.co.il> Hi Sasha, When building OpenSM I get the following error messages: cat: /../include/opensm/osm_version.h: No such file or directory /bin/sh: line 0: [: 3.1.8: unary operator expected This error is not affecting the build itself - everything works fine. The following patch fixes this error. I'm not sure if this applicable to ofed_1_3, but if it is, please apply both to master and to ofed_1_3. Signed-off-by: Yevgeny Kliteynik --- opensm/opensm/Makefile.am | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am index d617994..50fdae1 100644 --- a/opensm/opensm/Makefile.am +++ b/opensm/opensm/Makefile.am @@ -105,7 +105,7 @@ opensminclude_HEADERS = $(srcdir)/../include/opensm/osm_base.h \ BUILT_SOURCES = osm_version osm_version: if [ -x $(top_srcdir)/../gen_ver.sh ] ; then \ - ver_file=$(srcdir)/../include/opensm/osm_version.h ; \ + ver_file=$(top_builddir)/include/opensm/osm_version.h ; \ osm_ver=`cat $$ver_file | sed -ne '/#define OSM_VERSION /s/^.*\"OpenSM \(.*\)\"$$/\1/p'` ; \ ver=`$(top_srcdir)/../gen_ver.sh $(PACKAGE)` ; \ if [ $$ver != $$osm_ver ] ; then \ -- 1.5.1.4 From bart.vanassche at gmail.com Mon Jan 28 04:02:29 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 28 Jan 2008 13:02:29 +0100 Subject: [Scst-devel] [ofa-general] OFED 1.3 RC2 release is available In-Reply-To: <479DC0B4.9040902@vlnb.net> References: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> <479DC0B4.9040902@vlnb.net> Message-ID: On Jan 28, 2008 12:47 PM, Vladislav Bolkhovitin wrote: > > Bart Van Assche wrote: > > Apparently OFED 1.3 includes SRP target support ? Although I consider > > SRP target support as a very valuable contribution, it should not be > > included in the OFED distribution but in the SCST distribution. The > > reason is that the SRP target relies on SCST interfaces that can > > potentially change with each new SCST release. Consider e.g. the > > scsi_tgt.h header file, which defines the interface between SCST core > > and SCST mid-level modules. The version of this file included with > > git://git.openfabrics.org/~vu/ofed_1_3.git (0.9.6-pre3) is > > incompatible with the latest scsi_tgt.h file from the SCST project > > (0.9.6-rc1). This may cause kernel crashes for OFED 1.3 SRP target > > users who combine OFED 1.3 with the latest SCST version. > > No it won't crash, it will refuse to run. I've recently added in SCST > protection against attempts running mixed SCST and target driver versions. > > BTW, there is a > > **************************************************************** > *!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!* > *!! !!* > *!! BIG FAT WARNING ABOUT MIXED VERSIONS PROBLEM !!* > *!! !!* > *!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!* > **************************************************************** Hello Vladislav, I did not test the above scenario -- what I wrote was the result of source reading. It is very good that interface versions are checked inside SCST before mid-level drivers are used. Even with interface version checking in place, my opinion is that the SRP target code should be included in the SCST project and not in the OFED project. Bart. From 998.pgodfrey at ftwo.com Mon Jan 28 04:39:35 2008 From: 998.pgodfrey at ftwo.com (Norma Curtis) Date: Mon, 28 Jan 2008 14:39:35 +0200 Subject: [ofa-general] Let's chat Message-ID: <01c861bb$99f7ed80$1535f358@998.pgodfrey> Hello! I am tired this evening. I am nice girl that would like to chat with you. Email me at Gunnel at EHealThies.info only, because I am using my friend's email to write this. If you would like to see my pictures. From sashak at voltaire.com Mon Jan 28 04:31:46 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 12:31:46 +0000 Subject: [ofa-general] Re: [PATCH] opensm: osm_version.h not found In-Reply-To: <479DC0BA.5020009@dev.mellanox.co.il> References: <479DC0BA.5020009@dev.mellanox.co.il> Message-ID: <20080128123146.GW24344@sashak.voltaire.com> Hi Yevegeny, On 13:47 Mon 28 Jan , Yevgeny Kliteynik wrote: > > When building OpenSM I get the following error messages: > > cat: /../include/opensm/osm_version.h: No such file or directory > /bin/sh: line 0: [: 3.1.8: unary operator expected Interesting, how your build is activated (I'm not able to reproduce this in any know conditions)? > This error is not affecting the build itself - everything works fine. > The following patch fixes this error. > > I'm not sure if this applicable to ofed_1_3, but if it is, please > apply both to master and to ofed_1_3. > > Signed-off-by: Yevgeny Kliteynik > --- > opensm/opensm/Makefile.am | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/opensm/opensm/Makefile.am b/opensm/opensm/Makefile.am > index d617994..50fdae1 100644 > --- a/opensm/opensm/Makefile.am > +++ b/opensm/opensm/Makefile.am > @@ -105,7 +105,7 @@ opensminclude_HEADERS = $(srcdir)/../include/opensm/osm_base.h \ > BUILT_SOURCES = osm_version > osm_version: > if [ -x $(top_srcdir)/../gen_ver.sh ] ; then \ > - ver_file=$(srcdir)/../include/opensm/osm_version.h ; \ > + ver_file=$(top_builddir)/include/opensm/osm_version.h ; \ Isn't 'top_srcdir' would be better here? Sasha > osm_ver=`cat $$ver_file | sed -ne '/#define OSM_VERSION /s/^.*\"OpenSM \(.*\)\"$$/\1/p'` ; \ > ver=`$(top_srcdir)/../gen_ver.sh $(PACKAGE)` ; \ > if [ $$ver != $$osm_ver ] ; then \ > -- > 1.5.1.4 > > From dwrwosenm at rwosen.com Mon Jan 28 04:21:33 2008 From: dwrwosenm at rwosen.com (Osvaldo Schaffer) Date: Mon, 28 Jan 2008 20:21:33 +0800 Subject: [ofa-general] Forget about visiting local drugstores. Order your meds online. Message-ID: <01c861eb$5fa64480$8bf6207d@dwrwosenm> Canadian Pharmacy drugstore offers 100% generic medications which are proven alternatives to more expensive brand Canadian Pharmacy prescription meds. With the click of your mouse you can order cheap medications without embarrassment or time waste. Fast and discreet shipping and delivery. A large selection of medications. http://geocities.com/vernasherman627 �CanadianPharmacy� is the best Canadian drugstore online. Jesse Nord From bart.vanassche at gmail.com Mon Jan 28 04:31:39 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Mon, 28 Jan 2008 13:31:39 +0100 Subject: [Scst-devel] [ofa-general] OFED 1.3 RC2 release is available In-Reply-To: <479DC928.4020402@vlnb.net> References: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> <479DC0B4.9040902@vlnb.net> <479DC928.4020402@vlnb.net> Message-ID: On Jan 28, 2008 1:23 PM, Vladislav Bolkhovitin wrote: > But that won't change anything. The problem will be simply inverted: > there will be a possibility to run a SRPT driver compiled for a wrong > OFED version. The SRP target driver indeed relies on several OFED kernel headers, but these kernel headers are included in the mainstream Linux kernel since some time. When I need OFED kernel modules, I use the modules included with the mainstream Linux kernel and not those included with the OFED distribution. With regard to distribution of kernel code that is newer than the most recent mainstream Linux kernel I prefer the model followed by the realtime community: do not distribute the whole source tree but publish an up-to-date patch every time a new kernel version is released. See also http://www.kernel.org/pub/linux/kernel/projects/rt/. Keeping such a patch up to date is more work but is a lot easier to review than having to compare source trees. Bart. From sashak at voltaire.com Mon Jan 28 04:42:50 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 12:42:50 +0000 Subject: [ofa-general] Re: [PATCH v2] opensm/osm_ucast_ftree.c: ignore port 0 and loopbacks on swithces In-Reply-To: <479D0E9D.7020005@dev.mellanox.co.il> References: <479D0E9D.7020005@dev.mellanox.co.il> Message-ID: <20080128124250.GX24344@sashak.voltaire.com> On 01:07 Mon 28 Jan , Yevgeny Kliteynik wrote: > Hi Sasha, > > Fat-tree routing should ignore port 0 and loopback > connections on switches when populating its db. > > Please apply to ofed_1_3 and master. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From sashak at voltaire.com Mon Jan 28 04:46:03 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 12:46:03 +0000 Subject: [ofa-general] Re: [PATCH] opensm: osm_version.h not found In-Reply-To: <20080128123146.GW24344@sashak.voltaire.com> References: <479DC0BA.5020009@dev.mellanox.co.il> <20080128123146.GW24344@sashak.voltaire.com> Message-ID: <20080128124603.GY24344@sashak.voltaire.com> On 12:31 Mon 28 Jan , Sasha Khapyorsky wrote: > > > > When building OpenSM I get the following error messages: > > > > cat: /../include/opensm/osm_version.h: No such file or directory > > /bin/sh: line 0: [: 3.1.8: unary operator expected > > Interesting, how your build is activated (I'm not able to reproduce this > in any know conditions)? Actually I know where we can got it. It is when build tree is configured separately from the source tree. The patch (with top_builddir) makes sense then. Sasha From sashak at voltaire.com Mon Jan 28 04:48:58 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 12:48:58 +0000 Subject: [ofa-general] Re: [PATCH] opensm: osm_version.h not found In-Reply-To: <479DC0BA.5020009@dev.mellanox.co.il> References: <479DC0BA.5020009@dev.mellanox.co.il> Message-ID: <20080128124858.GZ24344@sashak.voltaire.com> On 13:47 Mon 28 Jan , Yevgeny Kliteynik wrote: > Hi Sasha, > > When building OpenSM I get the following error messages: > > cat: /../include/opensm/osm_version.h: No such file or directory > /bin/sh: line 0: [: 3.1.8: unary operator expected > > This error is not affecting the build itself - everything works fine. > The following patch fixes this error. > > I'm not sure if this applicable to ofed_1_3, but if it is, please > apply both to master and to ofed_1_3. > > Signed-off-by: Yevgeny Kliteynik Applied. Thanks. Sasha From kliteyn at dev.mellanox.co.il Mon Jan 28 04:47:13 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Mon, 28 Jan 2008 14:47:13 +0200 Subject: [ofa-general] Re: [PATCH] opensm: osm_version.h not found In-Reply-To: <20080128124603.GY24344@sashak.voltaire.com> References: <479DC0BA.5020009@dev.mellanox.co.il> <20080128123146.GW24344@sashak.voltaire.com> <20080128124603.GY24344@sashak.voltaire.com> Message-ID: <479DCED1.9020706@dev.mellanox.co.il> Sasha Khapyorsky wrote: > On 12:31 Mon 28 Jan , Sasha Khapyorsky wrote: >>> When building OpenSM I get the following error messages: >>> >>> cat: /../include/opensm/osm_version.h: No such file or directory >>> /bin/sh: line 0: [: 3.1.8: unary operator expected >> Interesting, how your build is activated (I'm not able to reproduce this >> in any know conditions)? > > Actually I know where we can got it. It is when build tree is > configured separately from the source tree. The patch (with top_builddir) > makes sense then. Exactly. I keep the sources in the path that is accessible from all the machines, and build it separately on each machine. -- Yevgeny > Sasha > From ogerlitz at voltaire.com Mon Jan 28 04:48:00 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Mon, 28 Jan 2008 14:48:00 +0200 (IST) Subject: [ofa-general] Re: [PATCH] ib/ipoib: handle Gratuitous ARP & bonding failover race also for connected mode neighbours In-Reply-To: References: Message-ID: On Thu, 17 Jan 2008, Or Gerlitz wrote: > move a little up the code that checks for a situation where the remote GID stored in the ipoib_neigh is > different than the one present in the neighbour (handle Gratuitous ARP) or that a bonding fail over has > happened but the neighbour still has a pointer to an ipoib_neigh created not by the current slave. This > will cause the driver to apply the check also for connected mode neighbours. OK, Roland, I'd am now confident that this patch is needed, see below the reasonings, please apply to 2.6.25, later I will send it also to -stable, here goes: Basically ipoib-cm is not totaly broken wrt to bonding AND connect mode --without-- this patch being applied, but OTOH it does not function at it should. My setup has a client node xmitting udp unicast to a server node where the server node is bonded (ib0 and ib1 are enslaved by bond0). I tried three types of fail-overs where each one of them causes the bonding at the server node to send gratuitous ARP where without this patch no act is taken by ipoib at the client side A) using "primary slave up" (*) B) taking an interface down C) taking a port down In the "primary slave up" fail-over case, since the non-active slave interface is up and running, the traffic keeps going through it, so forever at the client side there's a neighbour pointing to GID X where the traffic goes to (the QP associated with) GID Y. In the interface down fail-over case, the going down code closes the RX QP, since the connected mode (cm) is implemented over RC (...) this causes a send completion with IB_WC_RETRY_EXC_ERR error to be generated by the HCA, ipoib_cm_handle_tx_wc calls ipoib_neigh_free and when the next xmit is called from the stack, ipoib creates a new ipoib_neigh, this time against the correct GID In the port going down case, again the RC implementation causes the retry exceeded error to take place and from here its the same as in the previous case. Other then all the above, gratitious ARP is used in other HA schemes such as floating IP address between I/O targets, since the connected mode ignores it, this scheme will not work without the patch. Or (*) the bonding HA mode enables you to select a primary slave which once up would be moved to be the active slave. So to cause this failover, I take the primary (eg ib0) down, and then fail-over happens to the second slave (eg ib1), now I take the primary up and a second fail-over happens. From jackm at dev.mellanox.co.il Mon Jan 28 05:14:39 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Mon, 28 Jan 2008 15:14:39 +0200 Subject: [ofa-general] Zero byte rdma read causes REM_OP_ERROR In-Reply-To: <479C81C8.6060106@dev.mellanox.co.il> References: <3307cdf90801242007y3ace39ccrb72d5f35c3a937e4@mail.gmail.com> <3307cdf90801251254p5983b62x687549bb793db39d@mail.gmail.com> <479C81C8.6060106@dev.mellanox.co.il> Message-ID: <200801281514.40097.jackm@dev.mellanox.co.il> On Sunday 27 January 2008 15:06, Dotan Barak wrote: > Rajouri Jammu wrote: > > I'm using rdma_cm and I don't set the qp_access_flags explicitly. > > > > I presume they are set correctly since non-zero length rdma reads > > complete successfully. I have also verified the data. > > > > the only place I set the privileges is when registering the memory > > region and I have them set at > > IBV_ACCESS_LOCAL_WRITE, _REMOTE_READ and _REMOTE_WRITE To send perform a zero-byte RDMA-read/write, you should assemble a WQE with no scatter/gather entries (see IBSPEC 1.2, volume 1, section 11.4.1.1, table 94 -- Work Request Modifier Matrix, footnote b). A s/g entry with its length field = 0 is interpreted as requesting 2 gigabytes. Is this the problem? - Jack From jackm at dev.mellanox.co.il Mon Jan 28 05:14:39 2008 From: jackm at dev.mellanox.co.il (Jack Morgenstein) Date: Mon, 28 Jan 2008 15:14:39 +0200 Subject: [ofa-general] Zero byte rdma read causes REM_OP_ERROR In-Reply-To: <479C81C8.6060106@dev.mellanox.co.il> References: <3307cdf90801242007y3ace39ccrb72d5f35c3a937e4@mail.gmail.com> <3307cdf90801251254p5983b62x687549bb793db39d@mail.gmail.com> <479C81C8.6060106@dev.mellanox.co.il> Message-ID: <200801281514.40097.jackm@dev.mellanox.co.il> On Sunday 27 January 2008 15:06, Dotan Barak wrote: > Rajouri Jammu wrote: > > I'm using rdma_cm and I don't set the qp_access_flags explicitly. > > > > I presume they are set correctly since non-zero length rdma reads > > complete successfully. I have also verified the data. > > > > the only place I set the privileges is when registering the memory > > region and I have them set at > > IBV_ACCESS_LOCAL_WRITE, _REMOTE_READ and _REMOTE_WRITE To send perform a zero-byte RDMA-read/write, you should assemble a WQE with no scatter/gather entries (see IBSPEC 1.2, volume 1, section 11.4.1.1, table 94 -- Work Request Modifier Matrix, footnote b). A s/g entry with its length field = 0 is interpreted as requesting 2 gigabytes. Is this the problem? - Jack From vst at vlnb.net Mon Jan 28 03:47:00 2008 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Mon, 28 Jan 2008 14:47:00 +0300 Subject: ***SPAM*** Re: [Scst-devel] [ofa-general] OFED 1.3 RC2 release is available In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> Message-ID: <479DC0B4.9040902@vlnb.net> Bart Van Assche wrote: > On Jan 16, 2008 5:22 PM, Tziporet Koren wrote: > >>Hi, >>OFED 1.3 RC2 release is available on >>http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc2.tgz >> >>To get BUILD_ID run ofed_info >> >>Please report any issues in bugzilla https://bugs.openfabrics.org/ >>The RC3 release is expected on January 30 > > > Apparently OFED 1.3 includes SRP target support ? Although I consider > SRP target support as a very valuable contribution, it should not be > included in the OFED distribution but in the SCST distribution. The > reason is that the SRP target relies on SCST interfaces that can > potentially change with each new SCST release. Consider e.g. the > scsi_tgt.h header file, which defines the interface between SCST core > and SCST mid-level modules. The version of this file included with > git://git.openfabrics.org/~vu/ofed_1_3.git (0.9.6-pre3) is > incompatible with the latest scsi_tgt.h file from the SCST project > (0.9.6-rc1). This may cause kernel crashes for OFED 1.3 SRP target > users who combine OFED 1.3 with the latest SCST version. Now it won't crash, it will refuse to run. I've recently added in SCST protection against attempts running mixed SCST and target driver versions. BTW, there is a **************************************************************** *!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!* *!! !!* *!! BIG FAT WARNING ABOUT MIXED VERSIONS PROBLEM !!* *!! !!* *!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!* **************************************************************** during SCST installation. Did you ignored it? Is it not BIG FAT enough? > Sorry for this late notice. > > Bart. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Scst-devel mailing list > Scst-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scst-devel > From vst at vlnb.net Mon Jan 28 04:23:04 2008 From: vst at vlnb.net (Vladislav Bolkhovitin) Date: Mon, 28 Jan 2008 15:23:04 +0300 Subject: ***SPAM*** Re: [Scst-devel] [ofa-general] OFED 1.3 RC2 release is available In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> <479DC0B4.9040902@vlnb.net> Message-ID: <479DC928.4020402@vlnb.net> Bart Van Assche wrote: > On Jan 28, 2008 12:47 PM, Vladislav Bolkhovitin wrote: > >>Bart Van Assche wrote: >> >>>Apparently OFED 1.3 includes SRP target support ? Although I consider >>>SRP target support as a very valuable contribution, it should not be >>>included in the OFED distribution but in the SCST distribution. The >>>reason is that the SRP target relies on SCST interfaces that can >>>potentially change with each new SCST release. Consider e.g. the >>>scsi_tgt.h header file, which defines the interface between SCST core >>>and SCST mid-level modules. The version of this file included with >>>git://git.openfabrics.org/~vu/ofed_1_3.git (0.9.6-pre3) is >>>incompatible with the latest scsi_tgt.h file from the SCST project >>>(0.9.6-rc1). This may cause kernel crashes for OFED 1.3 SRP target >>>users who combine OFED 1.3 with the latest SCST version. >> >>No it won't crash, it will refuse to run. I've recently added in SCST >>protection against attempts running mixed SCST and target driver versions. >> >>BTW, there is a >> >>**************************************************************** >>*!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!* >>*!! !!* >>*!! BIG FAT WARNING ABOUT MIXED VERSIONS PROBLEM !!* >>*!! !!* >>*!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!* >>**************************************************************** > > > Hello Vladislav, > > I did not test the above scenario -- what I wrote was the result of > source reading. It is very good that interface versions are checked > inside SCST before mid-level drivers are used. Even with interface > version checking in place, my opinion is that the SRP target code > should be included in the SCST project and not in the OFED project. But that won't change anything. The problem will be simply inverted: there will be a possibility to run a SRPT driver compiled for a wrong OFED version. > Bart. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Scst-devel mailing list > Scst-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scst-devel > From jlentini at netapp.com Mon Jan 28 07:14:22 2008 From: jlentini at netapp.com (James Lentini) Date: Mon, 28 Jan 2008 10:14:22 -0500 (EST) Subject: [ofa-general] Status of NFS-RDMA ? In-Reply-To: <20080126193035.GA21209@cefeid.wcss.wroc.pl> References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> <4797AD59.2000206@mellanox.co.il> <20080126193035.GA21209@cefeid.wcss.wroc.pl> Message-ID: On Sat, 26 Jan 2008, Pawel Dziekonski wrote: > I'm still writing to this list because > nfs-rdma-devel at lists.sourceforge.net seems to be dead... :( We'll investigate that. > I pulled Tom's tree from new url and build a kernel. If you enabled support for INFINIBAND drivers (IB and iWARP support) and NFS client/server support, the kernel should be ready to go (run "grep RDMA /your_kernel_sources/.config" to confirm that CONFIG_SUNRPC_XPRT_RDMA is either m or y). NFS/RDMA doesn't require OFED be installed. OFED is a release of the Linux kernel sources and some userspace libraries/tools. If you are using InfiniBand adapters, you'll need a subnet manager. OFED contains an sm called OpenSM. You can get OpenSM via OFED or you can download OpenSM separately. It is up to you. > then I downloaded OFED from > http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz, I don't know what the above URL contains. The latest code is in Tom Tucker's tree (and now NFS server maintainer Bruce Fields tree). It is being merged in 2.6.25. You'll be able to get the server code from a kernel.org rc for 2.6.25. In the meantime, you'll have to use the development repos to get the server. > however ofa-kernel fails to build. whatever I do I always got in > ofa-kernel: > > make[1]: Entering directory `/usr/src/ib/xprt-switch-2.6' > test -e include/linux/autoconf.h -a -e include/config/auto.conf || ( \ > echo; \ > echo " ERROR: Kernel configuration is invalid."; \ > echo " include/linux/autoconf.h or include/config/auto.conf are missing."; \ > echo " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ > echo; \ > /bin/false) > > obviously, doing 'make oldconfig && make prepare' does not help. > anyway, above mentioned files do exist: > > # ls -la /usr/src/ib/xprt-switch-2.6/{include/linux/autoconf.h,include/config/auto.conf} > -rw-r--r-- 1 root root 10156 Jan 25 17:42 /usr/src/ib/xprt-switch-2.6/include/config/auto.conf > -rw-r--r-- 1 root root 14733 Jan 25 17:42 /usr/src/ib/xprt-switch-2.6/include/linux/autoconf.h > > despite of above, compilation continues but fails with: > > gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.mad.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/3.4.6/include -D__KERNEL__ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/include -Iinclude -include include/linux/autoconf.h -include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -Wdeclaration-after-statement -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mad)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.tmp_mad.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c: In function `ib_mad_init_module': > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c:2966: error: too many arguments to function `kmem_cache_create' > make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.o] Error 1 > make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core] Error 2 > make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband] Error 2 > make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error 2 > make[1]: Leaving directory `/usr/src/ib/xprt-switch-2.6' > make: *** [kernel] Error 2 > error: Bad exit status from /var/tmp/rpm-tmp.3877 (%install) > > full log: > https://cefeid.wcss.wroc.pl/d/tmp/OFED.build.32122.log > > thanks in advance, P > > -- > Pawel Dziekonski > Wroclaw Centre for Networking & Supercomputing, HPC Department > Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND > phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl > From swise at opengridcomputing.com Mon Jan 28 07:26:47 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Mon, 28 Jan 2008 09:26:47 -0600 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: References: Message-ID: <479DF437.4000309@opengridcomputing.com> Roland Dreier wrote: > > I think the patch below should be a fix for the problem, although I've > only compile tested it. The idea is to stop increasing shift once it > reaches a bit position where the first buffer and the iova differ. > What do you think? If this works for you, I will merge it for 2.6.25. > > diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c > index 6bcde1c..1a15129 100644 > --- a/drivers/infiniband/hw/mthca/mthca_provider.c > +++ b/drivers/infiniband/hw/mthca/mthca_provider.c > @@ -948,7 +948,9 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, > return ERR_PTR(-EINVAL); > > /* Find largest page shift we can use to cover buffers */ > - for (shift = PAGE_SHIFT; shift < 31; ++shift) > + for (shift = PAGE_SHIFT; shift < 31; ++shift) { > + if ((buffer_list[0].addr ^ *iova_start) & (1ULL << shift)) > + break; > if (num_phys_buf > 1) { > if ((1ULL << shift) & mask) > break; > @@ -958,6 +960,7 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, > (buffer_list[0].addr & ((1ULL << shift) - 1))) > break; > } > + } > > buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); > buffer_list[0].addr &= ~0ull << shift; I wonder if we should make this functionality common and put it in infiniband/drivers/core somewhere? I think mthca and cxgb3 need the same service. Probably ehca and nes too... Steve. From tziporet at dev.mellanox.co.il Mon Jan 28 07:49:30 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 28 Jan 2008 17:49:30 +0200 Subject: [ofa-general] OFED meeting agenda for today (Jan 28) meeting Message-ID: <479DF98A.5080907@mellanox.co.il> This is the agenda for the meeting today: 1. OFED 1.3 readiness toward RC3 this week - all 2. Decide regarding IBM request to change IPoIB to support CM without SRQ (see http://lists.openfabrics.org/pipermail/ewg/2008-January/005507.html) 3. Review high priority bugs: 846 critical RHEL 5 jim at mellanox.com SDP crash on RHEL5 ppc64 running netserver 859 critical SLES 10 monis at voltaire.com Bonding configuration on Sles10 sp1 is not loaded consistently 863 critical RHEL 4 monis at voltaire.com ib-bonding won't compile for RHEL4 U6 874 critical Other rjwalsh at pathscale.com Intel MPI (IMB test) hangs intermittently on the qlogic HCA 760 major All eli at mellanox.co.il UDP performance on Rx is lower than Tx 761 major Other eli at mellanox.co.il Poor and jittery UDP performance at small messages 869 major SLES 10 orenk at dev.mellanox.co.il mstflint won't biuld on SLES10 x86 736 major Other rolandd at cisco.com IBV_WC_RETRY_EXC_ERR errors with local rdma_reads 767 major Other swise at opengridcomputing.com Non backport Kernels that don't build in genalloc cause compile errors for cxgb3 Tziporet -------------- next part -------------- An HTML attachment was scrubbed... URL: From stijn.desmet at intec.ugent.be Mon Jan 28 07:52:56 2008 From: stijn.desmet at intec.ugent.be (Stijn De Smet) Date: Mon, 28 Jan 2008 16:52:56 +0100 Subject: [ofa-general] Bonding and hw_csum Message-ID: <479DFA58.7050800@intec.ugent.be> Hello, I'm trying to get IPOIB bonding to work with the hw_csum enabled. I'm using OFED-1.3-RC2 with Mellanox MT25208 Infinihost IIIEx. If I enabled the bonding together with the hw_csum, on some hosts iperf works in both directions and both machines can be iperf server or client. On another machine(all the same hardware(IBM x3655)/software), it doesn't work, but sometimes restarting openibd(so unloading/loading the modules) helps. Sometimes the host only works as iperf server or client, but not as both. If I start puling and replugging cables from the machines, in 90% of the cases the bond fails(so the first failover works, failback not). When I disable hw_csums, I can start iperf's, pull and replug all cables and the iperf's run uninterrupted. Regards, Stijn From rdreier at cisco.com Mon Jan 28 07:41:16 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 28 Jan 2008 07:41:16 -0800 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: <479DF437.4000309@opengridcomputing.com> (Steve Wise's message of "Mon, 28 Jan 2008 09:26:47 -0600") References: <479DF437.4000309@opengridcomputing.com> Message-ID: > I wonder if we should make this functionality common and put it in > infiniband/drivers/core somewhere? I think mthca and cxgb3 need the > same service. Probably ehca and nes too... Yeah, if it can be factored out cleanly then that would definitely be a nice thing to do. From jlentini at netapp.com Mon Jan 28 07:54:36 2008 From: jlentini at netapp.com (James Lentini) Date: Mon, 28 Jan 2008 10:54:36 -0500 (EST) Subject: [ofa-general] Status of NFS-RDMA ? In-Reply-To: References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> <4797AD59.2000206@mellanox.co.il> <20080126193035.GA21209@cefeid.wcss.wroc.pl> Message-ID: On Mon, 28 Jan 2008, James Lentini wrote: > > On Sat, 26 Jan 2008, Pawel Dziekonski wrote: > > > I'm still writing to this list because > > nfs-rdma-devel at lists.sourceforge.net seems to be dead... :( > > We'll investigate that. I successfully sent a test message to the list. Let me know if you have problems in the future. From rf at q-leap.de Mon Jan 28 08:09:42 2008 From: rf at q-leap.de (Roland Fehrenbacher) Date: Mon, 28 Jan 2008 17:09:42 +0100 Subject: [ofa-general] Problem with ConnectX HBA Message-ID: <18333.65094.565387.209505@gargle.gargle.HOWL> Hi, when running MPI codes, we have the following error messages coming from some of our servers running 2.6.22.16 with kernel modules from ofa_kernel-1.2.5.4: mlx4_core 0000:08:00.0: SW2HW_MPT failed (-16) The communication on the corresponding machines is completely blocked, and ibstat is just hanging. Any idea what could be wrong? Just for additional info: When running the kernel with the original 2.6.22 drivers, I had these kind of error messages at a much higher rate. Thanks, Roland From edenconservatories.com at orthosportinc.com Mon Jan 28 08:29:50 2008 From: edenconservatories.com at orthosportinc.com (Ernesto Rodriguez) Date: Mon, 28 Jan 2008 17:29:50 +0100 Subject: [ofa-general] Autodesk 3D Studio Max 9 for XP for 149, Reta!ls @ 6720 (save 6590) Message-ID: <000601c861cb$147c3980$0100007f@fyhsktk> readiris pro 11.5 for mac - 39 intuit quicken home and business 2008 - 39 Type 'softnugood. com' in your |E (w/o spaces and quotes) avid liquid pro 7 - 69 corel wordperfect office x3 standard - 49 autodesk autocad lt 2008 - 69 systran 6 premium translator - 159 ulead photoimpact 12 - 79 coreldraw graphics suite 12 - 49 discreet combustion 4.0 for windows - 69 autodesk autocad 2008 - 129 2008 microsoft office beta for mac - 79 adobe encore dvd 2 - 49 Save up to 74-90%! From rosnbrg at us.ibm.com Mon Jan 28 08:55:00 2008 From: rosnbrg at us.ibm.com (Bryan S Rosenburg) Date: Mon, 28 Jan 2008 11:55:00 -0500 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: Message-ID: Roland Dreier wrote on 01/27/2008 04:33:33 PM: > got it... I was tricking myself that the check for alignment at the > start of the function was sufficient, but it's not once we start using > a bigger value of shift. > > I think the patch below should be a fix for the problem, although I've > only compile tested it. The idea is to stop increasing shift once it > reaches a bit position where the first buffer and the iova differ. > What do you think? If this works for you, I will merge it for 2.6.25. I've tested your fix with a two-element buffer list and various combinations of virtual and physical alignments. It works as intended, allowing a larger page size (and corresponding expansion of the physical region) if and only if the virtual and physical addresses are sufficiently compatible. I think you can simplify the code somewhat if you incorporate the (virtual ^ physical) alignment characterization into the mask. Here's an alternative patch. The initial alignment check gets subsumed into the later mask alignment check. ================================================================================ diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c index 6bcde1c..4e04b4f 100644 --- a/drivers/infiniband/hw/mthca/mthca_provider.c +++ b/drivers/infiniband/hw/mthca/mthca_provider.c @@ -929,11 +929,7 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, int err; int i, j, n; - /* First check that we have enough alignment */ - if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & ~PAGE_MASK)) - return ERR_PTR(-EINVAL); - - mask = 0; + mask = buffer_list[0].addr ^ *iova_start; total_size = 0; for (i = 0; i < num_phys_buf; ++i) { if (i != 0) @@ -948,16 +944,16 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, return ERR_PTR(-EINVAL); /* Find largest page shift we can use to cover buffers */ - for (shift = PAGE_SHIFT; shift < 31; ++shift) - if (num_phys_buf > 1) { - if ((1ULL << shift) & mask) - break; - } else { + for (shift = PAGE_SHIFT; shift < 31; ++shift) { + if ((1ULL << shift) & mask) + break; + if (num_phys_buf == 1) { if (1ULL << shift >= buffer_list[0].size + (buffer_list[0].addr & ((1ULL << shift) - 1))) break; } + } buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); buffer_list[0].addr &= ~0ull << shift; ================================================================================ I'm still annoyed by the (num_phys_buf == 1) special case. I'm wondering if it's still needed. If you leave out that if-statement entirely, you may end up using a page size that is larger (maybe much larger) than necessary, but I think things will still work, given that the virtual-to-physical alignment constraints are respected. If you remove the special case, you can replace the whole loop with an ffs() call. Anyway, your patch works fine. Use my suggestion only if you like it. - Bryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From swise at opengridcomputing.com Mon Jan 28 08:55:00 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Mon, 28 Jan 2008 10:55:00 -0600 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: References: Message-ID: <479E08E4.4010707@opengridcomputing.com> Bryan S Rosenburg wrote: > > Roland Dreier wrote on 01/27/2008 04:33:33 PM: > > got it... I was tricking myself that the check for alignment at the > > start of the function was sufficient, but it's not once we start using > > a bigger value of shift. > > > > I think the patch below should be a fix for the problem, although I've > > only compile tested it. The idea is to stop increasing shift once it > > reaches a bit position where the first buffer and the iova differ. > > What do you think? If this works for you, I will merge it for 2.6.25. > > I've tested your fix with a two-element buffer list and various > combinations of virtual and physical alignments. It works as intended, > allowing a larger page size (and corresponding expansion of the physical > region) if and only if the virtual and physical addresses are > sufficiently compatible. > > I think you can simplify the code somewhat if you incorporate the > (virtual ^ physical) alignment characterization into the mask. Here's > an alternative patch. The initial alignment check gets subsumed into > the later mask alignment check. > > ================================================================================ > > > diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c > b/drivers/infiniband/hw/mthca/mthca_provider.c > index 6bcde1c..4e04b4f 100644 > --- a/drivers/infiniband/hw/mthca/mthca_provider.c > +++ b/drivers/infiniband/hw/mthca/mthca_provider.c > @@ -929,11 +929,7 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd > *pd, > int err; > int i, j, n; > > - /* First check that we have enough alignment */ > - if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & > ~PAGE_MASK)) > - return ERR_PTR(-EINVAL); > - > - mask = 0; > + mask = buffer_list[0].addr ^ *iova_start; > total_size = 0; > for (i = 0; i < num_phys_buf; ++i) { > if (i != 0) > @@ -948,16 +944,16 @@ static struct ib_mr *mthca_reg_phys_mr(struct > ib_pd *pd, > return ERR_PTR(-EINVAL); > > /* Find largest page shift we can use to cover buffers */ > - for (shift = PAGE_SHIFT; shift < 31; ++shift) > - if (num_phys_buf > 1) { > - if ((1ULL << shift) & mask) > - break; > - } else { > + for (shift = PAGE_SHIFT; shift < 31; ++shift) { > + if ((1ULL << shift) & mask) > + break; > + if (num_phys_buf == 1) { > if (1ULL << shift >= > buffer_list[0].size + > (buffer_list[0].addr & ((1ULL << shift) - 1))) > break; > } > + } > > buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); > buffer_list[0].addr &= ~0ull << shift; > ================================================================================ > > > I'm still annoyed by the (num_phys_buf == 1) special case. I'm > wondering if it's still needed. If you leave out that if-statement > entirely, you may end up using a page size that is larger (maybe much > larger) than necessary, but I think things will still work, given that > the virtual-to-physical alignment constraints are respected. If you > remove the special case, you can replace the whole loop with an ffs() call. > > Anyway, your patch works fine. Use my suggestion only if you like it. > > - Bryan > So is cxgb3 still busted? IE I still need that additional patch you sent? Perhaps I can align cxgb3 and mthca to the same logic. Maybe create a core helper function... Steve. From tempsh at qq.com Mon Jan 28 08:58:46 2008 From: tempsh at qq.com (Ivan Denton) Date: Mon, 28 Jan 2008 17:58:46 +0100 Subject: [ofa-general] Medications that you need. Message-ID: <01c861d7$6d6222b0$49237b50@tempsh> Buy Must Have medications at Canada based pharmacy. No prescription at all! Same quality! Save your money, buy pills immediately! http://geocities.com/elbajoyner944 We provide confidential and secure purchase! From huongvp at yahoo.com Mon Jan 28 09:07:27 2008 From: huongvp at yahoo.com (Vu Pham) Date: Mon, 28 Jan 2008 09:07:27 -0800 Subject: [Scst-devel] [ofa-general] OFED 1.3 RC2 release is available In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C9031E655B@mtlexch01.mtl.com> <479DC0B4.9040902@vlnb.net> Message-ID: <479E0BCF.3010108@yahoo.com> Bart Van Assche wrote: > On Jan 28, 2008 12:47 PM, Vladislav Bolkhovitin wrote: >> Bart Van Assche wrote: >>> Apparently OFED 1.3 includes SRP target support ? Although I consider >>> SRP target support as a very valuable contribution, it should not be >>> included in the OFED distribution but in the SCST distribution. The >>> reason is that the SRP target relies on SCST interfaces that can >>> potentially change with each new SCST release. Consider e.g. the >>> scsi_tgt.h header file, which defines the interface between SCST core >>> and SCST mid-level modules. The version of this file included with >>> git://git.openfabrics.org/~vu/ofed_1_3.git (0.9.6-pre3) is >>> incompatible with the latest scsi_tgt.h file from the SCST project >>> (0.9.6-rc1). This may cause kernel crashes for OFED 1.3 SRP target >>> users who combine OFED 1.3 with the latest SCST version. >> No it won't crash, it will refuse to run. I've recently added in SCST >> protection against attempts running mixed SCST and target driver versions. >> >> BTW, there is a >> >> **************************************************************** >> *!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!* >> *!! !!* >> *!! BIG FAT WARNING ABOUT MIXED VERSIONS PROBLEM !!* >> *!! !!* >> *!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!* >> **************************************************************** > > Hello Vladislav, > > I did not test the above scenario -- what I wrote was the result of > source reading. It is very good that interface versions are checked > inside SCST before mid-level drivers are used. Even with interface > version checking in place, my opinion is that the SRP target code > should be included in the SCST project and not in the OFED project. > > Bart. Hi Bart, On srpt readme file, the prerequisite is install SCST BEFORE ofed-1.3 or like Vlad warning "recompiling ofed" if you install scst after install ofed. here is one of the reason srpt is part of ofed not scst: SCST is GPL ofed + srpt is GPL or BSD -vu From hrosenstock at xsigo.com Mon Jan 28 10:13:28 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 28 Jan 2008 10:13:28 -0800 Subject: [ofa-general] [PATCH] [TRIVIAL] opensm/osm_subnet.c: Better clarity in opensm.opts file for perfmgr_redir Message-ID: <1201544008.19045.23.camel@hrosenstock-ws.xsigo.com> opensm/osm_subnet.c: Better clarity in opensm.opts file for perfmgr_redir Signed-off-by: Hal Rosenstock diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index 01a2a7a..8ae4333 100644 --- a/opensm/opensm/osm_subnet.c +++ b/opensm/opensm/osm_subnet.c @@ -1629,7 +1629,7 @@ ib_api_status_t osm_subn_write_conf_file(IN osm_subn_opt_t * const p_opts) "#\n# Performance Manager Options\n#\n" "# perfmgr enable\n" "perfmgr %s\n\n" - "# perfmgr_redir enable\n" + "# perfmgr redirection enable\n" "perfmgr_redir %s\n\n" "# sweep time in seconds\n" "perfmgr_sweep_time_s %u\n\n" From hrosenstock at xsigo.com Mon Jan 28 10:13:35 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 28 Jan 2008 10:13:35 -0800 Subject: [ofa-general] [PATCH][TRIVIAL] opensm/osm_perfmgr.c: Fix duplicated error code Message-ID: <1201544015.19045.24.camel@hrosenstock-ws.xsigo.com> opensm/osm_perfmgr.c: Fix duplicated error code Signed-off-by: Hal Rosenstock diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index 091b46a..77d8f33 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -210,7 +210,7 @@ osm_perfmgr_mad_send_err_callback(void *bind_context, osm_madw_t * p_madw) if ((p_node = cl_qmap_get(&(pm->monitored_map), node_guid)) == cl_qmap_end(&(pm->monitored_map))) { osm_log(pm->log, OSM_LOG_ERROR, - "osm_pc_rcv_process: ERR 4C12: GUID 0x%016" + "osm_pc_rcv_process: ERR 4C15: GUID 0x%016" PRIx64 " not found in monitored map\n", node_guid); goto Exit; From hrosenstock at xsigo.com Mon Jan 28 10:13:42 2008 From: hrosenstock at xsigo.com (Hal Rosenstock) Date: Mon, 28 Jan 2008 10:13:42 -0800 Subject: [ofa-general] [PATCH] opensm/osm_perfmgr.c: If redirection requested but disabled, don't rerequest Message-ID: <1201544022.19045.25.camel@hrosenstock-ws.xsigo.com> opensm/osm_perfmgr.c: If redirection requested but disabled, don't rerequest Signed-off-by: Hal Rosenstock diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index 091b46a..674d8a5 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -1166,8 +1166,11 @@ static void osm_pc_rcv_process(void *context, void *data) goto Exit; } - if (!pm->subn->opt.perfmgr_redir) - goto ReIssue; + if (!pm->subn->opt.perfmgr_redir) { + osm_log(pm->log, OSM_LOG_ERROR, + "osm_pc_rcv_process: ERR 4C16: redirection requested but disabled\n"); + goto Exit; + } /* LID redirection support (easier than GID redirection) */ cl_plock_acquire(pm->lock); From dwsepschoolm at sepschool.org Mon Jan 28 10:13:26 2008 From: dwsepschoolm at sepschool.org (Matthew Maher) Date: Mon, 28 Jan 2008 20:13:26 +0200 Subject: [ofa-general] Enjoy gambling in the most reliable online casino! Message-ID: <01c861ea$3d5ff700$9f716856@dwsepschoolm> Play the most popular casino games at home! Black Jack, Slots, Roulette, Poker, Craps! Just download easy to use free software, register free account and play your favorite game. Receive free $2400 bonus to start play with! Golden Gate Casino guarantees competent customer support for all players, quick response in case you have question or problem and instant payouts. Fair gaming only! http://geocities.com/jerroddaniels388 Start downloading free software now! From a-10-tion at ameritech.net Mon Jan 28 11:06:21 2008 From: a-10-tion at ameritech.net (Max Cherry) Date: Mon, 28 Jan 2008 20:06:21 +0100 Subject: [ofa-general] Your profile Message-ID: <01c861e9$400e1c80$210b8753@a-10-tion> Hello! I am bored today. I am nice girl that would like to chat with you. Email me at Emilia at EHealThies.info only, because I am using my friend's email to write this. I will show you some great pictures of me. From arlin.r.davis at intel.com Mon Jan 28 10:38:26 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Mon, 28 Jan 2008 10:38:26 -0800 Subject: [ofa-general] RE: [ewg] OFED meeting agenda for today (Jan 28) meeting In-Reply-To: <479DF98A.5080907@mellanox.co.il> References: <479DF98A.5080907@mellanox.co.il> Message-ID: <000001c861dc$f9f96670$a5e0180a@amr.corp.intel.com> Vlad, Why is dapl v1 and v2 libraries not setup for --build32 option? (see bug 824) Are there some issues needing resolved? -arlin _____ From: ewg-bounces at lists.openfabrics.org [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of Tziporet Koren Sent: Monday, January 28, 2008 7:50 AM To: EWG Cc: OpenFabrics General Subject: [ewg] OFED meeting agenda for today (Jan 28) meeting This is the agenda for the meeting today: 1. OFED 1.3 readiness toward RC3 this week - all 2. Decide regarding IBM request to change IPoIB to support CM without SRQ (see http://lists.openfabrics.org/pipermail/ewg/2008-January/005507.html) 3. Review high priority bugs: 846 critical RHEL 5 jim at mellanox.com SDP crash on RHEL5 ppc64 running netserver 859 critical SLES 10 monis at voltaire.com Bonding configuration on Sles10 sp1 is not loaded consistently 863 critical RHEL 4 monis at voltaire.com ib-bonding won't compile for RHEL4 U6 874 critical Other rjwalsh at pathscale.com Intel MPI (IMB test) hangs intermittently on the qlogic HCA 760 major All eli at mellanox.co.il UDP performance on Rx is lower than Tx 761 major Other eli at mellanox.co.il Poor and jittery UDP performance at small messages 869 major SLES 10 orenk at dev.mellanox.co.il mstflint won't biuld on SLES10 x86 736 major Other rolandd at cisco.com IBV_WC_RETRY_EXC_ERR errors with local rdma_reads 767 major Other swise at opengridcomputing.com Non backport Kernels that don't build in genalloc cause compile errors for cxgb3 Tziporet -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralph.campbell at qlogic.com Mon Jan 28 11:14:41 2008 From: ralph.campbell at qlogic.com (Ralph Campbell) Date: Mon, 28 Jan 2008 11:14:41 -0800 Subject: [ofa-general] [PATCH 13/16 v2] IB/ipoib: Add support for modify CQ In-Reply-To: <1201193855.6755.66.camel@mtls03> References: <1201193855.6755.66.camel@mtls03> Message-ID: <1201547683.27464.1.camel@brick.pathscale.com> Can you send out a specification for what ib_modify_cq() is supposed to do? It isn't part of the IBTA verbs spec. so I don't know how other HCAs are supposed to implement it. On Thu, 2008-01-24 at 18:57 +0200, Eli Cohen wrote: > Add support for modify CQ > > Add support for modifying CQ parameters for controlling > event generation moderation. > > Signed-off-by: Eli Cohen > --- > > changes: > Fix spelling mistakes > Fix function documentation > > drivers/infiniband/core/verbs.c | 7 +++++++ > include/rdma/ib_verbs.h | 11 +++++++++++ > 2 files changed, 18 insertions(+), 0 deletions(-) > > diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c > index 86ed8af..84709ed 100644 > --- a/drivers/infiniband/core/verbs.c > +++ b/drivers/infiniband/core/verbs.c > @@ -628,6 +628,13 @@ struct ib_cq *ib_create_cq(struct ib_device *device, > } > EXPORT_SYMBOL(ib_create_cq); > > +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period) > +{ > + return cq->device->modify_cq ? > + cq->device->modify_cq(cq, cq_count, cq_period) : -ENOSYS; > +} > +EXPORT_SYMBOL(ib_modify_cq); > + > int ib_destroy_cq(struct ib_cq *cq) > { > if (atomic_read(&cq->usecnt)) > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h > index 6ef1729..a8f94a9 100644 > --- a/include/rdma/ib_verbs.h > +++ b/include/rdma/ib_verbs.h > @@ -984,6 +984,8 @@ struct ib_device { > int comp_vector, > struct ib_ucontext *context, > struct ib_udata *udata); > + int (*modify_cq)(struct ib_cq *cq, u16 cq_count, > + u16 cq_period); > int (*destroy_cq)(struct ib_cq *cq); > int (*resize_cq)(struct ib_cq *cq, int cqe, > struct ib_udata *udata); > @@ -1389,6 +1391,15 @@ struct ib_cq *ib_create_cq(struct ib_device *device, > int ib_resize_cq(struct ib_cq *cq, int cqe); > > /** > + * ib_modify_cq - Modifies moderation params of the CQ > + * @cq: The CQ to modify. > + * @cq_count: number of CQEs that will trigger an event > + * @cq_period: max period of time in usec before triggering an event > + * > + */ > +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); > + > +/** > * ib_destroy_cq - Destroys the specified CQ. > * @cq: The CQ to destroy. > */ From Jeffrey.C.Becker at nasa.gov Mon Jan 28 11:40:21 2008 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Mon, 28 Jan 2008 11:40:21 -0800 Subject: [ofa-general] OFA server rebooted Message-ID: <479E2FA5.6070903@nasa.gov> Hi. After all the e-mails, I started investigating the disk space issue, and found some unkillable processes hanging on our NFS backup partition. After 479 days of uptime :-) , I figured it was a good time to reboot the server. It's back up now and the web page, wiki and bugzilla all seem to be OK. Please let me know if you find any problems. Thanks. -jeff From rosnbrg at us.ibm.com Mon Jan 28 11:23:55 2008 From: rosnbrg at us.ibm.com (Bryan S Rosenburg) Date: Mon, 28 Jan 2008 14:23:55 -0500 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: <479E08E4.4010707@opengridcomputing.com> Message-ID: Steve Wise wrote on 01/28/2008 11:55:00 AM: > So is cxgb3 still busted? IE I still need that additional patch you > sent? Perhaps I can align cxgb3 and mthca to the same logic. Maybe > create a core helper function... As far as I can tell, cxgb3 is still busted for single-element buffer lists. But you have an easy fix, which is what the first patch was about. However, with the approach you've taken, you've given up some potential uses of larger page sizes. To pick up those cases, you'd have to make the virtual address available to the page shift calculation in cxgb3:build_phys_page_list(). A core helper function shared with mthca and perhaps others sounds like a great idea to me, but I'm new to this community. I don't know how much agreement you'd have to get. - Bryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Mon Jan 28 12:04:45 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 20:04:45 +0000 Subject: [ofa-general] Re: [PATCH] [TRIVIAL] opensm/osm_subnet.c: Better clarity in opensm.opts file for perfmgr_redir In-Reply-To: <1201544008.19045.23.camel@hrosenstock-ws.xsigo.com> References: <1201544008.19045.23.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080128200444.GO11277@sashak.voltaire.com> On 10:13 Mon 28 Jan , Hal Rosenstock wrote: > opensm/osm_subnet.c: Better clarity in opensm.opts file for > perfmgr_redir > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From sashak at voltaire.com Mon Jan 28 12:05:08 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 20:05:08 +0000 Subject: [ofa-general] Re: [PATCH][TRIVIAL] opensm/osm_perfmgr.c: Fix duplicated error code In-Reply-To: <1201544015.19045.24.camel@hrosenstock-ws.xsigo.com> References: <1201544015.19045.24.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080128200508.GP11277@sashak.voltaire.com> On 10:13 Mon 28 Jan , Hal Rosenstock wrote: > opensm/osm_perfmgr.c: Fix duplicated error code > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From ardavis at ichips.intel.com Mon Jan 28 12:01:09 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Mon, 28 Jan 2008 12:01:09 -0800 Subject: [ofa-general] RE: [ewg] OFED meeting agenda for today (Jan 28) meeting In-Reply-To: <000001c861dc$f9f96670$a5e0180a@amr.corp.intel.com> References: <479DF98A.5080907@mellanox.co.il> <000001c861dc$f9f96670$a5e0180a@amr.corp.intel.com> Message-ID: <479E3485.6020204@ichips.intel.com> Arlin Davis wrote: > Vlad, > > Why is dapl v1 and v2 libraries not setup for --build32 option? (see bug > 824) Are there some issues needing resolved? I enabled the 32-bit build for uDAPL and the build works fine but we have some issues with the RPM installation due to ordering. Does the install.pl script handle ofa_pre_inst settings when both 32bit and 64bit libraries are being installed? Install dapl-v1 RPM: Running rpm -iv /tmp/OFED-1.3-rc2/RPMS/redhat-release-5Server-5.1.0.2/dapl-1.2.3-0.84.ofed20071128.x86_64.rpm Install dapl-v2 RPM: Running rpm -iv /tmp/OFED-1.3-rc2/RPMS/redhat-release-5Server-5.1.0.2/dapl-2.0.3-0.217.ofed20080115.x86_64.rpm Install dapl-devel RPM: Running rpm -iv /tmp/OFED-1.3-rc2/RPMS/redhat-release-5Server-5.1.0.2/dapl-devel-2.0.3-0.217.ofed20080115.x86_64.r pm Running rpm -iv /tmp/OFED-1.3-rc2/RPMS/redhat-release-5Server-5.1.0.2/dapl-devel-2.0.3-0.217.ofed20080115.i686.rpm Failed to install dapl-devel RPM See /tmp/OFED.8978.logs/dapl-devel.rpminstall.log error: Failed dependencies: libdaplofa.so.2 is needed by dapl-devel-2.0.3-0.217.ofed20080115.i686 libdat.so.2 is needed by dapl-devel-2.0.3-0.217.ofed20080115.i686 I also noticed that dapl-devel is not part of HPC selection. Please add. Thanks, -arlin From sweitzen at cisco.com Mon Jan 28 10:57:11 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Mon, 28 Jan 2008 10:57:11 -0800 Subject: [ofa-general] is OF bugzilla down? Message-ID: I can't seem to get to it.... Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -------------- next part -------------- An HTML attachment was scrubbed... URL: From sweitzen at cisco.com Mon Jan 28 12:05:26 2008 From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen)) Date: Mon, 28 Jan 2008 12:05:26 -0800 Subject: [ofa-general] is OF bugzilla down? In-Reply-To: References: Message-ID: It's back up now. ________________________________ From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Scott Weitzenkamp (sweitzen) Sent: Monday, January 28, 2008 10:57 AM To: ewg at lists.openfabrics.org; general at lists.openfabrics.org Subject: [ofa-general] is OF bugzilla down? I can't seem to get to it.... Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Mon Jan 28 12:18:58 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Mon, 28 Jan 2008 20:18:58 +0000 Subject: [ofa-general] Re: [PATCH] opensm/osm_perfmgr.c: If redirection requested but disabled, don't rerequest In-Reply-To: <1201544022.19045.25.camel@hrosenstock-ws.xsigo.com> References: <1201544022.19045.25.camel@hrosenstock-ws.xsigo.com> Message-ID: <20080128201858.GQ11277@sashak.voltaire.com> On 10:13 Mon 28 Jan , Hal Rosenstock wrote: > opensm/osm_perfmgr.c: If redirection requested but disabled, don't > rerequest > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha From eli at dev.mellanox.co.il Mon Jan 28 13:04:54 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Mon, 28 Jan 2008 23:04:54 +0200 Subject: [ofa-general] [PATCH 13/16 v2] IB/ipoib: Add support for modify CQ In-Reply-To: <1201547683.27464.1.camel@brick.pathscale.com> References: <1201193855.6755.66.camel@mtls03> <1201547683.27464.1.camel@brick.pathscale.com> Message-ID: <20080128210450.GA26001@mtls03> On Mon, Jan 28, 2008 at 11:14:41AM -0800, Ralph Campbell wrote: > Can you send out a specification for what ib_modify_cq() > is supposed to do? It isn't part of the IBTA verbs spec. > so I don't know how other HCAs are supposed to implement it. > Many NICs provide for controlling the rate of generation of interrupts. Controlling the rate of interrupt generation helps reducing the overhead of interrupt handling. In IB, a CQ may generate notifications which are asynchronous from the software perspective and may generate interrupts. The following interface allows to control this function on the devices which support this. A notification will be generated if any of two conditions become true. 1. cq_count - number of CQEs were pushed into the CQ since the last notificaton. 2. cq_period - number of micorseconds elapsed since the last notification. After the generation of a notification the coditions are reset. > > + * ib_modify_cq - Modifies moderation params of the CQ > > + * @cq: The CQ to modify. > > + * @cq_count: number of CQEs that will trigger an event > > + * @cq_period: max period of time in usec before triggering an event > > + * > > + */ > > +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); > > + > From tziporet at dev.mellanox.co.il Mon Jan 28 13:32:22 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 28 Jan 2008 23:32:22 +0200 Subject: [ofa-general] Problem with ConnectX HBA In-Reply-To: <18333.65094.565387.209505@gargle.gargle.HOWL> References: <18333.65094.565387.209505@gargle.gargle.HOWL> Message-ID: <479E49E6.6000100@mellanox.co.il> Roland Fehrenbacher wrote: > Hi, > > when running MPI codes, we have the following error messages coming > from some of our servers running 2.6.22.16 with kernel modules from > ofa_kernel-1.2.5.4: > > mlx4_core 0000:08:00.0: SW2HW_MPT failed (-16) > > The communication on the corresponding machines is completely blocked, > and ibstat is just hanging. > > Any idea what could be wrong? Just for additional info: When running > the kernel with the original 2.6.22 drivers, I had these kind of error > messages at a much higher rate. > > > What is the FW version you use? What is the type of machine used? Can you send us description how to reproduce? Tziporet From perkinjo at cse.ohio-state.edu Mon Jan 28 13:38:06 2008 From: perkinjo at cse.ohio-state.edu (Jonathan L. Perkins) Date: Mon, 28 Jan 2008 16:38:06 -0500 Subject: [ofa-general] MVAPICH2 1.0.2 SRPM Available Message-ID: <479E4B3E.6030502@cse.ohio-state.edu> I've uploaded a new SRPM for MVAPICH2 to the openfabrics server. This is located in ~perkinjo/ofed_1_3/ and is identified by the latest.txt file. This SRPM includes a fix that should solve the problem reported by Scott Weitzenkamp as well as some other improvements. -- Jonathan Perkins http://www.cse.ohio-state.edu/~perkinjo From tziporet at dev.mellanox.co.il Mon Jan 28 13:46:41 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Mon, 28 Jan 2008 23:46:41 +0200 Subject: [ofa-general] [GIT PULL] First InfiniBand/RDMA merge In-Reply-To: References: Message-ID: <479E4D41.6090201@mellanox.co.il> Roland Dreier wrote: > Linus, if you haven't headed off to the airport, please pull from > > master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus > > This tree is also available from kernel.org mirrors at: > > git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus > Roland, Can you update if you plan to take these patches: 1. IB/ipoib stateless offloads 2. IB/mlx4: shrinking WQE Thanks, Tziporet From rdreier at cisco.com Mon Jan 28 13:49:16 2008 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 28 Jan 2008 13:49:16 -0800 Subject: [ofa-general] [GIT PULL] First InfiniBand/RDMA merge In-Reply-To: <479E4D41.6090201@mellanox.co.il> (Tziporet Koren's message of "Mon, 28 Jan 2008 23:46:41 +0200") References: <479E4D41.6090201@mellanox.co.il> Message-ID: > 1. IB/ipoib stateless offloads Maybe, Or still seems to have some issues. I hope to get at least part of the series. > 2. IB/mlx4: shrinking WQE Probably. From rf at q-leap.de Mon Jan 28 14:09:01 2008 From: rf at q-leap.de (Roland Fehrenbacher) Date: Mon, 28 Jan 2008 23:09:01 +0100 Subject: [ofa-general] Problem with ConnectX HBA In-Reply-To: <479E49E6.6000100@mellanox.co.il> References: <18333.65094.565387.209505@gargle.gargle.HOWL> <479E49E6.6000100@mellanox.co.il> Message-ID: <18334.21117.537859.232550@gargle.gargle.HOWL> >>>>> "Tziporet" == Tziporet Koren writes: Tziporet> Roland Fehrenbacher wrote: >> Hi, >> >> when running MPI codes, we have the following error messages >> coming from some of our servers running 2.6.22.16 with kernel >> modules from ofa_kernel-1.2.5.4: >> >> mlx4_core 0000:08:00.0: SW2HW_MPT failed (-16) >> >> The communication on the corresponding machines is completely >> blocked, and ibstat is just hanging. >> >> Any idea what could be wrong? Just for additional info: When >> running the kernel with the original 2.6.22 drivers, I had >> these kind of error messages at a much higher rate. >> >> >> Tziporet> What is the FW version you use? # ibstat CA 'mlx4_0' CA type: MT25418 Number of ports: 2 Firmware version: 2.3.0 Hardware version: 0 Node GUID: 0x0002c9020025a69c System image GUID: 0x0002c9020025a69f Port 1: State: Active Physical state: LinkUp Rate: 20 Base lid: 199 LMC: 0 SM lid: 1 Capability mask: 0x02510868 Port GUID: 0x0002c9020025a69d Tziporet> What is the type of machine used? It is a dual Xeon (Quad core) on a 5000P chipset board. Tziporet> Can you send us description how to reproduce? I started a 100 node / 8 core = 800 processes mvapich job (linpack). The issue occured after about 1 hour of runtime. A 50 node / 8 core = 400 processes mvapich job ran fine several times for more than 36 hours (including the node on which this issue occured now). Roland From Sudhir.Dachepalli at lsi.com Mon Jan 28 14:52:15 2008 From: Sudhir.Dachepalli at lsi.com (Dachepalli, Sudhir) Date: Mon, 28 Jan 2008 15:52:15 -0700 Subject: [ofa-general] subscribe Message-ID: subscribe -------------- next part -------------- An HTML attachment was scrubbed... URL: From arlin.r.davis at intel.com Mon Jan 28 15:49:45 2008 From: arlin.r.davis at intel.com (Davis, Arlin R) Date: Mon, 28 Jan 2008 15:49:45 -0800 Subject: [ofa-general] [ANNOUNCE] dapl-1.2.4 and dapl-2.0.4 release Message-ID: There are new releases for dapl 1.2 and 2.0 available on the OFA download page and in my git tree. md5sum: eeb8ff274f3c297ec88c551177ca902f dapl-1.2.3.tar.gz md5sum: 6336689ef9d764d144b9c9cd9219987a dapl-2.0.2.tar.gz Vlad, please pull both releases into OFED 1.3 RC3 and install the following packages: dapl-1.2.4-1 dapl-2.0.4-1 dapl-utils-2.0.4-1 dapl-devel-2.0.4-1 dapl-debuginfo-2.0.4-1 See http://www.openfabrics.org/downloads/dapl/README.html for details. -arlin -------------- next part -------------- An HTML attachment was scrubbed... URL: From pawel.dziekonski at pwr.wroc.pl Mon Jan 28 16:37:31 2008 From: pawel.dziekonski at pwr.wroc.pl (Pawel Dziekonski) Date: Tue, 29 Jan 2008 01:37:31 +0100 Subject: [ofa-general] Status of NFS-RDMA ? In-Reply-To: References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> <4797AD59.2000206@mellanox.co.il> <20080126193035.GA21209@cefeid.wcss.wroc.pl> Message-ID: <20080129003731.GA30262@cefeid.wcss.wroc.pl> On Mon, 28 Jan 2008 at 10:14:22AM -0500, James Lentini wrote: > > > On Sat, 26 Jan 2008, Pawel Dziekonski wrote: > > > I pulled Tom's tree from new url and build a kernel. > > If you enabled support for INFINIBAND drivers (IB and iWARP support) > and NFS client/server support, the kernel should be ready to go (run > "grep RDMA /your_kernel_sources/.config" to confirm that > CONFIG_SUNRPC_XPRT_RDMA is either m or y). > > NFS/RDMA doesn't require OFED be installed. OFED is a release of the > Linux kernel sources and some userspace libraries/tools. If you are > > then I downloaded OFED from > > http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz, > > I don't know what the above URL contains. The latest code is in Tom > Tucker's tree (and now NFS server maintainer Bruce Fields tree). It is hi, back to subject on a proper mailing list. I have a >3 year experience with mellanox hardware and IBGold so I basically know what OFED is all about. up to now i was only using IBGold since IB drivers appeared in kernel pretty recently. currently I have new hardware. I'm running Tom's kernel and already did some MPI tests. SDP is not working, probably because sdp kernel modules where not build. ;) I understand that those modules are only available from ofa-kernel. please correct me if i'm wrong. system is Scientic Linux 4.5, which is supposed to be a fully compatible RH4 clone. hardware is Supermicro mobos with Mellanox MT25204 and Flextronisc switch. error log from ofa-kernel build: > > make[1]: Entering directory `/usr/src/ib/xprt-switch-2.6' > > test -e include/linux/autoconf.h -a -e include/config/auto.conf || ( \ > > echo; \ > > echo " ERROR: Kernel configuration is invalid."; \ > > echo " include/linux/autoconf.h or include/config/auto.conf are missing."; \ > > echo " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ > > echo; \ > > /bin/false) > > > > obviously, doing 'make oldconfig && make prepare' does not help. > > anyway, above mentioned files do exist: > > > > # ls -la /usr/src/ib/xprt-switch-2.6/{include/linux/autoconf.h,include/config/auto.conf} > > -rw-r--r-- 1 root root 10156 Jan 25 17:42 /usr/src/ib/xprt-switch-2.6/include/config/auto.conf > > -rw-r--r-- 1 root root 14733 Jan 25 17:42 /usr/src/ib/xprt-switch-2.6/include/linux/autoconf.h > > > > despite of above, compilation continues but fails with: > > > > gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.mad.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/3.4.6/include -D__KERNEL__ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/include -Iinclude -include include/linux/autoconf.h -include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -Wdeclaration-after-statement -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mad)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.tmp_mad.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c: In function `ib_mad_init_module': > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c:2966: error: too many arguments to function `kmem_cache_create' > > make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.o] Error 1 > > make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core] Error 2 > > make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband] Error 2 > > make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error 2 > > make[1]: Leaving directory `/usr/src/ib/xprt-switch-2.6' > > make: *** [kernel] Error 2 > > error: Bad exit status from /var/tmp/rpm-tmp.3877 (%install) > > full log: > > https://cefeid.wcss.wroc.pl/d/tmp/OFED.build.32122.log thanks in advance for any help, P -- Pawel Dziekonski Wroclaw Centre for Networking & Supercomputing, HPC Department Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl From ecowas at fdf.com Mon Jan 28 18:48:08 2008 From: ecowas at fdf.com (Mrs. Claudia Lauren) Date: Tue, 29 Jan 2008 13:48:08 +1100 Subject: [ofa-general] (no subject) Message-ID: <200801290248.m0T2m8dj004890@mail07.syd.optusnet.com.au> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ecowas at fdf.com Mon Jan 28 18:49:58 2008 From: ecowas at fdf.com (Mrs. Claudia Lauren) Date: Tue, 29 Jan 2008 13:49:58 +1100 Subject: [ofa-general] (no subject) Message-ID: <200801290249.m0T2nwgA011691@mail09.syd.optusnet.com.au> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwsearchpantherm at searchpanther.com Mon Jan 28 19:23:39 2008 From: dwsearchpantherm at searchpanther.com (Pam Berger) Date: Tue, 29 Jan 2008 11:23:39 +0800 Subject: [ofa-general] Gamble in the best online casino! Message-ID: <01c86269$65426f80$c1b85cde@dwsearchpantherm> Play the most popular casino games at home! Black Jack, Slots, Roulette, Poker, Craps! Just download easy to use free software, register free account and play your favorite game. Receive free $2400 bonus to start play with! Golden Gate Casino guarantees competent customer support for all players, quick response in case you have question or problem and instant payouts. Fair gaming only! http://geocities.com/andersonlawrence747 Enjoy pure pleasure of gambling from home without stress! From dwschauinslandreisenm at schauinslandreisen.de Mon Jan 28 20:41:52 2008 From: dwschauinslandreisenm at schauinslandreisen.de (Jerald Knutson) Date: Tue, 29 Jan 2008 12:41:52 +0800 Subject: [ofa-general] Nothing is more important than your health. Message-ID: <01c86274$52fba110$8d0f363b@dwschauinslandreisenm> Canadian Pharmacy is an experienced, trusted, and fully-licensed Canadian online drugstore. Buy low cost generic pharmaceutical products of extremely high quality manufactured by the leading world famous manufacturers which stand for quality of their meds. http://geocities.com/raulgriffith822 Quality medications should be affordable for all! Brian Zaccardi From bart.vanassche at gmail.com Tue Jan 29 00:04:04 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 29 Jan 2008 09:04:04 +0100 Subject: [ofa-general] Re: Distributing the SRP target source code Message-ID: On Jan 28, 2008 6:07 PM, Vu Pham wrote: > > On srpt readme file, the prerequisite is install SCST BEFORE > ofed-1.3 or like Vlad warning "recompiling ofed" if you > install scst after install ofed. This is what will happen if someone installs Linux kernel headers + SCST + OFED in this order: 1. Linux kernel headers matching the running kernel are installed in /usr/src/linux-.../include or equivalent, and a symbolic link to the kernel headers is created in /lib/modules/$(uname -r)/build/include. 2. By building and installing SCST, SCST modules are installed in /lib/modules/$(uname -r)/extra and SCST kernel headers are installed in /usr/local/include, a.o. SCST's scsi_tgt.h header file, the interface between SCST and mid-level SCSI drivers. 3. Next, OFED kernel modules are being built. During this process the SRP target module is compiled with the header file drivers/infiniband/ulp/srpt/scsi_tgt.h. The version of this file distributed with OFED 1.3 is incompatible with the one distributed with the latest version of SCST. Or: the kernel will probably crash as soon as one starts using the SRP target module, even if he or she followed the above outlined "official" build procedure. Including /usr/local/include/scsi_tgt.h in the SRP target module is not an option -- kernel modules must not include userspace headers, except for the well known exceptions like . All this trouble can be avoided by distributing the SRP target code with SCST instead of with OFED. Furthermore, all kernel headers that define inter-module interfaces should reside in /include//... The SRP target breaks this convention by having a private copy of an inter-module interface in a local directory (drivers/infiniband/ulp/srpt/scsi_tgt.h). > here is one of the reason srpt is part of ofed not scst: > > SCST is GPL > ofed + srpt is GPL or BSD This is not an issue -- if you have a look at the Linux kernel, you will see that all source files are licensed under at least the GPLv2 and some source files are licensed under GPLv2 + one or more other licenses, e.g. BSD. Bart. From vuhuong at mellanox.com Tue Jan 29 00:20:54 2008 From: vuhuong at mellanox.com (Vu Pham) Date: Tue, 29 Jan 2008 00:20:54 -0800 Subject: [ofa-general] Re: Distributing the SRP target source code In-Reply-To: References: Message-ID: <479EE1E6.1010101@mellanox.com> Bart Van Assche wrote: > On Jan 28, 2008 6:07 PM, Vu Pham wrote: >> On srpt readme file, the prerequisite is install SCST BEFORE >> ofed-1.3 or like Vlad warning "recompiling ofed" if you >> install scst after install ofed. > > This is what will happen if someone installs Linux kernel headers + > SCST + OFED in this order: > 1. Linux kernel headers matching the running kernel are installed in > /usr/src/linux-.../include or equivalent, and a symbolic link to the > kernel headers is created in /lib/modules/$(uname -r)/build/include. > 2. By building and installing SCST, SCST modules are installed in > /lib/modules/$(uname -r)/extra and SCST kernel headers are installed > in /usr/local/include, a.o. SCST's scsi_tgt.h header file, the > interface between SCST and mid-level SCSI drivers. > 3. Next, OFED kernel modules are being built. During this process the > SRP target module is compiled with the header file > drivers/infiniband/ulp/srpt/scsi_tgt.h. The version of this file > distributed with OFED 1.3 is incompatible with the one distributed > with the latest version of SCST. Or: the kernel will probably crash as > soon as one starts using the SRP target module, even if he or she > followed the above outlined "official" build procedure. Including > /usr/local/include/scsi_tgt.h in the SRP target module is not an > option -- kernel modules must not include userspace headers, except > for the well known exceptions like . > There are two include paths. The first one is /usr/local/include/scst and the second one are drivers/infiniband/ulp/srpt. Therefore, building srpt in ofed will always use the /usr/local/include/scst path first and if you already install scst then there won't be any problem As you already know /usr/local/include/scst/scsi_tgt.h is not userspace header. SCST is not part of kernel yet; srpt is also not part of kernel > All this trouble can be avoided by distributing the SRP target code > with SCST instead of with OFED. The same problem would appear if someone use different ofed versions > > Furthermore, all kernel headers that define inter-module interfaces > should reside in /include//... The > SRP target breaks this convention by having a private copy of an > inter-module interface in a local directory > (drivers/infiniband/ulp/srpt/scsi_tgt.h). Once again srpt is not part of kernel; therefore, it breaks certain kernel rule. We'll fix it if scst is official part of kernel > > >> here is one of the reason srpt is part of ofed not scst: >> >> SCST is GPL >> ofed + srpt is GPL or BSD > > This is not an issue -- if you have a look at the Linux kernel, you > will see that all source files are licensed under at least the GPLv2 > and some source files are licensed under GPLv2 + one or more other > licenses, e.g. BSD. > I know that; however, I don't know if SCST has ok with double license or not -vu From sth at blumberg.com Tue Jan 29 01:27:29 2008 From: sth at blumberg.com (Sonja Lutz) Date: Tue, 29 Jan 2008 11:27:29 +0200 Subject: [ofa-general] Cheapest software prices! Message-ID: <01c86269$eeb7a500$7ee72459@sth> Purchase perfectly working software available in all European languages! Also for Macintosh! Fast to download, only original versions are offered at very cheap prices. Special offers and discounts allow you to save! All your questions concerning installation will be replied quickly. Highly professional customer service! If your software does not run, we'll refund you money. http://geocities.com/harleyjohnston530 You'll definitely find software you need. From dwsinahotelsm at sinahotels.it Tue Jan 29 01:32:48 2008 From: dwsinahotelsm at sinahotels.it (Janell Quintero) Date: Tue, 29 Jan 2008 10:32:48 +0100 Subject: [ofa-general] Forget about visiting local drugstores. Order your meds online. Message-ID: <01c86262$4ab96440$bb575655@dwsinahotelsm> Medications in Canada are of the same quality as American drugs and manufactured according to the same strict standards but they come cheaper. You can also save your money as Canadian Pharmacy offers generic medications. The quality of meds is beyond all expectations as our suppliers are such world famous manufacturers as Pfeizer, Johnson&Johnson, etc. Canadian Pharmacy guarantees total confidentiality. We have a full range of medications. Prices are really great. Discreet wrapping, prompt delivery, excellent service. http://geocities.com/lulagilbert119 Order with us and enjoy your life in full! Jennifer BORDER From bart.vanassche at gmail.com Tue Jan 29 01:57:23 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 29 Jan 2008 10:57:23 +0100 Subject: [ofa-general] Re: Distributing the SRP target source code In-Reply-To: <479EE1E6.1010101@mellanox.com> References: <479EE1E6.1010101@mellanox.com> Message-ID: On Jan 29, 2008 9:20 AM, Vu Pham wrote: > There are two include paths. The first one is > /usr/local/include/scst and the second one are > drivers/infiniband/ulp/srpt. Therefore, building srpt in > ofed will always use the /usr/local/include/scst path first > and if you already install scst then there won't be any problem > > As you already know /usr/local/include/scst/scsi_tgt.h is > not userspace header. SCST is not part of kernel yet; srpt > is also not part of kernel Please remove drivers/infiniband/ulp/srpt/scsi_tgt.h and scst_const.h from the OFED distribution. It's better that the SRP target doesn't build if SCST was not yet installed instead of having to experience a kernel crash when OFED was built before SCST. > > All this trouble can be avoided by distributing the SRP target code > > with SCST instead of with OFED. > > The same problem would appear if someone use different ofed > versions Personally I never use OFED kernel modules built from the OFED source distribution but instead I use the InfiniBand kernel modules included with the Linux distribution in use. This guarantees consistence between the kernel core and the InfiniBand kernel modules. And whenever I use the SRP target code, I copy it to the kernel source tree and build it from there instead of relying on the OFED kernel build process. Bart. From vuhuong at mellanox.com Tue Jan 29 02:09:08 2008 From: vuhuong at mellanox.com (Vu Pham) Date: Tue, 29 Jan 2008 02:09:08 -0800 Subject: [ofa-general] Re: Distributing the SRP target source code In-Reply-To: References: <479EE1E6.1010101@mellanox.com> Message-ID: <479EFB44.7050000@mellanox.com> Bart Van Assche wrote: > On Jan 29, 2008 9:20 AM, Vu Pham wrote: >> There are two include paths. The first one is >> /usr/local/include/scst and the second one are >> drivers/infiniband/ulp/srpt. Therefore, building srpt in >> ofed will always use the /usr/local/include/scst path first >> and if you already install scst then there won't be any problem >> >> As you already know /usr/local/include/scst/scsi_tgt.h is >> not userspace header. SCST is not part of kernel yet; srpt >> is also not part of kernel > > Please remove drivers/infiniband/ulp/srpt/scsi_tgt.h and scst_const.h > from the OFED distribution. It's better that the SRP target doesn't > build if SCST was not yet installed instead of having to experience a > kernel crash when OFED was built before SCST. It's clear from both ofed/srpt readme and Vlad's SCST bit fat warning You either build scst before ofed or rebuild ofed > >>> All this trouble can be avoided by distributing the SRP target code >>> with SCST instead of with OFED. >> The same problem would appear if someone use different ofed >> versions > > Personally I never use OFED kernel modules built from the OFED source > distribution but instead I use the InfiniBand kernel modules included > with the Linux distribution in use. This guarantees consistence > between the kernel core and the InfiniBand kernel modules. And > whenever I use the SRP target code, I copy it to the kernel source > tree and build it from there instead of relying on the OFED kernel > build process. > And if you have never build ofed and only use IB drivers/modules in kernel tree then you should not use the srpt source in ofed distribution. You should srpt driver from this git tree git://git.openfabrics.org/~vu/srpt.git This srpt git tree does not have scsi_tgt.h in it From vlad at lists.openfabrics.org Tue Jan 29 03:12:20 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Tue, 29 Jan 2008 03:12:20 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080129-0200 daily build status Message-ID: <20080129111220.57948E6096B@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.13 Passed on ia64 with linux-2.6.18 Passed on ppc64 with linux-2.6.16 Passed on ia64 with linux-2.6.19 Passed on ppc64 with linux-2.6.12 Passed on powerpc with linux-2.6.14 Passed on powerpc with linux-2.6.12 Passed on x86_64 with linux-2.6.17 Passed on ia64 with linux-2.6.17 Passed on ppc64 with linux-2.6.15 Passed on x86_64 with linux-2.6.19 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.14 Passed on ia64 with linux-2.6.13 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.12 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.21.1 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.17 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ppc64 with linux-2.6.19 Passed on powerpc with linux-2.6.15 Passed on x86_64 with linux-2.6.14 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ia64 with linux-2.6.22 Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Failed: From vlad at dev.mellanox.co.il Tue Jan 29 02:24:21 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 29 Jan 2008 12:24:21 +0200 Subject: [ofa-general] OFA server rebooted In-Reply-To: <479E2FA5.6070903@nasa.gov> References: <479E2FA5.6070903@nasa.gov> Message-ID: <479EFED5.6050406@dev.mellanox.co.il> Jeff Becker wrote: > Hi. After all the e-mails, I started investigating the disk space issue, > and found some unkillable processes hanging on our NFS backup partition. > After 479 days of uptime :-) , I figured it was a good time to reboot > the server. It's back up now and the web page, wiki and bugzilla all > seem to be OK. Please let me know if you find any problems. Thanks. > > -jeff > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > Hi Jeff, Please check if the time on the OFA server is correct. Thanks, Vladimir From dwsphereukm at sphereuk.com Tue Jan 29 02:37:45 2008 From: dwsphereukm at sphereuk.com (Amber Garner) Date: Tue, 29 Jan 2008 17:37:45 +0700 Subject: [ofa-general] Medications that you need. Message-ID: <01c8629d$a81e3a80$b60a167b@dwsphereukm> Buy Must Have medications at Canada based pharmacy. No prescription at all! Same quality! Save your money, buy pills immediately! http://geocities.com/quincytalley207 We provide confidential and secure purchase! From monis at voltaire.com Tue Jan 29 02:44:32 2008 From: monis at voltaire.com (Moni Shoua) Date: Tue, 29 Jan 2008 12:44:32 +0200 Subject: [ofa-general] [PATCH] IB/IPoIB Check if grat. ARP changed had arrived when working in connected mode Message-ID: <479F0390.8020102@voltaire.com> move a little up the code that checks for a situation where the remote GID stored in the ipoib_neigh is different than the one present in the neighbour (handle Gratuitous ARP) or that a bonding fail over has happened but the neighbour still has a pointer to an ipoib_neigh created not by the current slave. This will cause the driver to apply the check also for connected mode neighbours. This patch was tested against upstream kernel and ofed_kernel. Signed-off-by: Or Gerlitz Signed-off-by: Moni Shoua diff --git a/kernel_patches/fixes/ipoib_0120_check_grat_arp_with_cm.patch b/kernel_patches/fixes/ipoib_0120_check_grat_arp_with_cm.patch new file mode 100644 index 0000000..8b2c32e --- /dev/null +++ b/kernel_patches/fixes/ipoib_0120_check_grat_arp_with_cm.patch @@ -0,0 +1,34 @@ +Index: ofa_kernel-1.3/drivers/infiniband/ulp/ipoib/ipoib_main.c +=================================================================== +--- ofa_kernel-1.3.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-29 08:55:33.000000000 -0500 ++++ ofa_kernel-1.3/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-29 09:17:30.000000000 -0500 +@@ -716,12 +716,7 @@ + + neigh = *to_ipoib_neigh(skb->dst->neighbour); + +- if (ipoib_cm_get(neigh)) { +- if (ipoib_cm_up(neigh)) { +- ipoib_cm_send(dev, skb, ipoib_cm_get(neigh)); +- goto out; +- } +- } else if (neigh->ah) { ++ if (neigh->ah) + if (unlikely((memcmp(&neigh->dgid.raw, + skb->dst->neighbour->ha + 4, + sizeof(union ib_gid))) || +@@ -742,9 +737,14 @@ + goto out; + } + ++ if (ipoib_cm_get(neigh)) { ++ if (ipoib_cm_up(neigh)) { ++ ipoib_cm_send(dev, skb, ipoib_cm_get(neigh)); ++ goto out; ++ } ++ } else if (neigh->ah) { + ipoib_send(dev, skb, neigh->ah, + IPOIB_QPN(skb->dst->neighbour->ha)); +- + goto out; + } + From bart.vanassche at gmail.com Tue Jan 29 02:51:34 2008 From: bart.vanassche at gmail.com (Bart Van Assche) Date: Tue, 29 Jan 2008 11:51:34 +0100 Subject: [ofa-general] Re: Distributing the SRP target source code In-Reply-To: <479EFB44.7050000@mellanox.com> References: <479EE1E6.1010101@mellanox.com> <479EFB44.7050000@mellanox.com> Message-ID: On Jan 29, 2008 11:09 AM, Vu Pham wrote: > Bart Van Assche wrote: > > Please remove drivers/infiniband/ulp/srpt/scsi_tgt.h and scst_const.h > > from the OFED distribution. It's better that the SRP target doesn't > > build if SCST was not yet installed instead of having to experience a > > kernel crash when OFED was built before SCST. > > It's clear from both ofed/srpt readme and Vlad's SCST bit > fat warning. You either build scst before ofed or > rebuild ofed After having installed SCST and OFED 1.3 on a system there will be two incompatible versions present on that system of SCST's header file scsi_tgt.h. This is confusing and questionable. Furthermore, the SRP target will only build correctly if /usr/local/include is in the include path before . (current directory). Relying on the order of directories in the include path is a very questionable practice too. Bart. From ogerlitz at voltaire.com Tue Jan 29 02:56:18 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Tue, 29 Jan 2008 12:56:18 +0200 (IST) Subject: [ofa-general] [PATCH] ib/fmr_pool: allocage page list only when caching enabled Message-ID: Allocate memory for the page_list field of struct ib_pool_fmr only when caching is enabled, since it is not used otherwise Signed-off-by: Or Gerlitz Index: linux-2.6.24-rc8/drivers/infiniband/core/fmr_pool.c =================================================================== --- linux-2.6.24-rc8.orig/drivers/infiniband/core/fmr_pool.c 2008-01-28 16:45:41.000000000 +0200 +++ linux-2.6.24-rc8/drivers/infiniband/core/fmr_pool.c 2008-01-29 09:17:56.000000000 +0200 @@ -308,10 +308,13 @@ struct ib_fmr_pool *ib_create_fmr_pool(s .max_maps = pool->max_remaps, .page_shift = params->page_shift }; + int bytes_per_fmr = sizeof *fmr; + + if (pool->cache_bucket) + bytes_per_fmr += params->max_pages_per_fmr * sizeof (u64); for (i = 0; i < params->pool_size; ++i) { - fmr = kmalloc(sizeof *fmr + params->max_pages_per_fmr * sizeof (u64), - GFP_KERNEL); + fmr = kmalloc(bytes_per_fmr, GFP_KERNEL); if (!fmr) { printk(KERN_WARNING PFX "failed to allocate fmr " "struct for FMR %d\n", i); From ogerlitz at voltaire.com Tue Jan 29 02:57:56 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Tue, 29 Jan 2008 12:57:56 +0200 (IST) Subject: [ofa-general] [PATCH] ib/ipoib: remove a misleading debug print Message-ID: commit 732a2170f499ce7cf5f0bdd4f9e0b0c8337b67e1 "IB/ipoib: Bound the net device to the ipoib_neigh structue" left a misleading (n->dev would be a bond device only if boding is used) debug print, clean it up. Signed-off-by: Or Gerlitz Index: linux-2.6.24-rc8/drivers/infiniband/ulp/ipoib/ipoib_main.c =================================================================== --- linux-2.6.24-rc8.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-29 09:20:32.000000000 +0200 +++ linux-2.6.24-rc8/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-29 09:22:30.000000000 +0200 @@ -821,11 +821,9 @@ static void ipoib_neigh_cleanup(struct n struct ipoib_ah *ah = NULL; neigh = *to_ipoib_neigh(n); - if (neigh) { + if (neigh) priv = netdev_priv(neigh->dev); - ipoib_dbg(priv, "neigh_destructor for bonding device: %s\n", - n->dev->name); - } else + else return; ipoib_dbg(priv, "neigh_cleanup for %06x " IPOIB_GID_FMT "\n", From bulten at netmarkpatent.com Tue Jan 29 03:16:44 2008 From: bulten at netmarkpatent.com (NETMARK PATENT) Date: Tue, 29 Jan 2008 13:16:44 +0200 Subject: [ofa-general] =?windows-1254?q?T=FCrkiye=27de_Yay=FDnlanan_=DDlk_?= =?windows-1254?q?Ses_Markas=FD_!?= Message-ID: <3847-220081229111644580@ugur> Türkiye'de Yayınlanan İlk Ses Markası ! Siemens Aktiengesellschaft firmasının notalar şeklinde olan ses markası 12.11.2007 tarihli 147 numaralı Resmi Marka Bülteni'nde ilan edildi. Bu marka başvurusu Türkiye'de ilk ses markası başvurusudur. Ses markasını TPE'nin bu linkinden dinleyebilirsiniz: http://www.turkpatent.gov.tr/dosyalar/SesMarkasi/SoundMarkSiemens.wav Bluetooth'lu Uzaktan Kumandalı Kaykay Patentli "Groundsurf" bildiğiniz kay kaylar gibi değil. 3 tekerleği olan bu kaykayın ön tekerleğine bağlı bir motor bulunmaktadır. Bu motorla hız kontrol edilir. Arka tekerlekler ise yön vermede yüksek bir kolaylık sağlacak şekilde tasarlanmıştır. Ama bu kaykayın en önemli özelliği ise bluetooth'lu bir cep telefonu ile uzaktan kumanda edilebilir olmasıdır. Bu şekilde fren yapılabilir, hızı ayarlanabilir. Kaynak: Groundsurf Helva İçin Başvuru Yapılıyor! Türk kültürüne ait lokumun tescilinde gecikerek tüm dünyada sahteleriyle uğraşmak zorunda kalan üreticiler, helva için erken önlem alacak. Tahin, Reçel ve Helva Üreticileri Derneği önümüzdeki günlerde lokum ve helvanın tescilini almak için Türk Patent Enstütisi’ne başvuracak. Tahin, Reçel ve Helva Üreticileri Derneği Başkanı Necati Göksu, “Lokum kıvamında bile olmayan jöle gibi şeyleri Türk lokumu adı altında satıyorlar. Lokumun tescili konusunda geç kaldığımızı kabul ediyoruz. Ancak aynı şeyi helvada da yaşamamak için bu kez erken hareket edeceğiz” dedi.14.12.2007 Referans Bu bültenleri almak istemiyorsan1z bulten at netmarkpatent.com adresine bo_ bir mail göndermenizi rica ederiz. Böyle bir talebiniz olmad11 sürece düzenli olarak bültenlerimizi alabilirsiniz. NETMARK PATENT T:0212 220 31 20 F:0212 220 74 21 -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwschwarzpharmam at schwarzpharma.com Tue Jan 29 03:47:56 2008 From: dwschwarzpharmam at schwarzpharma.com (Jami Kirkpatrick) Date: Tue, 29 Jan 2008 14:47:56 +0300 Subject: [ofa-general] Win money and have fun with Golden Gate Casino! Message-ID: <01c86285$ef003600$e1f4ff57@dwschwarzpharmam> Where to gamble online? Check the list of the games in Golden Gate Casino! Just download free software and play from the comfort of your home! Get started and receive $2400 welcome bonus! Great online casino Golden Gate is one of the leading casinos known for fair playing, excellent customer service available to contact 24 hour a day, 7 days a week and prompt payouts. http://geocities.com/rayfloyd443 Play casino games any time you like. From tziporet at mellanox.co.il Tue Jan 29 04:21:40 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Tue, 29 Jan 2008 14:21:40 +0200 Subject: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness Message-ID: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> OFED Jan 28 meeting summary on RC3 readiness: ===================================== 1. OFED 1.3 readiness toward RC3 this week * RC3 is based on the official 2.6.24 release * RC3 is expected on Wed * RC4 is planned for Feb 13 2. All companies update: * IBM - ready for RC3 * Voltaire - ready for RC3 * Qlogic - ready for RC3; will work on bug 874 * Intel - things looks good. Need some uDAPL update from Arlin * Chelsio - ready for RC3 * NetEffect - ready for RC3 * Cisco - reported all issues in bugzilla * Mellanox - ready for RC3 * MPI - all packages are ready 3. Request to change IPoIB to support CM without SRQ and 4K MTU Decided that we cannot insert such enhancements at this stage (RC3 built today) without delaying the release since IPoIB is a critical ULP used by all customers. Since we do not want to delay the release and we wish to have a solution for the new IPoIB enhancements we plan to have 1.3.1 release AIs: Tziporet to define the 1.3.1 release (scope of changes, schedule etc.) Vlad: open 1_3_1 branch so people will have a place to commit changes. We will not start any daily build before 1.3 release 3. Review high priority bugs: 846 critical jim at mellanox.com SDP crash on RHEL5 ppc64 running netserver - will be debugged 859 critical monis at voltaire.com Bonding configuration on Sles10 sp1 is not loaded consistently - fixed 863 critical monis at voltaire.com ib-bonding won't compile for RHEL4 U6 - fixed 874 critical rjwalsh at pathscale.com Intel MPI (IMB test) hangs intermittently on the qlogic HCA - will be debugged by Qlogic 760 major eli at mellanox.co.il UDP performance on Rx is lower than Tx - for 1.3.1 761 major eli at mellanox.co.il Poor and jittery UDP performance at small messages - for 1.3.1 869 major orenk at dev.mellanox.co.il mstflint won't build on SLES10 x86 - fixed 736 major rolandd at cisco.com IBV_WC_RETRY_EXC_ERR errors with local rdma_reads - seems a FW issue (Mellanox to debug) 767 major swise at opengridcomputing.com Non backport Kernels that don't build in genalloc cause compile errors for cxgb3 - no fix (document) Tziporet -------------- next part -------------- An HTML attachment was scrubbed... URL: From Zhen.Liang at Sun.COM Tue Jan 29 04:37:24 2008 From: Zhen.Liang at Sun.COM (Liang Zhen) Date: Tue, 29 Jan 2008 20:37:24 +0800 Subject: [ofa-general] page mask calculation in mthca mthca_reg_phys_mr Message-ID: <479F1E04.3000204@sun.com> Hi, I think there is a similar bug in mthca as in cxgb3 (http://lists.openfabrics.org/pipermail/general/2008-January/045246.html), in mthca_reg_phys_mr(), mask is built like this way: for (i = 0; i < num_phys_buf; ++i) { if (i != 0) mask |= buffer_list[i].addr; if (i != num_phys_buf - 1) mask |= buffer_list[i].addr + buffer_list[i].size; total_size += buffer_list[i].size; } It can get wrong mask as start address of the first fragment and end address of the last fragment are ignored, here is some example of this: INPUT: ipb[0].addr = 0x10002000; ipb[0].size = 0x4000; ipb[1].addr = 0x10006000; ipb[1].size = 0x1000; OUTPUT: shift 13 npages 3 page_list[0] = 10002000 page_list[1] = 10004000 page_list[2] = 10006000 2) -------------------------------- INPUT: ipb[0].addr = 0x10001000; ipb[0].size = 0x4000; ipb[1].addr = 0x10006000; ipb[1].size = 0x1000; OUTPUT: shift 12 npages 5 page_list[0] = 10001000 page_list[1] = 10002000 page_list[2] = 10003000 page_list[3] = 10004000 page_list[4] = 10006000 Regards Liang From illiberalcuz at pinuts.de Tue Jan 29 04:52:10 2008 From: illiberalcuz at pinuts.de (Carey Burris) Date: Tue, 29 Jan 2008 13:52:10 +0100 Subject: [ofa-general] After that feel reliance of all Message-ID: <449345786.65031314072583@pinuts.de> On-li jy ne gen pg eric sto nz re We are proud to be able to bring you our wide selection of m gv ed kvq ic vfv in dkg es, all of them are available to you online, 24x7. No Waiting for Doctors, you will enjoy Complete privacy, and you can order anytime, in your Own time, with No prior prescription needed! Now you can enjoy the convenience of ordering from your own home or office at the time that suits you! Order Safely and Securely through our secure transaction server. Visit our si wxv te. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eli at dev.mellanox.co.il Tue Jan 29 07:13:36 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 29 Jan 2008 17:13:36 +0200 Subject: [ofa-general] [PATCH] IB/IPoIB Check if grat. ARP changed had arrived when working in connected mode In-Reply-To: <479F0390.8020102@voltaire.com> References: <479F0390.8020102@voltaire.com> Message-ID: <1201619616.7074.3.camel@mtls03> Now you may call ipoib_put_ah(neigh->ah) for a CM neighbor and this could cause de-reference of a NULL pointer. On Tue, 2008-01-29 at 12:44 +0200, Moni Shoua wrote: > move a little up the code that checks for a situation where the remote GID stored in the ipoib_neigh is > different than the one present in the neighbour (handle Gratuitous ARP) or that a bonding fail over has > happened but the neighbour still has a pointer to an ipoib_neigh created not by the current slave. This > will cause the driver to apply the check also for connected mode neighbours. > This patch was tested against upstream kernel and ofed_kernel. > > Signed-off-by: Or Gerlitz > Signed-off-by: Moni Shoua > > diff --git a/kernel_patches/fixes/ipoib_0120_check_grat_arp_with_cm.patch b/kernel_patches/fixes/ipoib_0120_check_grat_arp_with_cm.patch > new file mode 100644 > index 0000000..8b2c32e > --- /dev/null > +++ b/kernel_patches/fixes/ipoib_0120_check_grat_arp_with_cm.patch > @@ -0,0 +1,34 @@ > +Index: ofa_kernel-1.3/drivers/infiniband/ulp/ipoib/ipoib_main.c > +=================================================================== > +--- ofa_kernel-1.3.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-29 08:55:33.000000000 -0500 > ++++ ofa_kernel-1.3/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-29 09:17:30.000000000 -0500 > +@@ -716,12 +716,7 @@ > + > + neigh = *to_ipoib_neigh(skb->dst->neighbour); > + > +- if (ipoib_cm_get(neigh)) { > +- if (ipoib_cm_up(neigh)) { > +- ipoib_cm_send(dev, skb, ipoib_cm_get(neigh)); > +- goto out; > +- } > +- } else if (neigh->ah) { > ++ if (neigh->ah) > + if (unlikely((memcmp(&neigh->dgid.raw, > + skb->dst->neighbour->ha + 4, > + sizeof(union ib_gid))) || > +@@ -742,9 +737,14 @@ > + goto out; > + } > + > ++ if (ipoib_cm_get(neigh)) { > ++ if (ipoib_cm_up(neigh)) { > ++ ipoib_cm_send(dev, skb, ipoib_cm_get(neigh)); > ++ goto out; > ++ } > ++ } else if (neigh->ah) { > + ipoib_send(dev, skb, neigh->ah, > + IPOIB_QPN(skb->dst->neighbour->ha)); > +- > + goto out; > + } > + > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From civilize9 at busa.de Tue Jan 29 05:39:55 2008 From: civilize9 at busa.de (Marva Gipson) Date: Tue, 29 Jan 2008 16:39:55 +0300 Subject: [ofa-general] feel like a major player with our new EDSET Message-ID: <993900440.96446017561461@busa.de> It’s OK to buy V gu ia au gr sly a from In sat te oy rnet ph sxg ar gtr ma yr ci sb es Our onli ks ne ph dbl ar dnv ma gk cy works 24 hours a day. We offer more than 150 me qdu di ohf ca hvx ti hj ons:- V sny ia ywj gr dbj a- So phm ma- C ws ia mbe li fa s- V akv al qvn iu nqq m- X png an sxa ax- A tam mb dtb ien- I iv mi oh tre jt x- L oco ev stk it uuv ra- X ho en ck ic tc al- T pfn al rna wi hyx n All the p pi ric npp es you can find on our si aod te! Marva Gipson -------------- next part -------------- An HTML attachment was scrubbed... URL: From ogerlitz at voltaire.com Tue Jan 29 05:52:31 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Tue, 29 Jan 2008 15:52:31 +0200 Subject: [ofa-general] Re: [PATCH] IB/IPoIB Check if grat. ARP changed had arrived when working in connected mode In-Reply-To: <1201619616.7074.3.camel@mtls03> References: <479F0390.8020102@voltaire.com> <1201619616.7074.3.camel@mtls03> Message-ID: <479F2F9F.8040200@voltaire.com> Eli Cohen wrote: > Now you may call ipoib_put_ah(neigh->ah) for a CM neighbor and this > could cause de-reference of a NULL pointer. Hi Eli, thanks for looking on this, however, I don't follow on your comment, do you say that for connected mode neighbours with this patch ipoib_put_ah would be called twice? why? Or. From eli at dev.mellanox.co.il Tue Jan 29 07:57:19 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 29 Jan 2008 17:57:19 +0200 Subject: [ofa-general] Re: [PATCH] IB/IPoIB Check if grat. ARP changed had arrived when working in connected mode In-Reply-To: <479F2F9F.8040200@voltaire.com> References: <479F0390.8020102@voltaire.com> <1201619616.7074.3.camel@mtls03> <479F2F9F.8040200@voltaire.com> Message-ID: <1201622239.7074.8.camel@mtls03> On Tue, 2008-01-29 at 15:52 +0200, Or Gerlitz wrote: > Eli Cohen wrote: > > Now you may call ipoib_put_ah(neigh->ah) for a CM neighbor and this > > could cause de-reference of a NULL pointer. > > Hi Eli, > > thanks for looking on this, however, I don't follow on your comment, do > you say that for connected mode neighbours with this patch ipoib_put_ah > would be called twice? why? I am not saying it we be called twice. I am saying that with this change, it might be called for CM neighbors which I think should not happen at all. From monisonlists at gmail.com Tue Jan 29 06:17:35 2008 From: monisonlists at gmail.com (Moni Shoua) Date: Tue, 29 Jan 2008 16:17:35 +0200 Subject: [ewg] Re: [ofa-general] [PATCH] IB/IPoIB Check if grat. ARP changed had arrived when working in connected mode In-Reply-To: <1201619616.7074.3.camel@mtls03> References: <479F0390.8020102@voltaire.com> <1201619616.7074.3.camel@mtls03> Message-ID: <479F357F.5070808@gmail.com> Eli Cohen wrote: > Now you may call ipoib_put_ah(neigh->ah) for a CM neighbor and this > could cause de-reference of a NULL pointer. > If I understand you right, I don't see how this can happen. The code block that calls ipoib_put_ah(neigh->ah) starts with if (neigh->ah)... Am I right? From strangulaterh38 at mail.rz.fh-aalen.de Tue Jan 29 06:27:10 2008 From: strangulaterh38 at mail.rz.fh-aalen.de (Freddie Payne) Date: Tue, 29 Jan 2008 15:27:10 +0100 Subject: [ofa-general] Longer...Harder Message-ID: <230343455.91039348224753@mail.rz.fh-aalen.de> exp mg re kfz ssher lrd ba gwd lsGa iye in An Am eun azi vmn ng 1 to 3 fu hd ll In fs ch ig es! CLI zr CK HE nh RE!!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From eli at dev.mellanox.co.il Tue Jan 29 06:27:11 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 29 Jan 2008 16:27:11 +0200 Subject: [ewg] Re: [ofa-general] [PATCH] IB/IPoIB Check if grat. ARP changed had arrived when working in connected mode In-Reply-To: <479F357F.5070808@gmail.com> References: <479F0390.8020102@voltaire.com> <1201619616.7074.3.camel@mtls03> <479F357F.5070808@gmail.com> Message-ID: <1201616831.28794.1.camel@mtls03> On Tue, 2008-01-29 at 16:17 +0200, Moni Shoua wrote: > Eli Cohen wrote: > > Now you may call ipoib_put_ah(neigh->ah) for a CM neighbor and this > > could cause de-reference of a NULL pointer. > > > If I understand you right, I don't see how this can happen. > The code block that calls ipoib_put_ah(neigh->ah) starts with if (neigh->ah)... > > Am I right? > Oh I see. I missed that. From monisonlists at gmail.com Tue Jan 29 06:33:17 2008 From: monisonlists at gmail.com (Moni Shoua) Date: Tue, 29 Jan 2008 16:33:17 +0200 Subject: [ofa-general] Re: [ewg] [PATCH] IB/IPoIB Check if grat. ARP changed had arrived when working in connected mode In-Reply-To: <479F0390.8020102@voltaire.com> References: <479F0390.8020102@voltaire.com> Message-ID: <479F392D.8040109@gmail.com> Moni Shoua wrote: > move a little up the code that checks for a situation where the remote GID stored in the ipoib_neigh is > different than the one present in the neighbour (handle Gratuitous ARP) or that a bonding fail over has > happened but the neighbour still has a pointer to an ipoib_neigh created not by the current slave. This > will cause the driver to apply the check also for connected mode neighbours. > This patch was tested against upstream kernel and ofed_kernel. > > Signed-off-by: Or Gerlitz > Signed-off-by: Moni Shoua > This patch resolves a critical bug: https://bugs.openfabrics.org/show_bug.cgi?id=878 From jlentini at netapp.com Tue Jan 29 06:53:46 2008 From: jlentini at netapp.com (James Lentini) Date: Tue, 29 Jan 2008 09:53:46 -0500 (EST) Subject: [nfs-rdma-devel] [ofa-general] Status of NFS-RDMA ? In-Reply-To: <20080129003731.GA30262@cefeid.wcss.wroc.pl> References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> <4797AD59.2000206@mellanox.co.il> <20080126193035.GA21209@cefeid.wcss.wroc.pl> <20080129003731.GA30262@cefeid.wcss.wroc.pl> Message-ID: On Tue, 29 Jan 2008, Pawel Dziekonski wrote: > > On Mon, 28 Jan 2008 at 10:14:22AM -0500, James Lentini wrote: > > > > > > On Sat, 26 Jan 2008, Pawel Dziekonski wrote: > > > > > I pulled Tom's tree from new url and build a kernel. > > > > If you enabled support for INFINIBAND drivers (IB and iWARP support) > > and NFS client/server support, the kernel should be ready to go (run > > "grep RDMA /your_kernel_sources/.config" to confirm that > > CONFIG_SUNRPC_XPRT_RDMA is either m or y). > > > > NFS/RDMA doesn't require OFED be installed. OFED is a release of the > > Linux kernel sources and some userspace libraries/tools. If you are > > > > then I downloaded OFED from > > > http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz, > > > > I don't know what the above URL contains. The latest code is in Tom > > Tucker's tree (and now NFS server maintainer Bruce Fields tree). It is > > > hi, > > back to subject on a proper mailing list. > > I have a >3 year experience with mellanox hardware and IBGold so I > basically know what OFED is all about. up to now i was only using > IBGold since IB drivers appeared in kernel pretty recently. You'll want to use the mainline kernel's IB drivers for NFS/RDMA. We've been developing the NFS/RDMA software on the OpenFabrics (aka OpenIB) code since it was merged into 2.6.10 in Dec 2004. > currently I have new hardware. I'm running Tom's kernel and already > did some MPI tests. SDP is not working, probably because sdp kernel > modules where not build. ;) I understand that those modules are only > available from ofa-kernel. please correct me if i'm wrong. Correct. SDP has never been submitted to mainline Linux. > system is Scientic Linux 4.5, which is supposed to be a fully > compatible RH4 clone. hardware is Supermicro mobos with Mellanox > MT25204 and Flextronisc switch. > > error log from ofa-kernel build: Is your goal to build a kernel with an NFS/RDMA server? If so, the kernel sources from Tom Tucker's git tree are the ones you want, not the old OFED 1.2-based packages which are out of date. Did you try setting up the NFS/RDMA server on the kernel used for your MPI tests above? > > > make[1]: Entering directory `/usr/src/ib/xprt-switch-2.6' > > > test -e include/linux/autoconf.h -a -e include/config/auto.conf || ( \ > > > echo; \ > > > echo " ERROR: Kernel configuration is invalid."; \ > > > echo " include/linux/autoconf.h or include/config/auto.conf are missing."; \ > > > echo " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ > > > echo; \ > > > /bin/false) > > > > > > obviously, doing 'make oldconfig && make prepare' does not help. > > > anyway, above mentioned files do exist: > > > > > > # ls -la /usr/src/ib/xprt-switch-2.6/{include/linux/autoconf.h,include/config/auto.conf} > > > -rw-r--r-- 1 root root 10156 Jan 25 17:42 /usr/src/ib/xprt-switch-2.6/include/config/auto.conf > > > -rw-r--r-- 1 root root 14733 Jan 25 17:42 /usr/src/ib/xprt-switch-2.6/include/linux/autoconf.h > > > > > > despite of above, compilation continues but fails with: > > > > > > gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.mad.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/3.4.6/include -D__KERNEL__ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/include -Iinclude -include include/linux/autoconf.h -include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -Wdeclaration-after-statement -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mad)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.! tmp > _mad.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c > > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c: In function `ib_mad_init_module': > > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.c:2966: error: too many arguments to function `kmem_cache_create' > > > make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/mad.o] Error 1 > > > make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core] Error 2 > > > make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband] Error 2 > > > make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error 2 > > > make[1]: Leaving directory `/usr/src/ib/xprt-switch-2.6' > > > make: *** [kernel] Error 2 > > > error: Bad exit status from /var/tmp/rpm-tmp.3877 (%install) > > > > full log: > > > https://cefeid.wcss.wroc.pl/d/tmp/OFED.build.32122.log > > thanks in advance for any help, P > > > -- > Pawel Dziekonski > Wroclaw Centre for Networking & Supercomputing, HPC Department > Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND > phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > nfs-rdma-devel mailing list > nfs-rdma-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-rdma-devel > From eli at dev.mellanox.co.il Tue Jan 29 07:15:52 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 29 Jan 2008 17:15:52 +0200 Subject: [ofa-general] Re: [PATCH 4/16 v2] IB/ipoib: Add checksum offload support for ipoib In-Reply-To: <479C7E5B.8040104@voltaire.com> References: <1201193831.6755.63.camel@mtls03> <479C7E5B.8040104@voltaire.com> Message-ID: <1201619752.28794.16.camel@mtls03> On Sun, 2008-01-27 at 14:51 +0200, Or Gerlitz wrote: > Eli Cohen wrote: > > --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c > > @@ -1234,6 +1234,11 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, > > set_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); > > ipoib_warn(priv, "enabling connected mode " > > "will cause multicast packet drops\n"); > > + > > + dev->features &= ~NETIF_F_IP_CSUM; > > if adding NETIF_F_IP_CSUM brings in NETIF_F_SG, why not ANDing here with > ~NETIF_F_SG as well? > You're right - I will fix that. > > + > > + priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM; > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From tziporet at mellanox.co.il Tue Jan 29 07:26:18 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Tue, 29 Jan 2008 17:26:18 +0200 Subject: [ofa-general] OFED 1.2.5.5 is ready on the ofa server Message-ID: <6C2C79E72C305246B504CBA17B5500C90336872F@mtlexch01.mtl.com> OFED-1.2.5.5 is ready: http://www.openfabrics.org/downloads/OFED/ofed-1.2.5/OFED-1.2.5.5.tgz Supported Platforms and Operating Systems ========================================= o CPU architectures: - x86_64 - x86 - ppc64 - ia64 o Linux Operating Systems: - RedHat EL4 up3: 2.6.9-34.ELsmp - RedHat EL4 up4: 2.6.9-42.ELsmp - RedHat EL4 up5: 2.6.9-55.ELsmp - RedHat EL4 up6: 2.6.9-67.ELsmp * - RedHat EL5: 2.6.18-8.el5 - RedHat EL5 up1: 2.6.18-53.el5 - Fedora C6: 2.6.18-8.fc6 * - SLES9 SP3: 2.6.5-7.244-smp * - SLES9 SP4: 2.6.5-7.305-smp * - SLES10: 2.6.16.21-0.8-smp - SLES10 SP1: 2.6.16.46-0.12-smp - SLES10 SP1 up1: 2.6.16.46-0.12-smp - SUSE Pro 10.0: 2.6.13-16-smp * - kernel.org: 2.6.20.x and 2.6.22.x * OSes that are partially tested Fixed Bugs and Enhancements Since OFED 1.2.5.4 ============================================== - OSes support: - added support for RHEL5 up1 - Added support for RHEL4 up6 - Added support for SLES9 SP4 on PPC64 - Low level drivers update: - cxgb3: - Flush the receive queue when closing - Fix page shift calculation in build_phys_page_list() - Mark QP as privileged based on user capabilities - Fix the T3A workaround checks - Pull in latest fixes. - mlx4: - For write-combining copies, a tight "for" loop instead of memcpy. - Fix the value of the pkey_index in the completion to get a valid value for GSI QPs. - Changed mlx4 driver default to be MSI-X - IPoIB: - Fix issue when RC QP is closed (due to RNR NAK) - CMA: - Bug fix on connection tear-down (enables RDS to download when removing low level driver) - RDS: - Fixed a bug of uninitialized variable in RDS-tools. - OpenSM: - Fixed coredump that might occur when switch ports are disconnecting and reconnecting quickly. Fixed Bugs and Enhancements OFED 1.2.5: ======================================= - OSes support: - Added support for SLES9 SP4 - Low level drivers update: - cxgb3: Pull in latest fixes. - ipath: Pull in latest fixes. - RDS: - Performance enhancements - Relax the header consistency check on fragment reassembly - GA for Oracle 11 - IPoIB: - Use NAPI by default - For small received packets, allocate a new, smaller SKB to relief accounting on the socket. - mlx4: - Enable changing default max HCA resource limits using module options. - Support opening of more resources then the default by increasing command timeout for INIT_HCA to 10 seconds - PPC64 support: - Fixed compilation problems on SLES10 SP1 Tziporet & Vlad -------------- next part -------------- An HTML attachment was scrubbed... URL: From eli at dev.mellanox.co.il Tue Jan 29 08:17:35 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 29 Jan 2008 18:17:35 +0200 Subject: [ofa-general] [PATCH 14/16 v3] IB/ipoib: Support modifying IPOIB CQ moderation params Message-ID: <1201623455.28794.27.camel@mtls03> Support modifying IPOIB CQ moderation params This can be used to tune at run time the paramters controlling the event (interrupt) generation rate and thus reduce the overhead incurred by handling interrupts resulting in better throughput. Since IPOIB uses a single CQ for both rx and tx, rx is chosen to dictate configuration for both rx and tx. Signed-off-by: Eli Cohen --- changes: Fix documentation remove test on tx params in ipoib_set_coalesce(). drivers/infiniband/ulp/ipoib/ipoib.h | 6 ++++ drivers/infiniband/ulp/ipoib/ipoib_etool.c | 46 ++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index ee7807f..3e8dceb 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -309,6 +309,11 @@ struct ipoib_cm_dev_priv { int num_frags; }; +struct ipoib_ethtool_st { + u16 coalesce_usecs; + u16 max_coalesced_frames; +}; + /* * Device private locking: tx_lock protects members used in TX fast * path (and we use LLTX so upper layers don't do extra locking). @@ -386,6 +391,7 @@ struct ipoib_dev_priv { struct dentry *mcg_dentry; struct dentry *path_dentry; #endif + struct ipoib_ethtool_st etool; }; struct ipoib_ah { diff --git a/drivers/infiniband/ulp/ipoib/ipoib_etool.c b/drivers/infiniband/ulp/ipoib/ipoib_etool.c index 913aea0..a3ac4cf 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_etool.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_etool.c @@ -44,9 +44,55 @@ static void ipoib_get_drvinfo(struct net_device *netdev, strncpy(drvinfo->driver, "ipoib", sizeof(drvinfo->driver) - 1); } +static int ipoib_get_coalesce(struct net_device *dev, + struct ethtool_coalesce *coal) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + coal->rx_coalesce_usecs = priv->etool.coalesce_usecs; + coal->tx_coalesce_usecs = priv->etool.coalesce_usecs; + coal->rx_max_coalesced_frames = priv->etool.max_coalesced_frames; + coal->tx_max_coalesced_frames = priv->etool.max_coalesced_frames; + + return 0; +} + +static int ipoib_set_coalesce(struct net_device *dev, + struct ethtool_coalesce *coal) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + /* + * since ipoib uses a single CQ for both rx and tx, + * we assume that rx params dictate the configuration. + * These values are saved in the private data and returned + * when ipoib_get_coalesce is called + */ + if (coal->rx_coalesce_usecs > 0xffff || + coal->rx_max_coalesced_frames > 0xffff) + return -EINVAL; + + ret = ib_modify_cq(priv->cq, coal->rx_max_coalesced_frames, + coal->rx_coalesce_usecs); + if (ret) { + ipoib_dbg(priv, "failed modifying CQ\n"); + return ret; + } + + coal->tx_coalesce_usecs = coal->rx_coalesce_usecs; + priv->etool.coalesce_usecs = coal->rx_coalesce_usecs; + coal->tx_max_coalesced_frames = coal->rx_max_coalesced_frames; + priv->etool.max_coalesced_frames = coal->rx_max_coalesced_frames; + + return 0; +} + static const struct ethtool_ops ipoib_ethtool_ops = { .get_drvinfo = ipoib_get_drvinfo, .get_tso = ethtool_op_get_tso, + .get_coalesce = ipoib_get_coalesce, + .set_coalesce = ipoib_set_coalesce, }; void ipoib_set_ethtool_ops(struct net_device *dev) -- 1.5.3.8 From eli at dev.mellanox.co.il Tue Jan 29 08:17:40 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 29 Jan 2008 18:17:40 +0200 Subject: [ofa-general] [PATCH 4/16 v3] IB/ipoib: Add checksum offload support for ipoib Message-ID: <1201623460.28794.28.camel@mtls03> Add checksum offload support for ipoib Signed-off-by: Eli Cohen --- changes: clear NETIF_F_SG when change to CM mode remove the check in UD receive flow which ensures there are no IP options. The semantics of the CQE is modified to ensure that and the HW (mlx4) too. drivers/infiniband/ulp/ipoib/ipoib.h | 1 + drivers/infiniband/ulp/ipoib/ipoib_cm.c | 7 +++++++ drivers/infiniband/ulp/ipoib/ipoib_ib.c | 12 ++++++++++++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 +++++++++++++++ 4 files changed, 35 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 7c9edc6..d13e481 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -87,6 +87,7 @@ enum { IPOIB_MCAST_STARTED = 8, IPOIB_FLAG_ADMIN_CM = 9, IPOIB_FLAG_UMCAST = 10, + IPOIB_FLAG_CSUM = 11, IPOIB_MAX_BACKOFF_SECONDS = 16, diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 7dd2ec4..e94ec0a 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1378,6 +1378,9 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, set_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); ipoib_warn(priv, "enabling connected mode " "will cause multicast packet drops\n"); + + dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_SG); + ipoib_flush_paths(dev); return count; } @@ -1386,6 +1389,10 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, clear_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); dev->mtu = min(priv->mcast_mtu, dev->mtu); ipoib_flush_paths(dev); + + if (priv->ca->flags & IB_DEVICE_IP_CSUM) + dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; + return count; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 680c27f..0f616f6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -37,6 +37,7 @@ #include #include +#include #include @@ -231,6 +232,11 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) skb->dev = dev; /* XXX get correct PACKET_ type here */ skb->pkt_type = PACKET_HOST; + + /* check rx csum */ + if (test_bit(IPOIB_FLAG_CSUM, &priv->flags) && likely(wc->csum_ok)) + skb->ip_summed = CHECKSUM_UNNECESSARY; + netif_receive_skb(skb); repost: @@ -394,6 +400,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, return; } + if (dev->flags & NETIF_F_IP_CSUM && + skb->ip_summed == CHECKSUM_PARTIAL) + priv->tx_wr.send_flags |= IB_SEND_IP_CSUM; + else + priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM; + if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), address->ah, qpn, tx_req->mapping, skb_headlen(skb), diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 8dda67e..83f8b85 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1099,6 +1099,20 @@ int ipoib_add_pkey_attr(struct net_device *dev) return device_create_file(&dev->dev, &dev_attr_pkey); } +static void set_csum(struct net_device *dev, struct ib_device *hca) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (test_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags)) + return; + + if (!(hca->flags & IB_DEVICE_IP_CSUM)) + return; + + dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM; + set_bit(IPOIB_FLAG_CSUM, &priv->flags); +} + static struct net_device *ipoib_add_port(const char *format, struct ib_device *hca, u8 port) { @@ -1137,6 +1151,7 @@ static struct net_device *ipoib_add_port(const char *format, } else memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + set_csum(priv->dev, hca); result = ipoib_dev_init(priv->dev, hca, port); if (result < 0) { -- 1.5.3.8 From eli at dev.mellanox.co.il Tue Jan 29 08:17:43 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Tue, 29 Jan 2008 18:17:43 +0200 Subject: [ofa-general] [PATCH 9/16 v3] IB/ipoib: Add LSO support to ipoib Message-ID: <1201623463.28794.29.camel@mtls03> Add LSO support to ipoib Signed-off-by: Eli Cohen --- changes: create flags require TSO only if device supports that. drivers/infiniband/ulp/ipoib/ipoib.h | 54 ++++++++++++------- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 7 ++- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 80 +++++++++++++++++++++------ drivers/infiniband/ulp/ipoib/ipoib_main.c | 8 +++- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 5 ++- 5 files changed, 113 insertions(+), 41 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index d13e481..70f8b5c 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -152,31 +152,40 @@ static inline int ipoib_dma_map_tx(struct ib_device *ca, { struct sk_buff *skb = tx_req->skb; u64 *mapping = tx_req->mapping; - int frags; int i; - - mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) - return -EIO; - - frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < frags; ++i) { + int nfrags; + int off; + + if (skb_headlen(skb)) { + mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) + return -EIO; + off = 1; + } else + off = 0; + + nfrags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < nfrags; ++i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - mapping[i + 1] = ib_dma_map_page(ca, frag->page, - frag->page_offset, frag->size, - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(ca, mapping[i + 1]))) + mapping[i + off] = ib_dma_map_page(ca, frag->page, frag->page_offset, + frag->size, DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[i + off]))) goto partial_error; } return 0; partial_error: - ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + if (skb_headlen(skb)) { + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + off = 0; + } else + off = 1; for (; i > 0; --i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1]; - ib_dma_unmap_page(ca, mapping[i], frag->size, DMA_TO_DEVICE); + ib_dma_unmap_page(ca, mapping[i - off], frag->size, + DMA_TO_DEVICE); } return -EIO; } @@ -186,15 +195,20 @@ static inline void ipoib_dma_unmap_tx(struct ib_device *ca, { struct sk_buff *skb = tx_req->skb; u64 *mapping = tx_req->mapping; - int frags; int i; + int nfrags; + int off; - ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + if (skb_headlen(skb)) { + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + off = 1; + } else + off = 0; - frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < frags; ++i) { + nfrags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < nfrags; ++i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - ib_dma_unmap_page(ca, mapping[i + 1], frag->size, + ib_dma_unmap_page(ca, mapping[i + off], frag->size, DMA_TO_DEVICE); } } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index e94ec0a..4f5604d 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1379,7 +1379,7 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, ipoib_warn(priv, "enabling connected mode " "will cause multicast packet drops\n"); - dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_SG); + dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO); ipoib_flush_paths(dev); return count; @@ -1393,6 +1393,11 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, if (priv->ca->flags & IB_DEVICE_IP_CSUM) dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; + + if (priv->dev->features & NETIF_F_SG && + priv->ca->flags & IB_DEVICE_TCP_TSO) + priv->dev->features |= NETIF_F_TSO; + return count; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 0f616f6..c3af51b 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -38,6 +38,7 @@ #include #include #include +#include #include @@ -346,24 +347,40 @@ void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) static inline int post_send(struct ipoib_dev_priv *priv, unsigned int wr_id, struct ib_ah *address, u32 qpn, - u64 *mapping, int headlen, - skb_frag_t *frags, - int nr_frags) + struct ipoib_tx_buf *tx_req, + void *head, int hlen) { struct ib_send_wr *bad_wr; - int i; + int i, off; + struct sk_buff *skb = tx_req->skb; + skb_frag_t *frags = skb_shinfo(skb)->frags; + int nr_frags = skb_shinfo(skb)->nr_frags; + u64 *mapping = tx_req->mapping; + + if (skb_headlen(skb)) { + priv->tx_sge[0].addr = mapping[0]; + priv->tx_sge[0].length = skb_headlen(skb); + off = 1; + } else + off = 0; - priv->tx_sge[0].addr = mapping[0]; - priv->tx_sge[0].length = headlen; for (i = 0; i < nr_frags; ++i) { - priv->tx_sge[i + 1].addr = mapping[i + 1]; - priv->tx_sge[i + 1].length = frags[i].size; + priv->tx_sge[i + off].addr = mapping[i + off]; + priv->tx_sge[i + off].length = frags[i].size; } - priv->tx_wr.num_sge = nr_frags + 1; + priv->tx_wr.num_sge = nr_frags + off; priv->tx_wr.wr_id = wr_id; priv->tx_wr.wr.ud.remote_qpn = qpn; priv->tx_wr.wr.ud.ah = address; + if (head) { + priv->tx_wr.wr.ud.mss = skb_shinfo(skb)->gso_size; + priv->tx_wr.wr.ud.header = head; + priv->tx_wr.wr.ud.hlen = hlen; + priv->tx_wr.opcode = IB_WR_LSO; + } else + priv->tx_wr.opcode = IB_WR_SEND; + return ib_post_send(priv->qp, &priv->tx_wr, &bad_wr); } @@ -372,14 +389,36 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ipoib_tx_buf *tx_req; + int hlen; + void *phead; + + if (!skb_is_gso(skb)) { + if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { + ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", + skb->len, priv->mcast_mtu + IPOIB_ENCAP_LEN); + ++dev->stats.tx_dropped; + ++dev->stats.tx_errors; + ipoib_cm_skb_too_long(dev, skb, priv->mcast_mtu); + return; + } + phead = 0; + hlen = 0; + } else { + /* + * LSO header is limited to max 60 bytes + */ + if (unlikely((ip_hdr(skb)->ihl + tcp_hdr(skb)->doff) > 15)) { + ipoib_warn(priv, "ip(%d) and tcp(%d) headers too long, dropping skb\n", + ip_hdr(skb)->ihl << 2, tcp_hdr(skb)->doff << 2); + goto drop; + } - if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { - ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", - skb->len, priv->mcast_mtu + IPOIB_ENCAP_LEN); - ++dev->stats.tx_dropped; - ++dev->stats.tx_errors; - ipoib_cm_skb_too_long(dev, skb, priv->mcast_mtu); - return; + hlen = ((ip_hdr(skb)->ihl + tcp_hdr(skb)->doff) << 2) + IPOIB_ENCAP_LEN; + phead = skb->data; + if (unlikely(!skb_pull(skb, hlen))) { + ipoib_warn(priv, "linear data too small\n"); + goto drop; + } } ipoib_dbg_data(priv, "sending packet, length=%d address=%p qpn=0x%06x\n", @@ -408,8 +447,7 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), address->ah, qpn, - tx_req->mapping, skb_headlen(skb), - skb_shinfo(skb)->frags, skb_shinfo(skb)->nr_frags))) { + tx_req, phead, hlen))) { ipoib_warn(priv, "post_send failed\n"); ++dev->stats.tx_errors; ipoib_dma_unmap_tx(priv->ca, tx_req); @@ -425,6 +463,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, netif_stop_queue(dev); } } + return; + +drop: + ++dev->stats.tx_errors; + dev_kfree_skb_any(skb); + return; } static void __ipoib_reap_ah(struct net_device *dev) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 83f8b85..9063f28 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -706,7 +706,9 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev) goto out; } - ipoib_send(dev, skb, neigh->ah, IPOIB_QPN(skb->dst->neighbour->ha)); + ipoib_send(dev, skb, neigh->ah, + IPOIB_QPN(skb->dst->neighbour->ha)); + goto out; } @@ -1170,6 +1172,10 @@ static struct net_device *ipoib_add_port(const char *format, goto event_failed; } + if (priv->dev->features & NETIF_F_SG && priv->ca->flags & IB_DEVICE_TCP_TSO) + priv->dev->features |= NETIF_F_TSO; + + result = register_netdev(priv->dev); if (result) { printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 5e392e0..e20f2af 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -153,7 +153,7 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .max_recv_sge = 1 }, .sq_sig_type = IB_SIGNAL_ALL_WR, - .qp_type = IB_QPT_UD + .qp_type = IB_QPT_UD, }; int i, ret, size; @@ -191,6 +191,9 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) init_attr.send_cq = priv->cq; init_attr.recv_cq = priv->cq; + if (ca->flags & IB_DEVICE_TCP_TSO) + init_attr.create_flags = QP_CREATE_LSO; + priv->qp = ib_create_qp(priv->pd, &init_attr); if (IS_ERR(priv->qp)) { printk(KERN_WARNING "%s: failed to create QP\n", ca->name); -- 1.5.3.8 From teind at huanyuequipment.com Tue Jan 29 08:25:52 2008 From: teind at huanyuequipment.com (Connie Trevino) Date: Tue, 29 Jan 2008 18:25:52 +0200 Subject: [ofa-general] Get the cheapest software offer! Message-ID: <01c862a4$60e77800$60beb84e@teind> Need some software urgently? Purchase, download and install right now! Software in English, German, French, Italian, and Spanish for IBM PC and Macintosh! Cheap prices give you the possibility to save or buy more software than you can afford purchasing software on a CD! After purchasing you can install our software on any computer you'd like since it's not restricted. Access to all updates! Money back guarantee! http://geocities.com/annamartin583 You'll definitely find software you need. From vlad at dev.mellanox.co.il Tue Jan 29 08:46:45 2008 From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky) Date: Tue, 29 Jan 2008 18:46:45 +0200 Subject: [ofa-general] Re: [ewg] [PATCH] IB/IPoIB Check if grat. ARP changed had arrived when working in connected mode In-Reply-To: <479F0390.8020102@voltaire.com> References: <479F0390.8020102@voltaire.com> Message-ID: <479F5875.4070405@dev.mellanox.co.il> Moni Shoua wrote: > move a little up the code that checks for a situation where the remote GID stored in the ipoib_neigh is > different than the one present in the neighbour (handle Gratuitous ARP) or that a bonding fail over has > happened but the neighbour still has a pointer to an ipoib_neigh created not by the current slave. This > will cause the driver to apply the check also for connected mode neighbours. > This patch was tested against upstream kernel and ofed_kernel. > > Signed-off-by: Or Gerlitz > Signed-off-by: Moni Shoua > Applied, Regards, Vladimir From dledford at redhat.com Tue Jan 29 09:42:02 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 12:42:02 -0500 Subject: [ofa-general] Dapl 2 question/issue Message-ID: <1201628522.28486.36.camel@firewall.xsintricity.com> OK, I've been working on integrating the latest dapl stuff into our RHEL5.2 product and I've come across what I think is an issue. The dapl-2 code is not compatible with dapl-1 code, and there is a (albeit small, but still it exists) amount of work to forward port code. However, you maintained the same library name (aka, libdat) for both dapl-1 and dapl-2. That means that, if code were to #include and then link against -ldat, they would get the old dapl-1 headers and the new dapl-2 library (assuming the dapl-1 headers are installed, which realistically they need to be until all dependent code has been forward ported to dapl-2). In order for dapl-1 and dapl-2 libraries and devel environments to be installed simultaneously, which is what you need for a seamless migration from version 1 to 2, you need different names on the libs. Is there any chance we can get an updated dapl-2 that actually changes the lib name to libdat2.so instead of just libdat.so? -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From ahubbe at iol.unh.edu Tue Jan 29 09:58:25 2008 From: ahubbe at iol.unh.edu (Allen Hubbe) Date: Tue, 29 Jan 2008 12:58:25 -0500 (EST) Subject: [ofa-general] Dapl 2 question/issue In-Reply-To: <1201628522.28486.36.camel@firewall.xsintricity.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> Message-ID: I wonder: Could this be related to the problem I am having? https://bugs.openfabrics.org/show_bug.cgi?id=867 Allen On Tue, 29 Jan 2008, Doug Ledford wrote: > OK, I've been working on integrating the latest dapl stuff into our > RHEL5.2 product and I've come across what I think is an issue. > > The dapl-2 code is not compatible with dapl-1 code, and there is a > (albeit small, but still it exists) amount of work to forward port code. > However, you maintained the same library name (aka, libdat) for both > dapl-1 and dapl-2. That means that, if code were to #include > and then link against -ldat, they would get the old dapl-1 > headers and the new dapl-2 library (assuming the dapl-1 headers are > installed, which realistically they need to be until all dependent code > has been forward ported to dapl-2). In order for dapl-1 and dapl-2 > libraries and devel environments to be installed simultaneously, which > is what you need for a seamless migration from version 1 to 2, you need > different names on the libs. Is there any chance we can get an updated > dapl-2 that actually changes the lib name to libdat2.so instead of just > libdat.so? > > -- > Doug Ledford > GPG KeyID: CFBFF194 > http://people.redhat.com/dledford > > Infiniband specific RPMs available at > http://people.redhat.com/dledford/Infiniband > From launderjc at schleusingen.de Tue Jan 29 10:04:40 2008 From: launderjc at schleusingen.de (Oscar Tidwell) Date: Tue, 29 Jan 2008 19:04:40 +0100 Subject: [ofa-general] More sexual partners. More orgasms. More pleasure Message-ID: <174285437.48132404702405@schleusingen.de> 5 re pfh sul szd ts of use V ti P rk X ks L:In vc cr qr ea dnr se Your ph jg al jlu lu ycq s si vp ze.Bi fp gg bf er, ha zo rd zds er, lo dpi ng wu er la mez sti nu ng er hqb ect xam ion krc s on de yw ma hqg nd.In sau cre bcm as zm ed se crx xu dw al sta qy mi ucb na and s azx e kzv x drive.More po vqz wer qw ful, inte rb nse or vn ga hpq sm's.No more mo hib od-ki nk lli zan ng pre nb mat ef ure ej nee acula ap tion.If You're Tired Of Being Em vam barra nll ssed By Your Si atw ze When Making Love... If You Want To Have Better s set e wg x... There's only one choice - V lm P cc X bx L.Just try V zr P bg X dlv L and make sure that nothing is impossible! -------------- next part -------------- An HTML attachment was scrubbed... URL: From dledford at redhat.com Tue Jan 29 10:08:01 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 13:08:01 -0500 Subject: [ofa-general] Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> Message-ID: <1201630081.28486.37.camel@firewall.xsintricity.com> On Tue, 2008-01-29 at 12:58 -0500, Allen Hubbe wrote: > I wonder: Could this be related to the problem I am having? > > https://bugs.openfabrics.org/show_bug.cgi?id=867 I wouldn't be surprised if this is related. > Allen > > On Tue, 29 Jan 2008, Doug Ledford wrote: > > > OK, I've been working on integrating the latest dapl stuff into our > > RHEL5.2 product and I've come across what I think is an issue. > > > > The dapl-2 code is not compatible with dapl-1 code, and there is a > > (albeit small, but still it exists) amount of work to forward port code. > > However, you maintained the same library name (aka, libdat) for both > > dapl-1 and dapl-2. That means that, if code were to #include > > and then link against -ldat, they would get the old dapl-1 > > headers and the new dapl-2 library (assuming the dapl-1 headers are > > installed, which realistically they need to be until all dependent code > > has been forward ported to dapl-2). In order for dapl-1 and dapl-2 > > libraries and devel environments to be installed simultaneously, which > > is what you need for a seamless migration from version 1 to 2, you need > > different names on the libs. Is there any chance we can get an updated > > dapl-2 that actually changes the lib name to libdat2.so instead of just > > libdat.so? > > > > -- > > Doug Ledford > > GPG KeyID: CFBFF194 > > http://people.redhat.com/dledford > > > > Infiniband specific RPMs available at > > http://people.redhat.com/dledford/Infiniband > > -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From ardavis at ichips.intel.com Tue Jan 29 10:09:49 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Tue, 29 Jan 2008 10:09:49 -0800 Subject: [ofa-general] Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> Message-ID: <479F6BED.6060809@ichips.intel.com> Allen Hubbe wrote: > I wonder: Could this be related to the problem I am having? > > https://bugs.openfabrics.org/show_bug.cgi?id=867 That is a different issue: dapltest did not include definitions for inet_ntoa. At load time the symbol was resloved but with default definition of int, instead of char*, it caused a segfault. Fix is in latest packages released yesterday. -arlin From ardavis at ichips.intel.com Tue Jan 29 10:44:09 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Tue, 29 Jan 2008 10:44:09 -0800 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <1201628522.28486.36.camel@firewall.xsintricity.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> Message-ID: <479F73F9.9050704@ichips.intel.com> Doug Ledford wrote: > OK, I've been working on integrating the latest dapl stuff into our > RHEL5.2 product and I've come across what I think is an issue. > > The dapl-2 code is not compatible with dapl-1 code, and there is a > (albeit small, but still it exists) amount of work to forward port code. > However, you maintained the same library name (aka, libdat) for both > dapl-1 and dapl-2. Yes, I wanted to stay away from renaming on every major release if possible. But I can see your point if someone wants v1 and v2 development packages on the same system. The intention was to provide only the latest development environment with older versions supported in libraries only. That means that, if code were to #include > and then link against -ldat, they would get the old dapl-1 > headers and the new dapl-2 library (assuming the dapl-1 headers are > installed, which realistically they need to be until all dependent code > has been forward ported to dapl-2). In order for dapl-1 and dapl-2 > libraries and devel environments to be installed simultaneously, which > is what you need for a seamless migration from version 1 to 2, you need > different names on the libs. Is there any chance we can get an updated > dapl-2 that actually changes the lib name to libdat2.so instead of just > libdat.so? I have no objections to libdat2.so. James, do you see any issues? Anyone else? -arlin From divergencesbj281 at jens-nacke.de Tue Jan 29 10:51:13 2008 From: divergencesbj281 at jens-nacke.de (Gabriela Siegel) Date: Tue, 29 Jan 2008 20:51:13 +0200 Subject: [ofa-general] Say good bye to ED_dysfunction! Message-ID: <408759784.17015270726359@jens-nacke.de> Former President Bill Klinton uses Vi tf a icq g gp ra! Everybody knows the great s bz ex wl ual scandal known as "Klinton-Levinsky". After the relations like this Klinton's popularity raised a lot! It's a natural ph tul eno yc menon, because Bill as a real man in order not to shame himself when he was with Monica regularly used V pyz ia tb gr iy a. What happened you see:) His political figure became more courageous and more attractive. Women all over the world made out in his person not only the president of the USA, but the man!!! It's very important for a man to be respected as a man. See our site to enter upon the new phase of your life. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dledford at redhat.com Tue Jan 29 10:53:15 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 13:53:15 -0500 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <479F73F9.9050704@ichips.intel.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> Message-ID: <1201632795.28486.51.camel@firewall.xsintricity.com> On Tue, 2008-01-29 at 10:44 -0800, Arlin Davis wrote: > Doug Ledford wrote: > > OK, I've been working on integrating the latest dapl stuff into our > > RHEL5.2 product and I've come across what I think is an issue. > > > > The dapl-2 code is not compatible with dapl-1 code, and there is a > > (albeit small, but still it exists) amount of work to forward port code. > > However, you maintained the same library name (aka, libdat) for both > > dapl-1 and dapl-2. > > Yes, I wanted to stay away from renaming on every major release if > possible. But I can see your point if someone wants v1 and v2 > development packages on the same system. The intention was to provide > only the latest development environment with older versions supported in > libraries only. Well, let me say why I bring this up. Although I don't personally know anyone that uses it this way, our openmpi package is built with udapl provider support. So, here I am getting ready to build openmpi-1.2.5, and I would have to turn off the udapl provider if I didn't include the dat-1 devel environment because openmpi has not been forward ported (to my knowledge, but the fact that is still uses #include instead of #include is a strong indicator). If I turn of the udapl provider to openmpi is probably isn't any big loss since most people in their right mind just use the openib provider instead. But, technically, it would be a regression since it was enabled previously. So, this is the sort of thing I ran into. I figure if it effects me, it just might bite some user out there too. Hence this conversation. > That means that, if code were to #include > > and then link against -ldat, they would get the old dapl-1 > > headers and the new dapl-2 library (assuming the dapl-1 headers are > > installed, which realistically they need to be until all dependent code > > has been forward ported to dapl-2). In order for dapl-1 and dapl-2 > > libraries and devel environments to be installed simultaneously, which > > is what you need for a seamless migration from version 1 to 2, you need > > different names on the libs. Is there any chance we can get an updated > > dapl-2 that actually changes the lib name to libdat2.so instead of just > > libdat.so? > > I have no objections to libdat2.so. > > James, do you see any issues? Anyone else? > > -arlin > -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From changquing.tang at hp.com Tue Jan 29 11:05:25 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Tue, 29 Jan 2008 19:05:25 +0000 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <479F73F9.9050704@ichips.intel.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> Message-ID: > -----Original Message----- > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of > Arlin Davis > Sent: Tuesday, January 29, 2008 12:44 PM > To: Doug Ledford > Cc: general > Subject: [ofa-general] Re: Dapl 2 question/issue > > Doug Ledford wrote: > > OK, I've been working on integrating the latest dapl stuff into our > > RHEL5.2 product and I've come across what I think is an issue. > > > > The dapl-2 code is not compatible with dapl-1 code, and there is a > > (albeit small, but still it exists) amount of work to > forward port code. > > However, you maintained the same library name (aka, libdat) for both > > dapl-1 and dapl-2. > > Yes, I wanted to stay away from renaming on every major > release if possible. But I can see your point if someone > wants v1 and v2 development packages on the same system. The > intention was to provide only the latest development > environment with older versions supported in libraries only. > > That means that, if code were to #include > > and then link against -ldat, they would get the > old dapl-1 > > headers and the new dapl-2 library (assuming the dapl-1 headers are > > installed, which realistically they need to be until all dependent > > code has been forward ported to dapl-2). In order for dapl-1 and > > dapl-2 libraries and devel environments to be installed > > simultaneously, which is what you need for a seamless > migration from > > version 1 to 2, you need different names on the libs. Is there any > > chance we can get an updated > > dapl-2 that actually changes the lib name to libdat2.so instead of > > just libdat.so? > > I have no objections to libdat2.so. > > James, do you see any issues? Anyone else? If it is this way, then all previous linked code must re-link again, even they are only using basic uDAPL features. And HP-MPI dlopen(libdat.so), and call dat_registry_list_providers() to find the version number, this function always returns the runtime udapl version. --CQ > > -arlin > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From dledford at redhat.com Tue Jan 29 11:26:03 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 14:26:03 -0500 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> Message-ID: <1201634763.28486.63.camel@firewall.xsintricity.com> On Tue, 2008-01-29 at 19:05 +0000, Tang, Changqing wrote: > > > -----Original Message----- > > From: general-bounces at lists.openfabrics.org > > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of > > Arlin Davis > > Sent: Tuesday, January 29, 2008 12:44 PM > > To: Doug Ledford > > Cc: general > > Subject: [ofa-general] Re: Dapl 2 question/issue > > > > Doug Ledford wrote: > > > OK, I've been working on integrating the latest dapl stuff into our > > > RHEL5.2 product and I've come across what I think is an issue. > > > > > > The dapl-2 code is not compatible with dapl-1 code, and there is a > > > (albeit small, but still it exists) amount of work to > > forward port code. > > > However, you maintained the same library name (aka, libdat) for both > > > dapl-1 and dapl-2. > > > > Yes, I wanted to stay away from renaming on every major > > release if possible. But I can see your point if someone > > wants v1 and v2 development packages on the same system. The > > intention was to provide only the latest development > > environment with older versions supported in libraries only. > > > > That means that, if code were to #include > > > and then link against -ldat, they would get the > > old dapl-1 > > > headers and the new dapl-2 library (assuming the dapl-1 headers are > > > installed, which realistically they need to be until all dependent > > > code has been forward ported to dapl-2). In order for dapl-1 and > > > dapl-2 libraries and devel environments to be installed > > > simultaneously, which is what you need for a seamless > > migration from > > > version 1 to 2, you need different names on the libs. Is there any > > > chance we can get an updated > > > dapl-2 that actually changes the lib name to libdat2.so instead of > > > just libdat.so? > > > > I have no objections to libdat2.so. > > > > James, do you see any issues? Anyone else? > > If it is this way, then all previous linked code must re-link again, even they > are only using basic uDAPL features. That's true anyway. If you compiled a basic udapl application against libdat.so back when it was a link to libdat.so.1, then that's what got stored in the app. You would have to relink the app against libdat.so.2 for it to ever use the new library. > And HP-MPI dlopen(libdat.so), and call dat_registry_list_providers() to find the version > number, this function always returns the runtime udapl version. A runtime dlopen is different than being linked against a library, and yes in this case you would now have to dlopen(libdat2.so) and if that fails fall back to (libdat.so). However, given that according to the dapl-1 to dapl-2 porting guide there are several structures that have changed their layout between dat-1 and dat-2, I would think that HP-MPI would have to jump through hoops (either by knowing about both layouts, or by knowing what structures to avoid because they change) in order to support working with either dat-1 or dat-2 at runtime. If anything, that would make me think this is a good example of why they *should* be different library names. > --CQ > > > > > > > -arlin > > > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From changquing.tang at hp.com Tue Jan 29 11:39:20 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Tue, 29 Jan 2008 19:39:20 +0000 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <1201634763.28486.63.camel@firewall.xsintricity.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> Message-ID: > A runtime dlopen is different than being linked against a > library, and yes in this case you would now have to > dlopen(libdat2.so) and if that fails fall back to > (libdat.so). However, given that according to the > dapl-1 to dapl-2 porting guide there are several structures > that have changed their layout between dat-1 and dat-2, I > would think that HP-MPI would have to jump through hoops > (either by knowing about both layouts, or by knowing what > structures to avoid because they change) in order to support > working with either dat-1 or dat-2 at runtime. If anything, > that would make me think this is a good example of why they > *should* be different library names. If there is a way to seamlessl work on either version available on the system without asking user for choice, I am OK to change the library name. According what you said, we can dlopen(libdat2.so) first, then fallback to dlopen(libdat.so). But we have to claim that HP-MPI xxx version and older does not work with uDAPL 2.0. Currently HP-MPI works seamlessly on uDAPL 1.1 or uDAPL 1.2 system. --CQ > > --CQ > > > > > > > > > > > > -arlin > > > > > > > > > _______________________________________________ > > > general mailing list > > > general at lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit > > > http://openib.org/mailman/listinfo/openib-general > > > > -- > Doug Ledford > GPG KeyID: CFBFF194 > http://people.redhat.com/dledford > > Infiniband specific RPMs available at > http://people.redhat.com/dledford/Infiniband > From mcfaddenol157 at at-fachverlag.de Tue Jan 29 11:41:18 2008 From: mcfaddenol157 at at-fachverlag.de (Archie Presley) Date: Tue, 29 Jan 2008 20:41:18 +0100 Subject: [ofa-general] Requested healthcare help Message-ID: <669566792.03075986592005@at-fachverlag.de> The portents of modern science Useful information!Rumor has it that when a truck carrying a load of V to ia xwl g tla ra slid off into the Ohio River, all the lift bridges suddenly went up. Visit our site to lift up ...:) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jeffrey.C.Becker at nasa.gov Tue Jan 29 11:47:36 2008 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Tue, 29 Jan 2008 11:47:36 -0800 Subject: [ofa-general] OFA server rebooted In-Reply-To: <479EFED5.6050406@dev.mellanox.co.il> References: <479E2FA5.6070903@nasa.gov> <479EFED5.6050406@dev.mellanox.co.il> Message-ID: <479F82D8.8090106@nasa.gov> Hi Vlad Vladimir Sokolovsky wrote: > Jeff Becker wrote: >> Hi. After all the e-mails, I started investigating the disk space issue, >> and found some unkillable processes hanging on our NFS backup partition. >> After 479 days of uptime :-) , I figured it was a good time to reboot >> the server. It's back up now and the web page, wiki and bugzilla all >> seem to be OK. Please let me know if you find any problems. Thanks. >> >> -jeff >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> > Hi Jeff, > Please check if the time on the OFA server is correct. Good catch - we were off by an hour - fixed via ntp. Thanks. -jeff > > Thanks, > Vladimir From dledford at redhat.com Tue Jan 29 11:58:16 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 14:58:16 -0500 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> Message-ID: <1201636696.28486.75.camel@firewall.xsintricity.com> On Tue, 2008-01-29 at 19:39 +0000, Tang, Changqing wrote: > > > A runtime dlopen is different than being linked against a > > library, and yes in this case you would now have to > > dlopen(libdat2.so) and if that fails fall back to > > (libdat.so). However, given that according to the > > dapl-1 to dapl-2 porting guide there are several structures > > that have changed their layout between dat-1 and dat-2, I > > would think that HP-MPI would have to jump through hoops > > (either by knowing about both layouts, or by knowing what > > structures to avoid because they change) in order to support > > working with either dat-1 or dat-2 at runtime. If anything, > > that would make me think this is a good example of why they > > *should* be different library names. > > If there is a way to seamlessl work on either version available on the system > without asking user for choice, I am OK to change the library name. I don't think there is. Dapl-1 and dapl-2 simply are not API compatible. You have to port to dapl-2 (or so the docs say, I haven't written any code that uses dapl, so I can't speak from experience). > According what you said, we can dlopen(libdat2.so) first, then fallback to > dlopen(libdat.so). > > But we have to claim that HP-MPI xxx version and older does not work with > uDAPL 2.0. You *should* be claiming that regardless. The API between dapl-1 and dapl-2 changed, some of which is fixed simply be recompiling and some of which requires actual code changes. Just because the library name is the same doesn't mean that code built and compiled against dapl-1 could or should attempt to run against dapl-2 (unless I'm wrong here, Arlin should really speak to this issue...if the dapl-2 libraries provide backward compatible symbols via so symbol versions, then it might be possible for a dapl-1 program to run against the dapl-2 library, but then that would beg the question of why we are still distributing dapl-1 libraries in a separate package, so I'm guessing there is not a back compatible layer in the dapl-2 library). > Currently HP-MPI works seamlessly on uDAPL 1.1 or uDAPL 1.2 system. Yes, and that's typical. Between minor point releases it's common that things "just work". Between major releases, it isn't. Between major releases it's common that at a minimum a recompile and relink is needed, but in this case you not only need a recompile and relink, but you need a few logical changes as well. It isn't plug-n-play for the dapl-1 to dapl-2 update. > > --CQ > > > > > --CQ > > > > > > > > > > > > > > > > > -arlin > > > > > > > > > > > > _______________________________________________ > > > > general mailing list > > > > general at lists.openfabrics.org > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > To unsubscribe, please visit > > > > http://openib.org/mailman/listinfo/openib-general > > > > > > -- > > Doug Ledford > > GPG KeyID: CFBFF194 > > http://people.redhat.com/dledford > > > > Infiniband specific RPMs available at > > http://people.redhat.com/dledford/Infiniband > > -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From jlentini at netapp.com Tue Jan 29 12:37:43 2008 From: jlentini at netapp.com (James Lentini) Date: Tue, 29 Jan 2008 15:37:43 -0500 (EST) Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <479F73F9.9050704@ichips.intel.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> Message-ID: On Tue, 29 Jan 2008, Arlin Davis wrote: > Doug Ledford wrote: > > OK, I've been working on integrating the latest dapl stuff into our > > RHEL5.2 product and I've come across what I think is an issue. > > > > The dapl-2 code is not compatible with dapl-1 code, and there is a > > (albeit small, but still it exists) amount of work to forward port code. > > However, you maintained the same library name (aka, libdat) for both > > dapl-1 and dapl-2. > > Yes, I wanted to stay away from renaming on every major release if possible. > But I can see your point if someone wants v1 and v2 development packages on > the same system. The intention was to provide only the latest development > environment with older versions supported in libraries only. > > That means that, if code were to #include > > and then link against -ldat, they would get the old dapl-1 > > headers and the new dapl-2 library (assuming the dapl-1 headers are > > installed, which realistically they need to be until all dependent code > > has been forward ported to dapl-2). In order for dapl-1 and dapl-2 > > libraries and devel environments to be installed simultaneously, which > > is what you need for a seamless migration from version 1 to 2, you need > > different names on the libs. Is there any chance we can get an updated > > dapl-2 that actually changes the lib name to libdat2.so instead of just > > libdat.so? > > I have no objections to libdat2.so. > > James, do you see any issues? Anyone else? I don't have any objections either. Arkady, are there any spec. issues? From dledford at redhat.com Tue Jan 29 12:41:25 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 15:41:25 -0500 Subject: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> Message-ID: <1201639285.28486.101.camel@firewall.xsintricity.com> On Tue, 2008-01-29 at 14:21 +0200, Tziporet Koren wrote: > OFED Jan 28 meeting summary on RC3 readiness: > ===================================== > > 1. OFED 1.3 readiness toward RC3 this week > > * RC3 is based on the official 2.6.24 release > * RC3 is expected on Wed > * RC4 is planned for Feb 13 > > > 2. All companies update: > > * IBM - ready for RC3 > * Voltaire - ready for RC3 > * Qlogic - ready for RC3; will work on bug 874 > * Intel - things looks good. Need some uDAPL update from > Arlin > * Chelsio - ready for RC3 > * NetEffect - ready for RC3 > * Cisco - reported all issues in bugzilla > * Mellanox - ready for RC3 > * MPI - all packages are ready > > > 3. Request to change IPoIB to support CM without SRQ and 4K MTU > > Decided that we cannot insert such enhancements at this stage > (RC3 built today) without delaying the release since IPoIB is > a critical ULP used by all customers. > > Since we do not want to delay the release and we wish to have > a solution for the new IPoIB enhancements we plan to have > 1.3.1 release Hmmm...I'd like to put my $.02 in here. I don't have any visibility into what drives the OFED schedule, so I have no clue as to why people don't want to slip the schedule for this change. I'm sure you guys have your reasons. However, I also happen to be a consumer of this code, and I know for a fact that no one has gotten my input on this issue. So, the deal is that I'm currently integrating OFED 1.3 into what will be RHEL5.2. The RHEL5.2 freeze date has already passed, but in order to keep what finally goes out from being too stale, I'm being allowed to submit the OFED-1.3-rc1 code prior to freeze, and then update to OFED-1.3 final during our beta test process. What this means, is that anything you punt from 1.3 to 1.3.1, you are also punting out of RHEL5.2 and RHEL4.7. So, that being said, there's a whole trickle down effect with various groups that would really like to be able to use 5.2 out of the box that may prefer a slip in 1.3 so that this can be part of it instead of punting to 1.3.1. I'm not saying this will change your mind, but I'm sure it wasn't part of the decision process before, so I'm bringing it up. > AIs: > Tziporet to define the 1.3.1 release (scope of changes, > schedule etc.) > Vlad: open 1_3_1 branch so people will have a place to commit > changes. We will not start any daily build before 1.3 release > > > 3. Review high priority bugs: > 846 critical jim at mellanox.com SDP crash on RHEL5 > ppc64 running netserver - will be debugged > > 859 critical monis at voltaire.com Bonding configuration > on Sles10 sp1 is not loaded consistently - fixed > 863 critical monis at voltaire.com ib-bonding won't > compile for RHEL4 U6 - fixed > 874 critical rjwalsh at pathscale.com Intel MPI (IMB test) > hangs intermittently on the qlogic HCA - will be debugged by > Qlogic > > 760 major eli at mellanox.co.il UDP performance on Rx is lower > than Tx - for 1.3.1 > 761 major eli at mellanox.co.il Poor and jittery UDP > performance at small messages - for 1.3.1 Ditto for requesting these two be in 1.3. We've already had customers bring up the UDP performance issue in our previous releases. > 869 major orenk at dev.mellanox.co.il mstflint won't build > on SLES10 x86 - fixed > 736 major rolandd at cisco.com IBV_WC_RETRY_EXC_ERR errors > with local rdma_reads - seems a FW issue (Mellanox to > debug) > > 767 major swise at opengridcomputing.com Non backport Kernels > that don't build in genalloc cause compile errors for cxgb3 - no fix > (document) And we still need to get actual downloads for a number of the srpms in OFED-1.3. The various spec files list fictitious tarballs that aren't actually available on the download server. While that works for the rcs, they really need to have a tarball up there for final. -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From Arkady.Kanevsky at netapp.com Tue Jan 29 13:10:46 2008 From: Arkady.Kanevsky at netapp.com (Kanevsky, Arkady) Date: Tue, 29 Jan 2008 16:10:46 -0500 Subject: [ofa-general] RE: Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> Message-ID: No spec issues. Just use "right" library names in dat.conf file. Thanks, Arkady Kanevsky email: arkady at netapp.com Network Appliance Inc. phone: 781-768-5395 1601 Trapelo Rd. - Suite 16. Fax: 781-895-1195 Waltham, MA 02451 central phone: 781-768-5300 > -----Original Message----- > From: James Lentini [mailto:jlentini at netapp.com] > Sent: Tuesday, January 29, 2008 3:38 PM > To: Arlin Davis; Kanevsky, Arkady > Cc: Doug Ledford; general > Subject: Re: Dapl 2 question/issue > > > > On Tue, 29 Jan 2008, Arlin Davis wrote: > > > Doug Ledford wrote: > > > OK, I've been working on integrating the latest dapl > stuff into our > > > RHEL5.2 product and I've come across what I think is an issue. > > > > > > The dapl-2 code is not compatible with dapl-1 code, and > there is a > > > (albeit small, but still it exists) amount of work to > forward port code. > > > However, you maintained the same library name (aka, > libdat) for both > > > dapl-1 and dapl-2. > > > > Yes, I wanted to stay away from renaming on every major > release if possible. > > But I can see your point if someone wants v1 and v2 development > > packages on the same system. The intention was to provide only the > > latest development environment with older versions > supported in libraries only. > > > > That means that, if code were to #include > > > and then link against -ldat, they would get the old > > > dapl-1 headers and the new dapl-2 library (assuming the dapl-1 > > > headers are installed, which realistically they need to > be until all > > > dependent code has been forward ported to dapl-2). In order for > > > dapl-1 and dapl-2 libraries and devel environments to be > installed > > > simultaneously, which is what you need for a seamless > migration from > > > version 1 to 2, you need different names on the libs. Is > there any > > > chance we can get an updated > > > dapl-2 that actually changes the lib name to libdat2.so > instead of > > > just libdat.so? > > > > I have no objections to libdat2.so. > > > > James, do you see any issues? Anyone else? > > I don't have any objections either. > > Arkady, are there any spec. issues? > From ardavis at ichips.intel.com Tue Jan 29 13:46:13 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Tue, 29 Jan 2008 13:46:13 -0800 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <1201636696.28486.75.camel@firewall.xsintricity.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> Message-ID: <479F9EA5.4020407@ichips.intel.com> Doug Ledford wrote: > > You *should* be claiming that regardless. The API between dapl-1 and > dapl-2 changed, some of which is fixed simply be recompiling and some of > which requires actual code changes. Just because the library name is > the same doesn't mean that code built and compiled against dapl-1 could > or should attempt to run against dapl-2 (unless I'm wrong here, Arlin > should really speak to this issue...if the dapl-2 libraries provide > backward compatible symbols via so symbol versions, then it might be > possible for a dapl-1 program to run against the dapl-2 library, but > then that would beg the question of why we are still distributing dapl-1 > libraries in a separate package, so I'm guessing there is not a back > compatible layer in the dapl-2 library). You are correct. dapl-1 programs cannot run against dapl-2. That is why both v1 and v2 packages are provided. We have to support existing v1 applications while providing a transition path to v2. -arlin From changquing.tang at hp.com Tue Jan 29 13:57:05 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Tue, 29 Jan 2008 21:57:05 +0000 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <479F9EA5.4020407@ichips.intel.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> <479F9EA5.4020407@ichips.intel.com> Message-ID: So can we install both v1 and v2 packages on a system ? At the install time, we don't know the app will run on v1 or v2. --CQ > -----Original Message----- > From: Arlin Davis [mailto:ardavis at ichips.intel.com] > Sent: Tuesday, January 29, 2008 3:46 PM > To: Doug Ledford > Cc: Tang, Changqing; general > Subject: Re: [ofa-general] Re: Dapl 2 question/issue > > Doug Ledford wrote: > > > > You *should* be claiming that regardless. The API between > dapl-1 and > > dapl-2 changed, some of which is fixed simply be > recompiling and some > > of which requires actual code changes. Just because the > library name > > is the same doesn't mean that code built and compiled > against dapl-1 > > could or should attempt to run against dapl-2 (unless I'm > wrong here, > > Arlin should really speak to this issue...if the dapl-2 libraries > > provide backward compatible symbols via so symbol versions, then it > > might be possible for a dapl-1 program to run against the dapl-2 > > library, but then that would beg the question of why we are still > > distributing dapl-1 libraries in a separate package, so I'm > guessing > > there is not a back compatible layer in the dapl-2 library). > > You are correct. dapl-1 programs cannot run against dapl-2. > That is why both v1 and v2 packages are provided. We have to > support existing v1 applications while providing a transition > path to v2. > > -arlin > > From dledford at redhat.com Tue Jan 29 14:03:24 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 17:03:24 -0500 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> <479F9EA5.4020407@ichips.intel.com> Message-ID: <1201644204.28486.116.camel@firewall.xsintricity.com> On Tue, 2008-01-29 at 21:57 +0000, Tang, Changqing wrote: > So can we install both v1 and v2 packages on a system ? At the install time, we don't know the app will run on v1 or v2. Currently, you can install both runtimes, but not both devel environments. > --CQ > > > -----Original Message----- > > From: Arlin Davis [mailto:ardavis at ichips.intel.com] > > Sent: Tuesday, January 29, 2008 3:46 PM > > To: Doug Ledford > > Cc: Tang, Changqing; general > > Subject: Re: [ofa-general] Re: Dapl 2 question/issue > > > > Doug Ledford wrote: > > > > > > You *should* be claiming that regardless. The API between > > dapl-1 and > > > dapl-2 changed, some of which is fixed simply be > > recompiling and some > > > of which requires actual code changes. Just because the > > library name > > > is the same doesn't mean that code built and compiled > > against dapl-1 > > > could or should attempt to run against dapl-2 (unless I'm > > wrong here, > > > Arlin should really speak to this issue...if the dapl-2 libraries > > > provide backward compatible symbols via so symbol versions, then it > > > might be possible for a dapl-1 program to run against the dapl-2 > > > library, but then that would beg the question of why we are still > > > distributing dapl-1 libraries in a separate package, so I'm > > guessing > > > there is not a back compatible layer in the dapl-2 library). > > > > You are correct. dapl-1 programs cannot run against dapl-2. > > That is why both v1 and v2 packages are provided. We have to > > support existing v1 applications while providing a transition > > path to v2. > > > > -arlin > > > > -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From changquing.tang at hp.com Tue Jan 29 14:24:50 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Tue, 29 Jan 2008 22:24:50 +0000 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <1201644204.28486.116.camel@firewall.xsintricity.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> <479F9EA5.4020407@ichips.intel.com> <1201644204.28486.116.camel@firewall.xsintricity.com> Message-ID: devel environments are just runtimes plus header files, so the libraries won't overwrite each other, but header files will ? --CQ > -----Original Message----- > From: Doug Ledford [mailto:dledford at redhat.com] > Sent: Tuesday, January 29, 2008 4:03 PM > To: Tang, Changqing > Cc: Arlin Davis; general > Subject: RE: [ofa-general] Re: Dapl 2 question/issue > > > On Tue, 2008-01-29 at 21:57 +0000, Tang, Changqing wrote: > > So can we install both v1 and v2 packages on a system ? At > the install time, we don't know the app will run on v1 or v2. > > Currently, you can install both runtimes, but not both devel > environments. > > > --CQ > > > > > -----Original Message----- > > > From: Arlin Davis [mailto:ardavis at ichips.intel.com] > > > Sent: Tuesday, January 29, 2008 3:46 PM > > > To: Doug Ledford > > > Cc: Tang, Changqing; general > > > Subject: Re: [ofa-general] Re: Dapl 2 question/issue > > > > > > Doug Ledford wrote: > > > > > > > > You *should* be claiming that regardless. The API between > > > dapl-1 and > > > > dapl-2 changed, some of which is fixed simply be > > > recompiling and some > > > > of which requires actual code changes. Just because the > > > library name > > > > is the same doesn't mean that code built and compiled > > > against dapl-1 > > > > could or should attempt to run against dapl-2 (unless I'm > > > wrong here, > > > > Arlin should really speak to this issue...if the dapl-2 > libraries > > > > provide backward compatible symbols via so symbol > versions, then > > > > it might be possible for a dapl-1 program to run against the > > > > dapl-2 library, but then that would beg the question of > why we are > > > > still distributing dapl-1 libraries in a separate > package, so I'm > > > guessing > > > > there is not a back compatible layer in the dapl-2 library). > > > > > > You are correct. dapl-1 programs cannot run against dapl-2. > > > That is why both v1 and v2 packages are provided. We have > to support > > > existing v1 applications while providing a transition path to v2. > > > > > > -arlin > > > > > > > -- > Doug Ledford > GPG KeyID: CFBFF194 > http://people.redhat.com/dledford > > Infiniband specific RPMs available at > http://people.redhat.com/dledford/Infiniband > From ardavis at ichips.intel.com Tue Jan 29 14:27:40 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Tue, 29 Jan 2008 14:27:40 -0800 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> Message-ID: <479FA85C.7030501@ichips.intel.com> Tang, Changqing wrote: > > If there is a way to seamlessl work on either version available on the system > without asking user for choice, I am OK to change the library name. If you support 1.1 and 1.2 seamlessly then you have already dealt with some of the issues that will come up with 2.0 changes. You will have to build the proper wrappers around structure/api changes and provide the correct major/minor versions during your open call. The big difference is now you have multiple versions of libdat.so to manage along with multiple versions of providers. > > According what you said, we can dlopen(libdat2.so) first, then fallback to > dlopen(libdat.so). Yes, if we change the name and you add support for v2. If you fallback to v1 libdat.so then you should be running in v1 mode and only be linking to v1 providers. Look at the latest OFED 1.3 dat.conf and you can see options already exist to support both versions. MPI implementations using uDAPL and expecting v1 libraries work just fine, while dtest/dapltest built for v2 also work on the same system. > > But we have to claim that HP-MPI xxx version and older does not work with > uDAPL 2.0. Currently HP-MPI works seamlessly on uDAPL 1.1 or uDAPL 1.2 system. Yes, and you also have to claim that you support both 1.1 and 1.2 in today's movie where we have providers/devices running at different versions. FYI: I will continue to support both 1.2 and 2.0 OFA providers going forward so you can move to 2.0 at your own pace. -arlin From ardavis at ichips.intel.com Tue Jan 29 14:31:24 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Tue, 29 Jan 2008 14:31:24 -0800 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> <479F9EA5.4020407@ichips.intel.com> <1201644204.28486.116.camel@firewall.xsintricity.com> Message-ID: <479FA93C.6050305@ichips.intel.com> Tang, Changqing wrote: > devel environments are just runtimes plus header files, so the libraries won't overwrite each other, but header files will ? v2 header files are include/dat2, v1 headers are include/dat From dledford at redhat.com Tue Jan 29 14:43:31 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 17:43:31 -0500 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <479F9EA5.4020407@ichips.intel.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> <479F9EA5.4020407@ichips.intel.com> Message-ID: <1201646611.28486.126.camel@firewall.xsintricity.com> On Tue, 2008-01-29 at 13:46 -0800, Arlin Davis wrote: > Doug Ledford wrote: > > > > You *should* be claiming that regardless. The API between dapl-1 and > > dapl-2 changed, some of which is fixed simply be recompiling and some of > > which requires actual code changes. Just because the library name is > > the same doesn't mean that code built and compiled against dapl-1 could > > or should attempt to run against dapl-2 (unless I'm wrong here, Arlin > > should really speak to this issue...if the dapl-2 libraries provide > > backward compatible symbols via so symbol versions, then it might be > > possible for a dapl-1 program to run against the dapl-2 library, but > > then that would beg the question of why we are still distributing dapl-1 > > libraries in a separate package, so I'm guessing there is not a back > > compatible layer in the dapl-2 library). > > You are correct. dapl-1 programs cannot run against dapl-2. That is why > both v1 and v2 packages are provided. We have to support existing v1 > applications while providing a transition path to v2. OK, then that brings us back around to what I said earlier, which is that in order to be able to recompile any application that uses udapl prior to that application being ported to dapl2, you have to be able to install both devel environments. After talking with some other engineers inside Red Hat, this is what I'm going to be doing for our distribution. I'll be building a devel environment for both dapl-1.2 and dapl-2.0. I won't need to change the library name for dapl-2.0, but I am going to use a different directory for it's .so link. Specifically, since source code must be ported to dapl-2.0 (in the form of changing #include to dat2/dat.h in the source code), it would seem reasonable to me that the code can also be ported to link to a library in a different location (in this case, it will be %{_libdir}/dat/libdat.so, so the LDFLAGS will need to be updated with -L%{_libdir}/dat -ldat in order to get the new library). So, in our setup, code that wants to compile against the later dapl-2.0 library will need two changes (in addition to the port itself) in order to compile. All unported applications will still compile and link against the dapl-1.2 libraries. I'd be more than happy to send you guys the spec file I'm using to accomplish this if you wish. -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From dledford at redhat.com Tue Jan 29 14:47:43 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 17:47:43 -0500 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> <479F9EA5.4020407@ichips.intel.com> <1201644204.28486.116.camel@firewall.xsintricity.com> Message-ID: <1201646863.28486.129.camel@firewall.xsintricity.com> On Tue, 2008-01-29 at 22:24 +0000, Tang, Changqing wrote: > devel environments are just runtimes plus header files, Actually, the runtime is in the base package. The devel package is headers plus a bare .so link (aka, libdat.so -> libdat.so.2). It's the bare .so links that overwrite each other. The last thing you want is to have an application that has #include then link against -ldat and have that bare .so link point to the dapl-2 library because you will then be using the wrong header files with the shared object. Hence why I'm solving the issue the way I am in our release. > so the libraries won't overwrite each other, but header files will ? > > --CQ > > > -----Original Message----- > > From: Doug Ledford [mailto:dledford at redhat.com] > > Sent: Tuesday, January 29, 2008 4:03 PM > > To: Tang, Changqing > > Cc: Arlin Davis; general > > Subject: RE: [ofa-general] Re: Dapl 2 question/issue > > > > > > On Tue, 2008-01-29 at 21:57 +0000, Tang, Changqing wrote: > > > So can we install both v1 and v2 packages on a system ? At > > the install time, we don't know the app will run on v1 or v2. > > > > Currently, you can install both runtimes, but not both devel > > environments. > > > > > --CQ > > > > > > > -----Original Message----- > > > > From: Arlin Davis [mailto:ardavis at ichips.intel.com] > > > > Sent: Tuesday, January 29, 2008 3:46 PM > > > > To: Doug Ledford > > > > Cc: Tang, Changqing; general > > > > Subject: Re: [ofa-general] Re: Dapl 2 question/issue > > > > > > > > Doug Ledford wrote: > > > > > > > > > > You *should* be claiming that regardless. The API between > > > > dapl-1 and > > > > > dapl-2 changed, some of which is fixed simply be > > > > recompiling and some > > > > > of which requires actual code changes. Just because the > > > > library name > > > > > is the same doesn't mean that code built and compiled > > > > against dapl-1 > > > > > could or should attempt to run against dapl-2 (unless I'm > > > > wrong here, > > > > > Arlin should really speak to this issue...if the dapl-2 > > libraries > > > > > provide backward compatible symbols via so symbol > > versions, then > > > > > it might be possible for a dapl-1 program to run against the > > > > > dapl-2 library, but then that would beg the question of > > why we are > > > > > still distributing dapl-1 libraries in a separate > > package, so I'm > > > > guessing > > > > > there is not a back compatible layer in the dapl-2 library). > > > > > > > > You are correct. dapl-1 programs cannot run against dapl-2. > > > > That is why both v1 and v2 packages are provided. We have > > to support > > > > existing v1 applications while providing a transition path to v2. > > > > > > > > -arlin > > > > > > > > > > -- > > Doug Ledford > > GPG KeyID: CFBFF194 > > http://people.redhat.com/dledford > > > > Infiniband specific RPMs available at > > http://people.redhat.com/dledford/Infiniband > > -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From ardavis at ichips.intel.com Tue Jan 29 15:11:57 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Tue, 29 Jan 2008 15:11:57 -0800 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <1201646611.28486.126.camel@firewall.xsintricity.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> <479F9EA5.4020407@ichips.intel.com> <1201646611.28486.126.camel@firewall.xsintricity.com> Message-ID: <479FB2BD.3010906@ichips.intel.com> Doug Ledford wrote: > > OK, then that brings us back around to what I said earlier, which is > that in order to be able to recompile any application that uses udapl > prior to that application being ported to dapl2, you have to be able to > install both devel environments. > > After talking with some other engineers inside Red Hat, this is what I'm > going to be doing for our distribution. > > I'll be building a devel environment for both dapl-1.2 and dapl-2.0. I > won't need to change the library name for dapl-2.0, but I am going to > use a different directory for it's .so link. I would be more then happy to roll up the libdat2.so changes in time for OFED 1.3 if you prefer. I can probably have them ready later tonight in time for RC3. -arlin From ssufficool at rov.sbcounty.gov Tue Jan 29 16:04:09 2008 From: ssufficool at rov.sbcounty.gov (Sufficool, Stanley) Date: Tue, 29 Jan 2008 16:04:09 -0800 Subject: [ofa-general] Re: [PATCH] drivers/infiniband/ulp/srpt: Fix targetdata corruption In-Reply-To: <479A15B4.7030309@mellanox.com> Message-ID: When did ib_srpt start registering sessions with target node GUID? I thought it used the port GUID on target side. > -----Original Message----- > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vu Pham > Sent: Friday, January 25, 2008 9:01 AM > To: davem at systemfabricworks.com > Cc: general at lists.openfabrics.org > Subject: [ofa-general] Re: [PATCH] > drivers/infiniband/ulp/srpt: Fix targetdata corruption > > > davem at systemfabricworks.com wrote: > > Change the local buffer allocator to use a spin-lock > protected linked > > list instead of an array of atomic_t used/free > variables. The atomic_t > > code was open to a multi-thread race between test and > set. This has > > been observed with the result that the same data buffer > was used for > > more than one SCSI operation, either writing the wrong > data to the disk > > or sending the wrong data to the initiator. > > > > Signed-off-by: Robert Pearson > > Signed-off-by: David A. McMillen > > > Applied. Thanks > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From ssufficool at rov.sbcounty.gov Tue Jan 29 16:24:14 2008 From: ssufficool at rov.sbcounty.gov (Sufficool, Stanley) Date: Tue, 29 Jan 2008 16:24:14 -0800 Subject: [ofa-general] RE: [PATCH] drivers/infiniband/ulp/srpt: Fixtargetdata corruption In-Reply-To: Message-ID: > > davem at systemfabricworks.com wrote: > > > Change the local buffer allocator to use a spin-lock > > protected linked > > > list instead of an array of atomic_t used/free > > variables. The atomic_t > > > code was open to a multi-thread race between test and > > set. This has > > > been observed with the result that the same data buffer > > was used for > > > more than one SCSI operation, either writing the wrong > > data to the disk > > > or sending the wrong data to the initiator. > > > > > > Signed-off-by: Robert Pearson > > > Signed-off-by: David A. McMillen > > > > > Applied. Thanks _______________________________________________ Tested and passed with Windows 2003 SR2 WinOF x64 / Kernel 2.6.23 x64. Without using "ib_srpt memelements=0" From vuhuong at mellanox.com Tue Jan 29 16:23:36 2008 From: vuhuong at mellanox.com (Vu Pham) Date: Tue, 29 Jan 2008 16:23:36 -0800 Subject: [ofa-general] Re: [PATCH] drivers/infiniband/ulp/srpt: Fix targetdata corruption In-Reply-To: References: Message-ID: <479FC388.6090907@mellanox.com> Sufficool, Stanley wrote: > When did ib_srpt start registering sessions with target node GUID? I > thought it used the port GUID on target side. > ib_srpt used to use {target port GUID + initiator port GUID} as session name ib_srpt recently changed to use full initiator_port_ID (128-bit) as session name to register session. You can look at commit 15f6176b9bcc7b8122eb799b38deb09ece83bbab init_completion(&ch->scst_sess_done); sprintf(ch->sess_name, "0x%016llx%016llx", - (unsigned long long)be64_to_cpu(*(u64 *)&sdev->port[param->port - 1].gid.raw[8]), - (unsigned long long)be64_to_cpu(*(u64 *) (ch->i_port_id + 8))); + (unsigned long long)be64_to_cpu(*(u64 *)ch->i_port_id), + (unsigned long long)be64_to_cpu(*(u64 *)(ch->i_port_id + 8))); ch->scst_sess = scst_register_session(sdev->scst_tgt, 1, ch->sess_name, ch, srpt_register_channel_done); > >> -----Original Message----- >> From: general-bounces at lists.openfabrics.org >> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vu Pham >> Sent: Friday, January 25, 2008 9:01 AM >> To: davem at systemfabricworks.com >> Cc: general at lists.openfabrics.org >> Subject: [ofa-general] Re: [PATCH] >> drivers/infiniband/ulp/srpt: Fix targetdata corruption >> >> >> davem at systemfabricworks.com wrote: >>> Change the local buffer allocator to use a spin-lock >> protected linked >>> list instead of an array of atomic_t used/free >> variables. The atomic_t >>> code was open to a multi-thread race between test and >> set. This has >>> been observed with the result that the same data buffer >> was used for >>> more than one SCSI operation, either writing the wrong >> data to the disk >>> or sending the wrong data to the initiator. >>> >>> Signed-off-by: Robert Pearson >>> Signed-off-by: David A. McMillen >>> >> Applied. Thanks >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general >> From pradeeps at linux.vnet.ibm.com Tue Jan 29 16:34:56 2008 From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana) Date: Tue, 29 Jan 2008 16:34:56 -0800 Subject: [ofa-general] Re: CM Enable SRQ for less than 16 s/g - bug In-Reply-To: <479D10ED.6060107@linux.vnet.ibm.com> References: <1201439328.9219.15.camel@mtls03> <479CC810.7060100@linux.vnet.ibm.com> <6C2C79E72C305246B504CBA17B5500C903321E4C@mtlexch01.mtl.com> <479D10ED.6060107@linux.vnet.ibm.com> Message-ID: <479FC630.10506@linux.vnet.ibm.com> Pradeep Satyanarayana wrote: > Eli Cohen wrote: >> I meant that the test hangs but not the system. You can still ping hosts on the ipoib interface, it is just that the test never ends. You can press Ctrl C and restart the test again. >> > > If one can Ctrl-C that means it is not hung in the kernel. Several things strike > me: > a) Is this a new version of the test? > b) Was the system left in an "unclean" state from the previous test in > the regression suite? > c) Can this test hang be reproduce by just running this test on a freshly > booted system? > > Right now I do not have access to the machines to run a test. I will try and > do it next week. > Hello Eli, I did run ttcp between two two Mellanox HCAs (we have InfiniBand: Mellanox Technologies MT23108 InfiniHost (rev a1)) on ppc64 systems (on a Sles10sp2 beta distro) using the same command line options that was provided. I could not reproduce the hang in several iterations. I saw that you were using ttcpv -is this an enhancement to ttcp? Pradeep From a-17-m at aft-inc.net Tue Jan 29 17:09:50 2008 From: a-17-m at aft-inc.net (Darius Casey) Date: Wed, 30 Jan 2008 09:09:50 +0800 Subject: [ofa-general] What are you up to? Message-ID: <01c8631f$de040300$d7d44679@a-17-m> Hello! I am bored this afternoon. I am nice girl that would like to chat with you. Email me at Emma at EHealThies.info only, because I am using my friend's email to write this. Will send some of my pictures From arlin.r.davis at intel.com Tue Jan 29 16:58:28 2008 From: arlin.r.davis at intel.com (Arlin Davis) Date: Tue, 29 Jan 2008 16:58:28 -0800 Subject: [ofa-general] [ANNOUCE] dapl 2.0.5 release Message-ID: <000501c862db$3b9d17d0$a5e0180a@amr.corp.intel.com> There is new release for dapl 2.0 available on the OFA download page and in my git tree. Changes to allow both v1 and v2 development packages to be installed on the same system. v2 libdat.so has been renamed to libdat2.so. md5sum: 010459e421a5c194438d58b1ccf1c6d0 dapl-2.0.5.tar.gz Vlad, please pull new v2 release into OFED 1.3 RC3 and install the following packages: Note: please make sure dapl-1.2.4-devel is added to list. dapl-1.2.4-1 dapl-devel-1.2.4-1 dapl-2.0.5-1 dapl-utils-2.0.5-1 dapl-devel-2.0.5-1 dapl-debuginfo-2.0.5-1 See http://www.openfabrics.org/downloads/dapl/README.html for details. -arlin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajouri.jammu at gmail.com Tue Jan 29 17:24:50 2008 From: rajouri.jammu at gmail.com (Rajouri Jammu) Date: Tue, 29 Jan 2008 17:24:50 -0800 Subject: [ofa-general] Zero byte rdma read causes REM_OP_ERROR In-Reply-To: <200801281514.40097.jackm@dev.mellanox.co.il> References: <3307cdf90801242007y3ace39ccrb72d5f35c3a937e4@mail.gmail.com> <3307cdf90801251254p5983b62x687549bb793db39d@mail.gmail.com> <479C81C8.6060106@dev.mellanox.co.il> <200801281514.40097.jackm@dev.mellanox.co.il> Message-ID: <3307cdf90801291724h85960btf23715ef733c0de7@mail.gmail.com> I suspect that's the problem as I'm simply setting sge.length to rdma read size. Thanks for catching it! On Jan 28, 2008 5:14 AM, Jack Morgenstein wrote: > On Sunday 27 January 2008 15:06, Dotan Barak wrote: > > Rajouri Jammu wrote: > > > I'm using rdma_cm and I don't set the qp_access_flags explicitly. > > > > > > I presume they are set correctly since non-zero length rdma reads > > > complete successfully. I have also verified the data. > > > > > > the only place I set the privileges is when registering the memory > > > region and I have them set at > > > IBV_ACCESS_LOCAL_WRITE, _REMOTE_READ and _REMOTE_WRITE > > To send perform a zero-byte RDMA-read/write, you should assemble a WQE > with > no scatter/gather entries (see IBSPEC 1.2, volume 1, section 11.4.1.1, > table 94 -- Work Request Modifier Matrix, footnote b). A s/g entry with > its length > field = 0 is interpreted as requesting 2 gigabytes. > > Is this the problem? > > - Jack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajouri.jammu at gmail.com Tue Jan 29 17:24:50 2008 From: rajouri.jammu at gmail.com (Rajouri Jammu) Date: Tue, 29 Jan 2008 17:24:50 -0800 Subject: [ofa-general] Zero byte rdma read causes REM_OP_ERROR In-Reply-To: <200801281514.40097.jackm@dev.mellanox.co.il> References: <3307cdf90801242007y3ace39ccrb72d5f35c3a937e4@mail.gmail.com> <3307cdf90801251254p5983b62x687549bb793db39d@mail.gmail.com> <479C81C8.6060106@dev.mellanox.co.il> <200801281514.40097.jackm@dev.mellanox.co.il> Message-ID: <3307cdf90801291724h85960btf23715ef733c0de7@mail.gmail.com> I suspect that's the problem as I'm simply setting sge.length to rdma read size. Thanks for catching it! On Jan 28, 2008 5:14 AM, Jack Morgenstein wrote: > On Sunday 27 January 2008 15:06, Dotan Barak wrote: > > Rajouri Jammu wrote: > > > I'm using rdma_cm and I don't set the qp_access_flags explicitly. > > > > > > I presume they are set correctly since non-zero length rdma reads > > > complete successfully. I have also verified the data. > > > > > > the only place I set the privileges is when registering the memory > > > region and I have them set at > > > IBV_ACCESS_LOCAL_WRITE, _REMOTE_READ and _REMOTE_WRITE > > To send perform a zero-byte RDMA-read/write, you should assemble a WQE > with > no scatter/gather entries (see IBSPEC 1.2, volume 1, section 11.4.1.1, > table 94 -- Work Request Modifier Matrix, footnote b). A s/g entry with > its length > field = 0 is interpreted as requesting 2 gigabytes. > > Is this the problem? > > - Jack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kliteyn at mellanox.co.il Tue Jan 29 17:35:24 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 30 Jan 2008 03:35:24 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-30:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-29 OpenSM git rev = Tue_Jan_29_09:24:40_2008 [63c04327bbdcd47cc37cb0cbfb366de16ae0ccb6] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From ukanipacm at anipac.com Tue Jan 29 17:56:50 2008 From: ukanipacm at anipac.com (Lesley Goodman) Date: Wed, 30 Jan 2008 02:56:50 +0100 Subject: [ofa-general] No doctor appointment is necessary Message-ID: <340860329.90369365024358@anipac.com> Winter season is a great time to improve your health. For this reason CanadianPharmacy announced huge discounts as a special care for loyal customers. Try our service and you will get deep-discounted quality products delivered fast and discreetly directly to your doorstep. CanadianPharmacy is famous for the level of service and confidentiality. No scamming, no frauds. Get 12 free pills for over $300 order.Don't miss our Winter discounts. http://geocities.com/jarrodbrown424/Thank You for Your time and for your attention. -------------- next part -------------- An HTML attachment was scrubbed... URL: From changquing.tang at hp.com Tue Jan 29 18:06:40 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 30 Jan 2008 02:06:40 +0000 Subject: [ofa-general] [ANNOUCE] dapl 2.0.5 release In-Reply-To: <000501c862db$3b9d17d0$a5e0180a@amr.corp.intel.com> References: <000501c862db$3b9d17d0$a5e0180a@amr.corp.intel.com> Message-ID: Arlin: I have not had a chance to look at uDAPL 2.0, can you give a brief summary the changes from 1.2 to 2.0, I am interested from the applications perspective, don't care the internal details. Thanks. --CQ ________________________________ From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Arlin Davis Sent: Tuesday, January 29, 2008 6:58 PM To: OpenFabrics General; EWG Cc: Lentini, James; 'Vladimir Sokolovsky' Subject: [ofa-general] [ANNOUCE] dapl 2.0.5 release There is new release for dapl 2.0 available on the OFA download page and in my git tree. Changes to allow both v1 and v2 development packages to be installed on the same system. v2 libdat.so has been renamed to libdat2.so. md5sum: 010459e421a5c194438d58b1ccf1c6d0 dapl-2.0.5.tar.gz Vlad, please pull new v2 release into OFED 1.3 RC3 and install the following packages: Note: please make sure dapl-1.2.4-devel is added to list. dapl-1.2.4-1 dapl-devel-1.2.4-1 dapl-2.0.5-1 dapl-utils-2.0.5-1 dapl-devel-2.0.5-1 dapl-debuginfo-2.0.5-1 See http://www.openfabrics.org/downloads/dapl/README.html for details. -arlin -------------- next part -------------- An HTML attachment was scrubbed... URL: From dledford at redhat.com Tue Jan 29 18:55:37 2008 From: dledford at redhat.com (Doug Ledford) Date: Tue, 29 Jan 2008 21:55:37 -0500 Subject: [ofa-general] [ANNOUCE] dapl 2.0.5 release In-Reply-To: References: <000501c862db$3b9d17d0$a5e0180a@amr.corp.intel.com> Message-ID: <1201661737.28486.154.camel@firewall.xsintricity.com> On Wed, 2008-01-30 at 02:06 +0000, Tang, Changqing wrote: > Arlin: > I have not had a chance to look at uDAPL 2.0, can you give a brief > summary the changes from 1.2 to 2.0, I am interested from the > applications perspective, don't care the internal details. > > Thanks. http://www.openfabrics.org/downloads/dapl/documentation/transition_to_dat20.pdf provides a nice concise overview of the changes. > --CQ > > > ______________________________________________________________ > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of > Arlin Davis > Sent: Tuesday, January 29, 2008 6:58 PM > To: OpenFabrics General; EWG > Cc: Lentini, James; 'Vladimir Sokolovsky' > Subject: [ofa-general] [ANNOUCE] dapl 2.0.5 release > > > > There is new release for dapl 2.0 available on the OFA > download page and in my git tree. > > Changes to allow both v1 and v2 development packages to be > installed on the same system. > v2 libdat.so has been renamed to libdat2.so. > > md5sum: 010459e421a5c194438d58b1ccf1c6d0 dapl-2.0.5.tar.gz > > Vlad, please pull new v2 release into OFED 1.3 RC3 and install > the following packages: > > Note: please make sure dapl-1.2.4-devel is added to list. > > dapl-1.2.4-1 > dapl-devel-1.2.4-1 > dapl-2.0.5-1 > dapl-utils-2.0.5-1 > dapl-devel-2.0.5-1 > dapl-debuginfo-2.0.5-1 > > See http://www.openfabrics.org/downloads/dapl/README.html for > details. > > -arlin > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From promk at promarketing.com.br Tue Jan 29 20:30:31 2008 From: promk at promarketing.com.br (PROVENCE - XARA PICASSO) Date: Wed, 30 Jan 2008 04:30:31 GMT Subject: [ofa-general] PROVENCE VEICULOS - XARA PICASSO 30/1/2008 01:30:25 Message-ID: <20080130032712.90053E6099D@openfabrics.org> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: provence.jpg Type: image/jpeg Size: 230372 bytes Desc: not available URL: From ardavis at ichips.intel.com Tue Jan 29 22:23:10 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Tue, 29 Jan 2008 22:23:10 -0800 Subject: [ofa-general] [ANNOUCE] dapl 2.0.5 release In-Reply-To: <1201661737.28486.154.camel@firewall.xsintricity.com> References: <000501c862db$3b9d17d0$a5e0180a@amr.corp.intel.com> <1201661737.28486.154.camel@firewall.xsintricity.com> Message-ID: <47A017CE.2080001@ichips.intel.com> Doug Ledford wrote: > On Wed, 2008-01-30 at 02:06 +0000, Tang, Changqing wrote: >> Arlin: >> I have not had a chance to look at uDAPL 2.0, can you give a brief >> summary the changes from 1.2 to 2.0, I am interested from the >> applications perspective, don't care the internal details. >> >> Thanks. > > http://www.openfabrics.org/downloads/dapl/documentation/transition_to_dat20.pdf provides a nice concise overview of the changes. > v2 also provides extension support for transport specific operations. IB extensions are built into the v2 package. See the following documents for IB and iWARP extensions: http://www.openfabrics.org/downloads/dapl/documentation/DAT_IB_Extensions.pdf http://www.openfabrics.org/downloads/dapl/documentation/DAT_IW_Extensions.pdf -arlin From krkumar2 at in.ibm.com Wed Jan 30 00:33:14 2008 From: krkumar2 at in.ibm.com (Krishna Kumar2) Date: Wed, 30 Jan 2008 14:03:14 +0530 Subject: [ofa-general] Re: Status of NFS-RDMA ? In-Reply-To: Message-ID: Hi James, Since you had mentioned in an earlier email that NFS-RDMA server side will be present in OFED1.4, do you know if any port of the server code to OFED1.3 (when it comes out) will happen? Is there any effort for that, any work ongoing, any help required, etc? I couldn't find the release time lines for OFED1.4, is there any link on openfabrics homepage? Thanks, - KK general-bounces at lists.openfabrics.org wrote on 01/29/2008 08:23:46 PM: > > > On Tue, 29 Jan 2008, Pawel Dziekonski wrote: > > > > > On Mon, 28 Jan 2008 at 10:14:22AM -0500, James Lentini wrote: > > > > > > > > > On Sat, 26 Jan 2008, Pawel Dziekonski wrote: > > > > > > > I pulled Tom's tree from new url and build a kernel. > > > > > > If you enabled support for INFINIBAND drivers (IB and iWARP support) > > > and NFS client/server support, the kernel should be ready to go (run > > > "grep RDMA /your_kernel_sources/.config" to confirm that > > > CONFIG_SUNRPC_XPRT_RDMA is either m or y). > > > > > > NFS/RDMA doesn't require OFED be installed. OFED is a release of the > > > Linux kernel sources and some userspace libraries/tools. If you are > > > > > > then I downloaded OFED from > > > > http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz, > > > > > > I don't know what the above URL contains. The latest code is in Tom > > > Tucker's tree (and now NFS server maintainer Bruce Fields tree). It is > > > > > > hi, > > > > back to subject on a proper mailing list. > > > > I have a >3 year experience with mellanox hardware and IBGold so I > > basically know what OFED is all about. up to now i was only using > > IBGold since IB drivers appeared in kernel pretty recently. > > You'll want to use the mainline kernel's IB drivers for NFS/RDMA. > We've been developing the NFS/RDMA software on the OpenFabrics (aka > OpenIB) code since it was merged into 2.6.10 in Dec 2004. > > > currently I have new hardware. I'm running Tom's kernel and already > > did some MPI tests. SDP is not working, probably because sdp kernel > > modules where not build. ;) I understand that those modules are only > > available from ofa-kernel. please correct me if i'm wrong. > > Correct. SDP has never been submitted to mainline Linux. > > > system is Scientic Linux 4.5, which is supposed to be a fully > > compatible RH4 clone. hardware is Supermicro mobos with Mellanox > > MT25204 and Flextronisc switch. > > > > error log from ofa-kernel build: > > Is your goal to build a kernel with an NFS/RDMA server? If so, the > kernel sources from Tom Tucker's git tree are the ones you want, not > the old OFED 1.2-based packages which are out of date. > > Did you try setting up the NFS/RDMA server on the kernel used for your > MPI tests above? > > > > > make[1]: Entering directory `/usr/src/ib/xprt-switch-2.6' > > > > test -e include/linux/autoconf.h -a -e include/config/auto.conf || ( \ > > > > echo; \ > > > > echo " ERROR: Kernel configuration is invalid."; \ > > > > echo " include/linux/autoconf.h or include/config/auto.conf are > missing."; \ > > > > echo " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ > > > > echo; \ > > > > /bin/false) > > > > > > > > obviously, doing 'make oldconfig && make prepare' does not help. > > > > anyway, above mentioned files do exist: > > > > > > > > # ls -la /usr/src/ib/xprt-switch-2.6/{include/linux/autoconf.h, > include/config/auto.conf} > > > > -rw-r--r-- 1 root root 10156 Jan 25 17:42 /usr/src/ib/xprt-switch-2. > 6/include/config/auto.conf > > > > -rw-r--r-- 1 root root 14733 Jan 25 17:42 /usr/src/ib/xprt-switch-2. > 6/include/linux/autoconf.h > > > > > > > > despite of above, compilation continues but fails with: > > > > > > > > gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > 2/drivers/infiniband/core/.mad.o.d -nostdinc -isystem /usr/lib/gcc/x86_64- > redhat-linux/3.4.6/include -D__KERNEL__ -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > 2/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 /drivers/infiniband/include > -Iinclude -include include/linux/autoconf.h -include > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h -Wall -Wundef > -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror- > implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel -pipe - > Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse - > mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 - > DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -Wdeclaration-after- > statement -DMODULE -D"KBUILD_STR(s)=#s" - > D"KBUILD_BASENAME=KBUILD_STR(mad)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" -c - > o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.! > tmp > > _mad.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 /drivers/infiniband/core/mad.c > > > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 /drivers/infiniband/core/mad.c: In > function `ib_mad_init_module': > > > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 /drivers/infiniband/core/mad.c: > 2966: error: too many arguments to function `kmem_cache_create' > > > > make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > 2/drivers/infiniband/core/mad.o] Error 1 > > > > make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > 2/drivers/infiniband/core] Error 2 > > > > make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 /drivers/infiniband] Error 2 > > > > make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error 2 > > > > make[1]: Leaving directory `/usr/src/ib/xprt-switch-2.6' > > > > make: *** [kernel] Error 2 > > > > error: Bad exit status from /var/tmp/rpm-tmp.3877 (%install) > > > > > > full log: > > > > https://cefeid.wcss.wroc.pl/d/tmp/OFED.build.32122.log > > > > thanks in advance for any help, P > > > > > > -- > > Pawel Dziekonski > > Wroclaw Centre for Networking & Supercomputing, HPC Department > > Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND > > phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > nfs-rdma-devel mailing list > > nfs-rdma-devel at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/nfs-rdma-devel > > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general From henden at majoni.com Wed Jan 30 02:23:16 2008 From: henden at majoni.com (Cyril Shepherd) Date: Wed, 30 Jan 2008 07:23:16 -0300 Subject: [ofa-general] re: Shepherd Message-ID: <01c86310$fae50a00$324f12c9@henden> Hey there my name is Nichole I saw your profile online and thought you might like to chat Add me on msn messanger if your intrested smsnatalie4 at hotmail.com From henden at majoni.com Wed Jan 30 02:23:16 2008 From: henden at majoni.com (Cyril Shepherd) Date: Wed, 30 Jan 2008 07:23:16 -0300 Subject: [ofa-general] re: Shepherd Message-ID: <01c86310$fae50a00$324f12c9@henden> Hey there my name is Nichole I saw your profile online and thought you might like to chat Add me on msn messanger if your intrested smsnatalie4 at hotmail.com From vlad at lists.openfabrics.org Wed Jan 30 03:12:40 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Wed, 30 Jan 2008 03:12:40 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080130-0200 daily build status Message-ID: <20080130111240.98F4FE601B2@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.12 Passed on ppc64 with linux-2.6.12 Passed on ia64 with linux-2.6.18 Passed on ppc64 with linux-2.6.15 Passed on ppc64 with linux-2.6.14 Passed on ia64 with linux-2.6.19 Passed on x86_64 with linux-2.6.16 Passed on x86_64 with linux-2.6.18-8.el5 Passed on ppc64 with linux-2.6.16 Passed on x86_64 with linux-2.6.13 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on ppc64 with linux-2.6.18 Passed on x86_64 with linux-2.6.17 Passed on powerpc with linux-2.6.15 Passed on ia64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.13 Passed on powerpc with linux-2.6.12 Passed on x86_64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.19 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.19 Passed on x86_64 with linux-2.6.20 Passed on ia64 with linux-2.6.12 Passed on powerpc with linux-2.6.14 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.16 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.12 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on x86_64 with linux-2.6.21.1 Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.22 Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.16.21-0.8-default Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.18-53.el5 Failed: From eli at mellanox.co.il Wed Jan 30 03:16:22 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 13:16:22 +0200 Subject: [ofa-general] Re: CM Enable SRQ for less than 16 s/g - bug In-Reply-To: <479FC630.10506@linux.vnet.ibm.com> References: <1201439328.9219.15.camel@mtls03> <479CC810.7060100@linux.vnet.ibm.com> <6C2C79E72C305246B504CBA17B5500C903321E4C@mtlexch01.mtl.com> <479D10ED.6060107@linux.vnet.ibm.com> <479FC630.10506@linux.vnet.ibm.com> Message-ID: <1201691782.28794.78.camel@mtls03> > Hello Eli, > > I did run ttcp between two two Mellanox HCAs (we have InfiniBand: Mellanox Technologies > MT23108 InfiniHost (rev a1)) on ppc64 systems (on a Sles10sp2 beta distro) using the > same command line options that was provided. I could not reproduce the hang in several > iterations. > > I saw that you were using ttcpv -is this an enhancement to ttcp? > > Pradeep > Hi Pradeep, I found the problem in connected mode where I did not clear the NETIF_F_SG flag when moving to connected mode. This was pointed out by Or Gerlitz from Voltaire. Thanks for your help. From ogerlitz at voltaire.com Wed Jan 30 04:33:40 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Wed, 30 Jan 2008 14:33:40 +0200 Subject: [ofa-general] rdma_create_qp fails with -12 In-Reply-To: References: <1200936237.23538.15.camel@obelisk.thedillows.org> <4795D149.3010405@voltaire.com> Message-ID: <47A06EA4.5020308@voltaire.com> Shipman, Galen M. wrote: > Up to 256 of these are chained together using the next pointer so that a > single call to ib_post_send is made for up to a 1MB xfer. > The number of work requests allocated for the QP is controlled by number > of concurrent sends * 256. > At 16 concurrent sends there is no problem. > At 64 there is (once we allocate recv work requests as well). the most problematic aspect in this approach is the memory consumption for the --QP-- even when you just get to the 128K limitation of the current implementation, when a non-SRQed Lustre server works with 1K clients, it consumes 128M of kernel memory for its QPs, its bad. > It sounds like this can be alleviated by using FMR. indeed Or. From ogerlitz at voltaire.com Wed Jan 30 05:34:01 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Wed, 30 Jan 2008 15:34:01 +0200 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: <479DFA58.7050800@intec.ugent.be> References: <479DFA58.7050800@intec.ugent.be> Message-ID: <47A07CC9.8030005@voltaire.com> Stijn De Smet wrote: > I'm trying to get IPOIB bonding to work with the hw_csum enabled. ... > When I disable hw_csums, I can start iperf's, pull and replug all cables > and the iperf's run uninterrupted. This is interesting report, however, since currently the hw checksum patch in not being submitted to the mainline kernel and it is also about to be removed from ofed 1.3 (Tziporet, can you update on that?), I am not going to look into that. Or. From ogerlitz at voltaire.com Wed Jan 30 05:36:50 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Wed, 30 Jan 2008 15:36:50 +0200 Subject: [ofa-general] [PATCH] ib/ipoib: handle Gratuitous ARP & bonding failover race also for connected mode neighbours In-Reply-To: References: Message-ID: <47A07D72.7090704@voltaire.com> Or Gerlitz wrote: > move a little up the code that checks for a situation where the remote GID stored in the ipoib_neigh is > different than the one present in the neighbour (handle Gratuitous ARP) or that a bonding fail over has > happened but the neighbour still has a pointer to an ipoib_neigh created not by the current slave. This > will cause the driver to apply the check also for connected mode neighbours. > Signed-off-by: Or Gerlitz Hi Roland, Do you need from me any more clarification to merge this into 2.6.25 ? Or From koen.segers at vrt.be Wed Jan 30 05:56:54 2008 From: koen.segers at vrt.be (Koen Segers) Date: Wed, 30 Jan 2008 14:56:54 +0100 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: <47A07CC9.8030005@voltaire.com> References: <479DFA58.7050800@intec.ugent.be> <47A07CC9.8030005@voltaire.com> Message-ID: <1201701414.11838.8.camel@koenVRT> On Wed, 2008-01-30 at 15:34 +0200, Or Gerlitz wrote: > Stijn De Smet wrote: > > I'm trying to get IPOIB bonding to work with the hw_csum enabled. > ... > > When I disable hw_csums, I can start iperf's, pull and replug all cables > > and the iperf's run uninterrupted. > > This is interesting report, however, since currently the hw checksum > patch in not being submitted to the mainline kernel and it is also about > to be removed from ofed 1.3 (Tziporet, can you update on that?), I am > not going to look into that. Do you mean that bonding with hw_csum enabled will never work? Why is hw_checksum not submitted to the mainline kernel (and thus also removed from ofed)? We definitely want to enable hw_checksum as it gives an enormous bandwidth boost with ipoib. Koen. > > Or. > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general *** Disclaimer *** Vlaamse Radio- en Televisieomroep Auguste Reyerslaan 52, 1043 Brussel nv van publiek recht BTW BE 0244.142.664 RPR Brussel http://www.vrt.be/disclaimer From kliteyn at dev.mellanox.co.il Wed Jan 30 06:22:20 2008 From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik) Date: Wed, 30 Jan 2008 16:22:20 +0200 Subject: [ofa-general] [PATCH] qperf: adding SL option for RDMA tests Message-ID: <47A0881C.9020509@dev.mellanox.co.il> Hi Johann, I'd like to add to qperf a command line parameter to for changing the SL of the QP/AH (for RDMA tests). This is being used mainly in order to check the QoS feature and performance on different VLs. I'm not sure I figured out the option thing in qperf right, but in any case, the following patch does the job for me. Please review and let me know what you think. -- Yevgeny Signed-off-by: Yevgeny Kliteynik --- src/help.txt | 4 ++++ src/qperf.c | 11 ++++++++++- src/qperf.h | 3 +++ src/rdma.c | 6 ++++-- 4 files changed, 21 insertions(+), 3 deletions(-) diff --git a/src/help.txt b/src/help.txt index 59195cb..ec00356 100644 --- a/src/help.txt +++ b/src/help.txt @@ -77,6 +77,7 @@ Opts --timeout Time (-T) Set timeout --loc_timeout Time (-lT) Set local timeout --rem_timeout Time (-rT) Set remote timeout + --service_level SL (-sl) Set Service Level to SL for RDMA tests --unify_nodes (-un) Unify nodes --unify_units (-uu) Unify units --use_bits_per_sec (-ub) Use bits/sec rather than bytes/sec @@ -184,6 +185,9 @@ Options Set local timeout to Time. --rem_timeout Time (-rT) Set local timeout to Time. + --service_level SL (-sl) + Set Service Level to SL. This is the SL used for RDMA tests only. + The default SL is 0. --unify_nodes (-un) Unify the nodes. Describe them in terms of local and remote rather than send and receive. diff --git a/src/qperf.c b/src/qperf.c index fd4c24f..06ff2d5 100644 --- a/src/qperf.c +++ b/src/qperf.c @@ -280,6 +280,7 @@ PAR_NAME ParName[] ={ { "sock_buf_size", L_SOCK_BUF_SIZE, R_SOCK_BUF_SIZE }, { "time", L_TIME, R_TIME }, { "timeout", L_TIMEOUT, R_TIMEOUT }, + { "service_level", L_SL, R_SL }, }; @@ -317,6 +318,8 @@ PAR_INFO ParInfo[P_N] ={ { R_TIME, 't', &RReq.time }, { L_TIMEOUT, 't', &Req.timeout }, { R_TIMEOUT, 't', &RReq.timeout }, + { L_SL, 'q', &Req.sl }, + { R_SL, 'q', &RReq.sl }, }; @@ -392,6 +395,8 @@ OPTION Options[] ={ { "-rT", 0, &opt_time, R_TIMEOUT }, { "--server_timeout", 0, &opt_misc, 's', 't' }, { "-st", 0, &opt_misc, 's', 't' }, + { "--service_level", 0, &opt_long, L_SL, R_SL }, + { "-sl", 0, &opt_long, L_SL, R_SL }, { "--unify_nodes", 0, &opt_misc, 'u', 'n' }, { "-un", 0, &opt_misc, 'u', 'n' }, { "--unify_units", 0, &opt_misc, 'u', 'u' }, @@ -1217,6 +1222,8 @@ client(TEST *test) par_use(R_AFFINITY); par_use(L_TIME); par_use(R_TIME); + par_use(L_SL); + par_use(R_SL); set_affinity(); RReq.ver_maj = VER_MAJ; @@ -1848,7 +1855,7 @@ show_rest(void) uint64_t lr = LStat.r.no_bytes; uint64_t rs = RStat.s.no_bytes; uint64_t rr = RStat.r.no_bytes; - + if (ls && !rs && rr && !lr) { srmode = 1; resnS = &Res.l; @@ -2385,6 +2392,7 @@ enc_req(REQ *host) enc_int(host->no_msgs, sizeof(host->no_msgs)); enc_int(host->sock_buf_size, sizeof(host->sock_buf_size)); enc_int(host->time, sizeof(host->time)); + enc_int(host->sl, sizeof(host->sl)); enc_str(host->id, sizeof(host->id)); } @@ -2411,6 +2419,7 @@ dec_req(REQ *host) host->no_msgs = dec_int(sizeof(host->no_msgs)); host->sock_buf_size = dec_int(sizeof(host->sock_buf_size)); host->time = dec_int(sizeof(host->time)); + host->sl = dec_int(sizeof(host->sl)); dec_str(host->id, sizeof(host->id)); } diff --git a/src/qperf.h b/src/qperf.h index 0c42361..2539f61 100644 --- a/src/qperf.h +++ b/src/qperf.h @@ -119,6 +119,8 @@ typedef enum { R_TIME, L_TIMEOUT, R_TIMEOUT, + L_SL, + R_SL, P_N } PAR_INDEX; @@ -156,6 +158,7 @@ typedef struct REQ { uint32_t no_msgs; /* Number of messages */ uint32_t sock_buf_size; /* Socket buffer size */ uint32_t time; /* Duration in seconds */ + uint8_t sl; /* Service Level */ char id[STRSIZE]; /* Identifier */ char rate[STRSIZE]; /* Rate */ } REQ; diff --git a/src/rdma.c b/src/rdma.c index b0cd067..5208d4d 100644 --- a/src/rdma.c +++ b/src/rdma.c @@ -1596,7 +1596,8 @@ ib_prepare(IBDEV *ibdev) .ah_attr = { .dlid = ibdev->rcon.lid, .port_num = ibdev->port, - .static_rate = ibdev->rate + .static_rate = ibdev->rate, + .sl = Req.sl } }; struct ibv_qp_attr rts_attr ={ @@ -1610,7 +1611,8 @@ ib_prepare(IBDEV *ibdev) struct ibv_ah_attr ah_attr ={ .dlid = ibdev->rcon.lid, .port_num = ibdev->port, - .static_rate = ibdev->rate + .static_rate = ibdev->rate, + .sl = Req.sl }; if (ibdev->trans == IBV_QPT_UD) { -- 1.5.1.4 From ogerlitz at voltaire.com Wed Jan 30 06:26:06 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Wed, 30 Jan 2008 16:26:06 +0200 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: <1201701414.11838.8.camel@koenVRT> References: <479DFA58.7050800@intec.ugent.be> <47A07CC9.8030005@voltaire.com> <1201701414.11838.8.camel@koenVRT> Message-ID: <47A088FE.7070506@voltaire.com> Koen Segers wrote: > Do you mean that bonding with hw_csum enabled will never work? no, I meant to say that I am not enough into the details and mechanics of the hw_csum approach/patch and since I understand it is going to be removed, I will not look now on going into this report. > Why is hw_checksum not submitted to the mainline kernel (and thus also > removed from ofed)? We definitely want to enable hw_checksum as it gives > an enormous bandwidth boost with ipoib. you should ask that the individual/s that are signed on the patch Or From jlentini at netapp.com Wed Jan 30 06:27:21 2008 From: jlentini at netapp.com (James Lentini) Date: Wed, 30 Jan 2008 09:27:21 -0500 (EST) Subject: [ofa-general] Re: Status of NFS-RDMA ? In-Reply-To: References: Message-ID: On Wed, 30 Jan 2008, Krishna Kumar2 wrote: > Hi James, > > Since you had mentioned in an earlier email that NFS-RDMA server > side will be present in OFED1.4, Actually, that was Tziporet. > do you know if any port of the server code to OFED1.3 (when it comes > out) will happen? Is there any effort for that, any work ongoing, > any help required, etc? Jeff Becker had looked into this. We would definitely appreciate the help. The NFS framework has changed significantly in several areas in recent kernel releases. This has made backporting the NFS/RDMA code to older kernels challenging. If you are interested in working on OFED1.3 support, let us know. > I couldn't find the release time lines for OFED1.4, is there any > link on openfabrics homepage? I'm not involved with the OFED1.4 planning. Tziporet, is there information on this? > Thanks, > > - KK > > general-bounces at lists.openfabrics.org wrote on 01/29/2008 08:23:46 PM: > > > > > > > On Tue, 29 Jan 2008, Pawel Dziekonski wrote: > > > > > > > > On Mon, 28 Jan 2008 at 10:14:22AM -0500, James Lentini wrote: > > > > > > > > > > > > On Sat, 26 Jan 2008, Pawel Dziekonski wrote: > > > > > > > > > I pulled Tom's tree from new url and build a kernel. > > > > > > > > If you enabled support for INFINIBAND drivers (IB and iWARP support) > > > > and NFS client/server support, the kernel should be ready to go (run > > > > "grep RDMA /your_kernel_sources/.config" to confirm that > > > > CONFIG_SUNRPC_XPRT_RDMA is either m or y). > > > > > > > > NFS/RDMA doesn't require OFED be installed. OFED is a release of the > > > > Linux kernel sources and some userspace libraries/tools. If you are > > > > > > > > then I downloaded OFED from > > > > > http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz, > > > > > > > > I don't know what the above URL contains. The latest code is in Tom > > > > Tucker's tree (and now NFS server maintainer Bruce Fields tree). It > is > > > > > > > > > hi, > > > > > > back to subject on a proper mailing list. > > > > > > I have a >3 year experience with mellanox hardware and IBGold so I > > > basically know what OFED is all about. up to now i was only using > > > IBGold since IB drivers appeared in kernel pretty recently. > > > > You'll want to use the mainline kernel's IB drivers for NFS/RDMA. > > We've been developing the NFS/RDMA software on the OpenFabrics (aka > > OpenIB) code since it was merged into 2.6.10 in Dec 2004. > > > > > currently I have new hardware. I'm running Tom's kernel and already > > > did some MPI tests. SDP is not working, probably because sdp kernel > > > modules where not build. ;) I understand that those modules are only > > > available from ofa-kernel. please correct me if i'm wrong. > > > > Correct. SDP has never been submitted to mainline Linux. > > > > > system is Scientic Linux 4.5, which is supposed to be a fully > > > compatible RH4 clone. hardware is Supermicro mobos with Mellanox > > > MT25204 and Flextronisc switch. > > > > > > error log from ofa-kernel build: > > > > Is your goal to build a kernel with an NFS/RDMA server? If so, the > > kernel sources from Tom Tucker's git tree are the ones you want, not > > the old OFED 1.2-based packages which are out of date. > > > > Did you try setting up the NFS/RDMA server on the kernel used for your > > MPI tests above? > > > > > > > make[1]: Entering directory `/usr/src/ib/xprt-switch-2.6' > > > > > test -e include/linux/autoconf.h -a -e include/config/auto.conf || > ( \ > > > > > echo; \ > > > > > echo " ERROR: Kernel configuration is invalid."; \ > > > > > echo " include/linux/autoconf.h or include/config/auto.conf > are > > missing."; \ > > > > > echo " Run 'make oldconfig && make prepare' on kernel src > to fix it."; \ > > > > > echo; \ > > > > > /bin/false) > > > > > > > > > > obviously, doing 'make oldconfig && make prepare' does not help. > > > > > anyway, above mentioned files do exist: > > > > > > > > > > # ls -la /usr/src/ib/xprt-switch-2.6/{include/linux/autoconf.h, > > include/config/auto.conf} > > > > > -rw-r--r-- 1 root root 10156 Jan 25 17:42 > /usr/src/ib/xprt-switch-2. > > 6/include/config/auto.conf > > > > > -rw-r--r-- 1 root root 14733 Jan 25 17:42 > /usr/src/ib/xprt-switch-2. > > 6/include/linux/autoconf.h > > > > > > > > > > despite of above, compilation continues but fails with: > > > > > > > > > > gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > > 2/drivers/infiniband/core/.mad.o.d -nostdinc -isystem > /usr/lib/gcc/x86_64- > > redhat-linux/3.4.6/include -D__KERNEL__ > -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > > 2/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > /drivers/infiniband/include > > -Iinclude -include include/linux/autoconf.h -include > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h -Wall > -Wundef > > -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common > -Werror- > > implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel > -pipe - > > Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time > -mno-sse - > > mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 > - > > DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -Wdeclaration-after- > > statement -DMODULE -D"KBUILD_STR(s)=#s" - > > D"KBUILD_BASENAME=KBUILD_STR(mad)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" > -c - > > o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.! > > tmp > > > _mad.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > /drivers/infiniband/core/mad.c > > > > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > /drivers/infiniband/core/mad.c: In > > function `ib_mad_init_module': > > > > > /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > /drivers/infiniband/core/mad.c: > > 2966: error: too many arguments to function `kmem_cache_create' > > > > > make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > > 2/drivers/infiniband/core/mad.o] Error 1 > > > > > make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > > 2/drivers/infiniband/core] Error 2 > > > > > make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > /drivers/infiniband] Error 2 > > > > > make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error > 2 > > > > > make[1]: Leaving directory `/usr/src/ib/xprt-switch-2.6' > > > > > make: *** [kernel] Error 2 > > > > > error: Bad exit status from /var/tmp/rpm-tmp.3877 (%install) > > > > > > > > full log: > > > > > https://cefeid.wcss.wroc.pl/d/tmp/OFED.build.32122.log > > > > > > thanks in advance for any help, P > > > > > > > > > -- > > > Pawel Dziekonski > > > Wroclaw Centre for Networking & Supercomputing, HPC Department > > > Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, > POLAND > > > phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl > > > > > > > ------------------------------------------------------------------------- > > > This SF.net email is sponsored by: Microsoft > > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > > _______________________________________________ > > > nfs-rdma-devel mailing list > > > nfs-rdma-devel at lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/nfs-rdma-devel > > > > > _______________________________________________ > > general mailing list > > general at lists.openfabrics.org > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From changquing.tang at hp.com Wed Jan 30 07:38:51 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 30 Jan 2008 15:38:51 +0000 Subject: [ofa-general] [ANNOUCE] dapl 2.0.5 release In-Reply-To: <47A017CE.2080001@ichips.intel.com> References: <000501c862db$3b9d17d0$a5e0180a@amr.corp.intel.com> <1201661737.28486.154.camel@firewall.xsintricity.com> <47A017CE.2080001@ichips.intel.com> Message-ID: Thanks for all the links. --CQ > -----Original Message----- > From: Arlin Davis [mailto:ardavis at ichips.intel.com] > Sent: Wednesday, January 30, 2008 12:23 AM > To: Doug Ledford > Cc: Tang, Changqing; OpenFabrics General; Arlin Davis > Subject: Re: [ofa-general] [ANNOUCE] dapl 2.0.5 release > > Doug Ledford wrote: > > On Wed, 2008-01-30 at 02:06 +0000, Tang, Changqing wrote: > >> Arlin: > >> I have not had a chance to look at uDAPL 2.0, can you give a > >> brief summary the changes from 1.2 to 2.0, I am interested > from the > >> applications perspective, don't care the internal details. > >> > >> Thanks. > > > > > http://www.openfabrics.org/downloads/dapl/documentation/transi > tion_to_dat20.pdf provides a nice concise overview of the changes. > > > > v2 also provides extension support for transport specific > operations. IB extensions are built into the v2 package. > See the following documents for IB and iWARP extensions: > > http://www.openfabrics.org/downloads/dapl/documentation/DAT_IB > _Extensions.pdf > http://www.openfabrics.org/downloads/dapl/documentation/DAT_IW > _Extensions.pdf > > -arlin > > > From super3 at charter.net Wed Jan 30 07:45:48 2008 From: super3 at charter.net (Global Medical Equipment Co. Ltd) Date: Wed, 30 Jan 2008 7:45:48 -0800 Subject: [ofa-general] Agent/Representative Message-ID: <20080130104549.8GCQV.113145.root@fepweb16> Good Day Sir/Madam, "Global Medical Equipments Int'l" We hereby request for your hand in partnership to act as our payment Agent/Representative in United State,Canada And order Europe's countrie's And you we be entitled with 10% of all payment you rececie from our customers, please if you are interested in the business offer please do let us no via e-mail. Yours Failthful, Mr.Paul Rogers Sales Manager/Director Global Medical Equipment Int'l From koen.segers at vrt.be Wed Jan 30 07:55:23 2008 From: koen.segers at vrt.be (Koen Segers) Date: Wed, 30 Jan 2008 16:55:23 +0100 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: <47A088FE.7070506@voltaire.com> References: <479DFA58.7050800@intec.ugent.be> <47A07CC9.8030005@voltaire.com> <1201701414.11838.8.camel@koenVRT> <47A088FE.7070506@voltaire.com> Message-ID: <1201708523.11838.25.camel@koenVRT> On Wed, 2008-01-30 at 16:26 +0200, Or Gerlitz wrote: > > Why is hw_checksum not submitted to the mainline kernel (and thus > also > > removed from ofed)? We definitely want to enable hw_checksum as it > gives > > an enormous bandwidth boost with ipoib. > > you should ask that the individual/s that are signed on the patch Is this Michael S. Tsirkin? I don't know where else to find this information. Regards, Koen *** Disclaimer *** Vlaamse Radio- en Televisieomroep Auguste Reyerslaan 52, 1043 Brussel nv van publiek recht BTW BE 0244.142.664 RPR Brussel http://www.vrt.be/disclaimer From pawel.dziekonski at pwr.wroc.pl Wed Jan 30 08:19:24 2008 From: pawel.dziekonski at pwr.wroc.pl (Pawel Dziekonski) Date: Wed, 30 Jan 2008 17:19:24 +0100 Subject: [nfs-rdma-devel] [ofa-general] Status of NFS-RDMA ? In-Reply-To: References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> <4797AD59.2000206@mellanox.co.il> <20080126193035.GA21209@cefeid.wcss.wroc.pl> <20080129003731.GA30262@cefeid.wcss.wroc.pl> Message-ID: <20080130161924.GA31154@cefeid.wcss.wroc.pl> On Tue, 29 Jan 2008 at 09:53:46AM -0500, James Lentini wrote: > > Is your goal to build a kernel with an NFS/RDMA server? If so, the > kernel sources from Tom Tucker's git tree are the ones you want, not > the old OFED 1.2-based packages which are out of date. > > Did you try setting up the NFS/RDMA server on the kernel used for your > MPI tests above? my goal is to have everything running on my fabric- NFS/RDMA, MPI, IPoverIB, SDP. my current status is: - Tom Tucker's git tree compiled and running - compiled OFED from http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz (whatever it is - who knows?) - MPI is working, SDP not. - nfs-utils 1.1.1 compiled and: nfs server start: #!/bin/sh /etc/rc.d/init.d/portmap restart modprobe nfs umount /proc/fs/nfsd mount -t nfsd /proc/fs/nfsd exportfs -av rpc.mountd rpc.statd --no-notify rpc.nfsd sm-notify # cat /etc/exports /scratch 10.2.2.2(no_subtree_check,insecure,rw,async,no_root_squash) nfs client start: #!/bin/sh /etc/rc.d/init.d/portmap restart modprobe nfs sm-notify # mount.rnfs -o rdma=10.2.2.1 10.2.2.1:/scratch /mnt Doing nfs/rdma mount to 10.2.2.1, mount protocol to 10.2.2.1 nfsmount: Invalid argument :( -- Pawel Dziekonski Wroclaw Centre for Networking & Supercomputing, HPC Department Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl From eli at mellanox.co.il Wed Jan 30 08:30:38 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:30:38 +0200 Subject: [ofa-general] [PATCH 0/16] ipoib offload patches Message-ID: <1201710638.28794.166.camel@mtls03> Hi, following this email is a list of patches aimed at improving ipoib UD mode performance by using various stateless offload facilities. I am resending the full list again and would like to see it get into 2.6.25. I denote them all as v4. This is the list of the patches: 0001-IB-ipoib-Add-high-dma-support.patch 0002-IB-ipoib-Add-s-g-support.patch 0003-IB-core-Add-checksum-offload-support.patch 0004-IB-ipoib-Add-checksum-offload-support.patch 0005-IB-mlx4-Add-checksum-offload-support.patch 0006-IB-mthca-Add-checksum-offload-support.patch 0007-IB-core-Add-creation-flags-to-QPs.patch 0008-IB-core-Add-support-for-LSO.patch 0009-IB-ipoib-Add-LSO-support.patch 0010-IB-mlx4-Add-creation-flags-to-mlx4-QPs.patch 0011-IB-mlx4-Add-LSO-support-to-mlx4.patch 0012-IB-ipoib-Add-ethtool-support-to-IPOIB.patch 0013-IB-core-Add-support-for-modify-CQ.patch 0014-IB-ipoib-Support-modifying-IPOIB-CQ-moderation-para.patch 0015-IB-mlx4-Add-support-for-modifying-CQ-parameters.patch 0016-IB-ipoib-Set-default-CQ-moderation-parameters.patch From eli at mellanox.co.il Wed Jan 30 08:30:46 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:30:46 +0200 Subject: [ofa-general] [PATCH 1/16 v4] IB/ipoib: Add high dma support Message-ID: <1201710646.28794.167.camel@mtls03> IB/ipoib: Add high dma support This patch assumes all IB devices support dma-ing from high memory. Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index a082466..8dda67e 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1111,6 +1111,8 @@ static struct net_device *ipoib_add_port(const char *format, SET_NETDEV_DEV(priv->dev, hca->dma_device); + priv->dev->features |= NETIF_F_HIGHDMA; + result = ib_query_pkey(hca, port, 0, &priv->pkey); if (result) { printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:30:53 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:30:53 +0200 Subject: [ofa-general] [PATCH 2/16 v4] IB/ipoib: Add s/g support Message-ID: <1201710653.28794.168.camel@mtls03> IB/ipoib: Add s/g support This patch acts as a preperation for using checksum offload for IB devices capable of inserting/verifying checksum in IP packets. The patch does not actaully turn on NETIF_F_SG but rather defers the role to the patches adding checksum offload capabilities. Support is added only for datagram mode since Mellanox HW does not support checksum offload on connected QPs. Signed-off-by: Michael S. Tsirkin Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib.h | 60 ++++++++++++++++++++++++++-- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 10 ++-- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 41 ++++++++++--------- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 14 +++--- 4 files changed, 89 insertions(+), 36 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index fe250c6..7c9edc6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -143,9 +143,61 @@ struct ipoib_rx_buf { struct ipoib_tx_buf { struct sk_buff *skb; - u64 mapping; + u64 mapping[MAX_SKB_FRAGS + 1]; }; +static inline int ipoib_dma_map_tx(struct ib_device *ca, + struct ipoib_tx_buf *tx_req) +{ + struct sk_buff *skb = tx_req->skb; + u64 *mapping = tx_req->mapping; + int frags; + int i; + + mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) + return -EIO; + + frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < frags; ++i) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + mapping[i + 1] = ib_dma_map_page(ca, frag->page, + frag->page_offset, frag->size, + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[i + 1]))) + goto partial_error; + } + return 0; + +partial_error: + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + + for (; i > 0; --i) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1]; + ib_dma_unmap_page(ca, mapping[i], frag->size, DMA_TO_DEVICE); + } + return -EIO; +} + +static inline void ipoib_dma_unmap_tx(struct ib_device *ca, + struct ipoib_tx_buf *tx_req) +{ + struct sk_buff *skb = tx_req->skb; + u64 *mapping = tx_req->mapping; + int frags; + int i; + + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + + frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < frags; ++i) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + ib_dma_unmap_page(ca, mapping[i + 1], frag->size, + DMA_TO_DEVICE); + } +} + struct ib_cm_id; struct ipoib_cm_data { @@ -294,9 +346,9 @@ struct ipoib_dev_priv { spinlock_t tx_lock; struct ipoib_tx_buf *tx_ring; - unsigned tx_head; - unsigned tx_tail; - struct ib_sge tx_sge; + unsigned tx_head; + unsigned tx_tail; + struct ib_sge tx_sge[MAX_SKB_FRAGS + 1]; struct ib_send_wr tx_wr; unsigned tx_outstanding; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 1818f95..7dd2ec4 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -634,8 +634,8 @@ static inline int post_send(struct ipoib_dev_priv *priv, { struct ib_send_wr *bad_wr; - priv->tx_sge.addr = addr; - priv->tx_sge.length = len; + priv->tx_sge[0].addr = addr; + priv->tx_sge[0].length = len; priv->tx_wr.wr_id = wr_id | IPOIB_OP_CM; @@ -676,7 +676,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_ return; } - tx_req->mapping = addr; + tx_req->mapping[0] = addr; if (unlikely(post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1), addr, skb->len))) { @@ -715,7 +715,7 @@ void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ib_wc *wc) tx_req = &tx->tx_ring[wr_id]; - ib_dma_unmap_single(priv->ca, tx_req->mapping, tx_req->skb->len, DMA_TO_DEVICE); + ib_dma_unmap_single(priv->ca, tx_req->mapping[0], tx_req->skb->len, DMA_TO_DEVICE); /* FIXME: is this right? Shouldn't we only increment on success? */ ++dev->stats.tx_packets; @@ -1110,7 +1110,7 @@ timeout: while ((int) p->tx_tail - (int) p->tx_head < 0) { tx_req = &p->tx_ring[p->tx_tail & (ipoib_sendq_size - 1)]; - ib_dma_unmap_single(priv->ca, tx_req->mapping, tx_req->skb->len, + ib_dma_unmap_single(priv->ca, tx_req->mapping[0], tx_req->skb->len, DMA_TO_DEVICE); dev_kfree_skb_any(tx_req->skb); ++p->tx_tail; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 52bc2bd..680c27f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -257,8 +257,7 @@ static void ipoib_ib_handle_tx_wc(struct net_device *dev, struct ib_wc *wc) tx_req = &priv->tx_ring[wr_id]; - ib_dma_unmap_single(priv->ca, tx_req->mapping, - tx_req->skb->len, DMA_TO_DEVICE); + ipoib_dma_unmap_tx(priv->ca, tx_req); ++dev->stats.tx_packets; dev->stats.tx_bytes += tx_req->skb->len; @@ -341,16 +340,23 @@ void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) static inline int post_send(struct ipoib_dev_priv *priv, unsigned int wr_id, struct ib_ah *address, u32 qpn, - u64 addr, int len) + u64 *mapping, int headlen, + skb_frag_t *frags, + int nr_frags) { struct ib_send_wr *bad_wr; + int i; - priv->tx_sge.addr = addr; - priv->tx_sge.length = len; - - priv->tx_wr.wr_id = wr_id; - priv->tx_wr.wr.ud.remote_qpn = qpn; - priv->tx_wr.wr.ud.ah = address; + priv->tx_sge[0].addr = mapping[0]; + priv->tx_sge[0].length = headlen; + for (i = 0; i < nr_frags; ++i) { + priv->tx_sge[i + 1].addr = mapping[i + 1]; + priv->tx_sge[i + 1].length = frags[i].size; + } + priv->tx_wr.num_sge = nr_frags + 1; + priv->tx_wr.wr_id = wr_id; + priv->tx_wr.wr.ud.remote_qpn = qpn; + priv->tx_wr.wr.ud.ah = address; return ib_post_send(priv->qp, &priv->tx_wr, &bad_wr); } @@ -360,7 +366,6 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ipoib_tx_buf *tx_req; - u64 addr; if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", @@ -383,20 +388,19 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, */ tx_req = &priv->tx_ring[priv->tx_head & (ipoib_sendq_size - 1)]; tx_req->skb = skb; - addr = ib_dma_map_single(priv->ca, skb->data, skb->len, - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { + if (unlikely(ipoib_dma_map_tx(priv->ca, tx_req))) { ++dev->stats.tx_errors; dev_kfree_skb_any(skb); return; } - tx_req->mapping = addr; if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), - address->ah, qpn, addr, skb->len))) { + address->ah, qpn, + tx_req->mapping, skb_headlen(skb), + skb_shinfo(skb)->frags, skb_shinfo(skb)->nr_frags))) { ipoib_warn(priv, "post_send failed\n"); ++dev->stats.tx_errors; - ib_dma_unmap_single(priv->ca, addr, skb->len, DMA_TO_DEVICE); + ipoib_dma_unmap_tx(priv->ca, tx_req); dev_kfree_skb_any(skb); } else { dev->trans_start = jiffies; @@ -615,10 +619,7 @@ int ipoib_ib_dev_stop(struct net_device *dev, int flush) while ((int) priv->tx_tail - (int) priv->tx_head < 0) { tx_req = &priv->tx_ring[priv->tx_tail & (ipoib_sendq_size - 1)]; - ib_dma_unmap_single(priv->ca, - tx_req->mapping, - tx_req->skb->len, - DMA_TO_DEVICE); + ipoib_dma_unmap_tx(priv->ca, tx_req); dev_kfree_skb_any(tx_req->skb); ++priv->tx_tail; --priv->tx_outstanding; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 433e99a..5e392e0 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -149,14 +149,14 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .cap = { .max_send_wr = ipoib_sendq_size, .max_recv_wr = ipoib_recvq_size, - .max_send_sge = 1, + .max_send_sge = dev->features & NETIF_F_SG ? MAX_SKB_FRAGS + 1 : 1, .max_recv_sge = 1 }, .sq_sig_type = IB_SIGNAL_ALL_WR, .qp_type = IB_QPT_UD }; - int ret, size; + int i, ret, size; priv->pd = ib_alloc_pd(priv->ca); if (IS_ERR(priv->pd)) { @@ -201,12 +201,12 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) priv->dev->dev_addr[2] = (priv->qp->qp_num >> 8) & 0xff; priv->dev->dev_addr[3] = (priv->qp->qp_num ) & 0xff; - priv->tx_sge.lkey = priv->mr->lkey; + for (i = 0; i < MAX_SKB_FRAGS + 1; ++i) + priv->tx_sge[i].lkey = priv->mr->lkey; - priv->tx_wr.opcode = IB_WR_SEND; - priv->tx_wr.sg_list = &priv->tx_sge; - priv->tx_wr.num_sge = 1; - priv->tx_wr.send_flags = IB_SEND_SIGNALED; + priv->tx_wr.opcode = IB_WR_SEND; + priv->tx_wr.sg_list = priv->tx_sge; + priv->tx_wr.send_flags = IB_SEND_SIGNALED; return 0; -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:30:57 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:30:57 +0200 Subject: [ofa-general] [PATCH 3/16 v4] IB/core: Add checksum offload support Message-ID: <1201710657.28794.169.camel@mtls03> IB/core: Add checksum offload support Signed-off-by: Eli Cohen --- include/rdma/ib_verbs.h | 13 +++++++++++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index cfbd38f..85f2cda 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -95,7 +95,14 @@ enum ib_device_cap_flags { IB_DEVICE_N_NOTIFY_CQ = (1<<14), IB_DEVICE_ZERO_STAG = (1<<15), IB_DEVICE_SEND_W_INV = (1<<16), - IB_DEVICE_MEM_WINDOW = (1<<17) + IB_DEVICE_MEM_WINDOW = (1<<17), + /* + * devices which publish this capability must support insertion of UDP + * and TCP checksum on outgoing packets and can verify the validity of + * checksum for incoming packets. Setting this flag implies the driver + * may set NETIF_F_IP_CSUM. + */ + IB_DEVICE_IP_CSUM = (1<<18), }; enum ib_atomic_cap { @@ -431,6 +438,7 @@ struct ib_wc { u8 sl; u8 dlid_path_bits; u8 port_num; /* valid only for DR SMPs on switches */ + int csum_ok; }; enum ib_cq_notify_flags { @@ -615,7 +623,8 @@ enum ib_send_flags { IB_SEND_FENCE = 1, IB_SEND_SIGNALED = (1<<1), IB_SEND_SOLICITED = (1<<2), - IB_SEND_INLINE = (1<<3) + IB_SEND_INLINE = (1<<3), + IB_SEND_IP_CSUM = (1<<4) }; struct ib_sge { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:00 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:00 +0200 Subject: [ofa-general] [PATCH 4/16 v4] IB/ipoib: Add checksum offload support Message-ID: <1201710660.28794.170.camel@mtls03> IB/ipoib: Add checksum offload support Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib.h | 1 + drivers/infiniband/ulp/ipoib/ipoib_cm.c | 7 +++++++ drivers/infiniband/ulp/ipoib/ipoib_ib.c | 12 ++++++++++++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 +++++++++++++++ 4 files changed, 35 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 7c9edc6..d13e481 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -87,6 +87,7 @@ enum { IPOIB_MCAST_STARTED = 8, IPOIB_FLAG_ADMIN_CM = 9, IPOIB_FLAG_UMCAST = 10, + IPOIB_FLAG_CSUM = 11, IPOIB_MAX_BACKOFF_SECONDS = 16, diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 7dd2ec4..e94ec0a 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1378,6 +1378,9 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, set_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); ipoib_warn(priv, "enabling connected mode " "will cause multicast packet drops\n"); + + dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_SG); + ipoib_flush_paths(dev); return count; } @@ -1386,6 +1389,10 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, clear_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); dev->mtu = min(priv->mcast_mtu, dev->mtu); ipoib_flush_paths(dev); + + if (priv->ca->flags & IB_DEVICE_IP_CSUM) + dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; + return count; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 680c27f..0f616f6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -37,6 +37,7 @@ #include #include +#include #include @@ -231,6 +232,11 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) skb->dev = dev; /* XXX get correct PACKET_ type here */ skb->pkt_type = PACKET_HOST; + + /* check rx csum */ + if (test_bit(IPOIB_FLAG_CSUM, &priv->flags) && likely(wc->csum_ok)) + skb->ip_summed = CHECKSUM_UNNECESSARY; + netif_receive_skb(skb); repost: @@ -394,6 +400,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, return; } + if (dev->flags & NETIF_F_IP_CSUM && + skb->ip_summed == CHECKSUM_PARTIAL) + priv->tx_wr.send_flags |= IB_SEND_IP_CSUM; + else + priv->tx_wr.send_flags &= ~IB_SEND_IP_CSUM; + if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), address->ah, qpn, tx_req->mapping, skb_headlen(skb), diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 8dda67e..83f8b85 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1099,6 +1099,20 @@ int ipoib_add_pkey_attr(struct net_device *dev) return device_create_file(&dev->dev, &dev_attr_pkey); } +static void set_csum(struct net_device *dev, struct ib_device *hca) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (test_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags)) + return; + + if (!(hca->flags & IB_DEVICE_IP_CSUM)) + return; + + dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM; + set_bit(IPOIB_FLAG_CSUM, &priv->flags); +} + static struct net_device *ipoib_add_port(const char *format, struct ib_device *hca, u8 port) { @@ -1137,6 +1151,7 @@ static struct net_device *ipoib_add_port(const char *format, } else memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + set_csum(priv->dev, hca); result = ipoib_dev_init(priv->dev, hca, port); if (result < 0) { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:06 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:06 +0200 Subject: [ofa-general] [PATCH 5/16 v4] IB/mlx4: Add checksum offload support Message-ID: <1201710666.28794.171.camel@mtls03> [PATCH] IB/mlx4: Add checksum offload support Signed-off-by: Eli Cohen Signed-off-by: Ali Ayub --- drivers/infiniband/hw/mlx4/cq.c | 9 +++++++++ drivers/infiniband/hw/mlx4/main.c | 5 +++++ drivers/infiniband/hw/mlx4/qp.c | 3 +++ drivers/net/mlx4/fw.c | 3 +++ include/linux/mlx4/cq.h | 4 ++-- include/linux/mlx4/qp.h | 2 ++ 6 files changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 7950aa6..539c69c 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -315,6 +315,11 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, int is_error; u32 g_mlpath_rqpn; u16 wqe_ctr; + __be32 status; + +#define CSUM_MASK_BITS cpu_to_be32(0x13c00000) +#define CSUM_VAL_BITS cpu_to_be32(0x10400000) +#define CSUM_MASK2_BITS cpu_to_be32(0x0c000000) cqe = next_cqe_sw(cq); if (!cqe) @@ -432,6 +437,10 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, wc->dlid_path_bits = (g_mlpath_rqpn >> 24) & 0x7f; wc->wc_flags |= g_mlpath_rqpn & 0x80000000 ? IB_WC_GRH : 0; wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) & 0x7f; + status = cqe->ipoib_status; + wc->csum_ok = (status & CSUM_MASK_BITS) == CSUM_VAL_BITS && + (status & CSUM_MASK2_BITS) && + cqe->checksum == 0xffff; } return 0; diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index d8287d9..8ce94a1 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -99,6 +99,8 @@ static int mlx4_ib_query_device(struct ib_device *ibdev, props->device_cap_flags |= IB_DEVICE_AUTO_PATH_MIG; if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_UD_AV_PORT) props->device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE; + if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) + props->device_cap_flags |= IB_DEVICE_IP_CSUM; props->vendor_id = be32_to_cpup((__be32 *) (out_mad->data + 36)) & 0xffffff; @@ -612,6 +614,9 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) ibdev->ib_dev.unmap_fmr = mlx4_ib_unmap_fmr; ibdev->ib_dev.dealloc_fmr = mlx4_ib_fmr_dealloc; + if (ibdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) + ibdev->ib_dev.flags |= IB_DEVICE_IP_CSUM; + if (init_node_data(ibdev)) goto err_map; diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 8cba9c5..ca7cd04 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -1307,6 +1307,9 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE) : 0) | (wr->send_flags & IB_SEND_SOLICITED ? cpu_to_be32(MLX4_WQE_CTRL_SOLICITED) : 0) | + ((wr->send_flags & IB_SEND_IP_CSUM) ? + cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM | + MLX4_WQE_CTRL_TCP_UDP_CSUM) : 0) | qp->sq_signal_bits; if (wr->opcode == IB_WR_SEND_WITH_IMM || diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index 535a446..736942f 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -736,6 +736,9 @@ int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param) MLX4_PUT(inbox, (u8) (PAGE_SHIFT - 12), INIT_HCA_UAR_PAGE_SZ_OFFSET); MLX4_PUT(inbox, param->log_uar_sz, INIT_HCA_LOG_UAR_SZ_OFFSET); + if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) + *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 3); + err = mlx4_cmd(dev, mailbox->dma, 0, 0, MLX4_CMD_INIT_HCA, 10000); if (err) diff --git a/include/linux/mlx4/cq.h b/include/linux/mlx4/cq.h index 0181e0a..5fdc859 100644 --- a/include/linux/mlx4/cq.h +++ b/include/linux/mlx4/cq.h @@ -45,11 +45,11 @@ struct mlx4_cqe { u8 sl; u8 reserved1; __be16 rlid; - u32 reserved2; + __be32 ipoib_status; __be32 byte_cnt; __be16 wqe_index; __be16 checksum; - u8 reserved3[3]; + u8 reserved2[3]; u8 owner_sr_opcode; }; diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h index 3968b94..b4eb921 100644 --- a/include/linux/mlx4/qp.h +++ b/include/linux/mlx4/qp.h @@ -158,6 +158,8 @@ enum { MLX4_WQE_CTRL_FENCE = 1 << 6, MLX4_WQE_CTRL_CQ_UPDATE = 3 << 2, MLX4_WQE_CTRL_SOLICITED = 1 << 1, + MLX4_WQE_CTRL_IP_CSUM = 1 << 4, + MLX4_WQE_CTRL_TCP_UDP_CSUM = 1 << 5, }; struct mlx4_wqe_ctrl_seg { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:16 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:16 +0200 Subject: [ofa-general] [PATCH 6/16 v4] IB/mthca: Add checksum offload support Message-ID: <1201710676.28794.172.camel@mtls03> IB/mthca: Add checksum offload support Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mthca/mthca_cmd.c | 3 +++ drivers/infiniband/hw/mthca/mthca_cmd.h | 1 + drivers/infiniband/hw/mthca/mthca_cq.c | 14 +++++++++----- drivers/infiniband/hw/mthca/mthca_main.c | 6 ++++++ drivers/infiniband/hw/mthca/mthca_qp.c | 2 ++ drivers/infiniband/hw/mthca/mthca_wqe.h | 17 +++++++++-------- 6 files changed, 30 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c index 6966f94..2a38926 100644 --- a/drivers/infiniband/hw/mthca/mthca_cmd.c +++ b/drivers/infiniband/hw/mthca/mthca_cmd.c @@ -1383,6 +1383,9 @@ int mthca_INIT_HCA(struct mthca_dev *dev, MTHCA_PUT(inbox, param->uarc_base, INIT_HCA_UAR_CTX_BASE_OFFSET); } + if (dev->device_cap_flags & IB_DEVICE_IP_CSUM) + *(inbox + INIT_HCA_FLAGS2_OFFSET / 4) |= cpu_to_be32(7 << 3); + err = mthca_cmd(dev, mailbox->dma, 0, 0, CMD_INIT_HCA, HZ, status); mthca_free_mailbox(dev, mailbox); diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.h b/drivers/infiniband/hw/mthca/mthca_cmd.h index 2f976f2..8928ca4 100644 --- a/drivers/infiniband/hw/mthca/mthca_cmd.h +++ b/drivers/infiniband/hw/mthca/mthca_cmd.h @@ -103,6 +103,7 @@ enum { DEV_LIM_FLAG_RAW_IPV6 = 1 << 4, DEV_LIM_FLAG_RAW_ETHER = 1 << 5, DEV_LIM_FLAG_SRQ = 1 << 6, + DEV_LIM_FLAG_IPOIB_CSUM = 1 << 7, DEV_LIM_FLAG_BAD_PKEY_CNTR = 1 << 8, DEV_LIM_FLAG_BAD_QKEY_CNTR = 1 << 9, DEV_LIM_FLAG_MW = 1 << 16, diff --git a/drivers/infiniband/hw/mthca/mthca_cq.c b/drivers/infiniband/hw/mthca/mthca_cq.c index 6bd9f13..4e6c75c 100644 --- a/drivers/infiniband/hw/mthca/mthca_cq.c +++ b/drivers/infiniband/hw/mthca/mthca_cq.c @@ -119,7 +119,8 @@ struct mthca_cqe { __be32 my_qpn; __be32 my_ee; __be32 rqpn; - __be16 sl_g_mlpath; + u8 sl_ipok; + u8 g_mlpath; __be16 rlid; __be32 imm_etype_pkey_eec; __be32 byte_cnt; @@ -493,6 +494,7 @@ static inline int mthca_poll_one(struct mthca_dev *dev, int is_send; int free_cqe = 1; int err = 0; + u16 checksum; cqe = next_cqe_sw(cq); if (!cqe) @@ -635,12 +637,14 @@ static inline int mthca_poll_one(struct mthca_dev *dev, break; } entry->slid = be16_to_cpu(cqe->rlid); - entry->sl = be16_to_cpu(cqe->sl_g_mlpath) >> 12; + entry->sl = cqe->sl_ipok >> 4; entry->src_qp = be32_to_cpu(cqe->rqpn) & 0xffffff; - entry->dlid_path_bits = be16_to_cpu(cqe->sl_g_mlpath) & 0x7f; + entry->dlid_path_bits = cqe->g_mlpath & 0x7f; entry->pkey_index = be32_to_cpu(cqe->imm_etype_pkey_eec) >> 16; - entry->wc_flags |= be16_to_cpu(cqe->sl_g_mlpath) & 0x80 ? - IB_WC_GRH : 0; + entry->wc_flags |= cqe->g_mlpath & 0x80 ? IB_WC_GRH : 0; + checksum = (be32_to_cpu(cqe->rqpn) >> 24) | + ((be32_to_cpu(cqe->my_ee) >> 16) & 0xff00); + entry->csum_ok = (cqe->sl_ipok & 1 && checksum == 0xffff); } entry->status = IB_WC_SUCCESS; diff --git a/drivers/infiniband/hw/mthca/mthca_main.c b/drivers/infiniband/hw/mthca/mthca_main.c index 5cf8250..881ddc8 100644 --- a/drivers/infiniband/hw/mthca/mthca_main.c +++ b/drivers/infiniband/hw/mthca/mthca_main.c @@ -267,6 +267,10 @@ static int mthca_dev_lim(struct mthca_dev *mdev, struct mthca_dev_lim *dev_lim) if (dev_lim->flags & DEV_LIM_FLAG_SRQ) mdev->mthca_flags |= MTHCA_FLAG_SRQ; + if (mthca_is_memfree(mdev)) + if (dev_lim->flags & DEV_LIM_FLAG_IPOIB_CSUM) + mdev->device_cap_flags |= IB_DEVICE_IP_CSUM; + return 0; } @@ -1109,6 +1113,8 @@ static int __mthca_init_one(struct pci_dev *pdev, int hca_type) if (err) goto err_cmd; + mdev->ib_dev.flags = mdev->device_cap_flags; + if (mdev->fw_ver < mthca_hca_table[hca_type].latest_fw) { mthca_warn(mdev, "HCA FW version %d.%d.%03d is old (%d.%d.%03d is current).\n", (int) (mdev->fw_ver >> 32), (int) (mdev->fw_ver >> 16) & 0xffff, diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c index 0e5461c..86aa732 100644 --- a/drivers/infiniband/hw/mthca/mthca_qp.c +++ b/drivers/infiniband/hw/mthca/mthca_qp.c @@ -2012,6 +2012,8 @@ int mthca_arbel_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, cpu_to_be32(MTHCA_NEXT_CQ_UPDATE) : 0) | ((wr->send_flags & IB_SEND_SOLICITED) ? cpu_to_be32(MTHCA_NEXT_SOLICIT) : 0) | + ((wr->send_flags & IB_SEND_IP_CSUM) ? + cpu_to_be32(MTHCA_NEXT_IP_CSUM | MTHCA_NEXT_TCP_UDP_CSUM) : 0) | cpu_to_be32(1); if (wr->opcode == IB_WR_SEND_WITH_IMM || wr->opcode == IB_WR_RDMA_WRITE_WITH_IMM) diff --git a/drivers/infiniband/hw/mthca/mthca_wqe.h b/drivers/infiniband/hw/mthca/mthca_wqe.h index f6a66fe..0e3a0e4 100644 --- a/drivers/infiniband/hw/mthca/mthca_wqe.h +++ b/drivers/infiniband/hw/mthca/mthca_wqe.h @@ -38,14 +38,15 @@ #include enum { - MTHCA_NEXT_DBD = 1 << 7, - MTHCA_NEXT_FENCE = 1 << 6, - MTHCA_NEXT_CQ_UPDATE = 1 << 3, - MTHCA_NEXT_EVENT_GEN = 1 << 2, - MTHCA_NEXT_SOLICIT = 1 << 1, - - MTHCA_MLX_VL15 = 1 << 17, - MTHCA_MLX_SLR = 1 << 16 + MTHCA_NEXT_DBD = 1 << 7, + MTHCA_NEXT_FENCE = 1 << 6, + MTHCA_NEXT_CQ_UPDATE = 1 << 3, + MTHCA_NEXT_EVENT_GEN = 1 << 2, + MTHCA_NEXT_SOLICIT = 1 << 1, + MTHCA_NEXT_IP_CSUM = 1 << 4, + MTHCA_NEXT_TCP_UDP_CSUM = 1 << 5, + MTHCA_MLX_VL15 = 1 << 17, + MTHCA_MLX_SLR = 1 << 16 }; enum { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:20 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:20 +0200 Subject: [ofa-general] [PATCH 7/16 v4] IB/core: Add creation flags to QPs Message-ID: <1201710680.28794.173.camel@mtls03> IB/core: Add creation flags to QPs This will allow a kernel verbs consumer to create a QP and pass special flags to the hw layer. This patch also defines one such flag for LSO support. Signed-off-by: Eli Cohen --- drivers/infiniband/core/uverbs_cmd.c | 1 + include/rdma/ib_verbs.h | 5 +++++ 2 files changed, 6 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index 495c803..9e98cec 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -1065,6 +1065,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, attr.srq = srq; attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; attr.qp_type = cmd.qp_type; + attr.create_flags = 0; attr.cap.max_send_wr = cmd.max_send_wr; attr.cap.max_recv_wr = cmd.max_recv_wr; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 85f2cda..030f868 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -494,6 +494,10 @@ enum ib_qp_type { IB_QPT_RAW_ETY }; +enum qp_create_flags { + QP_CREATE_LSO = 1 << 0, +}; + struct ib_qp_init_attr { void (*event_handler)(struct ib_event *, void *); void *qp_context; @@ -504,6 +508,7 @@ struct ib_qp_init_attr { enum ib_sig_type sq_sig_type; enum ib_qp_type qp_type; u8 port_num; /* special QP types only */ + enum qp_create_flags create_flags; }; enum ib_rnr_timeout { -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:24 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:24 +0200 Subject: [ofa-general] [PATCH 8/16 v4] IB/core: Add support for LSO Message-ID: <1201710684.28794.174.camel@mtls03> IB/core: Add support for LSO LSO allows to pass to the network driver SKBs with data larger than MTU and let the HW fragment the packet to mss quantities. Signed-off-by: Eli Cohen --- include/rdma/ib_verbs.h | 11 +++++++++-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 030f868..44a5713 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -103,6 +103,7 @@ enum ib_device_cap_flags { * may set NETIF_F_IP_CSUM. */ IB_DEVICE_IP_CSUM = (1<<18), + IB_DEVICE_TCP_TSO = (1<<19), }; enum ib_atomic_cap { @@ -410,6 +411,7 @@ enum ib_wc_opcode { IB_WC_COMP_SWAP, IB_WC_FETCH_ADD, IB_WC_BIND_MW, + IB_WC_LSO, /* * Set value of IB_WC_RECV so consumers can test if a completion is a * receive by testing (opcode & IB_WC_RECV). @@ -621,7 +623,8 @@ enum ib_wr_opcode { IB_WR_SEND_WITH_IMM, IB_WR_RDMA_READ, IB_WR_ATOMIC_CMP_AND_SWP, - IB_WR_ATOMIC_FETCH_AND_ADD + IB_WR_ATOMIC_FETCH_AND_ADD, + IB_WR_LSO }; enum ib_send_flags { @@ -629,7 +632,8 @@ enum ib_send_flags { IB_SEND_SIGNALED = (1<<1), IB_SEND_SOLICITED = (1<<2), IB_SEND_INLINE = (1<<3), - IB_SEND_IP_CSUM = (1<<4) + IB_SEND_IP_CSUM = (1<<4), + IB_SEND_UDP_LSO = (1<<5) }; struct ib_sge { @@ -659,6 +663,9 @@ struct ib_send_wr { } atomic; struct { struct ib_ah *ah; + void *header; + int hlen; + int mss; u32 remote_qpn; u32 remote_qkey; u16 pkey_index; /* valid for GSI only */ -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:28 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:28 +0200 Subject: [ofa-general] [PATCH 9/16 v4] IB/ipoib: Add LSO support Message-ID: <1201710688.28794.175.camel@mtls03> IB/ipoib: Add LSO support Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib.h | 54 ++++++++++++------- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 7 ++- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 80 +++++++++++++++++++++------ drivers/infiniband/ulp/ipoib/ipoib_main.c | 8 +++- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 5 ++- 5 files changed, 113 insertions(+), 41 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index d13e481..70f8b5c 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -152,31 +152,40 @@ static inline int ipoib_dma_map_tx(struct ib_device *ca, { struct sk_buff *skb = tx_req->skb; u64 *mapping = tx_req->mapping; - int frags; int i; - - mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) - return -EIO; - - frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < frags; ++i) { + int nfrags; + int off; + + if (skb_headlen(skb)) { + mapping[0] = ib_dma_map_single(ca, skb->data, skb_headlen(skb), + DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[0]))) + return -EIO; + off = 1; + } else + off = 0; + + nfrags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < nfrags; ++i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - mapping[i + 1] = ib_dma_map_page(ca, frag->page, - frag->page_offset, frag->size, - DMA_TO_DEVICE); - if (unlikely(ib_dma_mapping_error(ca, mapping[i + 1]))) + mapping[i + off] = ib_dma_map_page(ca, frag->page, frag->page_offset, + frag->size, DMA_TO_DEVICE); + if (unlikely(ib_dma_mapping_error(ca, mapping[i + off]))) goto partial_error; } return 0; partial_error: - ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + if (skb_headlen(skb)) { + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + off = 0; + } else + off = 1; for (; i > 0; --i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1]; - ib_dma_unmap_page(ca, mapping[i], frag->size, DMA_TO_DEVICE); + ib_dma_unmap_page(ca, mapping[i - off], frag->size, + DMA_TO_DEVICE); } return -EIO; } @@ -186,15 +195,20 @@ static inline void ipoib_dma_unmap_tx(struct ib_device *ca, { struct sk_buff *skb = tx_req->skb; u64 *mapping = tx_req->mapping; - int frags; int i; + int nfrags; + int off; - ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + if (skb_headlen(skb)) { + ib_dma_unmap_single(ca, mapping[0], skb_headlen(skb), DMA_TO_DEVICE); + off = 1; + } else + off = 0; - frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < frags; ++i) { + nfrags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < nfrags; ++i) { skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - ib_dma_unmap_page(ca, mapping[i + 1], frag->size, + ib_dma_unmap_page(ca, mapping[i + off], frag->size, DMA_TO_DEVICE); } } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index e94ec0a..4f5604d 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -1379,7 +1379,7 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, ipoib_warn(priv, "enabling connected mode " "will cause multicast packet drops\n"); - dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_SG); + dev->features &= ~(NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO); ipoib_flush_paths(dev); return count; @@ -1393,6 +1393,11 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr, if (priv->ca->flags & IB_DEVICE_IP_CSUM) dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG; + + if (priv->dev->features & NETIF_F_SG && + priv->ca->flags & IB_DEVICE_TCP_TSO) + priv->dev->features |= NETIF_F_TSO; + return count; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 0f616f6..c3af51b 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -38,6 +38,7 @@ #include #include #include +#include #include @@ -346,24 +347,40 @@ void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) static inline int post_send(struct ipoib_dev_priv *priv, unsigned int wr_id, struct ib_ah *address, u32 qpn, - u64 *mapping, int headlen, - skb_frag_t *frags, - int nr_frags) + struct ipoib_tx_buf *tx_req, + void *head, int hlen) { struct ib_send_wr *bad_wr; - int i; + int i, off; + struct sk_buff *skb = tx_req->skb; + skb_frag_t *frags = skb_shinfo(skb)->frags; + int nr_frags = skb_shinfo(skb)->nr_frags; + u64 *mapping = tx_req->mapping; + + if (skb_headlen(skb)) { + priv->tx_sge[0].addr = mapping[0]; + priv->tx_sge[0].length = skb_headlen(skb); + off = 1; + } else + off = 0; - priv->tx_sge[0].addr = mapping[0]; - priv->tx_sge[0].length = headlen; for (i = 0; i < nr_frags; ++i) { - priv->tx_sge[i + 1].addr = mapping[i + 1]; - priv->tx_sge[i + 1].length = frags[i].size; + priv->tx_sge[i + off].addr = mapping[i + off]; + priv->tx_sge[i + off].length = frags[i].size; } - priv->tx_wr.num_sge = nr_frags + 1; + priv->tx_wr.num_sge = nr_frags + off; priv->tx_wr.wr_id = wr_id; priv->tx_wr.wr.ud.remote_qpn = qpn; priv->tx_wr.wr.ud.ah = address; + if (head) { + priv->tx_wr.wr.ud.mss = skb_shinfo(skb)->gso_size; + priv->tx_wr.wr.ud.header = head; + priv->tx_wr.wr.ud.hlen = hlen; + priv->tx_wr.opcode = IB_WR_LSO; + } else + priv->tx_wr.opcode = IB_WR_SEND; + return ib_post_send(priv->qp, &priv->tx_wr, &bad_wr); } @@ -372,14 +389,36 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ipoib_tx_buf *tx_req; + int hlen; + void *phead; + + if (!skb_is_gso(skb)) { + if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { + ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", + skb->len, priv->mcast_mtu + IPOIB_ENCAP_LEN); + ++dev->stats.tx_dropped; + ++dev->stats.tx_errors; + ipoib_cm_skb_too_long(dev, skb, priv->mcast_mtu); + return; + } + phead = 0; + hlen = 0; + } else { + /* + * LSO header is limited to max 60 bytes + */ + if (unlikely((ip_hdr(skb)->ihl + tcp_hdr(skb)->doff) > 15)) { + ipoib_warn(priv, "ip(%d) and tcp(%d) headers too long, dropping skb\n", + ip_hdr(skb)->ihl << 2, tcp_hdr(skb)->doff << 2); + goto drop; + } - if (unlikely(skb->len > priv->mcast_mtu + IPOIB_ENCAP_LEN)) { - ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", - skb->len, priv->mcast_mtu + IPOIB_ENCAP_LEN); - ++dev->stats.tx_dropped; - ++dev->stats.tx_errors; - ipoib_cm_skb_too_long(dev, skb, priv->mcast_mtu); - return; + hlen = ((ip_hdr(skb)->ihl + tcp_hdr(skb)->doff) << 2) + IPOIB_ENCAP_LEN; + phead = skb->data; + if (unlikely(!skb_pull(skb, hlen))) { + ipoib_warn(priv, "linear data too small\n"); + goto drop; + } } ipoib_dbg_data(priv, "sending packet, length=%d address=%p qpn=0x%06x\n", @@ -408,8 +447,7 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, if (unlikely(post_send(priv, priv->tx_head & (ipoib_sendq_size - 1), address->ah, qpn, - tx_req->mapping, skb_headlen(skb), - skb_shinfo(skb)->frags, skb_shinfo(skb)->nr_frags))) { + tx_req, phead, hlen))) { ipoib_warn(priv, "post_send failed\n"); ++dev->stats.tx_errors; ipoib_dma_unmap_tx(priv->ca, tx_req); @@ -425,6 +463,12 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, netif_stop_queue(dev); } } + return; + +drop: + ++dev->stats.tx_errors; + dev_kfree_skb_any(skb); + return; } static void __ipoib_reap_ah(struct net_device *dev) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 83f8b85..9063f28 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -706,7 +706,9 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev) goto out; } - ipoib_send(dev, skb, neigh->ah, IPOIB_QPN(skb->dst->neighbour->ha)); + ipoib_send(dev, skb, neigh->ah, + IPOIB_QPN(skb->dst->neighbour->ha)); + goto out; } @@ -1170,6 +1172,10 @@ static struct net_device *ipoib_add_port(const char *format, goto event_failed; } + if (priv->dev->features & NETIF_F_SG && priv->ca->flags & IB_DEVICE_TCP_TSO) + priv->dev->features |= NETIF_F_TSO; + + result = register_netdev(priv->dev); if (result) { printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 5e392e0..e20f2af 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -153,7 +153,7 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .max_recv_sge = 1 }, .sq_sig_type = IB_SIGNAL_ALL_WR, - .qp_type = IB_QPT_UD + .qp_type = IB_QPT_UD, }; int i, ret, size; @@ -191,6 +191,9 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) init_attr.send_cq = priv->cq; init_attr.recv_cq = priv->cq; + if (ca->flags & IB_DEVICE_TCP_TSO) + init_attr.create_flags = QP_CREATE_LSO; + priv->qp = ib_create_qp(priv->pd, &init_attr); if (IS_ERR(priv->qp)) { printk(KERN_WARNING "%s: failed to create QP\n", ca->name); -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:31 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:31 +0200 Subject: [ofa-general] [PATCH 10/16 v4] IB/mlx4: Add creation flags to mlx4 QPs Message-ID: <1201710691.28794.176.camel@mtls03> IB/mlx4: Add creation flags to mlx4 QPs The core passes creation flags and mlx4 saves them for later reference. Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mlx4/mlx4_ib.h | 5 +++++ drivers/infiniband/hw/mlx4/qp.c | 12 +++++++++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index 2869765..39bc060 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -110,6 +110,10 @@ struct mlx4_ib_wq { unsigned tail; }; +enum qp_flags { + MLX4_QP_LSO = 1 << 0 +}; + struct mlx4_ib_qp { struct ib_qp ibqp; struct mlx4_qp mqp; @@ -133,6 +137,7 @@ struct mlx4_ib_qp { u8 resp_depth; u8 sq_no_prefetch; u8 state; + u32 flags; }; struct mlx4_ib_srq { diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index ca7cd04..a04e931 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -238,9 +238,12 @@ static int set_rq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap, return 0; } -static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap, - enum ib_qp_type type, struct mlx4_ib_qp *qp) +static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_init_attr *init_attr, + struct mlx4_ib_qp *qp) { + struct ib_qp_cap *cap = &init_attr->cap; + enum ib_qp_type type = init_attr->qp_type; + /* Sanity check SQ size before proceeding */ if (cap->max_send_wr > dev->dev->caps.max_wqes || cap->max_send_sge > dev->dev->caps.max_sq_sg || @@ -256,6 +259,9 @@ static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap, cap->max_send_sge + 2 > dev->dev->caps.max_sq_sg) return -EINVAL; + if (init_attr->create_flags & QP_CREATE_LSO) + qp->flags |= MLX4_QP_LSO; + qp->sq.wqe_shift = ilog2(roundup_pow_of_two(max(cap->max_send_sge * sizeof (struct mlx4_wqe_data_seg), cap->max_inline_data + @@ -371,7 +377,7 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd, } else { qp->sq_no_prefetch = 0; - err = set_kernel_sq_size(dev, &init_attr->cap, init_attr->qp_type, qp); + err = set_kernel_sq_size(dev, init_attr, qp); if (err) goto err; -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:35 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:35 +0200 Subject: [ofa-general] [PATCH 11/16 v4] IB/mlx4: Add LSO support to mlx4 Message-ID: <1201710695.28794.177.camel@mtls03> IB/mlx4: Add LSO support to mlx4 Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mlx4/cq.c | 3 ++ drivers/infiniband/hw/mlx4/main.c | 4 +++ drivers/infiniband/hw/mlx4/qp.c | 52 +++++++++++++++++++++++++++++++++--- drivers/net/mlx4/fw.c | 9 ++++++ drivers/net/mlx4/fw.h | 1 + drivers/net/mlx4/main.c | 1 + include/linux/mlx4/device.h | 1 + include/linux/mlx4/qp.h | 5 +++ 8 files changed, 71 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 539c69c..75fc2b3 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -409,6 +409,9 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq, case MLX4_OPCODE_BIND_MW: wc->opcode = IB_WC_BIND_MW; break; + case MLX4_OPCODE_LSO: + wc->opcode = IB_WC_LSO; + break; } } else { wc->byte_len = be32_to_cpu(cqe->byte_cnt); diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 8ce94a1..2dd0de3 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -101,6 +101,8 @@ static int mlx4_ib_query_device(struct ib_device *ibdev, props->device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE; if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) props->device_cap_flags |= IB_DEVICE_IP_CSUM; + if (dev->dev->caps.max_gso_sz) + props->device_cap_flags |= IB_DEVICE_TCP_TSO; props->vendor_id = be32_to_cpup((__be32 *) (out_mad->data + 36)) & 0xffffff; @@ -616,6 +618,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) if (ibdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM) ibdev->ib_dev.flags |= IB_DEVICE_IP_CSUM; + if (ibdev->dev->caps.max_gso_sz) + ibdev->ib_dev.flags |= IB_DEVICE_TCP_TSO; if (init_node_data(ibdev)) goto err_map; diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index a04e931..fc4811c 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -69,6 +69,7 @@ enum { static const __be32 mlx4_ib_opcode[] = { [IB_WR_SEND] = __constant_cpu_to_be32(MLX4_OPCODE_SEND), + [IB_WR_LSO] = __constant_cpu_to_be32(MLX4_OPCODE_LSO), [IB_WR_SEND_WITH_IMM] = __constant_cpu_to_be32(MLX4_OPCODE_SEND_IMM), [IB_WR_RDMA_WRITE] = __constant_cpu_to_be32(MLX4_OPCODE_RDMA_WRITE), [IB_WR_RDMA_WRITE_WITH_IMM] = __constant_cpu_to_be32(MLX4_OPCODE_RDMA_WRITE_IMM), @@ -243,6 +244,7 @@ static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_init_attr *i { struct ib_qp_cap *cap = &init_attr->cap; enum ib_qp_type type = init_attr->qp_type; + int reserve = 0; /* Sanity check SQ size before proceeding */ if (cap->max_send_wr > dev->dev->caps.max_wqes || @@ -259,15 +261,18 @@ static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_init_attr *i cap->max_send_sge + 2 > dev->dev->caps.max_sq_sg) return -EINVAL; - if (init_attr->create_flags & QP_CREATE_LSO) + if (init_attr->create_flags & QP_CREATE_LSO) { qp->flags |= MLX4_QP_LSO; + reserve = 64; + } qp->sq.wqe_shift = ilog2(roundup_pow_of_two(max(cap->max_send_sge * - sizeof (struct mlx4_wqe_data_seg), + sizeof (struct mlx4_wqe_data_seg) + + reserve, cap->max_inline_data + sizeof (struct mlx4_wqe_inline_seg)) + send_wqe_overhead(type))); - qp->sq.max_gs = ((1 << qp->sq.wqe_shift) - send_wqe_overhead(type)) / + qp->sq.max_gs = ((1 << qp->sq.wqe_shift) - reserve - send_wqe_overhead(type)) / sizeof (struct mlx4_wqe_data_seg); /* @@ -755,9 +760,11 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, } } - if (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI || - ibqp->qp_type == IB_QPT_UD) + if (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI) context->mtu_msgmax = (IB_MTU_4096 << 5) | 11; + else if (ibqp->qp_type == IB_QPT_UD) + context->mtu_msgmax = (IB_MTU_4096 << 5) | + ilog2(dev->dev->caps.max_gso_sz); else if (attr_mask & IB_QP_PATH_MTU) { if (attr->path_mtu < IB_MTU_256 || attr->path_mtu > IB_MTU_4096) { printk(KERN_ERR "path MTU (%u) is invalid\n", @@ -1274,6 +1281,28 @@ static void __set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ib_sge *sg) dseg->addr = cpu_to_be64(sg->addr); } +static int build_lso_seg(struct mlx4_lso_seg *wqe, struct ib_send_wr *wr, + struct mlx4_ib_qp *qp, int *lso_seg_len) +{ + int halign; + + halign = ALIGN(wr->wr.ud.hlen, 16); + if (unlikely(!(qp->flags & MLX4_QP_LSO) && wr->num_sge > qp->sq.max_gs - (halign >> 4))) + return -EINVAL; + + memcpy(wqe->header, wr->wr.ud.header, wr->wr.ud.hlen); + + /* make sure LSO header is written before + overwriting stamping */ + wmb(); + + wqe->mss_hdr_size = cpu_to_be32(((wr->wr.ud.mss - wr->wr.ud.hlen) + << 16) | wr->wr.ud.hlen); + + *lso_seg_len = halign; + return 0; +} + int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, struct ib_send_wr **bad_wr) { @@ -1364,6 +1393,19 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, set_datagram_seg(wqe, wr); wqe += sizeof (struct mlx4_wqe_datagram_seg); size += sizeof (struct mlx4_wqe_datagram_seg) / 16; + + if (wr->opcode == IB_WR_LSO) { + int hlen; + + err = build_lso_seg(wqe, wr, qp, &hlen); + if (err) { + *bad_wr = wr; + goto out; + } + wqe += hlen; + size += hlen >> 4; + } + break; case IB_QPT_SMI: diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c index 736942f..7b426ff 100644 --- a/drivers/net/mlx4/fw.c +++ b/drivers/net/mlx4/fw.c @@ -133,6 +133,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) #define QUERY_DEV_CAP_MAX_AV_OFFSET 0x27 #define QUERY_DEV_CAP_MAX_REQ_QP_OFFSET 0x29 #define QUERY_DEV_CAP_MAX_RES_QP_OFFSET 0x2b +#define QUERY_DEV_CAP_MAX_GSO_OFFSET 0x2d #define QUERY_DEV_CAP_MAX_RDMA_OFFSET 0x2f #define QUERY_DEV_CAP_RSZ_SRQ_OFFSET 0x33 #define QUERY_DEV_CAP_ACK_DELAY_OFFSET 0x35 @@ -215,6 +216,13 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev_cap->max_requester_per_qp = 1 << (field & 0x3f); MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_RES_QP_OFFSET); dev_cap->max_responder_per_qp = 1 << (field & 0x3f); + MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_GSO_OFFSET); + field &= 0x1f; + if (!field) + dev_cap->max_gso_sz = 0; + else + dev_cap->max_gso_sz = 1 << field; + MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_RDMA_OFFSET); dev_cap->max_rdma_global = 1 << (field & 0x3f); MLX4_GET(field, outbox, QUERY_DEV_CAP_ACK_DELAY_OFFSET); @@ -377,6 +385,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev_cap->max_sq_desc_sz, dev_cap->max_sq_sg); mlx4_dbg(dev, "Max RQ desc size: %d, max RQ S/G: %d\n", dev_cap->max_rq_desc_sz, dev_cap->max_rq_sg); + mlx4_dbg(dev, "Max GSO size: %d\n", dev_cap->max_gso_sz); dump_dev_cap_flags(dev, dev_cap->flags); diff --git a/drivers/net/mlx4/fw.h b/drivers/net/mlx4/fw.h index 7e1dd9e..ad5abf3 100644 --- a/drivers/net/mlx4/fw.h +++ b/drivers/net/mlx4/fw.h @@ -96,6 +96,7 @@ struct mlx4_dev_cap { u8 bmme_flags; u32 reserved_lkey; u64 max_icm_sz; + int max_gso_sz; }; struct mlx4_adapter { diff --git a/drivers/net/mlx4/main.c b/drivers/net/mlx4/main.c index 89b3f0b..ed2c648 100644 --- a/drivers/net/mlx4/main.c +++ b/drivers/net/mlx4/main.c @@ -159,6 +159,7 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) dev->caps.page_size_cap = ~(u32) (dev_cap->min_page_sz - 1); dev->caps.flags = dev_cap->flags; dev->caps.stat_rate_support = dev_cap->stat_rate_support; + dev->caps.max_gso_sz = dev_cap->max_gso_sz; return 0; } diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 222815d..856570f 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -181,6 +181,7 @@ struct mlx4_caps { u32 flags; u16 stat_rate_support; u8 port_width_cap[MLX4_MAX_PORTS + 1]; + int max_gso_sz; }; struct mlx4_buf_list { diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h index b4eb921..0bac8e8 100644 --- a/include/linux/mlx4/qp.h +++ b/include/linux/mlx4/qp.h @@ -215,6 +215,11 @@ struct mlx4_wqe_datagram_seg { __be32 reservd[2]; }; +struct mlx4_lso_seg { + __be32 mss_hdr_size; + __be32 header[0]; +}; + struct mlx4_wqe_bind_seg { __be32 flags1; __be32 flags2; -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:39 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:39 +0200 Subject: [ofa-general] [PATCH 12/16 v4] IB/ipoib: Add ethtool support to IPOIB Message-ID: <1201710699.28794.178.camel@mtls03> IB/ipoib: Add ethtool support to IPOIB Just add the infrastructure to add functionality later. Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/Makefile | 3 +- drivers/infiniband/ulp/ipoib/ipoib.h | 2 + drivers/infiniband/ulp/ipoib/ipoib_etool.c | 55 ++++++++++++++++++++++++++++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 + 4 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 drivers/infiniband/ulp/ipoib/ipoib_etool.c diff --git a/drivers/infiniband/ulp/ipoib/Makefile b/drivers/infiniband/ulp/ipoib/Makefile index 98ee38e..83488ee 100644 --- a/drivers/infiniband/ulp/ipoib/Makefile +++ b/drivers/infiniband/ulp/ipoib/Makefile @@ -4,7 +4,8 @@ ib_ipoib-y := ipoib_main.o \ ipoib_ib.o \ ipoib_multicast.o \ ipoib_verbs.o \ - ipoib_vlan.o + ipoib_vlan.o \ + ipoib_etool.o ib_ipoib-$(CONFIG_INFINIBAND_IPOIB_CM) += ipoib_cm.o ib_ipoib-$(CONFIG_INFINIBAND_IPOIB_DEBUG) += ipoib_fs.o diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 70f8b5c..ee7807f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -526,6 +526,8 @@ void ipoib_pkey_poll(struct work_struct *work); int ipoib_pkey_dev_delay_open(struct net_device *dev); void ipoib_drain_cq(struct net_device *dev); +void ipoib_set_ethtool_ops(struct net_device *dev); + #ifdef CONFIG_INFINIBAND_IPOIB_CM #define IPOIB_FLAGS_RC 0x80 diff --git a/drivers/infiniband/ulp/ipoib/ipoib_etool.c b/drivers/infiniband/ulp/ipoib/ipoib_etool.c new file mode 100644 index 0000000..913aea0 --- /dev/null +++ b/drivers/infiniband/ulp/ipoib/ipoib_etool.c @@ -0,0 +1,55 @@ +/* + * Copyright (c) 2007 Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_etool.c $ + */ + +#include +#include +#include + +#include "ipoib.h" + +static void ipoib_get_drvinfo(struct net_device *netdev, + struct ethtool_drvinfo *drvinfo) +{ + strncpy(drvinfo->driver, "ipoib", sizeof(drvinfo->driver) - 1); +} + +static const struct ethtool_ops ipoib_ethtool_ops = { + .get_drvinfo = ipoib_get_drvinfo, + .get_tso = ethtool_op_get_tso, +}; + +void ipoib_set_ethtool_ops(struct net_device *dev) +{ + SET_ETHTOOL_OPS(dev, &ipoib_ethtool_ops); +} diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 9063f28..c842a07 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -954,6 +954,8 @@ static void ipoib_setup(struct net_device *dev) dev->set_multicast_list = ipoib_set_mcast_list; dev->neigh_setup = ipoib_neigh_setup_dev; + ipoib_set_ethtool_ops(dev); + netif_napi_add(dev, &priv->napi, ipoib_poll, 100); dev->watchdog_timeo = HZ; -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:45 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:45 +0200 Subject: [ofa-general] [PATCH 13/16 v4] IB/core: Add support for modify CQ Message-ID: <1201710705.28794.179.camel@mtls03> IB/core: Add support for modify CQ Add support for modifying CQ parameters for controlling event generation moderation. Signed-off-by: Eli Cohen --- drivers/infiniband/core/verbs.c | 7 +++++++ include/rdma/ib_verbs.h | 11 +++++++++++ 2 files changed, 18 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index 86ed8af..84709ed 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -628,6 +628,13 @@ struct ib_cq *ib_create_cq(struct ib_device *device, } EXPORT_SYMBOL(ib_create_cq); +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period) +{ + return cq->device->modify_cq ? + cq->device->modify_cq(cq, cq_count, cq_period) : -ENOSYS; +} +EXPORT_SYMBOL(ib_modify_cq); + int ib_destroy_cq(struct ib_cq *cq) { if (atomic_read(&cq->usecnt)) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 44a5713..f6a8247 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -984,6 +984,8 @@ struct ib_device { int comp_vector, struct ib_ucontext *context, struct ib_udata *udata); + int (*modify_cq)(struct ib_cq *cq, u16 cq_count, + u16 cq_period); int (*destroy_cq)(struct ib_cq *cq); int (*resize_cq)(struct ib_cq *cq, int cqe, struct ib_udata *udata); @@ -1389,6 +1391,15 @@ struct ib_cq *ib_create_cq(struct ib_device *device, int ib_resize_cq(struct ib_cq *cq, int cqe); /** + * ib_modify_cq - Modifies moderation params of the CQ + * @cq: The CQ to modify. + * @cq_count: number of CQEs that will trigger an event + * @cq_period: max period of time in usec before triggering an event + * + */ +int ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); + +/** * ib_destroy_cq - Destroys the specified CQ. * @cq: The CQ to destroy. */ -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:49 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:49 +0200 Subject: [ofa-general] [PATCH 14/16 v4] IB/ipoib: Support modifying IPOIB CQ moderation params Message-ID: <1201710709.28794.180.camel@mtls03> IB/ipoib: Support modifying IPOIB CQ moderation params This can be used to tune at run time the paramters controlling the event (interrupt) generation rate and thus reduce the overhead incurred by handling interrupts resulting in better throughput. Since IPOIB uses a single CQ for both rx and tx, rx is chosen to dictate configuration for both rx and tx. Signed-off-by: Eli Cohen --- drivers/infiniband/ulp/ipoib/ipoib.h | 6 ++++ drivers/infiniband/ulp/ipoib/ipoib_etool.c | 46 ++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index ee7807f..3e8dceb 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -309,6 +309,11 @@ struct ipoib_cm_dev_priv { int num_frags; }; +struct ipoib_ethtool_st { + u16 coalesce_usecs; + u16 max_coalesced_frames; +}; + /* * Device private locking: tx_lock protects members used in TX fast * path (and we use LLTX so upper layers don't do extra locking). @@ -386,6 +391,7 @@ struct ipoib_dev_priv { struct dentry *mcg_dentry; struct dentry *path_dentry; #endif + struct ipoib_ethtool_st etool; }; struct ipoib_ah { diff --git a/drivers/infiniband/ulp/ipoib/ipoib_etool.c b/drivers/infiniband/ulp/ipoib/ipoib_etool.c index 913aea0..a3ac4cf 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_etool.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_etool.c @@ -44,9 +44,55 @@ static void ipoib_get_drvinfo(struct net_device *netdev, strncpy(drvinfo->driver, "ipoib", sizeof(drvinfo->driver) - 1); } +static int ipoib_get_coalesce(struct net_device *dev, + struct ethtool_coalesce *coal) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + coal->rx_coalesce_usecs = priv->etool.coalesce_usecs; + coal->tx_coalesce_usecs = priv->etool.coalesce_usecs; + coal->rx_max_coalesced_frames = priv->etool.max_coalesced_frames; + coal->tx_max_coalesced_frames = priv->etool.max_coalesced_frames; + + return 0; +} + +static int ipoib_set_coalesce(struct net_device *dev, + struct ethtool_coalesce *coal) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + /* + * since ipoib uses a single CQ for both rx and tx, + * we assume that rx params dictate the configuration. + * These values are saved in the private data and returned + * when ipoib_get_coalesce is called + */ + if (coal->rx_coalesce_usecs > 0xffff || + coal->rx_max_coalesced_frames > 0xffff) + return -EINVAL; + + ret = ib_modify_cq(priv->cq, coal->rx_max_coalesced_frames, + coal->rx_coalesce_usecs); + if (ret) { + ipoib_dbg(priv, "failed modifying CQ\n"); + return ret; + } + + coal->tx_coalesce_usecs = coal->rx_coalesce_usecs; + priv->etool.coalesce_usecs = coal->rx_coalesce_usecs; + coal->tx_max_coalesced_frames = coal->rx_max_coalesced_frames; + priv->etool.max_coalesced_frames = coal->rx_max_coalesced_frames; + + return 0; +} + static const struct ethtool_ops ipoib_ethtool_ops = { .get_drvinfo = ipoib_get_drvinfo, .get_tso = ethtool_op_get_tso, + .get_coalesce = ipoib_get_coalesce, + .set_coalesce = ipoib_set_coalesce, }; void ipoib_set_ethtool_ops(struct net_device *dev) -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:53 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:53 +0200 Subject: [ofa-general] [PATCH 15/16 v4] IB/mlx4: Add support for modifying CQ parameters Message-ID: <1201710714.28794.181.camel@mtls03> IB/mlx4: Add support for modifying CQ parameters Signed-off-by: Eli Cohen --- drivers/infiniband/hw/mlx4/cq.c | 19 +++++++++++++ drivers/infiniband/hw/mlx4/main.c | 1 + drivers/infiniband/hw/mlx4/mlx4_ib.h | 1 + drivers/net/mlx4/cq.c | 49 ++++++++++++++++++---------------- include/linux/mlx4/cmd.h | 2 +- include/linux/mlx4/cq.h | 25 +++++++++++++++++ 6 files changed, 73 insertions(+), 24 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 75fc2b3..66c0d6c 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -91,6 +91,25 @@ static struct mlx4_cqe *next_cqe_sw(struct mlx4_ib_cq *cq) return get_sw_cqe(cq, cq->mcq.cons_index); } +int mlx4_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period) +{ + struct mlx4_ib_cq *mcq = to_mcq(cq); + struct mlx4_ib_dev *dev = to_mdev(cq->device); + struct mlx4_cq_context *context; + int err; + + context = kzalloc(sizeof *context, GFP_KERNEL); + if (!context) + return -ENOMEM; + + context->cq_period = cpu_to_be16(cq_period); + context->cq_max_count = cpu_to_be16(cq_count); + err = mlx4_cq_modify(dev->dev, &mcq->mcq, context, 1); + + kfree(context); + return err; +} + struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev, int entries, int vector, struct ib_ucontext *context, struct ib_udata *udata) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 2dd0de3..6b00a81 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -601,6 +601,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) ibdev->ib_dev.post_send = mlx4_ib_post_send; ibdev->ib_dev.post_recv = mlx4_ib_post_recv; ibdev->ib_dev.create_cq = mlx4_ib_create_cq; + ibdev->ib_dev.modify_cq = mlx4_ib_modify_cq; ibdev->ib_dev.destroy_cq = mlx4_ib_destroy_cq; ibdev->ib_dev.poll_cq = mlx4_ib_poll_cq; ibdev->ib_dev.req_notify_cq = mlx4_ib_arm_cq; diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index 39bc060..211eb56 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -252,6 +252,7 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, struct ib_udata *udata); int mlx4_ib_dereg_mr(struct ib_mr *mr); +int mlx4_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev, int entries, int vector, struct ib_ucontext *context, struct ib_udata *udata); diff --git a/drivers/net/mlx4/cq.c b/drivers/net/mlx4/cq.c index d4441fe..39004c5 100644 --- a/drivers/net/mlx4/cq.c +++ b/drivers/net/mlx4/cq.c @@ -38,33 +38,11 @@ #include #include +#include #include "mlx4.h" #include "icm.h" -struct mlx4_cq_context { - __be32 flags; - u16 reserved1[3]; - __be16 page_offset; - __be32 logsize_usrpage; - u8 reserved2; - u8 cq_period; - u8 reserved3; - u8 cq_max_count; - u8 reserved4[3]; - u8 comp_eqn; - u8 log_page_size; - u8 reserved5[2]; - u8 mtt_base_addr_h; - __be32 mtt_base_addr_l; - __be32 last_notified_index; - __be32 solicit_producer_index; - __be32 consumer_index; - __be32 producer_index; - u32 reserved6[2]; - __be64 db_rec_addr; -}; - #define MLX4_CQ_STATUS_OK ( 0 << 28) #define MLX4_CQ_STATUS_OVERFLOW ( 9 << 28) #define MLX4_CQ_STATUS_WRITE_FAIL (10 << 28) @@ -121,6 +99,13 @@ static int mlx4_SW2HW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox, MLX4_CMD_TIME_CLASS_A); } +static int mlx4_MODIFY_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox, + int cq_num, u32 opmod) +{ + return mlx4_cmd(dev, mailbox->dma, cq_num, opmod, MLX4_CMD_MODIFY_CQ, + MLX4_CMD_TIME_CLASS_A); +} + static int mlx4_HW2SW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox, int cq_num) { @@ -206,6 +191,24 @@ err_out: } EXPORT_SYMBOL_GPL(mlx4_cq_alloc); +int mlx4_cq_modify(struct mlx4_dev *dev, struct mlx4_cq *cq, + struct mlx4_cq_context *context, int modify) +{ + struct mlx4_cmd_mailbox *mailbox; + int err; + + mailbox = mlx4_alloc_cmd_mailbox(dev); + if (IS_ERR(mailbox)) + return PTR_ERR(mailbox); + + memcpy(mailbox->buf, context, sizeof *context); + err = mlx4_MODIFY_CQ(dev, mailbox, cq->cqn, modify); + + mlx4_free_cmd_mailbox(dev, mailbox); + return err; +} +EXPORT_SYMBOL_GPL(mlx4_cq_modify); + void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq) { struct mlx4_priv *priv = mlx4_priv(dev); diff --git a/include/linux/mlx4/cmd.h b/include/linux/mlx4/cmd.h index 7d1eaa9..77323a7 100644 --- a/include/linux/mlx4/cmd.h +++ b/include/linux/mlx4/cmd.h @@ -81,7 +81,7 @@ enum { MLX4_CMD_SW2HW_CQ = 0x16, MLX4_CMD_HW2SW_CQ = 0x17, MLX4_CMD_QUERY_CQ = 0x18, - MLX4_CMD_RESIZE_CQ = 0x2c, + MLX4_CMD_MODIFY_CQ = 0x2c, /* SRQ commands */ MLX4_CMD_SW2HW_SRQ = 0x35, diff --git a/include/linux/mlx4/cq.h b/include/linux/mlx4/cq.h index 5fdc859..5d8625e 100644 --- a/include/linux/mlx4/cq.h +++ b/include/linux/mlx4/cq.h @@ -38,6 +38,27 @@ #include #include +struct mlx4_cq_context { + __be32 flags; + u16 reserved1[3]; + __be16 page_offset; + __be32 logsize_usrpage; + u16 cq_period; + u16 cq_max_count; + u8 reserved4[3]; + u8 comp_eqn; + u8 log_page_size; + u8 reserved5[2]; + u8 mtt_base_addr_h; + __be32 mtt_base_addr_l; + __be32 last_notified_index; + __be32 solicit_producer_index; + __be32 consumer_index; + __be32 producer_index; + u32 reserved6[2]; + __be64 db_rec_addr; +}; + struct mlx4_cqe { __be32 my_qpn; __be32 immed_rss_invalid; @@ -120,4 +141,8 @@ enum { MLX4_CQ_DB_REQ_NOT = 2 << 24 }; + +int mlx4_cq_modify(struct mlx4_dev *dev, struct mlx4_cq *cq, + struct mlx4_cq_context *context, int resize); + #endif /* MLX4_CQ_H */ -- 1.5.3.8 From eli at mellanox.co.il Wed Jan 30 08:31:57 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Wed, 30 Jan 2008 18:31:57 +0200 Subject: [ofa-general] [PATCH 16/16 v4] IB/ipoib: Set default CQ moderation parameters Message-ID: <1201710717.28794.182.camel@mtls03> IB/ipoib: Set default CQ moderation parameters Set the default params to make sure they are applied. This params give better performance. Signed-off-by: Eli Cohen --- Bellow are two runs of netperf, the first with moderation params disabled and the second with modetation params enabled. command: netperf -H 14.4.3.175 -t UDP_STREAM -- -m 128 1) moderation disabled UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 14.4.3.175 (14.4.3.175) port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 114688 128 10.01 3138731 0 321.23 114688 10.01 3138722 321.23 2) moderation set to 10 usec, 16 frames 114688 128 10.00 3747251 0 383.59 114688 10.00 3739630 382.81 drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index e20f2af..ba21ea6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -34,6 +34,7 @@ */ #include "ipoib.h" +#include int ipoib_mcast_attach(struct net_device *dev, u16 mlid, union ib_gid *mgid) { @@ -157,6 +158,7 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) }; int i, ret, size; + struct ethtool_coalesce *coal; priv->pd = ib_alloc_pd(priv->ca); if (IS_ERR(priv->pd)) { @@ -185,6 +187,16 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) goto out_free_mr; } + coal = kzalloc(sizeof *coal, GFP_KERNEL); + if (coal) { + coal->rx_coalesce_usecs = 10; + coal->tx_coalesce_usecs = 10; + coal->rx_max_coalesced_frames = 16; + coal->tx_max_coalesced_frames = 16; + dev->ethtool_ops->set_coalesce(dev, coal); + kfree(coal); + } + if (ib_req_notify_cq(priv->cq, IB_CQ_NEXT_COMP)) goto out_free_cq; -- 1.5.3.8 From tziporet at dev.mellanox.co.il Wed Jan 30 08:40:10 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 30 Jan 2008 18:40:10 +0200 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: <1201639285.28486.101.camel@firewall.xsintricity.com> References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> <1201639285.28486.101.camel@firewall.xsintricity.com> Message-ID: <47A0A86A.5060003@mellanox.co.il> Doug Ledford wrote: > > Hmmm...I'd like to put my $.02 in here. I don't have any visibility > into what drives the OFED schedule, so I have no clue as to why people > don't want to slip the schedule for this change. I'm sure you guys have > your reasons. However, I also happen to be a consumer of this code, and > I know for a fact that no one has gotten my input on this issue. So, > the deal is that I'm currently integrating OFED 1.3 into what will be > RHEL5.2. The RHEL5.2 freeze date has already passed, but in order to > keep what finally goes out from being too stale, I'm being allowed to > submit the OFED-1.3-rc1 code prior to freeze, and then update to > OFED-1.3 final during our beta test process. What this means, is that > anything you punt from 1.3 to 1.3.1, you are also punting out of RHEL5.2 > and RHEL4.7. So, that being said, there's a whole trickle down effect > with various groups that would really like to be able to use 5.2 out of > the box that may prefer a slip in 1.3 so that this can be part of it > instead of punting to 1.3.1. I'm not saying this will change your mind, > but I'm sure it wasn't part of the decision process before, so I'm > bringing it up. > Thanks for the input (BTW you are welcome to join our weekly meetings and give us feedback online) I think it is important to make sure RH new versions will include best OFED release This my suggestion is: * Delay 1.3 release in a week * Do RC4 next week - Feb 6 * Add RC5 on Feb 18 - this will be the GOLD version * GA release on Feb 25 All - please reply if this is acceptable > > > 760 major eli at mellanox.co.il UDP performance on Rx is lower > than Tx - for 1.3.1 > 761 major eli at mellanox.co.il Poor and jittery UDP > performance at small messages - for 1.3.1 > > > Ditto for requesting these two be in 1.3. We've already had customers > bring up the UDP performance issue in our previous releases. > > We will push some fixes of these to RC4 if the above plan is accepted Tziporet From tziporet at dev.mellanox.co.il Wed Jan 30 08:42:51 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 30 Jan 2008 18:42:51 +0200 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: <47A07CC9.8030005@voltaire.com> References: <479DFA58.7050800@intec.ugent.be> <47A07CC9.8030005@voltaire.com> Message-ID: <47A0A90B.40506@mellanox.co.il> Or Gerlitz wrote: > > This is interesting report, however, since currently the hw checksum > patch in not being submitted to the mainline kernel and it is also > about to be removed from ofed 1.3 (Tziporet, can you update on that?), > I am not going to look into that. > > Or. > the hw checksum patch was removed from OFED 1.3 Tziporet From rdreier at cisco.com Wed Jan 30 08:47:55 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Jan 2008 08:47:55 -0800 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: (Bryan S. Rosenburg's message of "Mon, 28 Jan 2008 11:55:00 -0500") References: Message-ID: Thanks... > - /* First check that we have enough alignment */ > - if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & ~PAGE_MASK)) > - return ERR_PTR(-EINVAL); I don't think we want to remove this check. It prevents someone from trying to register a region with a virtual address that the hardware can't handle. > I'm still annoyed by the (num_phys_buf == 1) special case. I'm wondering > if it's still needed. If you leave out that if-statement entirely, you > may end up using a page size that is larger (maybe much larger) than > necessary, but I think things will still work, given that the > virtual-to-physical alignment constraints are respected. If you remove > the special case, you can replace the whole loop with an ffs() call. Makes sense... let me post a patch for discussion. From rosnbrg at us.ibm.com Wed Jan 30 09:00:21 2008 From: rosnbrg at us.ibm.com (Bryan S Rosenburg) Date: Wed, 30 Jan 2008 12:00:21 -0500 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: Message-ID: Roland Dreier wrote on 01/30/2008 11:47:55 AM: > > > - /* First check that we have enough alignment */ > > - if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & ~PAGE_MASK)) > > - return ERR_PTR(-EINVAL); > > I don't think we want to remove this check. It prevents someone from > trying to register a region with a virtual address that the hardware > can't handle. This initial check is redundant, given: mask = buffer_list[0].addr ^ *iova_start; and subsequent: if (mask & ~PAGE_MASK) return ERR_PTR(-EINVAL); - Bryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Jan 30 09:01:14 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Jan 2008 09:01:14 -0800 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: (Bryan S. Rosenburg's message of "Wed, 30 Jan 2008 12:00:21 -0500") References: Message-ID: > This initial check is redundant, given: > > mask = buffer_list[0].addr ^ *iova_start; > > and subsequent: > > if (mask & ~PAGE_MASK) > return ERR_PTR(-EINVAL); Yes, good point. Thanks! From Jeffrey.C.Becker at nasa.gov Wed Jan 30 09:32:09 2008 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Wed, 30 Jan 2008 09:32:09 -0800 Subject: [ofa-general] Re: Status of NFS-RDMA ? In-Reply-To: References: Message-ID: <47A0B499.5030208@nasa.gov> Hi all. James Lentini wrote: > On Wed, 30 Jan 2008, Krishna Kumar2 wrote: > > >> Hi James, >> >> Since you had mentioned in an earlier email that NFS-RDMA server >> side will be present in OFED1.4, >> > > Actually, that was Tziporet. > > >> do you know if any port of the server code to OFED1.3 (when it comes >> out) will happen? Is there any effort for that, any work ongoing, >> any help required, etc? >> > > Jeff Becker had looked into this. We would definitely appreciate the > help. > I have set up a git tree for NFSoRDMA and succesfully merged it with, and built it on OFED 1.3-rcx. I'm currently doing the backports (SLES 10 SP1 first). All this is in preparation for OFED 1.4, as that is when NFSoRDMA will be included in OFED. I think I have this patching/backporting stuff under control. However, my testing resources are limited. Thus depending on your platform, I might be able to point you at OFED 1.3 based bits for testing if/when they are ready. Thanks. -jeff > The NFS framework has changed significantly in several areas in recent > kernel releases. This has made backporting the NFS/RDMA code to older > kernels challenging. > > If you are interested in working on OFED1.3 support, let us know. > > >> I couldn't find the release time lines for OFED1.4, is there any >> link on openfabrics homepage? >> > > I'm not involved with the OFED1.4 planning. Tziporet, is there > information on this? > > >> Thanks, >> >> - KK >> >> general-bounces at lists.openfabrics.org wrote on 01/29/2008 08:23:46 PM: >> >> >>> On Tue, 29 Jan 2008, Pawel Dziekonski wrote: >>> >>> >>>> On Mon, 28 Jan 2008 at 10:14:22AM -0500, James Lentini wrote: >>>> >>>>> On Sat, 26 Jan 2008, Pawel Dziekonski wrote: >>>>> >>>>> >>>>>> I pulled Tom's tree from new url and build a kernel. >>>>>> >>>>> If you enabled support for INFINIBAND drivers (IB and iWARP support) >>>>> and NFS client/server support, the kernel should be ready to go (run >>>>> "grep RDMA /your_kernel_sources/.config" to confirm that >>>>> CONFIG_SUNRPC_XPRT_RDMA is either m or y). >>>>> >>>>> NFS/RDMA doesn't require OFED be installed. OFED is a release of the >>>>> Linux kernel sources and some userspace libraries/tools. If you are >>>>> >>>>>> then I downloaded OFED from >>>>>> http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz, >>>>>> >>>>> I don't know what the above URL contains. The latest code is in Tom >>>>> Tucker's tree (and now NFS server maintainer Bruce Fields tree). It >>>>> >> is >> >>>> hi, >>>> >>>> back to subject on a proper mailing list. >>>> >>>> I have a >3 year experience with mellanox hardware and IBGold so I >>>> basically know what OFED is all about. up to now i was only using >>>> IBGold since IB drivers appeared in kernel pretty recently. >>>> >>> You'll want to use the mainline kernel's IB drivers for NFS/RDMA. >>> We've been developing the NFS/RDMA software on the OpenFabrics (aka >>> OpenIB) code since it was merged into 2.6.10 in Dec 2004. >>> >>> >>>> currently I have new hardware. I'm running Tom's kernel and already >>>> did some MPI tests. SDP is not working, probably because sdp kernel >>>> modules where not build. ;) I understand that those modules are only >>>> available from ofa-kernel. please correct me if i'm wrong. >>>> >>> Correct. SDP has never been submitted to mainline Linux. >>> >>> >>>> system is Scientic Linux 4.5, which is supposed to be a fully >>>> compatible RH4 clone. hardware is Supermicro mobos with Mellanox >>>> MT25204 and Flextronisc switch. >>>> >>>> error log from ofa-kernel build: >>>> >>> Is your goal to build a kernel with an NFS/RDMA server? If so, the >>> kernel sources from Tom Tucker's git tree are the ones you want, not >>> the old OFED 1.2-based packages which are out of date. >>> >>> Did you try setting up the NFS/RDMA server on the kernel used for your >>> MPI tests above? >>> >>> >>>>>> make[1]: Entering directory `/usr/src/ib/xprt-switch-2.6' >>>>>> test -e include/linux/autoconf.h -a -e include/config/auto.conf || >>>>>> >> ( \ >> >>>>>> echo; \ >>>>>> echo " ERROR: Kernel configuration is invalid."; \ >>>>>> echo " include/linux/autoconf.h or include/config/auto.conf >>>>>> >> are >> >>> missing."; \ >>> >>>>>> echo " Run 'make oldconfig && make prepare' on kernel src >>>>>> >> to fix it."; \ >> >>>>>> echo; \ >>>>>> /bin/false) >>>>>> >>>>>> obviously, doing 'make oldconfig && make prepare' does not help. >>>>>> anyway, above mentioned files do exist: >>>>>> >>>>>> # ls -la /usr/src/ib/xprt-switch-2.6/{include/linux/autoconf.h, >>>>>> >>> include/config/auto.conf} >>> >>>>>> -rw-r--r-- 1 root root 10156 Jan 25 17:42 >>>>>> >> /usr/src/ib/xprt-switch-2. >> >>> 6/include/config/auto.conf >>> >>>>>> -rw-r--r-- 1 root root 14733 Jan 25 17:42 >>>>>> >> /usr/src/ib/xprt-switch-2. >> >>> 6/include/linux/autoconf.h >>> >>>>>> despite of above, compilation continues but fails with: >>>>>> >>>>>> gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. >>>>>> >>> 2/drivers/infiniband/core/.mad.o.d -nostdinc -isystem >>> >> /usr/lib/gcc/x86_64- >> >>> redhat-linux/3.4.6/include -D__KERNEL__ >>> >> -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. >> >>> 2/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 >>> >> /drivers/infiniband/include >> >>> -Iinclude -include include/linux/autoconf.h -include >>> /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h -Wall >>> >> -Wundef >> >>> -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common >>> >> -Werror- >> >>> implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel >>> >> -pipe - >> >>> Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time >>> >> -mno-sse - >> >>> mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 >>> >> - >> >>> DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -Wdeclaration-after- >>> statement -DMODULE -D"KBUILD_STR(s)=#s" - >>> D"KBUILD_BASENAME=KBUILD_STR(mad)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" >>> >> -c - >> >>> o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.! >>> tmp >>> >>>> _mad.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 >>>> >> /drivers/infiniband/core/mad.c >> >>>>>> /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 >>>>>> >> /drivers/infiniband/core/mad.c: In >> >>> function `ib_mad_init_module': >>> >>>>>> /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 >>>>>> >> /drivers/infiniband/core/mad.c: >> >>> 2966: error: too many arguments to function `kmem_cache_create' >>> >>>>>> make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. >>>>>> >>> 2/drivers/infiniband/core/mad.o] Error 1 >>> >>>>>> make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. >>>>>> >>> 2/drivers/infiniband/core] Error 2 >>> >>>>>> make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 >>>>>> >> /drivers/infiniband] Error 2 >> >>>>>> make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error >>>>>> >> 2 >> >>>>>> make[1]: Leaving directory `/usr/src/ib/xprt-switch-2.6' >>>>>> make: *** [kernel] Error 2 >>>>>> error: Bad exit status from /var/tmp/rpm-tmp.3877 (%install) >>>>>> >>>>>> full log: >>>>>> https://cefeid.wcss.wroc.pl/d/tmp/OFED.build.32122.log >>>>>> >>>> thanks in advance for any help, P >>>> >>>> >>>> -- >>>> Pawel Dziekonski >>>> Wroclaw Centre for Networking & Supercomputing, HPC Department >>>> Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, >>>> >> POLAND >> >>>> phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl >>>> >>>> >>>> >> ------------------------------------------------------------------------- >> >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> nfs-rdma-devel mailing list >>>> nfs-rdma-devel at lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/nfs-rdma-devel >>>> >>>> >>> _______________________________________________ >>> general mailing list >>> general at lists.openfabrics.org >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >>> >>> To unsubscribe, please visit >>> >> http://openib.org/mailman/listinfo/openib-general >> >> From landman at scalableinformatics.com Wed Jan 30 09:43:40 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Wed, 30 Jan 2008 12:43:40 -0500 Subject: [ofa-general] Re: Status of NFS-RDMA ? In-Reply-To: <47A0B499.5030208@nasa.gov> References: <47A0B499.5030208@nasa.gov> Message-ID: <47A0B74C.50708@scalableinformatics.com> Jeff Becker wrote: > I have set up a git tree for NFSoRDMA and succesfully merged it with, > and built it on OFED 1.3-rcx. I'm currently doing the backports (SLES 10 > SP1 first). All this is in preparation for OFED 1.4, as that is when > NFSoRDMA will be included in OFED. I think I have this > patching/backporting stuff under control. However, my testing resources > are limited. Thus depending on your platform, I might be able to point > you at OFED 1.3 based bits for testing if/when they are ready. Thanks. > > -jeff Hi Jeff: We would be interested in helping to test out those bits. Let me know if you need this. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615 From changquing.tang at hp.com Wed Jan 30 10:05:53 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Wed, 30 Jan 2008 18:05:53 +0000 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: <47A0A86A.5060003@mellanox.co.il> References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> <1201639285.28486.101.camel@firewall.xsintricity.com> <47A0A86A.5060003@mellanox.co.il> Message-ID: When do you pack the official RC3 ? Thanks. --CQ > -----Original Message----- > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of > Tziporet Koren > Sent: Wednesday, January 30, 2008 10:40 AM > To: Doug Ledford > Cc: ewg at lists.openfabrics.org; general at lists.openfabrics.org > Subject: Re: [ewg] Re: [ofa-general] OFED Jan 28 meeting > summary on RC3 readiness > > Doug Ledford wrote: > > > > Hmmm...I'd like to put my $.02 in here. I don't have any > visibility > > into what drives the OFED schedule, so I have no clue as to > why people > > don't want to slip the schedule for this change. I'm sure you guys > > have your reasons. However, I also happen to be a consumer of this > > code, and I know for a fact that no one has gotten my input on this > > issue. So, the deal is that I'm currently integrating OFED > 1.3 into > > what will be RHEL5.2. The RHEL5.2 freeze date has already > passed, but > > in order to keep what finally goes out from being too > stale, I'm being > > allowed to submit the OFED-1.3-rc1 code prior to freeze, and then > > update to > > OFED-1.3 final during our beta test process. What this > means, is that > > anything you punt from 1.3 to 1.3.1, you are also punting out of > > RHEL5.2 and RHEL4.7. So, that being said, there's a whole trickle > > down effect with various groups that would really like to > be able to > > use 5.2 out of the box that may prefer a slip in 1.3 so > that this can > > be part of it instead of punting to 1.3.1. I'm not saying > this will > > change your mind, but I'm sure it wasn't part of the > decision process > > before, so I'm bringing it up. > > > Thanks for the input (BTW you are welcome to join our weekly > meetings and give us feedback online) I think it is important > to make sure RH new versions will include best OFED release > > This my suggestion is: > > * Delay 1.3 release in a week > * Do RC4 next week - Feb 6 > * Add RC5 on Feb 18 - this will be the GOLD version > * GA release on Feb 25 > > > All - please reply if this is acceptable > > > > > > 760 major eli at mellanox.co.il UDP performance on > Rx is lower > > than Tx - for 1.3.1 > > 761 major eli at mellanox.co.il Poor and jittery UDP > > performance at small messages - for 1.3.1 > > > > > > Ditto for requesting these two be in 1.3. We've already > had customers > > bring up the UDP performance issue in our previous releases. > > > > > We will push some fixes of these to RC4 if the above plan is accepted > > Tziporet > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From ardavis at ichips.intel.com Wed Jan 30 10:37:20 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Wed, 30 Jan 2008 10:37:20 -0800 Subject: [ofa-general] [ANNOUCE] dapl 2.0.5 release In-Reply-To: <000501c862db$3b9d17d0$a5e0180a@amr.corp.intel.com> References: <000501c862db$3b9d17d0$a5e0180a@amr.corp.intel.com> Message-ID: <47A0C3E0.1000605@ichips.intel.com> Arlin Davis wrote: > There is new release for dapl 2.0 available on the OFA download page and > in my git tree. > > Changes to allow both v1 and v2 development packages to be installed on > the same system. > v2 libdat.so has been renamed to libdat2.so. > > md5sum: 010459e421a5c194438d58b1ccf1c6d0 dapl-2.0.5.tar.gz > > Vlad, please pull new v2 release into OFED 1.3 RC3 and install the > following packages: > > Note: please make sure dapl-1.2.4-devel is added to list. > Vlad, I noticed that the daily build added dapl-2.0.5 but did not add dapl-1.2.4-devel to install script. Please add for next build. A missing libdat.so will break most v1 uDAPL consumers. Thanks, -arlin From sashak at voltaire.com Wed Jan 30 10:54:41 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 30 Jan 2008 18:54:41 +0000 Subject: [ofa-general] [PATCH 1/2] opensm: wait_for_pending_transaction() generalization Message-ID: <20080130185441.GU11277@sashak.voltaire.com> Function wait_for_pending_transaction() is global now and moved from PerfMgr to StateMgr, all related objects are generalized. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_stats.h | 4 ---- opensm/opensm/main.c | 2 -- opensm/opensm/osm_opensm.c | 16 +++++++++++++++- opensm/opensm/osm_perfmgr.c | 35 ++--------------------------------- opensm/opensm/osm_sm_mad_ctrl.c | 2 -- opensm/opensm/osm_state_mgr.c | 18 ++++++++++++++++++ 6 files changed, 35 insertions(+), 42 deletions(-) diff --git a/opensm/include/opensm/osm_stats.h b/opensm/include/opensm/osm_stats.h index b5100f2..ecd752b 100644 --- a/opensm/include/opensm/osm_stats.h +++ b/opensm/include/opensm/osm_stats.h @@ -48,13 +48,11 @@ #ifndef _OSM_STATS_H_ #define _OSM_STATS_H_ -#ifdef ENABLE_OSM_PERF_MGR #ifdef HAVE_LIBPTHREAD #include #else #include #endif -#endif #include #include @@ -100,14 +98,12 @@ typedef struct _osm_stats { atomic32_t sa_mads_sent; atomic32_t sa_mads_rcvd_unknown; atomic32_t sa_mads_ignored; -#ifdef ENABLE_OSM_PERF_MGR #ifdef HAVE_LIBPTHREAD pthread_mutex_t mutex; pthread_cond_t cond; #else cl_event_t event; #endif -#endif } osm_stats_t; /* * FIELDS diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c index 7c435a0..703a242 100644 --- a/opensm/opensm/main.c +++ b/opensm/opensm/main.c @@ -1044,13 +1044,11 @@ int main(int argc, char *argv[]) fprintf(stdout, "There are still %u MADs out. Forcing the exit of the OpenSM application...\n", osm.mad_pool.mads_out); -#ifdef ENABLE_OSM_PERF_MGR #ifdef HAVE_LIBPTHREAD pthread_cond_signal(&osm.stats.cond); #else cl_event_signal(&osm.stats.event); #endif -#endif } Exit: diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c index a78307c..fa517d0 100644 --- a/opensm/opensm/osm_opensm.c +++ b/opensm/opensm/osm_opensm.c @@ -236,7 +236,12 @@ void osm_opensm_destroy(IN osm_opensm_t * const p_osm) osm_vendor_delete(&p_osm->p_vendor); osm_subn_destroy(&p_osm->subn); cl_disp_destroy(&p_osm->disp); - +#ifdef HAVE_LIBPTHREAD + pthread_cond_destroy(&p_osm->stats.cond); + pthread_mutex_destroy(&p_osm->stats.mutex); +#else + cl_event_destroy(&p_osm->stats.event); +#endif close_node_name_map(p_osm->node_name_map); cl_plock_destroy(&p_osm->lock); @@ -277,6 +282,15 @@ osm_opensm_init(IN osm_opensm_t * const p_osm, if (status != IB_SUCCESS) goto Exit; +#ifdef HAVE_LIBPTHREAD + pthread_mutex_init(&p_osm->stats.mutex, NULL); + pthread_cond_init(&p_osm->stats.cond, NULL); +#else + status = cl_event_init(&p_osm->stats.event, FALSE); + if (status != IB_SUCCESS) + goto Exit; +#endif + if (p_opt->single_thread) { osm_log(&p_osm->log, OSM_LOG_INFO, "osm_opensm_init: Forcing single threaded dispatcher\n"); diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index 6c0c6cd..dd6e662 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -117,6 +117,8 @@ static inline void diff_time(struct timeval *before, #endif +extern int wait_for_pending_transactions(osm_stats_t * stats); + /********************************************************************** * Internal helper functions. **********************************************************************/ @@ -717,24 +719,6 @@ static int sweep_hop_0(osm_sm_t * const sm) return (status); } -static int wait_for_pending_transactions(osm_stats_t * stats) -{ -#ifdef HAVE_LIBPTHREAD - pthread_mutex_lock(&stats->mutex); - while (stats->qp0_mads_outstanding && !osm_exit_flag) - pthread_cond_wait(&stats->cond, &stats->mutex); - pthread_mutex_unlock(&stats->mutex); -#else - while (1) { - unsigned count = stats->qp0_mads_outstanding; - if (!count || osm_exit_flag) - break; - cl_event_wait_on(&stats->event, EVENT_NO_TIMEOUT, TRUE); - } -#endif - return osm_exit_flag; -} - static void reset_node_count(cl_map_item_t * const p_map_item, void *cxt) { osm_node_t *p_node = (osm_node_t *) p_map_item; @@ -898,12 +882,6 @@ void osm_perfmgr_destroy(osm_perfmgr_t * const pm) free(pm->event_db_dump_file); perfmgr_db_destroy(pm->db); cl_timer_destroy(&pm->sweep_timer); -#ifdef HAVE_LIBPTHREAD - pthread_cond_destroy(&pm->subn->p_osm->stats.cond); - pthread_mutex_destroy(&pm->subn->p_osm->stats.mutex); -#else - cl_event_destroy(&pm->subn->p_osm->stats.event); -#endif OSM_LOG_EXIT(pm->log); } @@ -1300,15 +1278,6 @@ osm_perfmgr_init(osm_perfmgr_t * const pm, pm->max_outstanding_queries = p_opt->perfmgr_max_outstanding_queries; pm->event_plugin = event_plugin; -#ifdef HAVE_LIBPTHREAD - pthread_mutex_init(&subn->p_osm->stats.mutex, NULL); - pthread_cond_init(&subn->p_osm->stats.cond, NULL); -#else - status = cl_event_init(&subn->p_osm->stats.event, FALSE); - if (status != IB_SUCCESS) - goto Exit; -#endif - status = cl_timer_init(&pm->sweep_timer, perfmgr_sweep, pm); if (status != IB_SUCCESS) goto Exit; diff --git a/opensm/opensm/osm_sm_mad_ctrl.c b/opensm/opensm/osm_sm_mad_ctrl.c index 2638357..c6624a1 100644 --- a/opensm/opensm/osm_sm_mad_ctrl.c +++ b/opensm/opensm/osm_sm_mad_ctrl.c @@ -108,13 +108,11 @@ __osm_sm_mad_ctrl_retire_trans_mad(IN osm_sm_mad_ctrl_t * const p_ctrl, osm_sm_signal(&p_ctrl->p_subn->p_osm->sm, OSM_SIGNAL_NO_PENDING_TRANSACTIONS); -#ifdef ENABLE_OSM_PERF_MGR #ifdef HAVE_LIBPTHREAD pthread_cond_signal(&p_ctrl->p_stats->cond); #else cl_event_signal(&p_ctrl->p_stats->event); #endif -#endif } OSM_LOG_EXIT(p_ctrl->p_log); diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 93fd880..3746389 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -1314,6 +1314,24 @@ static void __osm_state_mgr_check_tbl_consistency(IN osm_state_mgr_t * /********************************************************************** **********************************************************************/ +int wait_for_pending_transactions(osm_stats_t * stats) +{ +#ifdef HAVE_LIBPTHREAD + pthread_mutex_lock(&stats->mutex); + while (stats->qp0_mads_outstanding && !osm_exit_flag) + pthread_cond_wait(&stats->cond, &stats->mutex); + pthread_mutex_unlock(&stats->mutex); +#else + while (1) { + unsigned count = stats->qp0_mads_outstanding; + if (!count || osm_exit_flag) + break; + cl_event_wait_on(&stats->event, EVENT_NO_TIMEOUT, TRUE); + } +#endif + return osm_exit_flag; +} + void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, IN osm_signal_t signal) { -- 1.5.4.rc2.60.gb2e62 From sashak at voltaire.com Wed Jan 30 10:56:06 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Wed, 30 Jan 2008 18:56:06 +0000 Subject: [ofa-general] [PATCH 2/2] opensm: OpenSM state machine rework In-Reply-To: <20080130185441.GU11277@sashak.voltaire.com> References: <20080130185441.GU11277@sashak.voltaire.com> Message-ID: <20080130185606.GV11277@sashak.voltaire.com> Instead of tricky state machine it implements plain flow do_sweep() function which uses wait_for_pending_transaction() blocker. One of the goals of this patch is to preserve the original OpenSM behavior. Signed-off-by: Sasha Khapyorsky --- opensm/include/opensm/osm_base.h | 51 +-- opensm/include/opensm/osm_sm.h | 1 + opensm/opensm/osm_console.c | 64 -- opensm/opensm/osm_helper.c | 51 +-- opensm/opensm/osm_node_info_rcv.c | 11 +- opensm/opensm/osm_perfmgr.c | 12 +- opensm/opensm/osm_port_info_rcv.c | 3 +- opensm/opensm/osm_sm.c | 3 + opensm/opensm/osm_sm_mad_ctrl.c | 6 +- opensm/opensm/osm_sminfo_rcv.c | 17 +- opensm/opensm/osm_state_mgr.c | 1176 ++++++++--------------------------- opensm/opensm/osm_sw_info_rcv.c | 3 +- opensm/opensm/osm_sweep_fail_ctrl.c | 2 +- 13 files changed, 286 insertions(+), 1114 deletions(-) diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h index aaf9930..6f784ca 100644 --- a/opensm/include/opensm/osm_base.h +++ b/opensm/include/opensm/osm_base.h @@ -753,39 +753,7 @@ typedef enum _osm_sm_state { OSM_SM_STATE_NO_STATE = 0, OSM_SM_STATE_INIT, OSM_SM_STATE_IDLE, - OSM_SM_STATE_SWEEP_LIGHT, - OSM_SM_STATE_SWEEP_LIGHT_WAIT, - OSM_SM_STATE_SWEEP_HEAVY_SELF, - OSM_SM_STATE_SWEEP_HEAVY_SUBNET, - OSM_SM_STATE_SET_SM_UCAST_LID, - OSM_SM_STATE_SET_SM_UCAST_LID_WAIT, - OSM_SM_STATE_SET_SM_UCAST_LID_DONE, - OSM_SM_STATE_SET_SUBNET_UCAST_LIDS, - OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_WAIT, - OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_DONE, - OSM_SM_STATE_SET_UCAST_TABLES, - OSM_SM_STATE_SET_UCAST_TABLES_WAIT, - OSM_SM_STATE_SET_UCAST_TABLES_DONE, - OSM_SM_STATE_SET_MCAST_TABLES, - OSM_SM_STATE_SET_MCAST_TABLES_WAIT, - OSM_SM_STATE_SET_MCAST_TABLES_DONE, - OSM_SM_STATE_SET_LINK_PORTS, - OSM_SM_STATE_SET_LINK_PORTS_WAIT, - OSM_SM_STATE_SET_LINK_PORTS_DONE, - OSM_SM_STATE_SET_ARMED, - OSM_SM_STATE_SET_ARMED_WAIT, - OSM_SM_STATE_SET_ARMED_DONE, - OSM_SM_STATE_SET_ACTIVE, - OSM_SM_STATE_SET_ACTIVE_WAIT, OSM_SM_STATE_STANDBY, - OSM_SM_STATE_SUBNET_UP, - OSM_SM_STATE_PROCESS_REQUEST, - OSM_SM_STATE_PROCESS_REQUEST_WAIT, - OSM_SM_STATE_PROCESS_REQUEST_DONE, - OSM_SM_STATE_MASTER_OR_HIGHER_SM_DETECTED, - OSM_SM_STATE_SET_PKEY, - OSM_SM_STATE_SET_PKEY_WAIT, - OSM_SM_STATE_SET_PKEY_DONE, OSM_SM_STATE_MAX } osm_sm_state_t; /***********/ @@ -804,17 +772,14 @@ typedef enum _osm_sm_state { */ #define OSM_SIGNAL_NONE 0 #define OSM_SIGNAL_SWEEP 1 -#define OSM_SIGNAL_CHANGE_DETECTED 2 -#define OSM_SIGNAL_NO_PENDING_TRANSACTIONS 3 -#define OSM_SIGNAL_DONE 4 -#define OSM_SIGNAL_DONE_PENDING 5 -#define OSM_SIGNAL_LIGHT_SWEEP_FAIL 6 -#define OSM_SIGNAL_IDLE_TIME_PROCESS 7 -#define OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST 8 -#define OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED 9 -#define OSM_SIGNAL_EXIT_STBY 10 -#define OSM_SIGNAL_PERFMGR_SWEEP 11 -#define OSM_SIGNAL_MAX 12 +#define OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST 2 +#define OSM_SIGNAL_EXIT_STBY 3 +#define OSM_SIGNAL_PERFMGR_SWEEP 4 +#define OSM_SIGNAL_MAX 4 + +/* status values for sweep managers - can be removed later */ +#define OSM_SIGNAL_DONE 16 +#define OSM_SIGNAL_DONE_PENDING 17 typedef uintn_t osm_signal_t; /***********/ diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h index e0b3d01..2cdbdd0 100644 --- a/opensm/include/opensm/osm_sm.h +++ b/opensm/include/opensm/osm_sm.h @@ -120,6 +120,7 @@ typedef struct osm_sm { cl_timer_t sweep_timer; cl_event_wheel_t trap_aging_tracker; cl_thread_t sweeper; + unsigned master_sm_found; osm_subn_t *p_subn; osm_db_t *p_db; osm_vendor_t *p_vendor; diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c index d0a632f..bb2a68d 100644 --- a/opensm/opensm/osm_console.c +++ b/opensm/opensm/osm_console.c @@ -300,72 +300,8 @@ static char *sm_state_mgr_str(osm_sm_state_t state) return ("Init"); case OSM_SM_STATE_IDLE: return ("Idle"); - case OSM_SM_STATE_SWEEP_LIGHT: - return ("Sweep Light"); - case OSM_SM_STATE_SWEEP_LIGHT_WAIT: - return ("Sweep Light Wait"); - case OSM_SM_STATE_SWEEP_HEAVY_SELF: - return ("Sweep Heavy Self"); - case OSM_SM_STATE_SWEEP_HEAVY_SUBNET: - return ("Sweep Heavy Subnet"); - case OSM_SM_STATE_SET_SM_UCAST_LID: - return ("Set SM UCAST LID"); - case OSM_SM_STATE_SET_SM_UCAST_LID_WAIT: - return ("Set SM UCAST LID Wait"); - case OSM_SM_STATE_SET_SM_UCAST_LID_DONE: - return ("Set SM UCAST LID Done"); - case OSM_SM_STATE_SET_SUBNET_UCAST_LIDS: - return ("Set Subnet UCAST LIDS"); - case OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_WAIT: - return ("Set Subnet UCAST LIDS Wait"); - case OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_DONE: - return ("Set Subnet UCAST LIDS Done"); - case OSM_SM_STATE_SET_UCAST_TABLES: - return ("Set UCAST Tables"); - case OSM_SM_STATE_SET_UCAST_TABLES_WAIT: - return ("Set UCAST Tables Wait"); - case OSM_SM_STATE_SET_UCAST_TABLES_DONE: - return ("Set UCAST Tables Done"); - case OSM_SM_STATE_SET_MCAST_TABLES: - return ("Set MCAST Tables"); - case OSM_SM_STATE_SET_MCAST_TABLES_WAIT: - return ("Set MCAST Tables Wait"); - case OSM_SM_STATE_SET_MCAST_TABLES_DONE: - return ("Set MCAST Tables Done"); - case OSM_SM_STATE_SET_LINK_PORTS: - return ("Set Link Ports"); - case OSM_SM_STATE_SET_LINK_PORTS_WAIT: - return ("Set Link Ports Wait"); - case OSM_SM_STATE_SET_LINK_PORTS_DONE: - return ("Set Link Ports Done"); - case OSM_SM_STATE_SET_ARMED: - return ("Set Armed"); - case OSM_SM_STATE_SET_ARMED_WAIT: - return ("Set Armed Wait"); - case OSM_SM_STATE_SET_ARMED_DONE: - return ("Set Armed Done"); - case OSM_SM_STATE_SET_ACTIVE: - return ("Set Active"); - case OSM_SM_STATE_SET_ACTIVE_WAIT: - return ("Set Active Wait"); case OSM_SM_STATE_STANDBY: return ("Standby"); - case OSM_SM_STATE_SUBNET_UP: - return ("Subnet Up"); - case OSM_SM_STATE_PROCESS_REQUEST: - return ("Process Request"); - case OSM_SM_STATE_PROCESS_REQUEST_WAIT: - return ("Process Request Wait"); - case OSM_SM_STATE_PROCESS_REQUEST_DONE: - return ("Process Request Done"); - case OSM_SM_STATE_MASTER_OR_HIGHER_SM_DETECTED: - return ("Master or Higher SM Detected"); - case OSM_SM_STATE_SET_PKEY: - return ("Set PKey"); - case OSM_SM_STATE_SET_PKEY_WAIT: - return ("Set PKey Wait"); - case OSM_SM_STATE_SET_PKEY_DONE: - return ("Set PKey Done"); default: return ("Unknown State"); } diff --git a/opensm/opensm/osm_helper.c b/opensm/opensm/osm_helper.c index 1ea86b9..bd345bc 100644 --- a/opensm/opensm/osm_helper.c +++ b/opensm/opensm/osm_helper.c @@ -2062,56 +2062,17 @@ const char *const __osm_sm_state_str[] = { "OSM_SM_STATE_NO_STATE", /* 0 */ "OSM_SM_STATE_INIT", /* 1 */ "OSM_SM_STATE_IDLE", /* 2 */ - "OSM_SM_STATE_SWEEP_LIGHT", /* 3 */ - "OSM_SM_STATE_SWEEP_LIGHT_WAIT", /* 4 */ - "OSM_SM_STATE_SWEEP_HEAVY_SELF", /* 5 */ - "OSM_SM_STATE_SWEEP_HEAVY_SUBNET", /* 6 */ - "OSM_SM_STATE_SET_SM_UCAST_LID", /* 7 */ - "OSM_SM_STATE_SET_SM_UCAST_LID_WAIT", /* 8 */ - "OSM_SM_STATE_SET_SM_UCAST_LID_DONE", /* 9 */ - "OSM_SM_STATE_SET_SUBNET_UCAST_LIDS", /* 10 */ - "OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_WAIT", /* 11 */ - "OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_DONE", /* 12 */ - "OSM_SM_STATE_SET_UCAST_TABLES", /* 13 */ - "OSM_SM_STATE_SET_UCAST_TABLES_WAIT", /* 14 */ - "OSM_SM_STATE_SET_UCAST_TABLES_DONE", /* 15 */ - "OSM_SM_STATE_SET_MCAST_TABLES", /* 16 */ - "OSM_SM_STATE_SET_MCAST_TABLES_WAIT", /* 17 */ - "OSM_SM_STATE_SET_MCAST_TABLES_DONE", /* 18 */ - "OSM_SM_STATE_SET_LINK_PORTS", /* 19 */ - "OSM_SM_STATE_SET_LINK_PORTS_WAIT", /* 20 */ - "OSM_SM_STATE_SET_LINK_PORTS_DONE", /* 21 */ - "OSM_SM_STATE_SET_ARMED", /* 22 */ - "OSM_SM_STATE_SET_ARMED_WAIT", /* 23 */ - "OSM_SM_STATE_SET_ARMED_DONE", /* 24 */ - "OSM_SM_STATE_SET_ACTIVE", /* 25 */ - "OSM_SM_STATE_SET_ACTIVE_WAIT", /* 26 */ - "OSM_SM_STATE_STANDBY", /* 27 */ - "OSM_SM_STATE_SUBNET_UP", /* 28 */ - "OSM_SM_STATE_PROCESS_REQUEST", /* 29 */ - "OSM_SM_STATE_PROCESS_REQUEST_WAIT", /* 30 */ - "OSM_SM_STATE_PROCESS_REQUEST_DONE", /* 31 */ - "OSM_SM_STATE_MASTER_OR_HIGHER_SM_DETECTED", /* 32 */ - "OSM_SM_STATE_SET_PKEY", /* 33 */ - "OSM_SM_STATE_SET_PKEY_WAIT", /* 34 */ - "OSM_SM_STATE_SET_PKEY_DONE", /* 35 */ - "UNKNOWN STATE!!" /* 36 */ + "OSM_SM_STATE_STANDBY", /* 3 */ + "UNKNOWN STATE!!" /* 4 */ }; const char *const __osm_sm_signal_str[] = { "OSM_SIGNAL_NONE", /* 0 */ "OSM_SIGNAL_SWEEP", /* 1 */ - "OSM_SIGNAL_CHANGE_DETECTED", /* 2 */ - "OSM_SIGNAL_NO_PENDING_TRANSACTIONS", /* 3 */ - "OSM_SIGNAL_DONE", /* 4 */ - "OSM_SIGNAL_DONE_PENDING", /* 5 */ - "OSM_SIGNAL_LIGHT_SWEEP_FAIL", /* 6 */ - "OSM_SIGNAL_IDLE_TIME_PROCESS", /* 7 */ - "OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST", /* 8 */ - "OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED", /* 9 */ - "OSM_SIGNAL_EXIT_STBY", /* 10 */ - "OSM_SIGNAL_PERFMGR_SWEEP", /* 11 */ - "UNKNOWN SIGNAL!!" /* 12 */ + "OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST", /* 2 */ + "OSM_SIGNAL_EXIT_STBY", /* 3 */ + "OSM_SIGNAL_PERFMGR_SWEEP", /* 4 */ + "UNKNOWN SIGNAL!!" /* 5 */ }; /********************************************************************** diff --git a/opensm/opensm/osm_node_info_rcv.c b/opensm/opensm/osm_node_info_rcv.c index 3ac8d1f..2106aa2 100644 --- a/opensm/opensm/osm_node_info_rcv.c +++ b/opensm/opensm/osm_node_info_rcv.c @@ -814,7 +814,6 @@ void osm_ni_rcv_process(IN void *context, IN void *data) ib_node_info_t *p_ni; ib_smp_t *p_smp; osm_node_t *p_node; - boolean_t process_new_flag = FALSE; CL_ASSERT(sm); @@ -856,20 +855,12 @@ void osm_ni_rcv_process(IN void *context, IN void *data) if (!p_node) { __osm_ni_rcv_process_new(sm, p_madw); - process_new_flag = TRUE; + sm->p_subn->force_heavy_sweep = 1; } else __osm_ni_rcv_process_existing(sm, p_node, p_madw); CL_PLOCK_RELEASE(sm->p_lock); - /* - * If we processed a new node - need to signal to the SM that - * change detected. - */ - if (process_new_flag) - osm_sm_signal(&sm->p_subn->p_osm->sm, - OSM_SIGNAL_CHANGE_DETECTED); - Exit: OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c index dd6e662..9480ad7 100644 --- a/opensm/opensm/osm_perfmgr.c +++ b/opensm/opensm/osm_perfmgr.c @@ -740,7 +740,6 @@ static void reset_switch_count(cl_map_item_t * const p_map_item, void *cxt) static int perfmgr_discovery(osm_opensm_t * osm) { - unsigned signals = osm->sm.signal_mask; int ret; CL_PLOCK_ACQUIRE(&osm->lock); @@ -772,17 +771,10 @@ static int perfmgr_discovery(osm_opensm_t * osm) if (wait_for_pending_transactions(&osm->stats)) goto _exit; - _drop: +_drop: osm_drop_mgr_process(&osm->sm.drop_mgr); - _exit: - /* dirty hack: cleanup signal mask - - * this will not be needed later with both discoveries merged */ - cl_spinlock_acquire(&osm->sm.signal_lock); - osm->sm.signal_mask &= ~(OSM_SIGNAL_NO_PENDING_TRANSACTIONS | - OSM_SIGNAL_CHANGE_DETECTED); - osm->sm.signal_mask |= signals; - cl_spinlock_release(&osm->sm.signal_lock); +_exit: return ret; } diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c index e56ba51..356cd56 100644 --- a/opensm/opensm/osm_port_info_rcv.c +++ b/opensm/opensm/osm_port_info_rcv.c @@ -578,8 +578,7 @@ void osm_pi_rcv_process(IN void *context, IN void *data) "GUID 0x%" PRIx64 " port 0x%016" PRIx64 ", Commencing heavy sweep\n", cl_ntoh64(node_guid), cl_ntoh64(port_guid)); - osm_sm_signal(&sm->p_subn->p_osm->sm, - OSM_SIGNAL_CHANGE_DETECTED); + sm->p_subn->force_heavy_sweep = 1; goto Exit; } diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c index f2d259d..019fa51 100644 --- a/opensm/opensm/osm_sm.c +++ b/opensm/opensm/osm_sm.c @@ -121,6 +121,9 @@ static void __osm_sm_sweeper(IN void *p_ptr) continue; } + if (osm_exit_flag) + break; + cl_spinlock_acquire(&p_sm->signal_lock); signals = p_sm->signal_mask; p_sm->signal_mask = 0; diff --git a/opensm/opensm/osm_sm_mad_ctrl.c b/opensm/opensm/osm_sm_mad_ctrl.c index c6624a1..efbe97a 100644 --- a/opensm/opensm/osm_sm_mad_ctrl.c +++ b/opensm/opensm/osm_sm_mad_ctrl.c @@ -103,11 +103,7 @@ __osm_sm_mad_ctrl_retire_trans_mad(IN osm_sm_mad_ctrl_t * const p_ctrl, Signal the subnet manager. */ osm_log(p_ctrl->p_log, OSM_LOG_DEBUG, - "__osm_sm_mad_ctrl_retire_trans_mad: " - "signal OSM_SIGNAL_NO_PENDING_TRANSACTIONS\n"); - - osm_sm_signal(&p_ctrl->p_subn->p_osm->sm, - OSM_SIGNAL_NO_PENDING_TRANSACTIONS); + "__osm_sm_mad_ctrl_retire_trans_mad: wire is clean.\n"); #ifdef HAVE_LIBPTHREAD pthread_cond_signal(&p_ctrl->p_stats->cond); #else diff --git a/opensm/opensm/osm_sminfo_rcv.c b/opensm/opensm/osm_sminfo_rcv.c index 63cc393..4b4c2b1 100644 --- a/opensm/opensm/osm_sminfo_rcv.c +++ b/opensm/opensm/osm_sminfo_rcv.c @@ -346,7 +346,6 @@ __osm_sminfo_rcv_process_get_sm(IN osm_sm_t * sm, IN const osm_remote_sm_t * const p_sm) { const ib_sm_info_t *p_smi; - osm_signal_t ret_val = OSM_SIGNAL_NONE; OSM_LOG_ENTER(sm->p_log, __osm_sminfo_rcv_process_get_sm); @@ -370,7 +369,7 @@ __osm_sminfo_rcv_process_get_sm(IN osm_sm_t * sm, case IB_SMINFO_STATE_NOTACTIVE: break; case IB_SMINFO_STATE_MASTER: - ret_val = OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED; + sm->master_sm_found = 1; /* save on the p_sm_state_mgr the guid of the current master. */ osm_log(sm->p_log, OSM_LOG_VERBOSE, "__osm_sminfo_rcv_process_get_sm: " @@ -383,8 +382,7 @@ __osm_sminfo_rcv_process_get_sm(IN osm_sm_t * sm, if (__osm_sminfo_rcv_remote_sm_is_higher(sm, p_smi) == TRUE) { /* the remote is a higher sm - need to stop sweeping */ - ret_val = - OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED; + sm->master_sm_found = 1; /* save on the sm_state_mgr the guid of the higher SM we found - */ /* we will poll it - as long as it lives - we should be in Standby. */ osm_log(sm->p_log, OSM_LOG_VERBOSE, @@ -456,7 +454,7 @@ __osm_sminfo_rcv_process_get_sm(IN osm_sm_t * sm, } OSM_LOG_EXIT(sm->p_log); - return ret_val; + return 0; } /********************************************************************** @@ -471,7 +469,6 @@ __osm_sminfo_rcv_process_get_response(IN osm_sm_t * sm, osm_port_t *p_port; ib_net64_t port_guid; osm_remote_sm_t *p_sm; - osm_signal_t process_get_sm_ret_val = OSM_SIGNAL_NONE; OSM_LOG_ENTER(sm->p_log, __osm_sminfo_rcv_process_get_response); @@ -558,17 +555,11 @@ __osm_sminfo_rcv_process_get_response(IN osm_sm_t * sm, */ p_sm->smi = *p_smi; - process_get_sm_ret_val = __osm_sminfo_rcv_process_get_sm(sm, p_sm); + __osm_sminfo_rcv_process_get_sm(sm, p_sm); _unlock_and_exit: CL_PLOCK_RELEASE(sm->p_lock); - /* If process_get_sm_ret_val != OSM_SIGNAL_NONE then we have to signal - * to the SM with that signal. */ - if (process_get_sm_ret_val != OSM_SIGNAL_NONE) - osm_sm_signal(&sm->p_subn->p_osm->sm, - process_get_sm_ret_val); - Exit: OSM_LOG_EXIT(sm->p_log); } diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c index 3746389..4dcb584 100644 --- a/opensm/opensm/osm_state_mgr.c +++ b/opensm/opensm/osm_state_mgr.c @@ -1332,998 +1332,336 @@ int wait_for_pending_transactions(osm_stats_t * stats) return osm_exit_flag; } -void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, - IN osm_signal_t signal) +static void do_sweep(osm_sm_t * sm) { ib_api_status_t status; osm_remote_sm_t *p_remote_sm; - osm_signal_t tmp_signal; - CL_ASSERT(p_mgr); - - OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process); + sm->master_sm_found = 0; - /* if we are exiting do nothing */ - if (osm_exit_flag) - signal = OSM_SIGNAL_NONE; - - while (signal != OSM_SIGNAL_NONE) { - if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG)) { - osm_log(p_mgr->p_log, OSM_LOG_DEBUG, - "osm_state_mgr_process: " - "Received signal %s in state %s\n", - osm_get_sm_signal_str(signal), - osm_get_sm_state_str(p_mgr->state)); + /* + * If we already have switches, then try a light sweep. + * Otherwise, this is probably our first discovery pass + * or we are connected in loopback. In both cases do a + * heavy sweep. + * Note: If we are connected in loopback we want a heavy + * sweep, since we will not be getting any traps if there is + * a lost connection. + */ + /* if we are in DISCOVERING state - this means it is either in + * initializing or wake up from STANDBY - run the heavy sweep */ + if (cl_qmap_count(&sm->p_subn->sw_guid_tbl) + && sm->p_subn->sm_state != IB_SMINFO_STATE_DISCOVERING + && sm->p_subn->opt.force_heavy_sweep == FALSE + && sm->p_subn->force_heavy_sweep == FALSE + && sm->p_subn->subnet_initialization_error == FALSE + && (__osm_state_mgr_light_sweep_start(&sm->state_mgr) == IB_SUCCESS)) { + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; + if (!sm->p_subn->force_heavy_sweep) { + __osm_state_mgr_light_sweep_done_msg(&sm->state_mgr); + return; } + } - /* - * If we're already sweeping and we get the signal to sweep, - * just ignore it harmlessly. - */ - if ((p_mgr->state != OSM_SM_STATE_IDLE) - && (p_mgr->state != OSM_SM_STATE_STANDBY) - && (signal == OSM_SIGNAL_SWEEP)) { - break; - } + /* go to heavy sweep */ +_repeat_discovery: - switch (p_mgr->state) { - case OSM_SM_STATE_IDLE: - switch (signal) { - case OSM_SIGNAL_SWEEP: - /* - * If the osm_sm_state_mgr is in NOT-ACTIVE state - - * stay in IDLE - */ - if (p_mgr->p_subn->sm_state == IB_SMINFO_STATE_NOTACTIVE) { - osm_vendor_set_sm(p_mgr->p_mad_ctrl->h_bind, FALSE); - goto Idle; - } + /* First of all - unset all flags */ + sm->p_subn->force_heavy_sweep = FALSE; + sm->p_subn->subnet_initialization_error = FALSE; - /* - * If the osm_sm_state_mgr is in INIT state - signal - * it with a INIT signal to move it to DISCOVERY state. - */ - if (p_mgr->p_subn->sm_state == IB_SMINFO_STATE_INIT) - osm_sm_state_mgr_process(p_mgr-> - p_sm_state_mgr, - OSM_SM_SIGNAL_INIT); - - /* - * If we already have switches, then try a light sweep. - * Otherwise, this is probably our first discovery pass - * or we are connected in loopback. In both cases do a - * heavy sweep. - * Note: If we are connected in loopback we want a heavy - * sweep, since we will not be getting any traps if there is - * a lost connection. - */ - /* if we are in DISCOVERING state - this means it is either in - * initializing or wake up from STANDBY - run the heavy sweep */ - if (cl_qmap_count(&p_mgr->p_subn->sw_guid_tbl) - && p_mgr->p_subn->sm_state != - IB_SMINFO_STATE_DISCOVERING - && p_mgr->p_subn->opt.force_heavy_sweep == - FALSE - && p_mgr->p_subn->force_heavy_sweep == FALSE - && p_mgr->p_subn->subnet_initialization_error == FALSE) { - if (__osm_state_mgr_light_sweep_start(p_mgr) == IB_SUCCESS) { - p_mgr->state = OSM_SM_STATE_SWEEP_LIGHT; - } - } else { - /* First of all - if force_heavy_sweep is TRUE then - * need to unset it */ - p_mgr->p_subn->force_heavy_sweep = FALSE; - /* If subnet_initialization_error is TRUE then - * need to unset it. */ - p_mgr->p_subn->subnet_initialization_error = FALSE; - - /* rescan configuration updates */ - status = osm_subn_rescan_conf_files(p_mgr->p_subn); - if (status != IB_SUCCESS) { - osm_log(p_mgr->p_log, - OSM_LOG_ERROR, - "osm_state_mgr_process: ERR 331A: " - "osm_subn_rescan_conf_file failed\n"); - } - - if (p_mgr->p_subn->sm_state != IB_SMINFO_STATE_MASTER) - p_mgr->p_subn->need_update = 1; - - status = __osm_state_mgr_sweep_hop_0(p_mgr); - if (status == IB_SUCCESS) { - p_mgr->state = OSM_SM_STATE_SWEEP_HEAVY_SELF; - } - } - Idle: - signal = OSM_SIGNAL_NONE; - break; + /* rescan configuration updates */ + status = osm_subn_rescan_conf_files(sm->p_subn); + if (status != IB_SUCCESS) + osm_log(sm->p_log, OSM_LOG_ERROR, + "osm_state_mgr_process: ERR 331A: " + "osm_subn_rescan_conf_file failed\n"); - case OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST: - p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST; - signal = OSM_SIGNAL_IDLE_TIME_PROCESS; - break; + if (sm->p_subn->sm_state != IB_SMINFO_STATE_MASTER) + sm->p_subn->need_update = 1; - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + status = __osm_state_mgr_sweep_hop_0(&sm->state_mgr); + if (status != IB_SUCCESS || + wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; - case OSM_SM_STATE_PROCESS_REQUEST: - switch (signal) { - case OSM_SIGNAL_IDLE_TIME_PROCESS: - signal = osm_mcast_mgr_process_mgroups(p_mgr->p_mcast_mgr); - switch (signal) { - case OSM_SIGNAL_NONE: - p_mgr->state = OSM_SM_STATE_IDLE; - break; - - case OSM_SIGNAL_DONE_PENDING: - p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST_WAIT; - signal = OSM_SIGNAL_NONE; - break; - - case OSM_SIGNAL_DONE: - p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST_DONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + if (__osm_state_mgr_is_sm_port_down(&sm->state_mgr) == TRUE) { + __osm_state_mgr_sm_port_down_msg(&sm->state_mgr); - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_PROCESS_REQUEST_WAIT: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST_DONE; - break; + /* Run the drop manager - we want to clear all records */ + osm_drop_mgr_process(&sm->drop_mgr); - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_PROCESS_REQUEST_DONE: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - case OSM_SIGNAL_DONE: - if (p_mgr->p_subn->force_heavy_sweep) { - /* - * Do not read next item from the idle queue. - * Immediate heavy sweep is requested, so it's - * more important. - * Besides, there is a chance that after the - * heavy sweep complition, idle queue processing - * that SM would have performed here will be obsolete. - */ - if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG)) - osm_log(p_mgr->p_log, OSM_LOG_DEBUG, - "osm_state_mgr_process: " - "interrupting idle time queue processing - heavy sweep requested\n"); - signal = OSM_SIGNAL_NONE; - p_mgr->state = OSM_SM_STATE_IDLE; - break; - } - signal = OSM_SIGNAL_IDLE_TIME_PROCESS; - p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SWEEP_LIGHT: - switch (signal) { - case OSM_SIGNAL_LIGHT_SWEEP_FAIL: - case OSM_SIGNAL_CHANGE_DETECTED: - /* - * Nothing else to do yet except change state. - */ - p_mgr->state = OSM_SM_STATE_SWEEP_LIGHT_WAIT; - signal = OSM_SIGNAL_NONE; - break; + /* Move to DISCOVERING state */ + osm_sm_state_mgr_process(&sm->sm_state_mgr, + OSM_SM_SIGNAL_DISCOVER); + return; + } - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - /* - * No change was detected on the subnet. - * We can return to the idle state. - */ - __osm_state_mgr_light_sweep_done_msg(p_mgr); - p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST; - signal = OSM_SIGNAL_IDLE_TIME_PROCESS; - break; + status = __osm_state_mgr_sweep_hop_1(&sm->state_mgr); + if (status != IB_SUCCESS || + wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + /* discovery completed - check other sm presense */ + if (sm->master_sm_found) { + sm->state_mgr.state = OSM_SM_STATE_STANDBY; + /* + * Call the sm_state_mgr with signal + * MASTER_OR_HIGHER_SM_DETECTED_DONE + */ + osm_sm_state_mgr_process(&sm->sm_state_mgr, + OSM_SM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED_DONE); + __osm_state_mgr_standby_msg(&sm->state_mgr); + return; + } - case OSM_SM_STATE_SWEEP_LIGHT_WAIT: - switch (signal) { - case OSM_SIGNAL_LIGHT_SWEEP_FAIL: - case OSM_SIGNAL_CHANGE_DETECTED: - /* - * Nothing to do here. One subnet change typcially - * begets another.... But need to wait for all transactions to - * complete - */ - break; + /* if new sweep requested - don't bother with the rest */ + if (sm->p_subn->force_heavy_sweep) + goto _repeat_discovery; - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - /* - * A change was detected on the subnet. - * Initiate a heavy sweep. - */ - if (__osm_state_mgr_sweep_hop_0(p_mgr) == IB_SUCCESS) { - p_mgr->state = OSM_SM_STATE_SWEEP_HEAVY_SELF; - } - break; + __osm_state_mgr_sweep_heavy_done_msg(&sm->state_mgr); - default: - __osm_state_mgr_signal_error(p_mgr, signal); - break; + /* If we are MASTER - get the highest remote_sm, and + * see if it is higher than our local sm. + */ + if (sm->p_subn->sm_state == IB_SMINFO_STATE_MASTER) { + p_remote_sm = __osm_state_mgr_get_highest_sm(&sm->state_mgr); + if (p_remote_sm != NULL) { + /* report new ports (trap 64) before leaving MASTER */ + __osm_state_mgr_report_new_ports(&sm->state_mgr); + + /* need to handover the mastership + * to the remote sm, and move to standby */ + __osm_state_mgr_send_handover(&sm->state_mgr, p_remote_sm); + osm_sm_state_mgr_process(&sm->sm_state_mgr, + OSM_SM_SIGNAL_HANDOVER_SENT); + sm->state_mgr.state = OSM_SM_STATE_STANDBY; + return; + } else { + /* We are the highest sm - check to see if there is + * a remote SM that is in master state. */ + p_remote_sm = + __osm_state_mgr_exists_other_master_sm(&sm->state_mgr); + if (p_remote_sm != NULL) { + /* There is a remote SM that is master. + * need to wait for that SM to relinquish control + * of its portion of the subnet. C14-60.2.1. + * Also - need to start polling on that SM. */ + sm->sm_state_mgr.p_polling_sm = p_remote_sm; + osm_sm_state_mgr_process(&sm->sm_state_mgr, + OSM_SM_SIGNAL_WAIT_FOR_HANDOVER); + return; } - signal = OSM_SIGNAL_NONE; - break; + } + } - case OSM_SM_STATE_SWEEP_HEAVY_SELF: - switch (signal) { - case OSM_SIGNAL_CHANGE_DETECTED: - /* - * Nothing to do here. One subnet change typcially - * begets another.... But need to wait for all transactions - */ - signal = OSM_SIGNAL_NONE; - break; + /* Need to continue with lid assignment */ + osm_drop_mgr_process(&sm->drop_mgr); - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - if (__osm_state_mgr_is_sm_port_down(p_mgr) == TRUE) { - __osm_state_mgr_sm_port_down_msg(p_mgr); - - /* Run the drop manager - we want to clear all records */ - osm_drop_mgr_process(p_mgr->p_drop_mgr); - - /* Move to DISCOVERING state */ - osm_sm_state_mgr_process(p_mgr-> - p_sm_state_mgr, - OSM_SM_SIGNAL_DISCOVER); - - p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST; - signal = OSM_SIGNAL_IDLE_TIME_PROCESS; - } else { - if (__osm_state_mgr_sweep_hop_1(p_mgr) - == IB_SUCCESS) { - p_mgr->state = OSM_SM_STATE_SWEEP_HEAVY_SUBNET; - } - signal = OSM_SIGNAL_NONE; - } - break; + /* + * If we are not MASTER already - this means that we are + * in discovery state. call osm_sm_state_mgr with signal + * DISCOVERY_COMPLETED + */ + if (sm->p_subn->sm_state == IB_SMINFO_STATE_DISCOVERING) + osm_sm_state_mgr_process(&sm->sm_state_mgr, + OSM_SM_SIGNAL_DISCOVERY_COMPLETED); - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + osm_pkey_mgr_process(sm->p_subn->p_osm); - /* - * There is no 'OSM_SM_STATE_SWEEP_HEAVY_WAIT' state since we - * know that there are outstanding transactions on the wire already... - */ - case OSM_SM_STATE_SWEEP_HEAVY_SUBNET: - switch (signal) { - case OSM_SIGNAL_CHANGE_DETECTED: - /* - * Nothing to do here. One subnet change typically - * begets another.... - */ - signal = OSM_SIGNAL_NONE; - break; + osm_qos_setup(sm->p_subn->p_osm); - case OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED: - p_mgr->state = OSM_SM_STATE_MASTER_OR_HIGHER_SM_DETECTED; - break; + /* try to restore SA DB (this should be before lid_mgr + because we may want to disable clients reregistration + when SA DB is restored) */ + osm_sa_db_file_load(sm->p_subn->p_osm); - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - /* if new sweep requiested - don't bother with the rest */ - if (p_mgr->p_subn->force_heavy_sweep) { - p_mgr->state = OSM_SM_STATE_IDLE; - signal = OSM_SIGNAL_SWEEP; - break; - } + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; - __osm_state_mgr_sweep_heavy_done_msg(p_mgr); - - /* If we are MASTER - get the highest remote_sm, and - * see if it is higher than our local sm. If - */ - if (p_mgr->p_subn->sm_state == IB_SMINFO_STATE_MASTER) { - p_remote_sm = __osm_state_mgr_get_highest_sm(p_mgr); - if (p_remote_sm != NULL) { - /* report new ports (trap 64) before leaving MASTER */ - __osm_state_mgr_report_new_ports(p_mgr); - - /* need to handover the mastership - * to the remote sm, and move to standby */ - __osm_state_mgr_send_handover(p_mgr, p_remote_sm); - osm_sm_state_mgr_process(p_mgr-> - p_sm_state_mgr, - OSM_SM_SIGNAL_HANDOVER_SENT); - p_mgr->state = OSM_SM_STATE_STANDBY; - signal = OSM_SIGNAL_NONE; - break; - } else { - /* We are the highest sm - check to see if there is - * a remote SM that is in master state. */ - p_remote_sm = - __osm_state_mgr_exists_other_master_sm(p_mgr); - if (p_remote_sm != NULL) { - /* There is a remote SM that is master. - * need to wait for that SM to relinquish control - * of its portion of the subnet. C14-60.2.1. - * Also - need to start polling on that SM. */ - p_mgr->p_sm_state_mgr-> - p_polling_sm = p_remote_sm; - osm_sm_state_mgr_process - (p_mgr-> - p_sm_state_mgr, - OSM_SM_SIGNAL_WAIT_FOR_HANDOVER); - p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST; - signal = OSM_SIGNAL_IDLE_TIME_PROCESS; - break; - } - } - } + osm_lid_mgr_process_sm(&sm->lid_mgr); + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; - /* Need to continue with lid assignment */ - osm_drop_mgr_process(p_mgr->p_drop_mgr); + __osm_state_mgr_set_sm_lid_done_msg(&sm->state_mgr); + __osm_state_mgr_notify_lid_change(&sm->state_mgr); - p_mgr->state = OSM_SM_STATE_SET_PKEY; + osm_lid_mgr_process_subnet(&sm->lid_mgr); + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; - /* - * If we are not MASTER already - this means that we are - * in discovery state. call osm_sm_state_mgr with signal - * DISCOVERY_COMPLETED - */ - if (p_mgr->p_subn->sm_state == IB_SMINFO_STATE_DISCOVERING) - osm_sm_state_mgr_process(p_mgr-> - p_sm_state_mgr, - OSM_SM_SIGNAL_DISCOVERY_COMPLETED); + /* At this point we need to check the consistency of + * the port_lid_tbl under the subnet. There might be + * errors in it if PortInfo Set reqeusts didn't reach + * their destination. */ + __osm_state_mgr_check_tbl_consistency(&sm->state_mgr); - /* the returned signal might be DONE or DONE_PENDING */ - signal = osm_pkey_mgr_process(p_mgr->p_subn->p_osm); + __osm_state_mgr_lid_assign_msg(&sm->state_mgr); - /* the returned signal is always DONE */ - tmp_signal = osm_qos_setup(p_mgr->p_subn->p_osm); + /* + * Proceed with unicast forwarding table configuration. + * First - send trap 64 on newly discovered endports + */ + __osm_state_mgr_report_new_ports(&sm->state_mgr); - if (tmp_signal == OSM_SIGNAL_DONE_PENDING) - signal = OSM_SIGNAL_DONE_PENDING; + osm_ucast_mgr_process(&sm->ucast_mgr); + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; - /* try to restore SA DB (this should be before lid_mgr - because we may want to disable clients reregistration - when SA DB is restored) */ - osm_sa_db_file_load(p_mgr->p_subn->p_osm); + /* We are done setting all LFTs so clear the ignore existing. + * From now on, as long as we are still master, we want to + * take into account these lfts. */ + sm->p_subn->ignore_existing_lfts = FALSE; - break; + __osm_state_mgr_switch_config_msg(&sm->state_mgr); - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + if (!sm->p_subn->opt.disable_multicast) { + osm_mcast_mgr_process(&sm->mcast_mgr); + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; + __osm_state_mgr_multicast_config_msg(&sm->state_mgr); + } - case OSM_SM_STATE_SET_PKEY: - switch (signal) { - case OSM_SIGNAL_DONE: - p_mgr->state = OSM_SM_STATE_SET_PKEY_DONE; - break; + /* + * The LINK_PORTS state is required since we can not count on + * the port state change MADs to succeed. This is an artifact + * of the spec defining state change from state X to state X + * as an error. The hardware then is not required to process + * other parameters provided by the Set(PortInfo) Packet. + */ - case OSM_SIGNAL_DONE_PENDING: - /* - * There are outstanding transactions, so we - * must wait for the wire to clear. - */ - p_mgr->state = OSM_SM_STATE_SET_PKEY_WAIT; - signal = OSM_SIGNAL_NONE; - break; + osm_link_mgr_process(&sm->link_mgr, IB_LINK_NO_CHANGE); + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + __osm_state_mgr_links_ports_msg(&sm->state_mgr); - case OSM_SM_STATE_SET_PKEY_WAIT: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - p_mgr->state = OSM_SM_STATE_SET_PKEY_DONE; - break; + osm_link_mgr_process(&sm->link_mgr, IB_LINK_ARMED); + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + __osm_state_mgr_links_armed_msg(&sm->state_mgr); - case OSM_SM_STATE_SET_PKEY_DONE: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - case OSM_SIGNAL_DONE: - p_mgr->state = OSM_SM_STATE_SET_SM_UCAST_LID; - signal = osm_lid_mgr_process_sm(p_mgr->p_lid_mgr); - break; + osm_link_mgr_process(&sm->link_mgr, IB_LINK_ACTIVE); + if (wait_for_pending_transactions(&sm->p_subn->p_osm->stats)) + return; - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + /* + * The sweep completed! + */ - case OSM_SM_STATE_SET_SM_UCAST_LID: - switch (signal) { - case OSM_SIGNAL_DONE: - p_mgr->state = OSM_SM_STATE_SET_SM_UCAST_LID_DONE; - break; + /* in any case we zero this flag */ + sm->p_subn->coming_out_of_standby = FALSE; - case OSM_SIGNAL_DONE_PENDING: - /* - * There are outstanding transactions, so we - * must wait for the wire to clear. - */ - p_mgr->state = OSM_SM_STATE_SET_SM_UCAST_LID_WAIT; - signal = OSM_SIGNAL_NONE; - break; + /* If there were errors - then the subnet is not really up */ + if (sm->p_subn->subnet_initialization_error == TRUE) + __osm_state_mgr_init_errors_msg(&sm->state_mgr); + else { + /* The subnet is up correctly - set the first_time_master_sweep + * flag (if it is on) to FALSE. */ + if (sm->p_subn->first_time_master_sweep == TRUE) + sm->p_subn->first_time_master_sweep = FALSE; + sm->p_subn->need_update = 0; - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + osm_dump_all(sm->p_subn->p_osm); + __osm_state_mgr_up_msg(&sm->state_mgr); - case OSM_SM_STATE_SET_SM_UCAST_LID_WAIT: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - p_mgr->state = OSM_SM_STATE_SET_SM_UCAST_LID_DONE; - break; + if (osm_log_is_active(sm->p_log, OSM_LOG_VERBOSE)) + osm_sa_db_file_dump(sm->p_subn->p_osm); + } - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + /* + * Finally signal the subnet up event + */ + cl_event_signal(&sm->subnet_up_event); - case OSM_SM_STATE_SET_SM_UCAST_LID_DONE: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - case OSM_SIGNAL_DONE: - __osm_state_mgr_set_sm_lid_done_msg(p_mgr); - __osm_state_mgr_notify_lid_change(p_mgr); - p_mgr->state = OSM_SM_STATE_SET_SUBNET_UCAST_LIDS; - signal = osm_lid_mgr_process_subnet(p_mgr->p_lid_mgr); - break; + /* if we got a signal to force heavy sweep or errors + * in the middle of the sweep - try another sweep. */ + if (sm->p_subn->force_heavy_sweep + || sm->p_subn->subnet_initialization_error) + osm_sm_signal(sm, OSM_SIGNAL_SWEEP); +} - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; +static void do_process_mgrp_queue(osm_sm_t * sm) +{ + osm_mcast_mgr_process_mgroups(&sm->mcast_mgr); + wait_for_pending_transactions(&sm->p_subn->p_osm->stats); +} - case OSM_SM_STATE_SET_SUBNET_UCAST_LIDS: - switch (signal) { - case OSM_SIGNAL_DONE: - /* - * The LID Manager is done processing. - * There are no outstanding transactions, so we - * can move on to configuring the forwarding tables. - */ - p_mgr->state = OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_DONE; - break; +void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr, + IN osm_signal_t signal) +{ + CL_ASSERT(p_mgr); - case OSM_SIGNAL_DONE_PENDING: - /* - * The LID Manager is done processing. - * There are outstanding transactions, so we - * must wait for the wire to clear. - */ - p_mgr->state = OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_WAIT; - signal = OSM_SIGNAL_NONE; - break; + OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process); - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; + if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG)) + osm_log(p_mgr->p_log, OSM_LOG_DEBUG, + "osm_state_mgr_process: " + "Received signal %s in state %s\n", + osm_get_sm_signal_str(signal), + osm_get_sm_state_str(p_mgr->state)); + switch (p_mgr->state) { + case OSM_SM_STATE_IDLE: + switch (signal) { + case OSM_SIGNAL_SWEEP: /* - * In this state, the Unicast Manager has completed processing, - * but there are still transactions on the wire. Therefore, - * wait here until the wire clears. + * If the osm_sm_state_mgr is in NOT-ACTIVE state - + * stay in IDLE */ - case OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_WAIT: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - /* - * The LID Manager is done processing. - * There are no outstanding transactions, so we - * can move on to configuring the forwarding tables. - */ - p_mgr->state = OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_DONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_SUBNET_UCAST_LIDS_DONE: - switch (signal) { - case OSM_SIGNAL_DONE: - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - /* At this point we need to check the consistency of - * the port_lid_tbl under the subnet. There might be - * errors in it if PortInfo Set reqeusts didn't reach - * their destination. */ - __osm_state_mgr_check_tbl_consistency(p_mgr); - - __osm_state_mgr_lid_assign_msg(p_mgr); - - /* - * OK, the wire is clear, so proceed with - * unicast forwarding table configuration. - * First - send trap 64 on newly discovered endports - */ - __osm_state_mgr_report_new_ports(p_mgr); - - p_mgr->state = OSM_SM_STATE_SET_UCAST_TABLES; - signal = osm_ucast_mgr_process(p_mgr->p_ucast_mgr); - - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_UCAST_TABLES: - switch (signal) { - case OSM_SIGNAL_DONE: - p_mgr->state = OSM_SM_STATE_SET_UCAST_TABLES_DONE; - break; - - case OSM_SIGNAL_DONE_PENDING: - /* - * The Unicast Manager is done processing. - * There are outstanding transactions, so we - * must wait for the wire to clear. - */ - p_mgr->state = OSM_SM_STATE_SET_UCAST_TABLES_WAIT; - signal = OSM_SIGNAL_NONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_UCAST_TABLES_WAIT: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - p_mgr->state = OSM_SM_STATE_SET_UCAST_TABLES_DONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; + if (p_mgr->p_subn->sm_state == IB_SMINFO_STATE_NOTACTIVE) { + osm_vendor_set_sm(p_mgr->p_mad_ctrl->h_bind, FALSE); break; } - break; - - case OSM_SM_STATE_SET_UCAST_TABLES_DONE: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - case OSM_SIGNAL_DONE: - /* We are done setting all LFTs so clear the ignore existing. - * From now on, as long as we are still master, we want to - * take into account these lfts. */ - p_mgr->p_subn->ignore_existing_lfts = FALSE; - - __osm_state_mgr_switch_config_msg(p_mgr); - - if (!p_mgr->p_subn->opt.disable_multicast) { - p_mgr->state = OSM_SM_STATE_SET_MCAST_TABLES; - signal = osm_mcast_mgr_process(p_mgr->p_mcast_mgr); - } else { - p_mgr->state = OSM_SM_STATE_SET_LINK_PORTS; - signal = - osm_link_mgr_process(p_mgr-> - p_link_mgr, IB_LINK_NO_CHANGE); - } - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_MCAST_TABLES: - switch (signal) { - case OSM_SIGNAL_DONE: - p_mgr->state = OSM_SM_STATE_SET_MCAST_TABLES_DONE; - break; - - case OSM_SIGNAL_DONE_PENDING: - /* - * The Multicast Manager is done processing. - * There are outstanding transactions, so we - * must wait for the wire to clear. - */ - p_mgr->state = OSM_SM_STATE_SET_MCAST_TABLES_WAIT; - signal = OSM_SIGNAL_NONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_MCAST_TABLES_WAIT: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - p_mgr->state = OSM_SM_STATE_SET_MCAST_TABLES_DONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_MCAST_TABLES_DONE: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - case OSM_SIGNAL_DONE: - __osm_state_mgr_multicast_config_msg(p_mgr); - - p_mgr->state = OSM_SM_STATE_SET_LINK_PORTS; - signal = osm_link_mgr_process(p_mgr->p_link_mgr, IB_LINK_NO_CHANGE); - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; /* - * The LINK_PORTS state is required since we can not count on - * the port state change MADs to succeed. This is an artifact - * of the spec defining state change from state X to state X - * as an error. The hardware then is not required to process - * other parameters provided by the Set(PortInfo) Packet. + * If the osm_sm_state_mgr is in INIT state - signal + * it with a INIT signal to move it to DISCOVERY state. */ - case OSM_SM_STATE_SET_LINK_PORTS: - switch (signal) { - case OSM_SIGNAL_DONE: - p_mgr->state = OSM_SM_STATE_SET_LINK_PORTS_DONE; - break; - - case OSM_SIGNAL_DONE_PENDING: - /* - * The Link Manager is done processing. - * There are outstanding transactions, so we - * must wait for the wire to clear. - */ - p_mgr->state = OSM_SM_STATE_SET_LINK_PORTS_WAIT; - signal = OSM_SIGNAL_NONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_LINK_PORTS_WAIT: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - p_mgr->state = OSM_SM_STATE_SET_LINK_PORTS_DONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_LINK_PORTS_DONE: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - case OSM_SIGNAL_DONE: - - __osm_state_mgr_links_ports_msg(p_mgr); - - p_mgr->state = OSM_SM_STATE_SET_ARMED; - signal = osm_link_mgr_process(p_mgr->p_link_mgr, IB_LINK_ARMED); - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_ARMED: - switch (signal) { - case OSM_SIGNAL_DONE: - p_mgr->state = OSM_SM_STATE_SET_ARMED_DONE; - break; - - case OSM_SIGNAL_DONE_PENDING: - /* - * The Link Manager is done processing. - * There are outstanding transactions, so we - * must wait for the wire to clear. - */ - p_mgr->state = OSM_SM_STATE_SET_ARMED_WAIT; - signal = OSM_SIGNAL_NONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_ARMED_WAIT: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - p_mgr->state = OSM_SM_STATE_SET_ARMED_DONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_ARMED_DONE: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - case OSM_SIGNAL_DONE: - - __osm_state_mgr_links_armed_msg(p_mgr); - - p_mgr->state = OSM_SM_STATE_SET_ACTIVE; - signal = osm_link_mgr_process(p_mgr->p_link_mgr, IB_LINK_ACTIVE); - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_ACTIVE: - switch (signal) { - case OSM_SIGNAL_DONE: - /* - * Don't change the signal, just the state. - */ - p_mgr->state = OSM_SM_STATE_SUBNET_UP; - break; - - case OSM_SIGNAL_DONE_PENDING: - /* - * The Link Manager is done processing. - * There are outstanding transactions, so we - * must wait for the wire to clear. - */ - p_mgr->state = OSM_SM_STATE_SET_ACTIVE_WAIT; - signal = OSM_SIGNAL_NONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SET_ACTIVE_WAIT: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - /* - * Don't change the signal, just the state. - */ - p_mgr->state = OSM_SM_STATE_SUBNET_UP; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_SUBNET_UP: - switch (signal) { - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - case OSM_SIGNAL_DONE: - /* - * The sweep completed! - */ - - /* in any case we zero this flag */ - p_mgr->p_subn->coming_out_of_standby = FALSE; - - /* If there were errors - then the subnet is not really up */ - if (p_mgr->p_subn->subnet_initialization_error == TRUE) { - __osm_state_mgr_init_errors_msg(p_mgr); - } else { - /* The subnet is up correctly - set the first_time_master_sweep flag - * (if it is on) to FALSE. */ - if (p_mgr->p_subn->first_time_master_sweep == TRUE) { - p_mgr->p_subn->first_time_master_sweep = FALSE; - } - p_mgr->p_subn->need_update = 0; - - osm_dump_all(p_mgr->p_subn->p_osm); - __osm_state_mgr_up_msg(p_mgr); - - if (osm_log_is_active(p_mgr->p_log, OSM_LOG_VERBOSE)) - osm_sa_db_file_dump(p_mgr->p_subn->p_osm); - } - p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST; - signal = OSM_SIGNAL_IDLE_TIME_PROCESS; - - /* - * Finally signal the subnet up event - */ - status = - cl_event_signal(p_mgr->p_subnet_up_event); - if (status != IB_SUCCESS) { - osm_log(p_mgr->p_log, OSM_LOG_ERROR, - "osm_state_mgr_process: ERR 3319: " - "Invalid SM state %u\n", - p_mgr->state); - } - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - break; - - case OSM_SM_STATE_MASTER_OR_HIGHER_SM_DETECTED: - switch (signal) { - case OSM_SIGNAL_CHANGE_DETECTED: - /* - * Nothing to do here. One subnet change typically - * begets another.... - */ - break; - - case OSM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED: - /* - * If we lost once, we might lose again. Nothing to do. - */ - break; - - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - p_mgr->state = OSM_SM_STATE_STANDBY; - /* - * Call the sm_state_mgr with signal - * MASTER_OR_HIGHER_SM_DETECTED_DONE - */ + if (p_mgr->p_subn->sm_state == IB_SMINFO_STATE_INIT) osm_sm_state_mgr_process(p_mgr->p_sm_state_mgr, - OSM_SM_SIGNAL_MASTER_OR_HIGHER_SM_DETECTED_DONE); - __osm_state_mgr_standby_msg(p_mgr); - break; + OSM_SM_SIGNAL_INIT); - default: - __osm_state_mgr_signal_error(p_mgr, signal); - break; - } - signal = OSM_SIGNAL_NONE; + do_sweep(p_mgr->sm); break; - case OSM_SM_STATE_STANDBY: - switch (signal) { - case OSM_SIGNAL_EXIT_STBY: - /* - * Need to force re-write of sm_base_lid to all ports - * to do that we want all the ports to be considered - * foriegn - */ - signal = OSM_SIGNAL_SWEEP; - __osm_state_mgr_clean_known_lids(p_mgr); - p_mgr->state = OSM_SM_STATE_IDLE; - break; - - case OSM_SIGNAL_NO_PENDING_TRANSACTIONS: - /* - * Nothing to do here - need to stay at this state - */ - signal = OSM_SIGNAL_NONE; - break; - - default: - __osm_state_mgr_signal_error(p_mgr, signal); - signal = OSM_SIGNAL_NONE; - break; - } - /* stay with the same signal - so we can start the sweep */ + case OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST: + do_process_mgrp_queue(p_mgr->sm); break; default: - CL_ASSERT(FALSE); - osm_log(p_mgr->p_log, OSM_LOG_ERROR, - "osm_state_mgr_process: ERR 3320: " - "Invalid SM state %u\n", p_mgr->state); - p_mgr->state = OSM_SM_STATE_IDLE; - signal = OSM_SIGNAL_NONE; + __osm_state_mgr_signal_error(p_mgr, signal); break; } + break; - /* if we got a signal to force immediate heavy sweep in the middle of the sweep - - * try another sweep. */ - if ((p_mgr->p_subn->force_heavy_sweep) && - (p_mgr->state == OSM_SM_STATE_IDLE)) { - signal = OSM_SIGNAL_SWEEP; - } - /* if we got errors during the initialization in the middle of the sweep - - * try another sweep. */ - if ((p_mgr->p_subn->subnet_initialization_error) && - (p_mgr->state == OSM_SM_STATE_IDLE)) { - signal = OSM_SIGNAL_SWEEP; + case OSM_SM_STATE_STANDBY: + switch (signal) { + case OSM_SIGNAL_EXIT_STBY: + /* + * Need to force re-write of sm_base_lid to all ports + * to do that we want all the ports to be considered + * foriegn + */ + __osm_state_mgr_clean_known_lids(p_mgr); + p_mgr->state = OSM_SM_STATE_IDLE; + osm_sm_signal(p_mgr->sm, OSM_SIGNAL_SWEEP); + break; + default: + __osm_state_mgr_signal_error(p_mgr, signal); + break; } + /* stay with the same signal - so we can start the sweep */ + break; + default: + CL_ASSERT(FALSE); + osm_log(p_mgr->p_log, OSM_LOG_ERROR, + "osm_state_mgr_process: ERR 3320: " + "Invalid SM state %u\n", p_mgr->state); + break; } OSM_LOG_EXIT(p_mgr->p_log); diff --git a/opensm/opensm/osm_sw_info_rcv.c b/opensm/opensm/osm_sw_info_rcv.c index dbf8b8c..2cc887a 100644 --- a/opensm/opensm/osm_sw_info_rcv.c +++ b/opensm/opensm/osm_sw_info_rcv.c @@ -562,8 +562,7 @@ void osm_si_rcv_process(IN void *context, IN void *data) if (__osm_si_rcv_process_existing (sm, p_node, p_madw)) { CL_PLOCK_RELEASE(sm->p_lock); - osm_sm_signal(&sm->p_subn->p_osm->sm, - OSM_SIGNAL_CHANGE_DETECTED); + sm->p_subn->force_heavy_sweep = 1; goto Exit; } } diff --git a/opensm/opensm/osm_sweep_fail_ctrl.c b/opensm/opensm/osm_sweep_fail_ctrl.c index 92b3165..3a5190f 100644 --- a/opensm/opensm/osm_sweep_fail_ctrl.c +++ b/opensm/opensm/osm_sweep_fail_ctrl.c @@ -65,7 +65,7 @@ static void __osm_sweep_fail_ctrl_disp_callback(IN void *context, /* Notify the state manager that we had a light sweep failure. */ - osm_sm_signal(p_ctrl->sm, OSM_SIGNAL_LIGHT_SWEEP_FAIL); + p_ctrl->sm->p_subn->force_heavy_sweep = 1; OSM_LOG_EXIT(p_ctrl->sm->p_log); } -- 1.5.4.rc2.60.gb2e62 From bob.kossey at hp.com Wed Jan 30 10:54:23 2008 From: bob.kossey at hp.com (Kossey, Robert) Date: Wed, 30 Jan 2008 18:54:23 +0000 Subject: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness Message-ID: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> I would prefer not to see OFED 1.3 delayed for this. There will always be another bug, so you have to close the release and ship at some point. In the case of these particular bugs, IIRC, the first involved an older HCA that may not be widely used. The other UDP performance bugs do not have any ready fixes that I'm aware of. A more general question I would like to ask the group is how many people use OFED from the RH or SUSE distros as is, as compared with using OFED releases from other sources like the IB vendors, or building their own from openfabrics.org? We use RH distros, but to this point, the OFED support provided in RH distros has lagged behind the latest releases available from openfabrics.org. This is not to fault Red Hat, but OFED is still changing too rapidly, with minor point releases and bug fixes, for a distro to keep up. I think many of us hope that someday that will not be the case, but appears to be true for the foreseeable future. Right now, our mode of operation is to remove whatever IB support comes in the distro and replace it, so it does not help us to delay OFED 1.3 to get a particular bug fix in a distro. Bob > -----Original Message----- > ... > Doug Ledford wrote: > > > > Hmmm...I'd like to put my $.02 in here. I don't have any > visibility > > into what drives the OFED schedule, so I have no clue as to > why people > > don't want to slip the schedule for this change. I'm sure you guys > > have your reasons. However, I also happen to be a consumer of this > > code, and I know for a fact that no one has gotten my input on this > > issue. So, the deal is that I'm currently integrating OFED > 1.3 into > > what will be RHEL5.2. The RHEL5.2 freeze date has already > passed, but > > in order to keep what finally goes out from being too > stale, I'm being > > allowed to submit the OFED-1.3-rc1 code prior to freeze, and then > > update to > > OFED-1.3 final during our beta test process. What this > means, is that > > anything you punt from 1.3 to 1.3.1, you are also punting out of > > RHEL5.2 and RHEL4.7. So, that being said, there's a whole trickle > > down effect with various groups that would really like to > be able to > > use 5.2 out of the box that may prefer a slip in 1.3 so > that this can > > be part of it instead of punting to 1.3.1. I'm not saying > this will > > change your mind, but I'm sure it wasn't part of the > decision process > > before, so I'm bringing it up. > > > Thanks for the input (BTW you are welcome to join our weekly > meetings and give us feedback online) I think it is important > to make sure RH new versions will include best OFED release > > This my suggestion is: > > * Delay 1.3 release in a week > * Do RC4 next week - Feb 6 > * Add RC5 on Feb 18 - this will be the GOLD version > * GA release on Feb 25 > > > All - please reply if this is acceptable > > > > > > 760 major eli at mellanox.co.il UDP performance on > Rx is lower > > than Tx - for 1.3.1 > > 761 major eli at mellanox.co.il Poor and jittery UDP > > performance at small messages - for 1.3.1 > > > > > > Ditto for requesting these two be in 1.3. We've already > had customers > > bring up the UDP performance issue in our previous releases. > > > > > We will push some fixes of these to RC4 if the above plan is accepted > > Tziporet From jlentini at netapp.com Wed Jan 30 11:04:17 2008 From: jlentini at netapp.com (James Lentini) Date: Wed, 30 Jan 2008 14:04:17 -0500 (EST) Subject: [nfs-rdma-devel] [ofa-general] Status of NFS-RDMA ? In-Reply-To: <20080130161924.GA31154@cefeid.wcss.wroc.pl> References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> <4797AD59.2000206@mellanox.co.il> <20080126193035.GA21209@cefeid.wcss.wroc.pl> <20080129003731.GA30262@cefeid.wcss.wroc.pl> <20080130161924.GA31154@cefeid.wcss.wroc.pl> Message-ID: On Wed, 30 Jan 2008, Pawel Dziekonski wrote: > On Tue, 29 Jan 2008 at 09:53:46AM -0500, James Lentini wrote: > > > > Is your goal to build a kernel with an NFS/RDMA server? If so, the > > kernel sources from Tom Tucker's git tree are the ones you want, not > > the old OFED 1.2-based packages which are out of date. > > > > Did you try setting up the NFS/RDMA server on the kernel used for your > > MPI tests above? > > my goal is to have everything running on my fabric- NFS/RDMA, MPI, > IPoverIB, SDP. > > my current status is: > > - Tom Tucker's git tree compiled and running > - compiled OFED from > http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz > (whatever it is - who knows?) - MPI is working, SDP not. > > - nfs-utils 1.1.1 compiled and: > > nfs server start: > #!/bin/sh > /etc/rc.d/init.d/portmap restart > modprobe nfs > umount /proc/fs/nfsd > mount -t nfsd /proc/fs/nfsd > exportfs -av > rpc.mountd > rpc.statd --no-notify > rpc.nfsd > sm-notify > > # cat /etc/exports > /scratch 10.2.2.2(no_subtree_check,insecure,rw,async,no_root_squash) > > nfs client start: > #!/bin/sh > /etc/rc.d/init.d/portmap restart > modprobe nfs > sm-notify > > # mount.rnfs -o rdma=10.2.2.1 10.2.2.1:/scratch /mnt > Doing nfs/rdma mount to 10.2.2.1, mount protocol to 10.2.2.1 > nfsmount: Invalid argument Are you using the mount.nfs command you built from nfs-utils-1.1.1? If you installed nfs-utils, you should be doing something like this (mount will redirect to /sbin/mount.nfs if it is present): /sbin/mount :/ /mnt -i -o rdma,port=2050 There is more info here: http://nfs-rdma.sourceforge.net/Documents/README > :( > > -- > Pawel Dziekonski > Wroclaw Centre for Networking & Supercomputing, HPC Department > Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND > phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl > From xma at us.ibm.com Wed Jan 30 11:08:23 2008 From: xma at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 11:08:23 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: <47A0A86A.5060003@mellanox.co.il> Message-ID: general-bounces at lists.openfabrics.org wrote on 01/30/2008 08:40:10 AM: > Doug Ledford wrote: > > > > Hmmm...I'd like to put my $.02 in here. I don't have any visibility > > into what drives the OFED schedule, so I have no clue as to why people > > don't want to slip the schedule for this change. I'm sure you guys have > > your reasons. However, I also happen to be a consumer of this code, and > > I know for a fact that no one has gotten my input on this issue. So, > > the deal is that I'm currently integrating OFED 1.3 into what will be > > RHEL5.2. The RHEL5.2 freeze date has already passed, but in order to > > keep what finally goes out from being too stale, I'm being allowed to > > submit the OFED-1.3-rc1 code prior to freeze, and then update to > > OFED-1.3 final during our beta test process. What this means, is that > > anything you punt from 1.3 to 1.3.1, you are also punting out of RHEL5.2 > > and RHEL4.7. So, that being said, there's a whole trickle down effect > > with various groups that would really like to be able to use 5.2 out of > > the box that may prefer a slip in 1.3 so that this can be part of it > > instead of punting to 1.3.1. I'm not saying this will change your mind, > > but I'm sure it wasn't part of the decision process before, so I'm > > bringing it up. > > > Thanks for the input (BTW you are welcome to join our weekly meetings > and give us feedback online) > I think it is important to make sure RH new versions will include best > OFED release > > This my suggestion is: > > * Delay 1.3 release in a week > * Do RC4 next week - Feb 6 > * Add RC5 on Feb 18 - this will be the GOLD version > * GA release on Feb 25 > > > All - please reply if this is acceptable > > > > > > 760 major eli at mellanox.co.il UDP performance on Rx is lower > > than Tx - for 1.3.1 > > 761 major eli at mellanox.co.il Poor and jittery UDP > > performance at small messages - for 1.3.1 > > > > > > Ditto for requesting these two be in 1.3. We've already had customers > > bring up the UDP performance issue in our previous releases. > > > > > We will push some fixes of these to RC4 if the above plan is accepted > > Tziporet Is also that possible to include some delayed features which are planning to be in later release as well? Like IPoIB noSRQ, 4K mtu etc, we do have some customers request already. IPoIB noSRQ has been in upper stream already, but it's not in 2.6.24, it will be in 2.6.25. 4K mtu patch is under review. We have passed our tests. I will post a new version against RC3, and split the patch into several for 2.6.25 upper stream submission. thanks Shirley -------------- next part -------------- An HTML attachment was scrubbed... URL: From jlentini at netapp.com Wed Jan 30 11:11:00 2008 From: jlentini at netapp.com (James Lentini) Date: Wed, 30 Jan 2008 14:11:00 -0500 (EST) Subject: [nfs-rdma-devel] [ofa-general] Status of NFS-RDMA ? In-Reply-To: References: <20080123194728.GA10437@cefeid.wcss.wroc.pl> <4797AD59.2000206@mellanox.co.il> <20080126193035.GA21209@cefeid.wcss.wroc.pl> <20080129003731.GA30262@cefeid.wcss.wroc.pl> <20080130161924.GA31154@cefeid.wcss.wroc.pl> Message-ID: On Wed, 30 Jan 2008, James Lentini wrote: > > > On Wed, 30 Jan 2008, Pawel Dziekonski wrote: > > > On Tue, 29 Jan 2008 at 09:53:46AM -0500, James Lentini wrote: > > > > > > Is your goal to build a kernel with an NFS/RDMA server? If so, the > > > kernel sources from Tom Tucker's git tree are the ones you want, not > > > the old OFED 1.2-based packages which are out of date. > > > > > > Did you try setting up the NFS/RDMA server on the kernel used for your > > > MPI tests above? > > > > my goal is to have everything running on my fabric- NFS/RDMA, MPI, > > IPoverIB, SDP. > > > > my current status is: > > > > - Tom Tucker's git tree compiled and running > > - compiled OFED from > > http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz > > (whatever it is - who knows?) - MPI is working, SDP not. > > > > - nfs-utils 1.1.1 compiled and: > > > > nfs server start: > > #!/bin/sh > > /etc/rc.d/init.d/portmap restart > > modprobe nfs > > umount /proc/fs/nfsd > > mount -t nfsd /proc/fs/nfsd > > exportfs -av > > rpc.mountd > > rpc.statd --no-notify > > rpc.nfsd > > sm-notify > > > > # cat /etc/exports > > /scratch 10.2.2.2(no_subtree_check,insecure,rw,async,no_root_squash) > > > > nfs client start: > > #!/bin/sh > > /etc/rc.d/init.d/portmap restart > > modprobe nfs > > sm-notify > > > > # mount.rnfs -o rdma=10.2.2.1 10.2.2.1:/scratch /mnt > > Doing nfs/rdma mount to 10.2.2.1, mount protocol to 10.2.2.1 > > nfsmount: Invalid argument > > Are you using the mount.nfs command you built from nfs-utils-1.1.1? If > you installed nfs-utils, you should be doing something like this > (mount will redirect to /sbin/mount.nfs if it is present): > > /sbin/mount :/ /mnt -i -o rdma,port=2050 or /bin/mount if that is where your mount command lives. > There is more info here: > > http://nfs-rdma.sourceforge.net/Documents/README > > > :( > > > > -- > > Pawel Dziekonski > > Wroclaw Centre for Networking & Supercomputing, HPC Department > > Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND > > phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > nfs-rdma-devel mailing list > nfs-rdma-devel at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-rdma-devel > From rosnbrg at us.ibm.com Wed Jan 30 11:36:31 2008 From: rosnbrg at us.ibm.com (Bryan S Rosenburg) Date: Wed, 30 Jan 2008 14:36:31 -0500 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: Message-ID: Roland Dreier wrote on 01/30/2008 11:47:55 AM: > > > I'm still annoyed by the (num_phys_buf == 1) special case. I'm wondering > > if it's still needed. If you leave out that if-statement entirely, you > > may end up using a page size that is larger (maybe much larger) than > > necessary, but I think things will still work, given that the > > virtual-to-physical alignment constraints are respected. If you remove > > the special case, you can replace the whole loop with an ffs() call. > > Makes sense... let me post a patch for discussion. For what it's worth, I removed the single-buffer special case and replaced the whole second loop with an __ffs() call. As expected, it uses a very large page size to cover even a small single buffer if the alignment constraints allow it, and it works. Here's what I've been playing with: ================================================================================ diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c index 6bcde1c..b00ab71 100644 --- a/drivers/infiniband/hw/mthca/mthca_provider.c +++ b/drivers/infiniband/hw/mthca/mthca_provider.c @@ -929,11 +929,7 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, int err; int i, j, n; - /* First check that we have enough alignment */ - if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & ~PAGE_MASK)) - return ERR_PTR(-EINVAL); - - mask = 0; + mask = buffer_list[0].addr ^ *iova_start; total_size = 0; for (i = 0; i < num_phys_buf; ++i) { if (i != 0) @@ -948,16 +944,7 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, return ERR_PTR(-EINVAL); /* Find largest page shift we can use to cover buffers */ - for (shift = PAGE_SHIFT; shift < 31; ++shift) - if (num_phys_buf > 1) { - if ((1ULL << shift) & mask) - break; - } else { - if (1ULL << shift >= - buffer_list[0].size + - (buffer_list[0].addr & ((1ULL << shift) - 1))) - break; - } + shift = __ffs((unsigned long) ((mask & ((1ull<<31)-1)) | (1ull<<31))); buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); buffer_list[0].addr &= ~0ull << shift; ================================================================================ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Jan 30 12:05:04 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Jan 2008 12:05:04 -0800 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: (Bryan S. Rosenburg's message of "Wed, 30 Jan 2008 14:36:31 -0500") References: Message-ID: Looks almost identical to what I came up with (below), except: > + shift = __ffs((unsigned long) ((mask & ((1ull<<31)-1)) | (1ull<<31))); I think there's no reason to mask off the top bit of mask if we're going to set it immediately; I just did: + shift = __ffs(mask | 1 << 31); and here's the whole patch I'll queue up, if it looks good to you: diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c index 6bcde1c..19b7f61 100644 --- a/drivers/infiniband/hw/mthca/mthca_provider.c +++ b/drivers/infiniband/hw/mthca/mthca_provider.c @@ -923,17 +923,13 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, struct mthca_mr *mr; u64 *page_list; u64 total_size; - u64 mask; + unsigned long mask; int shift; int npages; int err; int i, j, n; - /* First check that we have enough alignment */ - if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & ~PAGE_MASK)) - return ERR_PTR(-EINVAL); - - mask = 0; + mask = buffer_list[0].addr ^ *iova_start; total_size = 0; for (i = 0; i < num_phys_buf; ++i) { if (i != 0) @@ -947,17 +943,7 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, if (mask & ~PAGE_MASK) return ERR_PTR(-EINVAL); - /* Find largest page shift we can use to cover buffers */ - for (shift = PAGE_SHIFT; shift < 31; ++shift) - if (num_phys_buf > 1) { - if ((1ULL << shift) & mask) - break; - } else { - if (1ULL << shift >= - buffer_list[0].size + - (buffer_list[0].addr & ((1ULL << shift) - 1))) - break; - } + shift = __ffs(mask | 1 << 31); buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); buffer_list[0].addr &= ~0ull << shift; From rosnbrg at us.ibm.com Wed Jan 30 12:41:33 2008 From: rosnbrg at us.ibm.com (Bryan S Rosenburg) Date: Wed, 30 Jan 2008 15:41:33 -0500 Subject: [ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list() In-Reply-To: Message-ID: Roland Dreier wrote on 01/30/2008 03:05:04 PM: > Looks almost identical to what I came up with (below), except: > > > + shift = __ffs((unsigned long) ((mask & ((1ull<<31)-1)) | (1ull<<31))); > > I think there's no reason to mask off the top bit of mask if we're > going to set it immediately; I was actually masking off the top 33 bits before setting bit 31. I agree that it's unnecessary, but I wanted it to be clear that the possible truncation from 64 to 32 bits (on 32-bit target machines) is okay. I'd have left off the masking and the cast if the kernel provided something like __ffsll() or __ffs64() that would take 64-bit arguments even on 32-bit machines, but it doesn't as far as I can tell. You've changed mask itself into an unsigned long: > - u64 mask; > + unsigned long mask; so the truncation (on 32-bit machines) is now happening as the mask is constructed. In this case the truncation is okay, because ultimately you only care about the low-order bits anyway, but implicit truncations always raise a red flag for me. In any case, I think your patch does the job and does it efficiently. - Bryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From uwaadnbdv at bluelinechoice.com Wed Jan 30 12:42:30 2008 From: uwaadnbdv at bluelinechoice.com (Alphonso Lawson) Date: Wed, 30 Jan 2008 21:42:30 +0100 Subject: [ofa-general] Why do you need Adobe software? Message-ID: <01c86389$03792700$c1f3e953@uwaadnbdv> Looking for the best value in discount software? Now you'll get the chance to have the softwares you want from lont time.And the better thing is, all softwares are dirt cheap.In all european languages - english, France, Italy, Spanish, German.Check out by yourself and have the softwares are cheap rates. http://geocities.com/rachaelburnett349 Thy sake and my poordoing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tziporet at dev.mellanox.co.il Wed Jan 30 13:01:34 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 30 Jan 2008 23:01:34 +0200 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> <1201639285.28486.101.camel@firewall.xsintricity.com> <47A0A86A.5060003@mellanox.co.il> Message-ID: <47A0E5AE.9020607@mellanox.co.il> Tang, Changqing wrote: > When do you pack the official RC3 ? Thanks. > > Already packed - mail will go out soon Tziporet From tziporet at mellanox.co.il Wed Jan 30 13:27:31 2008 From: tziporet at mellanox.co.il (Tziporet Koren) Date: Wed, 30 Jan 2008 23:27:31 +0200 Subject: [ofa-general] OFED 1.3 RC3 release is available Message-ID: <6C2C79E72C305246B504CBA17B5500C903369030@mtlexch01.mtl.com> Hi, OFED 1.3 RC3 release is available on http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc3.tgz To get BUILD_ID run ofed_info Please report any issues in bugzilla https://bugs.openfabrics.org/ The RC4 release is expected on February 13 Tziporet & Vlad ======================================================================== Release information: -------------------- Linux Operating Systems: - RedHat EL4 up4: 2.6.9-42.ELsmp - RedHat EL4 up5: 2.6.9-55.ELsmp - RedHat EL4 up6: 2.6.9-67.ELsmp * - RedHat EL5: 2.6.18-8.el5 - RedHat EL5 up1: 2.6.18-53.el5 - Fedora C6: 2.6.18-8.fc6 * - SLES10: 2.6.16.21-0.8-smp - SLES10 SP1: 2.6.16.46-0.12-smp - SLES10 SP1 up1: 2.6.16.53-0.16-smp - OpenSuSE 10.3: 2.6.22-*-* * - kernel.org: 2.6.23 and 2.6.24 * OSes that are partially tested Systems: * x86_64 * x86 * ia64 * ppc64 Main Changes from OFED 1.3-RC2 =============================== * Fixed 14 Bugs (see attachment) * Kernel is based on 2.6.24 * MPI packages update: * mvapich-1.0.0-1920.src.rpm * mvapich2-1.0.2-1.src.rpm * Dapl: new release * Bonding: new module * XRC enhanced API (see note below) * RDS: many bug fixes * IPoIB: Removed the HW checksum patch Tasks that should be completed for RC4: ======================================= 1. Fix bugs 2. Decide whether we take the new IPoIB patches from IBM (4K MTU and Non-SRQ CM) XRC note: ibv_modify_xrc_rcv_qp/ibv_reg_xrc_rcv_qp/ibv_unreg_xrc_rcv_qp all have a bug that the kernel layer returns 0 on success, instead of returning the number of bytes received in the userspace verb request. As a result, the userspace verbs erroneously report failure on their return. This bug has already been fixed in the daily build (the next OFED daily build will contain the fix.) The original XRC verbs (ibv_create_xrc_srq, ibv_open_xrc_domain, ibv_close_xrc_domain) work properly. -------------- next part -------------- A non-text attachment was scrubbed... Name: rc3-fixed-bugs.csv Type: application/octet-stream Size: 1630 bytes Desc: rc3-fixed-bugs.csv URL: From tziporet at dev.mellanox.co.il Wed Jan 30 13:35:50 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Wed, 30 Jan 2008 23:35:50 +0200 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> Message-ID: <47A0EDB6.5050804@mellanox.co.il> Kossey, Robert wrote: > I would prefer not to see OFED 1.3 delayed for this. There will always be another bug, so you have to close the release and ship at some point. In the case of these particular bugs, IIRC, the first involved an older HCA that may not be widely used. The other UDP performance bugs do not have any ready fixes that I'm aware of. > > A more general question I would like to ask the group is how many people use OFED from the RH or SUSE distros as is, as compared with using OFED releases from other sources like the IB vendors, or building their own from openfabrics.org? We use RH distros, but to this point, the OFED support provided in RH distros has lagged behind the latest releases available from openfabrics.org. This is not to fault Red Hat, but OFED is still changing too rapidly, with minor point releases and bug fixes, for a distro to keep up. I think many of us hope that someday that will not be the case, but appears to be true for the foreseeable future. Right now, our mode of operation is to remove whatever IB support comes in the distro and replace it, so it does not help us to delay OFED 1.3 to get a particular bug fix in a distro. > > The main reason is not the bugs but the features supported by IBM - CM support for non SRQ and 4K MTU I see that these are important for IBM (see other mails) Another thing we can do in order not to delay the release is insert the changes tomorrow (immediately after RC3 is out) and do RC4 next week (instead of 2 weeks between every RC), and RC5 the week after. In this way we will have enough time for testing and if we find some bug we can fix then in RC5 Is this better? Tziporet From rdreier at cisco.com Wed Jan 30 13:47:16 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Jan 2008 13:47:16 -0800 Subject: [ofa-general] page mask calculation in mthca mthca_reg_phys_mr In-Reply-To: <479F1E04.3000204@sun.com> (Liang Zhen's message of "Tue, 29 Jan 2008 20:37:24 +0800") References: <479F1E04.3000204@sun.com> Message-ID: > I think there is a similar bug in mthca as in cxgb3 > (http://lists.openfabrics.org/pipermail/general/2008-January/045246.html), > It can get wrong mask as start address of the first fragment and end > address of the last fragment are ignored, here is some example of this: We had a fairly long discussion about this just now. It turns out there is a bug, but not the one you are worried about. It's fine to ignore the start address of the first fragment and the end address of the last fragment, since the starting virtual address and length of the region will take care of that. The problem is if the virtual address has the wrong alignment. i am planning on merging the patch below, which simplifies things and also fixes the bug: diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c index 6bcde1c..19b7f61 100644 --- a/drivers/infiniband/hw/mthca/mthca_provider.c +++ b/drivers/infiniband/hw/mthca/mthca_provider.c @@ -923,17 +923,13 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, struct mthca_mr *mr; u64 *page_list; u64 total_size; - u64 mask; + unsigned long mask; int shift; int npages; int err; int i, j, n; - /* First check that we have enough alignment */ - if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & ~PAGE_MASK)) - return ERR_PTR(-EINVAL); - - mask = 0; + mask = buffer_list[0].addr ^ *iova_start; total_size = 0; for (i = 0; i < num_phys_buf; ++i) { if (i != 0) @@ -947,17 +943,7 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, if (mask & ~PAGE_MASK) return ERR_PTR(-EINVAL); - /* Find largest page shift we can use to cover buffers */ - for (shift = PAGE_SHIFT; shift < 31; ++shift) - if (num_phys_buf > 1) { - if ((1ULL << shift) & mask) - break; - } else { - if (1ULL << shift >= - buffer_list[0].size + - (buffer_list[0].addr & ((1ULL << shift) - 1))) - break; - } + shift = __ffs(mask | 1 << 31); buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); buffer_list[0].addr &= ~0ull << shift; From sean.hefty at intel.com Wed Jan 30 14:03:58 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Wed, 30 Jan 2008 14:03:58 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness In-Reply-To: <47A0EDB6.5050804@mellanox.co.il> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <47A0EDB6.5050804@mellanox.co.il> Message-ID: <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> >The main reason is not the bugs but the features supported by IBM - CM >support for non SRQ and 4K MTU These are entirely my opinions, but... OFED isn't even at RC1 if it's not at feature freeze... OFED has moved well beyond trying to provide an enterprise distribution to simply providing an experimental code base more concerned with including the latest and greatest features. It's become the staging area for getting the code into shape for merging upstream, which wasn't what I thought was the purpose of OFED. - Sean From meier3 at llnl.gov Wed Jan 30 14:19:24 2008 From: meier3 at llnl.gov (Timothy A. Meier) Date: Wed, 30 Jan 2008 14:19:24 -0800 Subject: [ofa-general] [PATCH] opensm: diags better error checking for DR option Message-ID: <47A0F7EC.3040004@llnl.gov> Sasha, this patch contains a bug fix for the -D option for the two perl utils iblinkinfo.pl and ibqueryerrors.pl. Since I touched these files, I also ran them through 'perltidy' with your suggestion options. This of course caused the patch to become large. Do you want me to back out the 'tidy' changes and resubmit the patch without formatting changes? Or is it okay the way it is? Due to its size I decided not to include the patch in the body of this email. -- Timothy A. Meier Computer Scientist ICCD/High Performance Computing 925.422.3341 meier3 at llnl.gov -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 0001-opensm-diags-better-error-checking-for-DR-option.patch URL: From betsy.zeller at qlogic.com Wed Jan 30 14:42:16 2008 From: betsy.zeller at qlogic.com (Betsy Zeller) Date: Wed, 30 Jan 2008 14:42:16 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: <47A0A86A.5060003@mellanox.co.il> References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> <1201639285.28486.101.camel@firewall.xsintricity.com> <47A0A86A.5060003@mellanox.co.il> Message-ID: <1201732936.7324.64.camel@sarium.pathscale.com> One of the things driving the OFED 1.3 date is that OFED 1.3 has to be released before the Plugfest, which starts on March 10. I can deal with slipping OFED GA date to Feb 25, but I really don't think we should let it slip into March. How confident are the developers that, if they get the extra week, there won't be further slippage? Doug - thanks very much for letting us know the plans for RHEL5 U2 - it's great news that OFED 1.3 (final release) will be included. - Betsy On Wed, 2008-01-30 at 18:40 +0200, Tziporet Koren wrote: > Doug Ledford wrote: > > > > Hmmm...I'd like to put my $.02 in here. I don't have any visibility > > into what drives the OFED schedule, so I have no clue as to why people > > don't want to slip the schedule for this change. I'm sure you guys have > > your reasons. However, I also happen to be a consumer of this code, and > > I know for a fact that no one has gotten my input on this issue. So, > > the deal is that I'm currently integrating OFED 1.3 into what will be > > RHEL5.2. The RHEL5.2 freeze date has already passed, but in order to > > keep what finally goes out from being too stale, I'm being allowed to > > submit the OFED-1.3-rc1 code prior to freeze, and then update to > > OFED-1.3 final during our beta test process. What this means, is that > > anything you punt from 1.3 to 1.3.1, you are also punting out of RHEL5.2 > > and RHEL4.7. So, that being said, there's a whole trickle down effect > > with various groups that would really like to be able to use 5.2 out of > > the box that may prefer a slip in 1.3 so that this can be part of it > > instead of punting to 1.3.1. I'm not saying this will change your mind, > > but I'm sure it wasn't part of the decision process before, so I'm > > bringing it up. > > > Thanks for the input (BTW you are welcome to join our weekly meetings > and give us feedback online) > I think it is important to make sure RH new versions will include best > OFED release > > This my suggestion is: > > * Delay 1.3 release in a week > * Do RC4 next week - Feb 6 > * Add RC5 on Feb 18 - this will be the GOLD version > * GA release on Feb 25 > > > All - please reply if this is acceptable > > > > > > 760 major eli at mellanox.co.il UDP performance on Rx is lower > > than Tx - for 1.3.1 > > 761 major eli at mellanox.co.il Poor and jittery UDP > > performance at small messages - for 1.3.1 > > > > > > Ditto for requesting these two be in 1.3. We've already had customers > > bring up the UDP performance issue in our previous releases. > > > > > We will push some fixes of these to RC4 if the above plan is accepted > > Tziporet > _______________________________________________ > ewg mailing list > ewg at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > From dledford at redhat.com Wed Jan 30 14:55:06 2008 From: dledford at redhat.com (Doug Ledford) Date: Wed, 30 Jan 2008 17:55:06 -0500 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness In-Reply-To: <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <47A0EDB6.5050804@mellanox.co.il> <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> Message-ID: <1201733706.28486.218.camel@firewall.xsintricity.com> On Wed, 2008-01-30 at 14:03 -0800, Sean Hefty wrote: > >The main reason is not the bugs but the features supported by IBM - CM > >support for non SRQ and 4K MTU > > These are entirely my opinions, but... > > OFED isn't even at RC1 if it's not at feature freeze... > > OFED has moved well beyond trying to provide an enterprise distribution to > simply providing an experimental code base more concerned with including the > latest and greatest features. It's become the staging area for getting the code > into shape for merging upstream, which wasn't what I thought was the purpose of > OFED. Well, that's not really a fair thing to say given that the CM support for non SRQ patch *is* upstream, it just isn't in OFED. As far as OFED not even being at RC1 if it isn't at feature freeze, that all depends on what's classified as a feature. I know the two patches above were called features by Tziporet, but if this were an internal Red Hat project, those would have been more correctly classified as blockers. Once we've passed our feature freeze deadline and started our testing and validation, if a bug or shortcoming is found in some new code we submitted, then that is classified as a blocker (unless it's actually unimportant enough that we can leave it, but there are very few of this sort of thing ever found). For us anyway, this will be our first release where we are turning on CM support in IPoIB. It would be a legitimate bug that the code as submitted doesn't work across all the hardware. So, that would be a blocker bug, with the fix being the non-SRQ support. Anyway, I got the impression that the real sentiment of your mail was less about those two bugs/features and more that OFED seems to be more of an experimental source repo than an enterprise distribution. In all fairness, the kernel portion of all of this, and the process of getting things into Linus' kernel, has *always* been a case of staging things in Roland's tree and then merging upstream. So, at least for the kernel, that's mostly true as OFED is pretty close to Roland's tree generally speaking. As for the user space packages though, you guys *are* the upstream. There's no one to merge upstream to and very little oversight by anyone. So, it's entirely up to all of you just how much your package seems to be a feature of the day change-athon versus a solid, stable program. -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From or.gerlitz at gmail.com Wed Jan 30 15:30:08 2008 From: or.gerlitz at gmail.com (Or Gerlitz) Date: Thu, 31 Jan 2008 01:30:08 +0200 Subject: [ofa-general] [RFC] IPoIB UD 4K MTU support In-Reply-To: <1201025321.756.33.camel@localhost.localdomain> References: <1201025321.756.33.camel@localhost.localdomain> Message-ID: <15ddcffd0801301530r16dbb7abs9fc805342347a056@mail.gmail.com> On 1/22/08, Shirley Ma wrote: > 4K MTU is still under testing. Can you share any findings from this testing? specifically, did it hurt small packet throughput? Are the switches used for this testing based on the Mellanox chip (AnafaII)? Or From mashirle at us.ibm.com Wed Jan 30 06:03:57 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 06:03:57 -0800 Subject: [ofa-general] [RFC] IPoIB UD 4K MTU support In-Reply-To: <15ddcffd0801301530r16dbb7abs9fc805342347a056@mail.gmail.com> References: <1201025321.756.33.camel@localhost.localdomain> <15ddcffd0801301530r16dbb7abs9fc805342347a056@mail.gmail.com> Message-ID: <1201701837.6850.7.camel@localhost.localdomain> On Thu, 2008-01-31 at 01:30 +0200, Or Gerlitz wrote: > On 1/22/08, Shirley Ma wrote: > > 4K MTU is still under testing. > > Can you share any findings from this testing? specifically, did it > hurt small packet throughput? > Are the switches used for this testing based on the Mellanox chip (AnafaII)? > > Or Hello Or, The patch didn't change 2K MTU performance, it only needs one buff allocation for 2K MTU for any kernel PAGE_SIZE. It only impacts 4K MTU size memory allocation for kernel PAGE_SIZE less or equal to 4K. I didn't see any performance worse for 2K. I can run any message sizes with any streams for more test if you have any special request right away. The switch we used for our test is Qlogic Silverstorm. I believe it's Mellanox chip (AnafaII). I will double check and let you know the results. The patch here is needed to be updated. I will submit it tonight against RC3. You can verify what I have found in your test environment. Any problems, I am here to support. Thanks Shirley From meier3 at llnl.gov Wed Jan 30 16:05:50 2008 From: meier3 at llnl.gov (Timothy A. Meier) Date: Wed, 30 Jan 2008 16:05:50 -0800 Subject: [ofa-general] [PATCH] opensm: diags bug fix in iblinkinfo.pl Message-ID: <47A110DE.6090103@llnl.gov> Sasha, Sorry, I missed this in my previous patch. From b9bd2d2e5be0121c148fe7087ca7e6cce357a55e Mon Sep 17 00:00:00 2001 From: Tim Meier Date: Wed, 30 Jan 2008 16:00:50 -0800 Subject: [PATCH] opensm: diags bug fix in iblinkinfo.pl Fixes -D bug determining guid from dr Signed-off-by: Tim Meier --- infiniband-diags/scripts/iblinkinfo.pl | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/infiniband-diags/scripts/iblinkinfo.pl b/infiniband-diags/scripts/iblinkinfo.pl index 5f268dd..195c8cf 100755 --- a/infiniband-diags/scripts/iblinkinfo.pl +++ b/infiniband-diags/scripts/iblinkinfo.pl @@ -98,7 +98,7 @@ sub main get_link_ends($regenerate_map, $ca_name, $ca_port); if (defined($direct_route)) { # convert DR to guid, then use original single_switch option - $single_switch = $IBswcountlimits::convert_dr_to_guid{$direct_route}; + $single_switch = convert_dr_to_guid($direct_route); if (!defined($single_switch) || !is_switch($single_switch)) { printf("The direct route (%s) does not map to a switch.\n", $direct_route); -- 1.5.1 -- Timothy A. Meier Computer Scientist ICCD/High Performance Computing 925.422.3341 meier3 at llnl.gov From mashirle at us.ibm.com Wed Jan 30 06:13:24 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 06:13:24 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness In-Reply-To: <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <47A0EDB6.5050804@mellanox.co.il> <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> Message-ID: <1201702404.6850.13.camel@localhost.localdomain> On Wed, 2008-01-30 at 14:03 -0800, Sean Hefty wrote: > >The main reason is not the bugs but the features supported by IBM - CM > >support for non SRQ and 4K MTU > > These are entirely my opinions, but... > > OFED isn't even at RC1 if it's not at feature freeze... > > OFED has moved well beyond trying to provide an enterprise distribution to > simply providing an experimental code base more concerned with including the > latest and greatest features. It's become the staging area for getting the code > into shape for merging upstream, which wasn't what I thought was the purpose of > OFED. > > - Sean Hello Sean, Only 4K MTU patch hasn't been upper stream. nonSRQ is upper stream already. The IPoIB implementation is hard coded and limit the packet size to 2048, which prevents IBM eHCA2 4K MTU from performing. I would say it's a bug fix from that prospective. And the majority of the code is making IPoIB-CM S/G code to be more generic so IPoIB-UD S/G can be reused it. It doesn't hurt 2K MTU much. Thanks Shirley From robert.j.woodruff at intel.com Wed Jan 30 17:10:38 2008 From: robert.j.woodruff at intel.com (Woodruff, Robert J) Date: Wed, 30 Jan 2008 17:10:38 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness In-Reply-To: <47A0A86A.5060003@mellanox.co.il> References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com><1201639285.28486.101.camel@firewall.xsintricity.com> <47A0A86A.5060003@mellanox.co.il> Message-ID: Tziporet wrote, > * Delay 1.3 release in a week > * Do RC4 next week - Feb 6 > * Add RC5 on Feb 18 - this will be the GOLD version > * GA release on Feb 25 >All - please reply if this is acceptable I hate to keep slipping this, but I think it is important to get what RedHat needs into OFED 1.3, so I am not apposed to this. I think however that perhaps after 1.3, we should discuss our process a bit to try to get a little better at making our original release dates. I think we are getting hit with feature creep, allowing some pretty major changes after the feature freeze date, late in the release cycle. I also think that we do need to be a little more careful and selective about what features go into OFED, as it is suppose to be an enterprise release rather than an experimental code release. For the kernel code, I think that this means keeping things a little closer to the kernel.org kernel features and if something is not upstream, then press for getting it upstream (or at least queued for upsteam) rather than allowing big patches into OFED that have not had a good review. The way we are working now, if it is getting into OFED, people are less aggressive at getting things upstream. Perhaps we can have a discussion about this at the Sonoma workshop. my 2 cents, woody From kliteyn at mellanox.co.il Wed Jan 30 17:31:04 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 31 Jan 2008 03:31:04 +0200 Subject: [ofa-general] nightly osm_sim report 2008-01-31:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-30 OpenSM git rev = Tue_Jan_29_09:24:40_2008 [63c04327bbdcd47cc37cb0cbfb366de16ae0ccb6] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From mashirle at us.ibm.com Wed Jan 30 08:00:15 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 08:00:15 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> <1201639285.28486.101.camel@firewall.xsintricity.com> <47A0A86A.5060003@mellanox.co.il> Message-ID: <1201708815.6850.22.camel@localhost.localdomain> On Wed, 2008-01-30 at 17:10 -0800, Woodruff, Robert J wrote: > Tziporet wrote, > > * Delay 1.3 release in a week > > * Do RC4 next week - Feb 6 > > * Add RC5 on Feb 18 - this will be the GOLD version > > * GA release on Feb 25 > > > >All - please reply if this is acceptable > > I hate to keep slipping this, but I think it is important to get > what RedHat needs into OFED 1.3, so I am not apposed to this. > > I think however that perhaps after 1.3, we should discuss our process > a bit to try to get a little better at making our original > release dates. I think we are getting hit with feature creep, allowing > some pretty major changes after the feature freeze date, late in the > release cycle. > > I also think that we do need to be a little more careful > and selective about what features go into OFED, as it is suppose to be > an enterprise release rather than an experimental code release. > > For the kernel code, I think that this means keeping things a little > closer to the kernel.org kernel features and if something is not > upstream, then > press for getting it upstream (or at least queued for upsteam) > rather than allowing big patches into OFED that have not had a good > review. > The way we are working now, if it is getting into OFED, people are less > aggressive at getting things upstream. > > Perhaps we can have a discussion about this at the Sonoma workshop. In addition, we should talk about how to integrate patches being queued in upper stream but not in OFED, like IPoIB noSRQ. There is always a window between OFED release and kernel release, a window between Distro release and OFED release. Some customers are targeted OFED release, some customers are targeted OFED release. Then how to handle these windows to meet different customers' requirements could be something t to be discussed at Sonoma workshop as well. Thanks Shirley From mashirle at us.ibm.com Wed Jan 30 08:05:06 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 08:05:06 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness In-Reply-To: <1201708815.6850.22.camel@localhost.localdomain> References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> <1201639285.28486.101.camel@firewall.xsintricity.com> <47A0A86A.5060003@mellanox.co.il> <1201708815.6850.22.camel@localhost.localdomain> Message-ID: <1201709106.6850.26.camel@localhost.localdomain> > In addition, we should talk about how to integrate patches being queued > in upper stream but not in OFED, like IPoIB noSRQ. There is always a > window between OFED release and kernel release, a window between Distro > release and OFED release. Some customers are targeted OFED release, some > customers are targeted OFED release. Then how to handle these windows to > meet different customers' requirements could be something t to be > discussed at Sonoma workshop as well. Oops, a typo, I meant some customers are targeted Distro releases. From customer support point view, it's always better to have OFED releases in Distros. Thanks Shirley From xma at us.ibm.com Wed Jan 30 19:35:27 2008 From: xma at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 19:35:27 -0800 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: <47A0A90B.40506@mellanox.co.il> Message-ID: Hello Tziporet, > the hw checksum patch was removed from OFED 1.3 > > Tziporet Could youp please specify which patch has been removed? I still can see a list of patches under RC3. here they are: ipoib_0010_Add-high-dma-support-to-ipoib.patch ipoib_0020_Add-s-g-support-for-IPOIB.patch ipoib_0030_hw_csum.patch ipoib_0040_checksum-offload.patch ipoib_0050_Add-LSO-support.patch ipoib_0060_ethtool-support.patch ipoib_0070_modiy_cq_params.patch ipoib_0080_broadcast_null.patch ipoib_0110_set_default_cq_patams.patch thanks Shirley -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdreier at cisco.com Wed Jan 30 19:36:39 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Jan 2008 19:36:39 -0800 Subject: [ofa-general] Re: [PATCH] IB/CM: add support for routed paths In-Reply-To: <000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com> (Sean Hefty's message of "Mon, 10 Dec 2007 15:53:25 -0800") References: <20071210203544.GI30090@obsidianresearch.com> <000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com> Message-ID: Thanks, applied From rdreier at cisco.com Wed Jan 30 20:06:53 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Jan 2008 20:06:53 -0800 Subject: [ofa-general] Re: [PATCH] IB/ehca: Prevent sending UD packets to QP0 In-Reply-To: <200801241759.09065.fenkes@de.ibm.com> (Joachim Fenkes's message of "Thu, 24 Jan 2008 17:59:08 +0100") References: <200801241759.09065.fenkes@de.ibm.com> Message-ID: thanks, applied From rdreier at cisco.com Wed Jan 30 20:09:05 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Jan 2008 20:09:05 -0800 Subject: [ofa-general] Re: [PATCH 2/2] IB/ehca: Add PMA support In-Reply-To: <200801252118.28122.fenkes@de.ibm.com> (Joachim Fenkes's message of "Fri, 25 Jan 2008 21:18:27 +0100") References: <200801252111.11915.fenkes@de.ibm.com> <200801252118.28122.fenkes@de.ibm.com> Message-ID: thanks, applied 1-2 From chetm at us.ibm.com Wed Jan 30 20:16:17 2008 From: chetm at us.ibm.com (Chet Mehta) Date: Wed, 30 Jan 2008 22:16:17 -0600 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> Message-ID: Robert, In response to your question...... > A more general question I would like to ask the group is how many people use OFED from the RH or SUSE distros as > is, as compared with using OFED releases from other sources like the IB vendors, or building their own from > openfabrics.org? We use RH distros, but to this point, the OFED support provided in RH distros has lagged > behind the latest releases available from openfabrics.org. This is not to fault Red Hat, but OFED is still > changing too rapidly, with minor point releases and bug fixes, for a distro to keep up. I think many of us hope > that someday that will not be the case, but appears to be true for the foreseeable future. Right now, our mode > of operation is to remove whatever IB support comes in the distro and replace it, so it does not help us to > delay OFED 1.3 to get a particular bug fix in a distro. I believe the question that should be asked is "'How many IB customers would like to use the OFED distribution if provided by the distro?" The answer at least for the customer set we deal with is pretty much unanimous. The fact is that customers are already dealing with a distro for their base OS so obtaining the interconnect code & support from the same sources is highly desirable. When OpenIB was new (in 2004/5) and "common" IB code was still in its infancy, the customer set was tolerant of 'build your own' or vendor provided distribution mechanism. However if IB is to become a ubiquitous interconnect, we in OFA have to strive to tailor our deliverables to meet distro requirements. Until we do that, IB will have difficultly gaining broader market acceptance. Just my perspective. :Chet. -------------- next part -------------- An HTML attachment was scrubbed... URL: From krkumar2 at in.ibm.com Wed Jan 30 20:16:10 2008 From: krkumar2 at in.ibm.com (Krishna Kumar2) Date: Thu, 31 Jan 2008 09:46:10 +0530 Subject: [ofa-general] Re: Status of NFS-RDMA ? In-Reply-To: <47A0B499.5030208@nasa.gov> Message-ID: Hi Jeff & James, Great. If you let me know when the bits are ready (I don't always read the mailing list), I should be able to get some testing done. Thanks, - KK Jeff Becker wrote on 01/30/2008 11:02:09 PM: > Hi all. > > James Lentini wrote: > > On Wed, 30 Jan 2008, Krishna Kumar2 wrote: > > > > > >> Hi James, > >> > >> Since you had mentioned in an earlier email that NFS-RDMA server > >> side will be present in OFED1.4, > >> > > > > Actually, that was Tziporet. > > > > > >> do you know if any port of the server code to OFED1.3 (when it comes > >> out) will happen? Is there any effort for that, any work ongoing, > >> any help required, etc? > >> > > > > Jeff Becker had looked into this. We would definitely appreciate the > > help. > > > I have set up a git tree for NFSoRDMA and succesfully merged it with, > and built it on OFED 1.3-rcx. I'm currently doing the backports (SLES 10 > SP1 first). All this is in preparation for OFED 1.4, as that is when > NFSoRDMA will be included in OFED. I think I have this > patching/backporting stuff under control. However, my testing resources > are limited. Thus depending on your platform, I might be able to point > you at OFED 1.3 based bits for testing if/when they are ready. Thanks. > > -jeff > > > The NFS framework has changed significantly in several areas in recent > > kernel releases. This has made backporting the NFS/RDMA code to older > > kernels challenging. > > > > If you are interested in working on OFED1.3 support, let us know. > > > > > >> I couldn't find the release time lines for OFED1.4, is there any > >> link on openfabrics homepage? > >> > > > > I'm not involved with the OFED1.4 planning. Tziporet, is there > > information on this? > > > > > >> Thanks, > >> > >> - KK > >> > >> general-bounces at lists.openfabrics.org wrote on 01/29/2008 08:23:46 PM: > >> > >> > >>> On Tue, 29 Jan 2008, Pawel Dziekonski wrote: > >>> > >>> > >>>> On Mon, 28 Jan 2008 at 10:14:22AM -0500, James Lentini wrote: > >>>> > >>>>> On Sat, 26 Jan 2008, Pawel Dziekonski wrote: > >>>>> > >>>>> > >>>>>> I pulled Tom's tree from new url and build a kernel. > >>>>>> > >>>>> If you enabled support for INFINIBAND drivers (IB and iWARP support) > >>>>> and NFS client/server support, the kernel should be ready to go (run > >>>>> "grep RDMA /your_kernel_sources/.config" to confirm that > >>>>> CONFIG_SUNRPC_XPRT_RDMA is either m or y). > >>>>> > >>>>> NFS/RDMA doesn't require OFED be installed. OFED is a release of the > >>>>> Linux kernel sources and some userspace libraries/tools. If you are > >>>>> > >>>>>> then I downloaded OFED from > >>>>>> http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz, > >>>>>> > >>>>> I don't know what the above URL contains. The latest code is in Tom > >>>>> Tucker's tree (and now NFS server maintainer Bruce Fields tree). It > >>>>> > >> is > >> > >>>> hi, > >>>> > >>>> back to subject on a proper mailing list. > >>>> > >>>> I have a >3 year experience with mellanox hardware and IBGold so I > >>>> basically know what OFED is all about. up to now i was only using > >>>> IBGold since IB drivers appeared in kernel pretty recently. > >>>> > >>> You'll want to use the mainline kernel's IB drivers for NFS/RDMA. > >>> We've been developing the NFS/RDMA software on the OpenFabrics (aka > >>> OpenIB) code since it was merged into 2.6.10 in Dec 2004. > >>> > >>> > >>>> currently I have new hardware. I'm running Tom's kernel and already > >>>> did some MPI tests. SDP is not working, probably because sdp kernel > >>>> modules where not build. ;) I understand that those modules are only > >>>> available from ofa-kernel. please correct me if i'm wrong. > >>>> > >>> Correct. SDP has never been submitted to mainline Linux. > >>> > >>> > >>>> system is Scientic Linux 4.5, which is supposed to be a fully > >>>> compatible RH4 clone. hardware is Supermicro mobos with Mellanox > >>>> MT25204 and Flextronisc switch. > >>>> > >>>> error log from ofa-kernel build: > >>>> > >>> Is your goal to build a kernel with an NFS/RDMA server? If so, the > >>> kernel sources from Tom Tucker's git tree are the ones you want, not > >>> the old OFED 1.2-based packages which are out of date. > >>> > >>> Did you try setting up the NFS/RDMA server on the kernel used for your > >>> MPI tests above? > >>> > >>> > >>>>>> make[1]: Entering directory `/usr/src/ib/xprt-switch-2.6' > >>>>>> test -e include/linux/autoconf.h -a -e include/config/auto.conf || > >>>>>> > >> ( \ > >> > >>>>>> echo; \ > >>>>>> echo " ERROR: Kernel configuration is invalid."; \ > >>>>>> echo " include/linux/autoconf.h or include/config/auto.conf > >>>>>> > >> are > >> > >>> missing."; \ > >>> > >>>>>> echo " Run 'make oldconfig && make prepare' on kernel src > >>>>>> > >> to fix it."; \ > >> > >>>>>> echo; \ > >>>>>> /bin/false) > >>>>>> > >>>>>> obviously, doing 'make oldconfig && make prepare' does not help. > >>>>>> anyway, above mentioned files do exist: > >>>>>> > >>>>>> # ls -la /usr/src/ib/xprt-switch-2.6/{include/linux/autoconf.h, > >>>>>> > >>> include/config/auto.conf} > >>> > >>>>>> -rw-r--r-- 1 root root 10156 Jan 25 17:42 > >>>>>> > >> /usr/src/ib/xprt-switch-2. > >> > >>> 6/include/config/auto.conf > >>> > >>>>>> -rw-r--r-- 1 root root 14733 Jan 25 17:42 > >>>>>> > >> /usr/src/ib/xprt-switch-2. > >> > >>> 6/include/linux/autoconf.h > >>> > >>>>>> despite of above, compilation continues but fails with: > >>>>>> > >>>>>> gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > >>>>>> > >>> 2/drivers/infiniband/core/.mad.o.d -nostdinc -isystem > >>> > >> /usr/lib/gcc/x86_64- > >> > >>> redhat-linux/3.4.6/include -D__KERNEL__ > >>> > >> -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > >> > >>> 2/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > >>> > >> /drivers/infiniband/include > >> > >>> -Iinclude -include include/linux/autoconf.h -include > >>> /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h -Wall > >>> > >> -Wundef > >> > >>> -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common > >>> > >> -Werror- > >> > >>> implicit-function-declaration -Os -m64 -mno-red-zone -mcmodel=kernel > >>> > >> -pipe - > >> > >>> Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time > >>> > >> -mno-sse - > >> > >>> mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 > >>> > >> - > >> > >>> DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer -Wdeclaration-after- > >>> statement -DMODULE -D"KBUILD_STR(s)=#s" - > >>> D"KBUILD_BASENAME=KBUILD_STR(mad)" -D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" > >>> > >> -c - > >> > >>> o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.! > >>> tmp > >>> > >>>> _mad.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > >>>> > >> /drivers/infiniband/core/mad.c > >> > >>>>>> /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > >>>>>> > >> /drivers/infiniband/core/mad.c: In > >> > >>> function `ib_mad_init_module': > >>> > >>>>>> /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > >>>>>> > >> /drivers/infiniband/core/mad.c: > >> > >>> 2966: error: too many arguments to function `kmem_cache_create' > >>> > >>>>>> make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > >>>>>> > >>> 2/drivers/infiniband/core/mad.o] Error 1 > >>> > >>>>>> make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > >>>>>> > >>> 2/drivers/infiniband/core] Error 2 > >>> > >>>>>> make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > >>>>>> > >> /drivers/infiniband] Error 2 > >> > >>>>>> make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error > >>>>>> > >> 2 > >> > >>>>>> make[1]: Leaving directory `/usr/src/ib/xprt-switch-2.6' > >>>>>> make: *** [kernel] Error 2 > >>>>>> error: Bad exit status from /var/tmp/rpm-tmp.3877 (%install) > >>>>>> > >>>>>> full log: > >>>>>> https://cefeid.wcss.wroc.pl/d/tmp/OFED.build.32122.log > >>>>>> > >>>> thanks in advance for any help, P > >>>> > >>>> > >>>> -- > >>>> Pawel Dziekonski > >>>> Wroclaw Centre for Networking & Supercomputing, HPC Department > >>>> Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, > >>>> > >> POLAND > >> > >>>> phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl > >>>> > >>>> > >>>> > >> ------------------------------------------------------------------------- > >> > >>>> This SF.net email is sponsored by: Microsoft > >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. > >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >>>> _______________________________________________ > >>>> nfs-rdma-devel mailing list > >>>> nfs-rdma-devel at lists.sourceforge.net > >>>> https://lists.sourceforge.net/lists/listinfo/nfs-rdma-devel > >>>> > >>>> > >>> _______________________________________________ > >>> general mailing list > >>> general at lists.openfabrics.org > >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > >>> > >>> To unsubscribe, please visit > >>> > >> http://openib.org/mailman/listinfo/openib-general > >> > >> > From rdreier at cisco.com Wed Jan 30 20:24:39 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Jan 2008 20:24:39 -0800 Subject: [ofa-general] Re: [PATCH] ib/ipoib: handle Gratuitous ARP & bonding failover race also for connected mode neighbours In-Reply-To: (Or Gerlitz's message of "Thu, 17 Jan 2008 17:03:45 +0200 (IST)") References: Message-ID: thanks, applied From rdreier at cisco.com Wed Jan 30 20:26:50 2008 From: rdreier at cisco.com (Roland Dreier) Date: Wed, 30 Jan 2008 20:26:50 -0800 Subject: [ofa-general] [PATCH] ib/ipoib: remove a misleading debug print In-Reply-To: (Or Gerlitz's message of "Tue, 29 Jan 2008 12:57:56 +0200 (IST)") References: Message-ID: thanks, applied. From mashirle at us.ibm.com Wed Jan 30 10:42:20 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 10:42:20 -0800 Subject: [ofa-general] [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: References: Message-ID: <1201718540.6850.41.camel@localhost.localdomain> The current IPoIB-UD implementation is limited IPoIB payload size to 2048 through hard coding IPOIB_PACKET_SIZE. The implementation is designed for kernel PAGE_SIZE equals or greater than 4K. If the kernel PAGE_SIZE is equals to 2K, memory buffer allocation will be failure when lack of large buffer of memory. However most of the Distros does support PAGE_SIZE >= 4K. So this implementation has no problem for 2048 payload. This implementation is simple but it prevents HCA device who does support 4096 payload from performing, like IBM eHCA2. This patch allows IPoIB-UD MTU up to 4092 (4K - IPOIB_ENCAP_LEN) when HCA can support 4K MTU. In this patch, APIs for S/G buffer allocation in IPoIB-CM mode has been made generic so IPoIB-UD and IPoIB-CM can share the S/G code. When PAGE_SIZE is equal or greater than IPOIB_UD_BUF_SIZE + bytes padding to align IP header, Only one buffer is needed for 4K MTU buffer allocation, otherwise, two buffers allocation is needed in S/G. The node IPoIB link MTU size is the minimum value of admin configurable MTU through ifconfig and IPoIB default broadcast group MTU size. When Subnet Manager enables default broadcast group during start up, this subnet IPoIB link MTU will be the value of default broadcast group MTU size. For any node IB MTU smaller than this value, the node can't join this IPoIB subnet. For any node IB MTU is greater than this value, the node will join this IPoIB subnet and this value will be set as its IPOIB link MTU. If Subnet Manager disables default broadcast group during start up, the first bring up node in this subnet will create the default IPoIB broadcast group based on the negotiation with the Subnet Manager, the default is currently set as 2K according to IPoIB RFC. The patch will be splitted into two patches: 1. Make IPoIB-CM RX S/G APIs generic 2. Enable IPoIB-UD RX S/G I am trying to split these two patches more independent so it's easy to test. ipoib_cm_alloc_rx_skb() will be renamed in second patch. Please review these patches as soon as possible so we can include this in OFED-1.3-RC4. Appreciate your help on time. Thanks Shirley From mashirle at us.ibm.com Wed Jan 30 10:45:16 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 10:45:16 -0800 Subject: [ofa-general] [PATCH 1/3] ib/ipoib: Make IPoIB-CM RX S/G APIs more generic In-Reply-To: <1201718540.6850.41.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> Message-ID: <1201718716.6850.46.camel@localhost.localdomain> Please review below patch while I am testing so I can integrate your comments in my test immediately. Thanks Shirley Signed-off-by:Shirley Ma --- drivers/infiniband/ulp/ipoib/ipoib.h | 25 ++++-- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 139 ++++++------------------------ drivers/infiniband/ulp/ipoib/ipoib_ib.c | 85 +++++++++++++++++++ 3 files changed, 131 insertions(+), 118 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index fe250c6..138f1a3 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -138,7 +138,7 @@ struct ipoib_mcast { struct ipoib_rx_buf { struct sk_buff *skb; - u64 mapping; + u64 mapping[IPOIB_CM_RX_SG]; }; struct ipoib_tx_buf { @@ -189,7 +189,7 @@ enum ipoib_cm_state { struct ipoib_cm_rx { struct ib_cm_id *id; struct ib_qp *qp; - struct ipoib_cm_rx_buf *rx_ring; + struct ipoib_rx_buf *rx_ring; struct list_head list; struct net_device *dev; unsigned long jiffies; @@ -212,11 +212,6 @@ struct ipoib_cm_tx { struct ib_wc ibwc[IPOIB_NUM_WC]; }; -struct ipoib_cm_rx_buf { - struct sk_buff *skb; - u64 mapping[IPOIB_CM_RX_SG]; -}; - struct ipoib_cm_dev_priv { struct ib_srq *srq; struct ipoib_cm_rx_buf *srq_ring; @@ -458,6 +453,22 @@ int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey); void ipoib_pkey_poll(struct work_struct *work); int ipoib_pkey_dev_delay_open(struct net_device *dev); void ipoib_drain_cq(struct net_device *dev); +void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, + unsigned int length, struct sk_buff *toskb); +struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, + int id, int frags, int head_size, + int pad, u64 *mapping); +void inline ipoib_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, + int head_size, u64 *mapping) +{ + int i; + ib_dma_unmap_single(priv->ca, mapping[0], head_size, DMA_FROM_DEVICE); + for (i = 0; i < frags; i++) + ib_dma_unmap_single(priv->ca, mapping[i + 1], PAGE_SIZE, + DMA_FROM_DEVICE); + +} + #ifdef CONFIG_INFINIBAND_IPOIB_CM diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 1818f95..c7d42ea 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -77,17 +77,6 @@ static struct ib_send_wr ipoib_cm_rx_drain_wr = { static int ipoib_cm_tx_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event); -static void ipoib_cm_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, - u64 mapping[IPOIB_CM_RX_SG]) -{ - int i; - - ib_dma_unmap_single(priv->ca, mapping[0], IPOIB_CM_HEAD_SIZE, DMA_FROM_DEVICE); - - for (i = 0; i < frags; ++i) - ib_dma_unmap_single(priv->ca, mapping[i + 1], PAGE_SIZE, DMA_FROM_DEVICE); -} - static int ipoib_cm_post_receive_srq(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -102,8 +91,9 @@ static int ipoib_cm_post_receive_srq(struct net_device *dev, int id) ret = ib_post_srq_recv(priv->cm.srq, &priv->cm.rx_wr, &bad_wr); if (unlikely(ret)) { ipoib_warn(priv, "post srq failed for buf %d (%d)\n", id, ret); - ipoib_cm_dma_unmap_rx(priv, priv->cm.num_frags - 1, - priv->cm.srq_ring[id].mapping); + ipoib_dma_unmap_rx(priv, priv->cm.num_frags - 1, + IPOIB_CM_HEAD_SIZE, + priv->cm.srq_ring[id].mapping); dev_kfree_skb_any(priv->cm.srq_ring[id].skb); priv->cm.srq_ring[id].skb = NULL; } @@ -126,8 +116,8 @@ static int ipoib_cm_post_receive_nonsrq(struct net_device *dev, ret = ib_post_recv(rx->qp, &priv->cm.rx_wr, &bad_wr); if (unlikely(ret)) { ipoib_warn(priv, "post recv failed for buf %d (%d)\n", id, ret); - ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, - rx->rx_ring[id].mapping); + ipoib_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, IPOIB_CM_HEAD_SIZE, + rx->rx_ring[id].mapping); dev_kfree_skb_any(rx->rx_ring[id].skb); rx->rx_ring[id].skb = NULL; } @@ -135,69 +125,17 @@ static int ipoib_cm_post_receive_nonsrq(struct net_device *dev, return ret; } -static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, - struct ipoib_cm_rx_buf *rx_ring, - int id, int frags, - u64 mapping[IPOIB_CM_RX_SG]) -{ - struct ipoib_dev_priv *priv = netdev_priv(dev); - struct sk_buff *skb; - int i; - - skb = dev_alloc_skb(IPOIB_CM_HEAD_SIZE + 12); - if (unlikely(!skb)) - return NULL; - - /* - * IPoIB adds a 4 byte header. So we need 12 more bytes to align the - * IP header to a multiple of 16. - */ - skb_reserve(skb, 12); - - mapping[0] = ib_dma_map_single(priv->ca, skb->data, IPOIB_CM_HEAD_SIZE, - DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, mapping[0]))) { - dev_kfree_skb_any(skb); - return NULL; - } - - for (i = 0; i < frags; i++) { - struct page *page = alloc_page(GFP_ATOMIC); - - if (!page) - goto partial_error; - skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE); - - mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page, - 0, PAGE_SIZE, DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1]))) - goto partial_error; - } - - rx_ring[id].skb = skb; - return skb; - -partial_error: - - ib_dma_unmap_single(priv->ca, mapping[0], IPOIB_CM_HEAD_SIZE, DMA_FROM_DEVICE); - - for (; i > 0; --i) - ib_dma_unmap_single(priv->ca, mapping[i], PAGE_SIZE, DMA_FROM_DEVICE); - - dev_kfree_skb_any(skb); - return NULL; -} - static void ipoib_cm_free_rx_ring(struct net_device *dev, - struct ipoib_cm_rx_buf *rx_ring) + struct ipoib_rx_buf *rx_ring) { struct ipoib_dev_priv *priv = netdev_priv(dev); int i; for (i = 0; i < ipoib_recvq_size; ++i) if (rx_ring[i].skb) { - ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, - rx_ring[i].mapping); + ipoib_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, + rx_ring[i].mapping); dev_kfree_skb_any(rx_ring[i].skb); } @@ -345,8 +283,12 @@ static int ipoib_cm_nonsrq_init_rx(struct net_device *dev, struct ib_cm_id *cm_i spin_unlock_irq(&priv->lock); for (i = 0; i < ipoib_recvq_size; ++i) { - if (!ipoib_cm_alloc_rx_skb(dev, rx->rx_ring, i, IPOIB_CM_RX_SG - 1, - rx->rx_ring[i].mapping)) { + rx->rx_ring[i].skb = ipoib_cm_alloc_rx_skb(dev, i, + IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, + 12, + rx->rx_ring[i].mapping); + if (!rx->rx_ring[i].skb) { ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); ret = -ENOMEM; goto err_count; @@ -480,43 +422,11 @@ static int ipoib_cm_rx_handler(struct ib_cm_id *cm_id, return 0; } } -/* Adjust length of skb with fragments to match received data */ -static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, - unsigned int length, struct sk_buff *toskb) -{ - int i, num_frags; - unsigned int size; - - /* put header into skb */ - size = min(length, hdr_space); - skb->tail += size; - skb->len += size; - length -= size; - - num_frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < num_frags; i++) { - skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - - if (length == 0) { - /* don't need this page */ - skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE); - --skb_shinfo(skb)->nr_frags; - } else { - size = min(length, (unsigned) PAGE_SIZE); - - frag->size = size; - skb->data_len += size; - skb->truesize += size; - skb->len += size; - length -= size; - } - } -} void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) { struct ipoib_dev_priv *priv = netdev_priv(dev); - struct ipoib_cm_rx_buf *rx_ring; + struct ipoib_rx_buf *rx_ring; unsigned int wr_id = wc->wr_id & ~(IPOIB_OP_CM | IPOIB_OP_RECV); struct sk_buff *skb, *newskb; struct ipoib_cm_rx *p; @@ -581,7 +491,8 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, (unsigned)IPOIB_CM_HEAD_SIZE)) / PAGE_SIZE; - newskb = ipoib_cm_alloc_rx_skb(dev, rx_ring, wr_id, frags, mapping); + newskb = ipoib_cm_alloc_rx_skb(dev, wr_id, frags, IPOIB_CM_HEAD_SIZE, + 12, mapping); if (unlikely(!newskb)) { /* * If we can't allocate a new RX buffer, dump @@ -592,7 +503,10 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) goto repost; } - ipoib_cm_dma_unmap_rx(priv, frags, rx_ring[wr_id].mapping); + rx_ring[wr_id].skb = newskb; + + ipoib_dma_unmap_rx(priv, frags, IPOIB_CM_HEAD_SIZE, + rx_ring[wr_id].mapping); memcpy(rx_ring[wr_id].mapping, mapping, (frags + 1) * sizeof *mapping); ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", @@ -1481,9 +1395,12 @@ int ipoib_cm_dev_init(struct net_device *dev) if (ipoib_cm_has_srq(dev)) { for (i = 0; i < ipoib_recvq_size; ++i) { - if (!ipoib_cm_alloc_rx_skb(dev, priv->cm.srq_ring, i, - priv->cm.num_frags - 1, - priv->cm.srq_ring[i].mapping)) { + priv->cm.srq_ring[i].skb = + ipoib_cm_alloc_rx_skb(dev, i, + priv->cm.num_frags - 1, + IPOIB_CM_HEAD_SIZE, 12, + priv->cm.srq_ring[i].mapping); + if (!priv->cm.srq_ring[i].skb) { ipoib_warn(priv, "failed to allocate " "receive buffer %d\n", i); ipoib_cm_dev_cleanup(dev); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 52bc2bd..138c758 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -116,6 +116,91 @@ static int ipoib_ib_post_receive(struct net_device *dev, int id) return ret; } +static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, + int id, int frags, int head_size, + int pad, u64 mapping) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct sk_buff *skb; + int i; + + skb = dev_alloc_skb(head_size + pad); + if (unlikely(!skb)) + return NULL; + + /* + * IPoIB adds a 4 byte header. So we need pad more bytes to align the + * IP header to a multiple of 16. For CM mode, you add pad 12, + * for UD mode, we add pad 4. + */ + skb_reserve(skb, pad); + + mapping[0] = ib_dma_map_single(priv->ca, skb->data, head_size, + DMA_FROM_DEVICE); + if (unlikely(ib_dma_mapping_error(priv->ca, mapping[0]))) { + dev_kfree_skb_any(skb); + return NULL; + } + + for (i = 0; i < frags; i++) { + struct page *page = alloc_page(GFP_ATOMIC); + + if (!page) + goto partial_error; + skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE); + + mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page, + 0, PAGE_SIZE, DMA_FROM_DEVICE); + if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1]))) + goto partial_error; + } + + return skb; + +partial_error: + + ib_dma_unmap_single(priv->ca, mapping[0], head_size, DMA_FROM_DEVICE); + + for (; i > 0; --i) + ib_dma_unmap_single(priv->ca, mapping[i], PAGE_SIZE, DMA_FROM_DEVICE); + + dev_kfree_skb_any(skb); + return NULL; +} + +/* Adjust length of skb with fragments to match received data */ +static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, + unsigned int length, struct sk_buff *toskb) +{ + int i, num_frags; + unsigned int size; + + /* put header into skb */ + size = min(length, hdr_space); + skb->tail += size; + skb->len += size; + length -= size; + + num_frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < num_frags; i++) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + + if (length == 0) { + /* don't need this page */ + skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE); + --skb_shinfo(skb)->nr_frags; + } else { + size = min(length, (unsigned) PAGE_SIZE); + + frag->size = size; + skb->data_len += size; + skb->truesize += size; + skb->len += size; + length -= size; + } + } +} + static int ipoib_alloc_rx_skb(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); From mashirle at us.ibm.com Wed Jan 30 11:33:31 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 11:33:31 -0800 Subject: [ofa-general] [PATCH 2/3] ib/ipoib: set IPoIB-UD RX S/G parameters In-Reply-To: <1201718540.6850.41.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> Message-ID: <1201721611.6850.48.camel@localhost.localdomain> Signed-off-by: Shirley Ma --- diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 138f1a3..65b1159 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -56,11 +56,11 @@ /* constants */ enum { - IPOIB_PACKET_SIZE = 2048, - IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, - IPOIB_ENCAP_LEN = 4, + IPOIB_MAX_IB_MTU = 4096, /* max ib device payload is 4096 */ + IPOIB_UD_MAX_RX_SG = ALIGN(IPOIB_MAX_IB_MTU + IB_GRH_BYTES + 4, PAGE_SIZE) / PAGE_SIZE, /* padding to align IP header */ + IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header to 16 */ IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU + IPOIB_ENCAP_LEN, IPOIB_CM_HEAD_SIZE = IPOIB_CM_BUF_SIZE % PAGE_SIZE, @@ -314,6 +314,9 @@ struct ipoib_dev_priv { struct dentry *mcg_dentry; struct dentry *path_dentry; #endif + int max_ib_mtu; + struct ib_sge rx_sge[IPOIB_UD_MAX_RX_SG]; + struct ib_recv_wr rx_wr; }; struct ipoib_ah { @@ -354,6 +357,11 @@ struct ipoib_neigh { struct list_head list; }; +#define IPOIB_UD_MTU(ib_mtu) (ib_mtu - IPOIB_ENCAP_LEN) +#define IPOIB_UD_BUF_SIZE(ib_mtu) (ib_mtu + IB_GRH_BYTES + 4) /* padding to align IP header */ +#define IPOIB_UD_HEAD_SIZE(ib_mtu) (IPOIB_UD_BUF_SIZE(ib_mtu)) % PAGE_SIZE +#define IPOIB_UD_RX_SG(ib_mtu) ALIGN(IPOIB_UD_BUF_SIZE(ib_mtu), PAGE_SIZE) / PAGE_SIZE + /* * We stash a pointer to our private neighbour information after our * hardware address in neigh->ha. The ALIGN() expression here makes diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index a082466..646aeb2 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -194,7 +194,7 @@ static int ipoib_change_mtu(struct net_device *dev, int new_mtu) return 0; } - if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) + if (new_mtu > IPOIB_UD_MTU(priv->max_ib_mtu)) return -EINVAL; priv->admin_mtu = new_mtu; @@ -968,10 +968,6 @@ static void ipoib_setup(struct net_device *dev) dev->tx_queue_len = ipoib_sendq_size * 2; dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; - /* MTU will be reset when mcast join happens */ - dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; - priv->mcast_mtu = priv->admin_mtu = dev->mtu; - memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); netif_carrier_off(dev); @@ -1103,6 +1099,7 @@ static struct net_device *ipoib_add_port(const char *format, struct ib_device *hca, u8 port) { struct ipoib_dev_priv *priv; + struct ib_port_attr attr; int result = -ENOMEM; priv = ipoib_intf_alloc(format); @@ -1111,6 +1108,18 @@ static struct net_device *ipoib_add_port(const char *format, SET_NETDEV_DEV(priv->dev, hca->dma_device); + if (!ib_query_port(hca, port, &attr)) + priv->max_ib_mtu = ib_mtu_enum_to_int(attr.max_mtu); + else { + printk(KERN_WARNING "%s: ib_query_port %d failed\n", + hca->name, port); + goto device_init_failed; + } + + /* MTU will be reset when mcast join happens */ + priv->dev->mtu = IPOIB_UD_MTU(priv->max_ib_mtu); + priv->mcast_mtu = priv->admin_mtu = priv->dev->mtu; + result = ib_query_pkey(hca, port, 0, &priv->pkey); if (result) { printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 2628339..630b429 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -567,8 +567,7 @@ void ipoib_mcast_join_task(struct work_struct *work) return; } - priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - - IPOIB_ENCAP_LEN; + priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 433e99a..eefdb6a 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -150,7 +150,7 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .max_send_wr = ipoib_sendq_size, .max_recv_wr = ipoib_recvq_size, .max_send_sge = 1, - .max_recv_sge = 1 + .max_recv_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu) }, .sq_sig_type = IB_SIGNAL_ALL_WR, .qp_type = IB_QPT_UD @@ -208,6 +208,16 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) priv->tx_wr.num_sge = 1; priv->tx_wr.send_flags = IB_SEND_SIGNALED; + priv->rx_sge[0].length = IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu); + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu); ++i) { + priv->rx_sge[i].lkey = priv->mr->lkey; + priv->rx_sge[i+1].length = PAGE_SIZE; + } + priv->rx_sge[i+1].length = PAGE_SIZE; + priv->rx_wr.num_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu); + priv->rx_wr.next = NULL; + priv->rx_wr.sg_list = priv->rx_sge; + return 0; out_free_cq: From mashirle at us.ibm.com Wed Jan 30 12:30:08 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 12:30:08 -0800 Subject: [ofa-general] [PATCH 3/3] ib/ipoib: IPoIB-UD RX S/G support In-Reply-To: <1201718540.6850.41.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> Message-ID: <1201725009.6850.54.camel@localhost.localdomain> Signed-off-by: Shirley Ma --- diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 65b1159..969955e 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -463,9 +463,9 @@ int ipoib_pkey_dev_delay_open(struct net_device *dev); void ipoib_drain_cq(struct net_device *dev); void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, unsigned int length, struct sk_buff *toskb); -struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, - int id, int frags, int head_size, - int pad, u64 *mapping); +struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, + int id, int frags, int head_size, + int pad, u64 *mapping); void inline ipoib_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, int head_size, u64 *mapping) { diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index c7d42ea..a9af796 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -283,11 +283,10 @@ static int ipoib_cm_nonsrq_init_rx(struct net_device *dev, struct ib_cm_id *cm_i spin_unlock_irq(&priv->lock); for (i = 0; i < ipoib_recvq_size; ++i) { - rx->rx_ring[i].skb = ipoib_cm_alloc_rx_skb(dev, i, - IPOIB_CM_RX_SG - 1, - IPOIB_CM_HEAD_SIZE, - 12, - rx->rx_ring[i].mapping); + rx->rx_ring[i].skb = ipoib_alloc_rx_skb(dev, i, + IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, 12, + rx->rx_ring[i].mapping); if (!rx->rx_ring[i].skb) { ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); ret = -ENOMEM; @@ -491,8 +490,8 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, (unsigned)IPOIB_CM_HEAD_SIZE)) / PAGE_SIZE; - newskb = ipoib_cm_alloc_rx_skb(dev, wr_id, frags, IPOIB_CM_HEAD_SIZE, - 12, mapping); + newskb = ipoib_alloc_rx_skb(dev, wr_id, frags, IPOIB_CM_HEAD_SIZE, + 12, mapping); if (unlikely(!newskb)) { /* * If we can't allocate a new RX buffer, dump @@ -1396,10 +1395,10 @@ int ipoib_cm_dev_init(struct net_device *dev) if (ipoib_cm_has_srq(dev)) { for (i = 0; i < ipoib_recvq_size; ++i) { priv->cm.srq_ring[i].skb = - ipoib_cm_alloc_rx_skb(dev, i, - priv->cm.num_frags - 1, - IPOIB_CM_HEAD_SIZE, 12, - priv->cm.srq_ring[i].mapping); + ipoib_alloc_rx_skb(dev, i, + priv->cm.num_frags - 1, + IPOIB_CM_HEAD_SIZE, 12, + priv->cm.srq_ring[i].mapping); if (!priv->cm.srq_ring[i].skb) { ipoib_warn(priv, "failed to allocate " "receive buffer %d\n", i); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 138c758..d6967ab 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -90,25 +90,18 @@ void ipoib_free_ah(struct kref *kref) static int ipoib_ib_post_receive(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); - struct ib_sge list; - struct ib_recv_wr param; struct ib_recv_wr *bad_wr; int ret; - list.addr = priv->rx_ring[id].mapping; - list.length = IPOIB_BUF_SIZE; - list.lkey = priv->mr->lkey; - - param.next = NULL; - param.wr_id = id | IPOIB_OP_RECV; - param.sg_list = &list; - param.num_sge = 1; - - ret = ib_post_recv(priv->qp, ¶m, &bad_wr); + priv->rx_wr.wr_id = id | IPOIB_OP_RECV; + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu); ++i) + priv->rx_sge[i].addr = priv->rx_ring[id].mapping[i]; + ret = ib_post_recv(priv->qp, &priv->rx_wr, &bad_wr); if (unlikely(ret)) { ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); - ib_dma_unmap_single(priv->ca, priv->rx_ring[id].mapping, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[id].mapping); dev_kfree_skb_any(priv->rx_ring[id].skb); priv->rx_ring[id].skb = NULL; } @@ -116,9 +109,9 @@ static int ipoib_ib_post_receive(struct net_device *dev, int id) return ret; } -static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, - int id, int frags, int head_size, - int pad, u64 mapping) +struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, + int id, int frags, int head_size, + int pad, u64 mapping) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct sk_buff *skb; @@ -201,43 +194,17 @@ static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, } } -static int ipoib_alloc_rx_skb(struct net_device *dev, int id) -{ - struct ipoib_dev_priv *priv = netdev_priv(dev); - struct sk_buff *skb; - u64 addr; - - skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); - if (!skb) - return -ENOMEM; - - /* - * IB will leave a 40 byte gap for a GRH and IPoIB adds a 4 byte - * header. So we need 4 more bytes to get to 48 and align the - * IP header to a multiple of 16. - */ - skb_reserve(skb, 4); - - addr = ib_dma_map_single(priv->ca, skb->data, IPOIB_BUF_SIZE, - DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { - dev_kfree_skb_any(skb); - return -EIO; - } - - priv->rx_ring[id].skb = skb; - priv->rx_ring[id].mapping = addr; - - return 0; -} - static int ipoib_ib_post_receives(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); int i; for (i = 0; i < ipoib_recvq_size; ++i) { - if (ipoib_alloc_rx_skb(dev, i)) { + priv->rx_ring[i].skb = ipoib_alloc_rx_skb(dev, i, + IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), 4, + priv->rx_ring[i].mapping); + if (!priv->rx_ring[i].skb) { ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); return -ENOMEM; } @@ -254,8 +221,9 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) { struct ipoib_dev_priv *priv = netdev_priv(dev); unsigned int wr_id = wc->wr_id & ~IPOIB_OP_RECV; - struct sk_buff *skb; + struct sk_buff *skb, *newskb; u64 addr; + int frags; ipoib_dbg_data(priv, "recv completion: id %d, status: %d\n", wr_id, wc->status); @@ -267,15 +235,15 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) } skb = priv->rx_ring[wr_id].skb; - addr = priv->rx_ring[wr_id].mapping; if (unlikely(wc->status != IB_WC_SUCCESS)) { if (wc->status != IB_WC_WR_FLUSH_ERR) ipoib_warn(priv, "failed recv event " "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); - ib_dma_unmap_single(priv->ca, addr, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[wr_id].mapping); dev_kfree_skb_any(skb); priv->rx_ring[wr_id].skb = NULL; return; @@ -288,11 +256,18 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) if (wc->slid == priv->local_lid && wc->src_qp == priv->qp->qp_num) goto repost; + frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, + (unsigned)IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu))) / PAGE_SIZE; + newskb = ipoib_alloc_rx_skb(dev, wr_id, frags, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + 4, mapping); + /* * If we can't allocate a new RX buffer, dump * this packet and reuse the old buffer. */ - if (unlikely(ipoib_alloc_rx_skb(dev, wr_id))) { + if (unlikely(!newskb)) { + ipoib_dbg(priv, "failed to allocate receive buffer %d\n", wr_id); ++dev->stats.rx_dropped; goto repost; } @@ -300,9 +275,12 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", wc->byte_len, wc->slid); - ib_dma_unmap_single(priv->ca, addr, IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, frags, IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[wr_id].mapping); + memcpy(priv->rx_ring[wr_id].mapping, mapping, + (frags + 1) * sizeof *mapping); - skb_put(skb, wc->byte_len); + skb_put_frags(skb, IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), wc->byte_len, newskb); skb_pull(skb, IB_GRH_BYTES); skb->protocol = ((struct ipoib_header *) skb->data)->proto; @@ -715,10 +693,10 @@ int ipoib_ib_dev_stop(struct net_device *dev, int flush) rx_req = &priv->rx_ring[i]; if (!rx_req->skb) continue; - ib_dma_unmap_single(priv->ca, - rx_req->mapping, - IPOIB_BUF_SIZE, - DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, + IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[i].mapping); dev_kfree_skb_any(rx_req->skb); rx_req->skb = NULL; } From mashirle at us.ibm.com Wed Jan 30 12:43:02 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 12:43:02 -0800 Subject: [ofa-general] [PATCH 2/3] ib/ipoib: set IPoIB-UD RX S/G parameters In-Reply-To: <1201721611.6850.48.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> <1201721611.6850.48.camel@localhost.localdomain> Message-ID: <1201725783.6850.60.camel@localhost.localdomain> Found a problem in patch generation file ipoib_verbs.c, I will fix it tomorrow, it should be: --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -150,7 +150,7 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .max_send_wr = ipoib_sendq_size, .max_recv_wr = ipoib_recvq_size, .max_send_sge = 1, - .max_recv_sge = 1 + .max_recv_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu) }, .sq_sig_type = IB_SIGNAL_ALL_WR, .qp_type = IB_QPT_UD @@ -208,6 +208,16 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) priv->tx_wr.num_sge = 1; priv->tx_wr.send_flags = IB_SEND_SIGNALED; + priv->rx_sge[0].length = IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu); + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1; ++i) { + priv->rx_sge[i].lkey = priv->mr->lkey; + priv->rx_sge[i + 1].length = PAGE_SIZE; + } + priv->rx_sge[i + 1].lkey = priv->mr->lkey; + priv->rx_wr.num_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu); + priv->rx_wr.next = NULL; + priv->rx_wr.sg_list = priv->rx_sge; + return 0; out_free_cq: From ogerlitz at voltaire.com Wed Jan 30 22:48:00 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 31 Jan 2008 08:48:00 +0200 Subject: [ofa-general] Re: IB/IPoIB Check if grat. AR changed had arrived when working in connected mode In-Reply-To: <1201616831.28794.1.camel@mtls03> References: <479F0390.8020102@voltaire.com> <1201619616.7074.3.camel@mtls03> <479F357F.5070808@gmail.com> <1201616831.28794.1.camel@mtls03> Message-ID: <47A16F20.9000507@voltaire.com> Eli Cohen wrote: >> Eli Cohen wrote: >>> Now you may call ipoib_put_ah(neigh->ah) for a CM neighbor and this >>> could cause de-reference of a NULL pointer. Eli, Not that your assumption changes anything regarding the patch correctness (and if it does, please suggest what problem is introduced by the patch), but, you might have some misconception here that for a connected mode neighbour neigh->ah is NULL. This is wrong, see patch_rec_completion where first an address handle is created and then assigned to the neighbour no matter if its a connected mode or datagram mode one. One might say that there's a resource waste here, since for connected mode neighbour the driver consumes two HCA resources (TX QP and AH) where for datagram mode neigh it consumes only one. This seems as not too easy cleanup, which maybe best left for the future generations... Or. From ogerlitz at voltaire.com Wed Jan 30 22:51:53 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 31 Jan 2008 08:51:53 +0200 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <1201718540.6850.41.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> Message-ID: <47A17009.9050407@voltaire.com> Shirley Ma wrote: > This patch allows IPoIB-UD MTU up to 4092 (4K - IPOIB_ENCAP_LEN) when > HCA can support 4K MTU. > The patch will be splitted into two patches: > > 1. Make IPoIB-CM RX S/G APIs generic > 2. Enable IPoIB-UD RX S/G Just to make sure, this patch is a candidate for upstream inclusion (which you want also to be present in ofed 1.3) and hence is based against Roland's tree, correct? Or From mashirle at us.ibm.com Wed Jan 30 12:58:09 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 12:58:09 -0800 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <47A17009.9050407@voltaire.com> References: <1201718540.6850.41.camel@localhost.localdomain> <47A17009.9050407@voltaire.com> Message-ID: <1201726689.19565.0.camel@localhost.localdomain> Hello Or, > Just to make sure, this patch is a candidate for upstream inclusion > (which you want also to be present in ofed 1.3) and hence is based > against Roland's tree, correct? Yes. I forgot to mention these patches are created against Roland's 2.6.25 tree. Thanks Shirley From ogerlitz at voltaire.com Wed Jan 30 23:04:44 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 31 Jan 2008 09:04:44 +0200 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <1201726689.19565.0.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> <47A17009.9050407@voltaire.com> <1201726689.19565.0.camel@localhost.localdomain> Message-ID: <47A1730C.3090204@voltaire.com> Shirley Ma wrote: > Yes. I forgot to mention these patches are created against Roland's > 2.6.25 tree. I see, but I want to make sure these patches are the one you want to merge into the kernel or its more of a work in progress which you want to be included in this experimental testbed called ofed If its candidate for upstream inclusion, I find it hard to review since there is no per patch change-log. Or. From mashirle at us.ibm.com Wed Jan 30 13:17:42 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 13:17:42 -0800 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <47A1730C.3090204@voltaire.com> References: <1201718540.6850.41.camel@localhost.localdomain> <47A17009.9050407@voltaire.com> <1201726689.19565.0.camel@localhost.localdomain> <47A1730C.3090204@voltaire.com> Message-ID: <1201727862.19565.7.camel@localhost.localdomain> Hello Or, On Thu, 2008-01-31 at 09:04 +0200, Or Gerlitz wrote: > Shirley Ma wrote: > > Yes. I forgot to mention these patches are created against Roland's > > 2.6.25 tree. > > I see, but I want to make sure these patches are the one you want to > merge into the kernel or its more of a work in progress which you want > to be included in this experimental testbed called ofed. I will create patch for OFED-1.3-RC3 separately. I wouldn't call it's experimental code since these APIs have been tested along with IPoIB-CM already. They are pretty stable. > If its candidate for upstream inclusion, I find it hard to review since > there is no per patch change-log. Thanks for the advice, I thought one change log was enough. If not, I will resubmit these patches along with one ling change-log. Thanks Shirley From ogerlitz at voltaire.com Wed Jan 30 23:27:19 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 31 Jan 2008 09:27:19 +0200 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <1201727862.19565.7.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> <47A17009.9050407@voltaire.com> <1201726689.19565.0.camel@localhost.localdomain> <47A1730C.3090204@voltaire.com> <1201727862.19565.7.camel@localhost.localdomain> Message-ID: <47A17857.8000000@voltaire.com> Shirley Ma wrote: > On Thu, 2008-01-31 at 09:04 +0200, Or Gerlitz wrote: >> I see, but I want to make sure these patches are the one you want to >> merge into the kernel or its more of a work in progress which you want >> to be included in this experimental testbed called ofed. > I will create patch for OFED-1.3-RC3 separately. I wouldn't call it's > experimental code since these APIs have been tested along with IPoIB-CM > already. They are pretty stable. I meant to say that ofed is an experimental testbed, this is becoming more and more clear to more and more people. I did not address your patchset specifically. >> If its candidate for upstream inclusion, I find it hard to review since >> there is no per patch change-log. > Thanks for the advice, I thought one change log was enough. If not, I > will resubmit these patches along with one ling change-log. If you think that for each of the patch one line change log is enough for a reviewer, let it be, but if I were you, I would validate again this assumption. The things is that when you send an RFC, many times most of the documentation is in the virtual 0/N patch, but remember that this documentation does not go into the git change-log, so in your case since you want this to be merged, you have to work harder and document both in the 0/N and also in the 1/N, 2/N ... N/N postings. Or From ogerlitz at voltaire.com Wed Jan 30 23:32:28 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 31 Jan 2008 09:32:28 +0200 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <1201718540.6850.41.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> Message-ID: <47A1798C.8050202@voltaire.com> Shirley Ma wrote: > The node IPoIB link MTU size is the minimum value of admin configurable > MTU through ifconfig and IPoIB default broadcast group MTU size. When > Subnet Manager enables default broadcast group during start up, this > subnet IPoIB link MTU will be the value of default broadcast group MTU > size. For any node IB MTU smaller than this value, the node can't join > this IPoIB subnet. For any node IB MTU is greater than this value, the > node will join this IPoIB subnet and this value will be set as its IPOIB > link MTU. If Subnet Manager disables default broadcast group during > start up, the first bring up node in this subnet will create the default > IPoIB broadcast group based on the negotiation with the Subnet Manager, > the default is currently set as 2K according to IPoIB RFC. Hi Shirley, Just to make sure, can you confirm that this patch set is not dependent on the below patch which is part of ofed but was never submitted to the upstream ipoib driver for inclusion? Also, can you share with what SM have you checked this, did you had to patch or run it with non-default param, more, what was the configuration, specifically what switch was used and any instrumentation you have made to the switch FW, thanks. Or > IB/ipoib: user appropriate mtu selector for path queries > > IPoIB must set mtu selector in path record query according to dev->mtu: > if we wildcard it, SM can select a path with lower MTU. > This breaks IPoIB on networks with SM Tavor quirk activates. > > We can always require this, since IPoIB spec includes the following statement: > The value (for IB MTU) assigned to the broadcast-GID must not > be greater than any physical link MTU spanned by the IPoIB > subnet. > > Signed-off-by: Michael S. Tsirkin > > --- > > Note the following uses IB_SA_GT so it should be applied on top of SA > enum rename. > > Index: ofed_1_1/drivers/infiniband/ulp/ipoib/ipoib_main.c > =================================================================== > --- ofed_1_1.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c > +++ ofed_1_1/drivers/infiniband/ulp/ipoib/ipoib_main.c > @@ -182,6 +182,8 @@ static int ipoib_change_mtu(struct net_d > > dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); > > + queue_work(ipoib_workqueue, &priv->flush_task); > + > return 0; > } > > @@ -452,15 +454,39 @@ static int path_rec_start(struct net_dev > struct ipoib_path *path) > { > struct ipoib_dev_priv *priv = netdev_priv(dev); > + ib_sa_comp_mask comp_mask = IB_SA_PATH_REC_MTU_SELECTOR | IB_SA_PATH_REC_MTU; > + > + path->pathrec.mtu_selector = IB_SA_GT; > > - ipoib_dbg(priv, "Start path record lookup for " IPOIB_GID_FMT "\n", > - IPOIB_GID_ARG(path->pathrec.dgid)); > + switch (roundup_pow_of_two(dev->mtu + IPOIB_ENCAP_LEN)) { > + case 512: > + path->pathrec.mtu = IB_MTU_256; > + break; > + case 1024: > + path->pathrec.mtu = IB_MTU_512; > + break; > + case 2048: > + path->pathrec.mtu = IB_MTU_1024; > + break; > + case 4096: > + path->pathrec.mtu = IB_MTU_2048; > + break; > + default: > + /* Wildcard everything */ > + comp_mask = 0; > + path->pathrec.mtu = 0; > + path->pathrec.mtu_selector = 0; > + } > + > + ipoib_dbg(priv, "Start path record lookup for " IPOIB_GID_FMT " MTU > %d\n", > + IPOIB_GID_ARG(path->pathrec.dgid), > + comp_mask ? ib_mtu_enum_to_int(path->pathrec.mtu) : 0); > > init_completion(&path->done); > > path->query_id = > ib_sa_path_rec_get(&ipoib_sa_client, priv->ca, priv->port, > - &path->pathrec, > + &path->pathrec, comp_mask | > IB_SA_PATH_REC_DGID | > IB_SA_PATH_REC_SGID | > IB_SA_PATH_REC_NUMB_PATH | From mashirle at us.ibm.com Wed Jan 30 13:53:31 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 13:53:31 -0800 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <47A17857.8000000@voltaire.com> References: <1201718540.6850.41.camel@localhost.localdomain> <47A17009.9050407@voltaire.com> <1201726689.19565.0.camel@localhost.localdomain> <47A1730C.3090204@voltaire.com> <1201727862.19565.7.camel@localhost.localdomain> <47A17857.8000000@voltaire.com> Message-ID: <1201730011.19565.11.camel@localhost.localdomain> > If you think that for each of the patch one line change log is enough > for a reviewer, let it be, but if I were you, I would validate again > this assumption. The things is that when you send an RFC, many times > most of the documentation is in the virtual 0/N patch, but remember that > this documentation does not go into the git change-log, so in your > case since you want this to be merged, you have to work harder and > document both in the 0/N and also in the 1/N, 2/N ... N/N postings. That's a good suggestion. It seems ipoib_cm.c has been changed in the past few hours. I am having trouble to apply them. I am cleaning my local tree and redo all patches with change-log. thanks Shirley From eli at mellanox.co.il Thu Jan 31 00:03:30 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 31 Jan 2008 10:03:30 +0200 Subject: [ofa-general] Re: IB/IPoIB Check if grat. AR changed had arrived when working in connected mode In-Reply-To: <47A16F20.9000507@voltaire.com> References: <479F0390.8020102@voltaire.com> <1201619616.7074.3.camel@mtls03> <479F357F.5070808@gmail.com> <1201616831.28794.1.camel@mtls03> <47A16F20.9000507@voltaire.com> Message-ID: <1201766610.27803.6.camel@mtls03> On Thu, 2008-01-31 at 08:48 +0200, Or Gerlitz wrote: > Eli Cohen wrote: > >> Eli Cohen wrote: > >>> Now you may call ipoib_put_ah(neigh->ah) for a CM neighbor and this > >>> could cause de-reference of a NULL pointer. > > Eli, > > Not that your assumption changes anything regarding the patch > correctness (and if it does, please suggest what problem is introduced > by the patch), but, you might have some misconception here that for a > connected mode neighbour neigh->ah is NULL. > > This is wrong, see patch_rec_completion where first an address handle is > created and then assigned to the neighbour no matter if its a connected > mode or datagram mode one. Yes, thanks for pointing this out. > > One might say that there's a resource waste here, since for connected > mode neighbour the driver consumes two HCA resources (TX QP and AH) > where for datagram mode neigh it consumes only one. This seems as not > too easy cleanup, which maybe best left for the future generations... > I think the waste of resources is not that significant and we have the benefit that it allows switching smoothly from CM to UD mode. From eli at dev.mellanox.co.il Thu Jan 31 00:14:26 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Thu, 31 Jan 2008 10:14:26 +0200 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: References: Message-ID: <1201767266.27803.8.camel@mtls03> On Wed, 2008-01-30 at 19:35 -0800, Shirley Ma wrote: > Hello Tziporet, > > > the hw checksum patch was removed from OFED 1.3 > > > > Tziporet > > Could youp please specify which patch has been removed? I still can > see a list of patches under RC3. here they are: > > ipoib_0010_Add-high-dma-support-to-ipoib.patch > ipoib_0020_Add-s-g-support-for-IPOIB.patch > ipoib_0030_hw_csum.patch > ipoib_0040_checksum-offload.patch > ipoib_0050_Add-LSO-support.patch > ipoib_0060_ethtool-support.patch > ipoib_0070_modiy_cq_params.patch > ipoib_0080_broadcast_null.patch > ipoib_0110_set_default_cq_patams.patch > ipoib_0030_hw_csum.patch has been removed From mashirle at us.ibm.com Wed Jan 30 14:18:09 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 14:18:09 -0800 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <47A1798C.8050202@voltaire.com> References: <1201718540.6850.41.camel@localhost.localdomain> <47A1798C.8050202@voltaire.com> Message-ID: <1201731489.19565.37.camel@localhost.localdomain> On Thu, 2008-01-31 at 09:32 +0200, Or Gerlitz wrote: > Hi Shirley, > > Just to make sure, can you confirm that this patch set is not > dependent > on the below patch which is part of ofed but was never submitted to > the > upstream ipoib driver for inclusion? No, this patchset is not dependent on any OFED patches. It's a pure patch set for 2.6.25 kernel. I have another version of this patchset which is built against OFED-1.3-RC2. I will update it to OFED-1.3-RC3. I hope I can get a quick ack for this patchset from maintainers to agree with this approach. There are around 1.5-2 times better performance I can see to use 4K MTU for IPoIB-UD. I will resumit this patchset tomorrow. You should wait for the new patchset since I have found some minor problem when I splitted these patches. > Also, can you share with what SM have you checked this, did you had > to > patch or run it with non-default param, more, what was the > configuration, specifically what switch was used and any > instrumentation > you have made to the switch FW, thanks. One of the reason this patchset was not be able to submit earlier was because of the SW support. I couldn't do a full test without SW supports 4K MTU. The SW firmware needs to be update to allow IPoIB broadcast group to be able to create 4096 MTU size. There are two requirements to the switch from SW perspective: 1. SW ports are able to configure to 4096 MTU size. 2. SW default IPoIB broadcast group is able to configure to 4096 MTU size. The default IPoIB broadcast group MTU can't exceed SW ports MTU size. The way to enable IPoIB 4K MTU is: 1. set SW ports to 4K MTU 2. set SM default IPoIB broadcast group MTU size as 4K. You could disable or enable IPoIB broadcast group when starting SM. If you don't enable IPoIB default broadcast group when starting SM, the first node in the subnet will come up and create a broadcast group with 2K MTU for this subnet. It makes sense since the node doesn't know the whole subnet link MTU size. So it's better to create a default 2K MTU. If you enable IPoIB default broadcast group when starting SM, if the MTU size is 2K, then all nodes in the cluster can join the subnet and the IPoIB subnet link MTU size will be set to 2K. If the broadcast group MTU size is 4K, then only nodes with 4K MTU can join this IPoIB subnet. I am not sure that's what you are looking for. Let me know if anything is unclear. thanks Shirley From mashirle at us.ibm.com Wed Jan 30 14:21:39 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 14:21:39 -0800 Subject: [ewg] Re: [ofa-general] Bonding and hw_csum In-Reply-To: <1201767266.27803.8.camel@mtls03> References: <1201767266.27803.8.camel@mtls03> Message-ID: <1201731699.19565.41.camel@localhost.localdomain> Hello Eli, > ipoib_0030_hw_csum.patch has been removed Would removing this patch cause any errors on applying the rest of patches? If not, I will remove it for our testing as well. Thanks Shirley From chiyomigrateful at americanagip.com Thu Jan 31 00:31:40 2008 From: chiyomigrateful at americanagip.com (chiyomigrateful at americanagip.com) Date: Thu, 31 Jan 2008 11:31:40 +0300 Subject: [ofa-general] Feel more pleasure in love! Message-ID: <47A1876C.3070703@americanagip.com> Pending meds on your way! http://89.123.205.87/ifzew/ From eli at dev.mellanox.co.il Thu Jan 31 00:29:56 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Thu, 31 Jan 2008 10:29:56 +0200 Subject: [ewg] Re: [ofa-general] Bonding and hw_csum In-Reply-To: <1201731699.19565.41.camel@localhost.localdomain> References: <1201767266.27803.8.camel@mtls03> <1201731699.19565.41.camel@localhost.localdomain> Message-ID: <1201768196.27803.11.camel@mtls03> On Wed, 2008-01-30 at 14:21 -0800, Shirley Ma wrote: > Hello Eli, > > > ipoib_0030_hw_csum.patch has been removed > > Would removing this patch cause any errors on applying the rest of > patches? If not, I will remove it for our testing as well. > If you're using an ofed tree in which this patch applies, then just removing it will cause quite a few conflicts on subsequent patches. I would suggest you to re-create your patches against the current ofed git tree. From mashirle at us.ibm.com Wed Jan 30 14:40:19 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 14:40:19 -0800 Subject: [ewg] Re: [ofa-general] Bonding and hw_csum In-Reply-To: <1201768196.27803.11.camel@mtls03> References: <1201767266.27803.8.camel@mtls03> <1201731699.19565.41.camel@localhost.localdomain> <1201768196.27803.11.camel@mtls03> Message-ID: <1201732819.19565.43.camel@localhost.localdomain> On Thu, 2008-01-31 at 10:29 +0200, Eli Cohen wrote: > If you're using an ofed tree in which this patch applies, then just > removing it will cause quite a few conflicts on subsequent patches. I > would suggest you to re-create your patches against the current ofed > git tree. Thanks, will do. Shirley From eli at mellanox.co.il Thu Jan 31 01:28:23 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 31 Jan 2008 11:28:23 +0200 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <47A1798C.8050202@voltaire.com> References: <1201718540.6850.41.camel@localhost.localdomain> <47A1798C.8050202@voltaire.com> Message-ID: <1201771703.27803.18.camel@mtls03> Hi shirley, my comments are: 1. The first patch (1/3) is malformed. I suggest you try to apply it before sending. 2. Make sure they compile before submitting - patch 1/3 for example changes ipoib_rx_buf struct ipoib_rx_buf { struct sk_buff *skb; u64 mapping; + u64 mapping[IPOIB_CM_RX_SG]; }; but does not change code in the UD flow to align with these changes. 3. Please put an explanation in the changelog. From eli at dev.mellanox.co.il Thu Jan 31 02:11:43 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Thu, 31 Jan 2008 12:11:43 +0200 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> Message-ID: <1201774303.27803.26.camel@mtls03> On Sat, 2008-01-26 at 01:44 -0800, Shirley Ma wrote: > > However I came up with a tricky approach that might work well. We > > would use two-element scatter lists for the receives, and post a > > 40-byte dummy buffer first and then a 4096 byte buffer for the actual > > packet. Since the only thing we do with the first 40 bytes is throw > > them away, we wouldn't even have to make the 40 bytes part of the skb; > > in fact we could have one buffer that every receive uses and never > > even touch the first entry of the scatter list after initialization. > > It would even save the skb_pull(skb, IB_GRH_BYTES); we currently do > > after receiving messages. > > > > What do you think? > > I thought the same thing before for one buffer allocation, I had a > little bit concern about whether IB_GRH could be used later. I have done > scatter-gather list patch already. It's based on the PAGE_SIZE whether > to use one buffer or two buffer, similar as IPoIB-CM S/G code. It's > under testing. The only thing I haven't finished is making S/G code more > generica and merge IPoIB-CM S/G and IPoIB-UD S/G buffer allocation > togather. Since IBM eHCA does support 4K MTU and we would like our > customer to use this feature in OFED-1.3 release. If I merge the > IPoIB-CM S/G code and IPoIB-UD S/G code, it would take much longer for > testing. I wonder whether it's OK to push IPoIB-UD S/G first then merge > IPoIB-UD and IPoIB-CM later. > I don't think it's a good idea to make the code more generic and use the same rx buffer scheme for both UD and CM. In the case of UD we would need 2 scatter entries - on for the dummy GRH bytes which can be initialized once to point at the same buffer, and the second to point to the real data buffer. I prefer to modify the UD code to work with 4K MTU. The reasoning is that CM scatters can reach on overall size of 17 which would require more memory and would consume more CPU cycles to handle (e.g. when a packet is received). This can be crucial for the cases where small UDP packets performance is needed. From ogerlitz at voltaire.com Thu Jan 31 02:31:16 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 31 Jan 2008 12:31:16 +0200 (IST) Subject: [ofa-general] back to the max_qp_wr attribute Message-ID: Hi, Doing HCA query (using ibv_devinfo on a system with two HCAs, memfull Arbel and connectx, see details below), I have noticed that the value of the max_qp_wr attribute is different, 64K on Arbel and 16K on connectx. I thought that its possible that on the mlx4 case some filter function is applied on the values returned by the FW, but I could not find an evidence for that in the code - Roland, am I right and you return the FW values? Jack, if this is not the case, can you confirm that the connectx supported value is indeed 16K, is it FW depedent? Or hca_id: mthca0 fw_ver: 4.7.600 node_guid: 0008:f104:0398:311c sys_image_guid: 0008:f104:0398:311f vendor_id: 0x08f1 vendor_part_id: 25208 hw_ver: 0xA0 board_id: VLT0040010001 phys_port_cnt: 2 max_mr_size: 0xffffffffffffffff page_size_cap: 0xfffff000 max_qp: 64512 max_qp_wr: 65535 device_cap_flags: 0x00001c76 max_sge: 59 max_sge_rd: 0 max_cq: 65408 max_cqe: 131071 max_mr: 131056 max_pd: 32768 max_qp_rd_atom: 4 max_ee_rd_atom: 0 max_res_rd_atom: 258048 max_qp_init_rd_atom: 128 max_ee_init_rd_atom: 0 atomic_cap: ATOMIC_HCA (1) max_ee: 0 max_rdd: 0 max_mw: 0 max_raw_ipv6_qp: 0 max_raw_ethy_qp: 0 max_mcast_grp: 8192 max_mcast_qp_attach: 56 max_total_mcast_qp_attach: 458752 max_ah: 0 max_fmr: 0 max_srq: 960 max_srq_wr: 65535 max_srq_sge: 31 max_pkeys: 64 local_ca_ack_delay: 15 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 35 port_lid: 3 port_lmc: 0x00 max_msg_sz: 0x80000000 port_cap_flags: 0x02510a68 max_vl_num: 8 (4) bad_pkey_cntr: 0x0 qkey_viol_cntr: 0x0 sm_sl: 0 pkey_tbl_len: 64 gid_tbl_len: 32 subnet_timeout: 18 init_type_reply: 0 active_width: 4X (2) active_speed: 2.5 Gbps (1) phys_state: LINK_UP (5) GID[ 0]: fe80:0000:0000:0001:0008:f104:0398:311d port: 2 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 35 port_lid: 18 port_lmc: 0x00 max_msg_sz: 0x80000000 port_cap_flags: 0x02510a68 max_vl_num: 8 (4) bad_pkey_cntr: 0x0 qkey_viol_cntr: 0x0 sm_sl: 0 pkey_tbl_len: 64 gid_tbl_len: 32 subnet_timeout: 18 init_type_reply: 0 active_width: 4X (2) active_speed: 2.5 Gbps (1) phys_state: LINK_UP (5) GID[ 0]: fe80:0000:0000:0001:0008:f104:0398:311e hca_id: mlx4_0 fw_ver: 2.2.000 node_guid: 0000:0002:c900:1a24 sys_image_guid: 0000:0002:c900:1a27 vendor_id: 0x08f1 vendor_part_id: 25418 hw_ver: 0xA0 board_id: VLT0130010001 phys_port_cnt: 2 max_mr_size: 0xffffffffffffffff page_size_cap: 0xfffff000 max_qp: 65472 max_qp_wr: 16384 device_cap_flags: 0x00041c66 max_sge: 32 max_sge_rd: 0 max_cq: 65408 max_cqe: 4194303 max_mr: 131056 max_pd: 32764 max_qp_rd_atom: 16 max_ee_rd_atom: 0 max_res_rd_atom: 1047552 max_qp_init_rd_atom: 128 max_ee_init_rd_atom: 0 atomic_cap: ATOMIC_HCA (1) max_ee: 0 max_rdd: 0 max_mw: 0 max_raw_ipv6_qp: 0 max_raw_ethy_qp: 0 max_mcast_grp: 8192 max_mcast_qp_attach: 56 max_total_mcast_qp_attach: 458752 max_ah: 0 max_fmr: 0 max_srq: 65472 max_srq_wr: 16383 max_srq_sge: 31 max_pkeys: 128 local_ca_ack_delay: 15 port: 1 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 0 port_lid: 0 port_lmc: 0x00 max_msg_sz: 0x40000000 port_cap_flags: 0x02510868 max_vl_num: 8 (4) bad_pkey_cntr: 0x0 qkey_viol_cntr: 0x0 sm_sl: 0 pkey_tbl_len: 128 gid_tbl_len: 128 subnet_timeout: 0 init_type_reply: 0 active_width: 4X (2) active_speed: 2.5 Gbps (1) phys_state: POLLING (2) GID[ 0]: fe80:0000:0000:0000:0000:0002:c900:1a25 port: 2 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 0 port_lid: 0 port_lmc: 0x00 max_msg_sz: 0x40000000 port_cap_flags: 0x02510868 max_vl_num: 8 (4) bad_pkey_cntr: 0x0 qkey_viol_cntr: 0x0 sm_sl: 0 pkey_tbl_len: 128 gid_tbl_len: 128 subnet_timeout: 0 init_type_reply: 0 active_width: 4X (2) active_speed: 2.5 Gbps (1) phys_state: POLLING (2) GID[ 0]: fe80:0000:0000:0000:0000:0002:c900:1a26 From vlad at lists.openfabrics.org Thu Jan 31 03:11:48 2008 From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox) Date: Thu, 31 Jan 2008 03:11:48 -0800 (PST) Subject: [ofa-general] ofa_1_3_kernel 20080131-0200 daily build status Message-ID: <20080131111148.C1468E6109E@openfabrics.org> This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git git_branch: ofed_kernel Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.21.1 Passed on i686 with linux-2.6.22 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.12 Passed on x86_64 with linux-2.6.20 Passed on x86_64 with linux-2.6.16 Passed on ia64 with linux-2.6.18 Passed on ia64 with linux-2.6.17 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.16 Passed on ppc64 with linux-2.6.15 Passed on ia64 with linux-2.6.12 Passed on x86_64 with linux-2.6.18 Passed on powerpc with linux-2.6.15 Passed on ia64 with linux-2.6.19 Passed on ppc64 with linux-2.6.14 Passed on ppc64 with linux-2.6.18 Passed on powerpc with linux-2.6.13 Passed on x86_64 with linux-2.6.19 Passed on powerpc with linux-2.6.12 Passed on ppc64 with linux-2.6.19 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.13 Passed on ia64 with linux-2.6.16 Passed on ppc64 with linux-2.6.17 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.16.43-0.3-smp Passed on x86_64 with linux-2.6.13 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.17 Passed on ppc64 with linux-2.6.13 Passed on x86_64 with linux-2.6.15 Passed on ia64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.18-8.el5 Passed on x86_64 with linux-2.6.22 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ia64 with linux-2.6.22 Passed on x86_64 with linux-2.6.21.1 Passed on x86_64 with linux-2.6.9-55.ELsmp Passed on ia64 with linux-2.6.23 Passed on x86_64 with linux-2.6.22.5-31-default Passed on ia64 with linux-2.6.16.21-0.8-default Passed on x86_64 with linux-2.6.18-53.el5 Failed: From dotanb at dev.mellanox.co.il Thu Jan 31 03:19:00 2008 From: dotanb at dev.mellanox.co.il (Dotan Barak) Date: Thu, 31 Jan 2008 13:19:00 +0200 Subject: [ofa-general] back to the max_qp_wr attribute In-Reply-To: References: Message-ID: <47A1AEA4.7020206@dev.mellanox.co.il> Or Gerlitz wrote: > Hi, > > Doing HCA query (using ibv_devinfo on a system with two HCAs, memfull > Arbel and connectx, see details below), I have noticed that the value > of the max_qp_wr attribute is different, 64K on Arbel and 16K on connectx. > > I thought that its possible that on the mlx4 case some filter function is > applied on the values returned by the FW, but I could not find an evidence > for that in the code - Roland, am I right and you return the FW values? > > Jack, if this is not the case, can you confirm that the connectx supported > value is indeed 16K, is it FW depedent? > > In mlx4 (and in other HCAs too) this value comes from the FW using the command QUERY_DEV_CAP. Dotan From ogerlitz at voltaire.com Thu Jan 31 04:47:21 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 31 Jan 2008 14:47:21 +0200 Subject: [ofa-general] back to the max_qp_wr attribute In-Reply-To: <47A1AEA4.7020206@dev.mellanox.co.il> References: <47A1AEA4.7020206@dev.mellanox.co.il> Message-ID: <47A1C359.4050906@voltaire.com> Dotan Barak wrote: > In mlx4 (and in other HCAs too) this value comes from the FW using the > command QUERY_DEV_CAP. I see. Looking on the dynamic params which can be changed in the driver load (see below the mthca ones), this param can't be changed, so where the limit lies, is it in the connectx FW or HW? Or. > linux-2.6.24-rc8 # modinfo ib_mthca | grep num > parm: fmr_reserved_mtts:number of memory translation table segments reserved for FMR (int) > parm: num_udav:maximum number of UD address vectors per HCA (int) > parm: num_mtt:maximum number of memory translation table segments per HCA (int) > parm: num_mpt:maximum number of memory protection table entries per HCA (int) > parm: num_mcg:maximum number of multicast groups per HCA (int) > parm: num_cq:maximum number of CQs per HCA (int) > parm: rdb_per_qp:number of RDB buffers per QP (int) > parm: num_qp:maximum number of QPs per HCA (int) > From koen.segers at vrt.be Thu Jan 31 04:57:42 2008 From: koen.segers at vrt.be (Koen Segers) Date: Thu, 31 Jan 2008 13:57:42 +0100 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: <47A0A90B.40506@mellanox.co.il> References: <479DFA58.7050800@intec.ugent.be> <47A07CC9.8030005@voltaire.com> <47A0A90B.40506@mellanox.co.il> Message-ID: <1201784262.7095.17.camel@koenVRT> On Wed, 2008-01-30 at 18:42 +0200, Tziporet Koren wrote: > Or Gerlitz wrote: > > > > This is interesting report, however, since currently the hw checksum > > patch in not being submitted to the mainline kernel and it is also > > about to be removed from ofed 1.3 (Tziporet, can you update on that?), > > I am not going to look into that. > > > > Or. > > > the hw checksum patch was removed from OFED 1.3 I just saw some patches on the mailing list concerning csum offloading. Are these applied in RC3? Or are they going to be introduced in the daily build of tomorrow? Is it correct to state that these patches replace the hw_csum parameter by offloading the csum computation to the mthca? This would mean that the results should be similar also. Does the new offload patch depend on the type of hca being used? According to lspci, we have the "InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (rev a0)" card. Do these patches work on a sles 10 sp1 installed on x3755 and x3655 machines of IBM that have this card inserted? Is bonding going to work with this type of offloading? Kind Regards Koen > > Tziporet > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general *** Disclaimer *** Vlaamse Radio- en Televisieomroep Auguste Reyerslaan 52, 1043 Brussel nv van publiek recht BTW BE 0244.142.664 RPR Brussel http://www.vrt.be/disclaimer From ogerlitz at voltaire.com Thu Jan 31 05:08:11 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 31 Jan 2008 15:08:11 +0200 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: <1201784262.7095.17.camel@koenVRT> References: <479DFA58.7050800@intec.ugent.be> <47A07CC9.8030005@voltaire.com> <47A0A90B.40506@mellanox.co.il> <1201784262.7095.17.camel@koenVRT> Message-ID: <47A1C83B.1040700@voltaire.com> Koen Segers wrote: > I just saw some patches on the mailing list concerning csum offloading. > Are these applied in RC3? Or are they going to be introduced in the > daily build of tomorrow? > Is it correct to state that these patches replace the hw_csum parameter > by offloading the csum computation to the mthca? This would mean that > the results should be similar also. no and no, best if you take a look on the presentation @ http://openfabrics.org/archives/nov2007sc/IPoIB-UD%20SO.pdf Basically the "checksum offloading" patches are for the datagram mode and is the standard offload as in the Ethernet world, where the "hw_csum" patch was for the connected mode. > Does the new offload patch depend on the type of hca being used? > According to lspci, we have the "InfiniBand: Mellanox Technologies > MT25208 InfiniHost III Ex (rev a0)" card. Do these patches work on a > sles 10 sp1 installed on x3755 and x3655 machines of IBM that have this > card inserted? checksum offloading is supported by the connectx and some of the other Mellanox devices, I am quite sure that 25208 is one of them, but you have to clarify this with Mellanox > Is bonding going to work with this type of offloading? sure! and if not, we will fix it. Or. From ogerlitz at voltaire.com Thu Jan 31 05:15:40 2008 From: ogerlitz at voltaire.com (Or Gerlitz) Date: Thu, 31 Jan 2008 15:15:40 +0200 Subject: [ofa-general] Re: [PATCH 0/16] ipoib offload patches In-Reply-To: <1201710638.28794.166.camel@mtls03> References: <1201710638.28794.166.camel@mtls03> Message-ID: <47A1C9FC.7000609@voltaire.com> Eli Cohen wrote: > following this email is a list of patches aimed at improving ipoib UD > mode performance by using various stateless offload facilities. > I am resending the full list again and would like to see it get into > 2.6.25. I denote them all as v4. OK, Roland, except for the LSO patches I am done with my review so for the below patches I say > 0001-IB-ipoib-Add-high-dma-support.patch > 0002-IB-ipoib-Add-s-g-support.patch > 0003-IB-core-Add-checksum-offload-support.patch > 0004-IB-ipoib-Add-checksum-offload-support.patch > 0005-IB-mlx4-Add-checksum-offload-support.patch > 0006-IB-mthca-Add-checksum-offload-support.patch > 0012-IB-ipoib-Add-ethtool-support-to-IPOIB.patch > 0013-IB-core-Add-support-for-modify-CQ.patch > 0014-IB-ipoib-Support-modifying-IPOIB-CQ-moderation-para.patch > 0015-IB-mlx4-Add-support-for-modifying-CQ-parameters.patch > 0016-IB-ipoib-Set-default-CQ-moderation-parameters.patch Reviewed-by: Or Gerlitz I might be able to look on the below patches later next week, however since the merge window has few more days, best if we can make progress here with or without them. > 0007-IB-core-Add-creation-flags-to-QPs.patch > 0008-IB-core-Add-support-for-LSO.patch > 0009-IB-ipoib-Add-LSO-support.patch > 0010-IB-mlx4-Add-creation-flags-to-mlx4-QPs.patch > 0011-IB-mlx4-Add-LSO-support-to-mlx4.patch and, Eli, keep with the good work! Or From tziporet at dev.mellanox.co.il Thu Jan 31 05:20:33 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Thu, 31 Jan 2008 15:20:33 +0200 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness In-Reply-To: References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com><1201639285.28486.101.camel@firewall.xsintricity.com> <47A0A86A.5060003@mellanox.co.il> Message-ID: <47A1CB21.40103@mellanox.co.il> Woodruff, Robert J wrote: > > I hate to keep slipping this, but I think it is important to get > what RedHat needs into OFED 1.3, so I am not apposed to this. > > I think however that perhaps after 1.3, we should discuss our process > a bit to try to get a little better at making our original > release dates. I think we are getting hit with feature creep, allowing > some pretty major changes after the feature freeze date, late in the > release cycle. > I agree - we must do a better work in OFED 1.4 Main thing is that all companies will think in advance on the new features they want to drive and not come with features in the last minute. > I also think that we do need to be a little more careful > and selective about what features go into OFED, as it is suppose to be > an enterprise release rather than an experimental code release. > This is true but from first OFED version we decided that not all components must be in production level and that we allow components that are in experimental state as long as they do not harm the stability of the full package We may revisit this decision. I think we should have a session on OFED target and expectations in Sonoma > For the kernel code, I think that this means keeping things a little > closer to the kernel.org kernel features and if something is not > upstream, then > press for getting it upstream (or at least queued for upsteam) > rather than allowing big patches into OFED that have not had a good > review. > The way we are working now, if it is getting into OFED, people are less > aggressive at getting things upstream. > > Perhaps we can have a discussion about this at the Sonoma workshop. > > > I agree we should have such a discussion at Sonoma Tziporet From eli at mellanox.co.il Thu Jan 31 05:21:40 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 31 Jan 2008 15:21:40 +0200 Subject: [ofa-general] Re: [PATCH 0/16] ipoib offload patches In-Reply-To: <47A1C9FC.7000609@voltaire.com> References: <1201710638.28794.166.camel@mtls03> <47A1C9FC.7000609@voltaire.com> Message-ID: <1201785700.27803.44.camel@mtls03> On Thu, 2008-01-31 at 15:15 +0200, Or Gerlitz wrote: > Eli Cohen wrote: > > following this email is a list of patches aimed at improving ipoib UD > > mode performance by using various stateless offload facilities. > > I am resending the full list again and would like to see it get into > > 2.6.25. I denote them all as v4. > > OK, Roland, except for the LSO patches I am done with my review so for > the below patches I say > > 0001-IB-ipoib-Add-high-dma-support.patch > > 0002-IB-ipoib-Add-s-g-support.patch > > 0003-IB-core-Add-checksum-offload-support.patch > > 0004-IB-ipoib-Add-checksum-offload-support.patch > > 0005-IB-mlx4-Add-checksum-offload-support.patch > > 0006-IB-mthca-Add-checksum-offload-support.patch > > 0012-IB-ipoib-Add-ethtool-support-to-IPOIB.patch > > 0013-IB-core-Add-support-for-modify-CQ.patch > > 0014-IB-ipoib-Support-modifying-IPOIB-CQ-moderation-para.patch > > 0015-IB-mlx4-Add-support-for-modifying-CQ-parameters.patch > > 0016-IB-ipoib-Set-default-CQ-moderation-parameters.patch > > Reviewed-by: Or Gerlitz > > I might be able to look on the below patches later next week, however > since the merge window has few more days, best if we can make progress > here with or without them. > > > 0007-IB-core-Add-creation-flags-to-QPs.patch > > 0008-IB-core-Add-support-for-LSO.patch > > 0009-IB-ipoib-Add-LSO-support.patch > > 0010-IB-mlx4-Add-creation-flags-to-mlx4-QPs.patch > > 0011-IB-mlx4-Add-LSO-support-to-mlx4.patch > > and, Eli, keep with the good work! > Or, thanks for investing the time and effort in reviewing. I would appreciate if more people could review and provide comments on the patches above. Specifically on the patches the Or did not review as I think they're important since they give performance boost for TCP. Thanks. From dwshirtailm at shirtail.com Thu Jan 31 07:18:37 2008 From: dwshirtailm at shirtail.com (Sherri Dickens) Date: Thu, 31 Jan 2008 23:18:37 +0800 Subject: [ofa-general] Medications that you need. Message-ID: <01c8645f$9b494480$897c203b@dwshirtailm> Buy Must Have medications at Canada based pharmacy. No prescription at all! Same quality! Save your money, buy pills immediately! http://geocities.com/efrenadams5 We provide confidential and secure purchase! From mashirle at us.ibm.com Wed Jan 30 21:36:56 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 21:36:56 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: <1201774303.27803.26.camel@mtls03> References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> <1201774303.27803.26.camel@mtls03> Message-ID: <1201757816.19565.62.camel@localhost.localdomain> Hello Eli, > I don't think it's a good idea to make the code more generic and use the > same rx buffer scheme for both UD and CM. In the case of UD we would > need 2 scatter entries - on for the dummy GRH bytes which can be > initialized once to point at the same buffer, and the second to point to > the real data buffer. I prefer to modify the UD code to work with 4K > MTU. > The reasoning is that CM scatters can reach on overall size of 17 which > would require more memory and would consume more CPU cycles to handle > (e.g. when a packet is received). This can be crucial for the cases > where small UDP packets performance is needed. What I meant here is to have some generic APIs for buffer allocation based on PAGE_SIZE, IPoIB payload size, HEAD_SIZE. In this way, if PAGE_SIZE is greater than IPoIB payload size (IPoIB-CM payload size is different than IPoIB-CM size) plus padding (IPoIB-CM padding is different than IPoIB-UD padding) to align IP header to 16 bytes, then only one buffer is needed. Number of buffers is dependent on IPoIB payload size + padding size. For example, let's assume the PAGE_SIZE = 4K, only one buffer is allocated for 2K MTU (IPoIB mtu is 2K-4), two buffer are allocated for 4K MTU(IPoIB mtu is 4K-4), 16 buffers are allocated for 64K MTU(IPoIB mtu is 64K-12). If PAGE_SIZE = 64K, then only one buffer is allocated for both IPoIB-UD and IPoIB-CM. Actually, I found a problem in IPoIB-CM here. Looks like most of HCAs do support S/G numbers > 16. Then the other OS implementation could have IPoIB link MTU = 64K -4 instead of current implementation 64K-16. It could be a problem for this implementation to be in a mixed OS evviornment. Just like my previous simple approach implementation for IPoIB-UD, which is IPoIB link MTU = 4K - 48. Roland, Do you want me to fix this? Thanks Shirley From mashirle at us.ibm.com Wed Jan 30 21:44:08 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 21:44:08 -0800 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <1201771703.27803.18.camel@mtls03> References: <1201718540.6850.41.camel@localhost.localdomain> <47A1798C.8050202@voltaire.com> <1201771703.27803.18.camel@mtls03> Message-ID: <1201758248.19565.69.camel@localhost.localdomain> Thanks Eli, I found it yesterday night too. I made a mistake, my local tree was not clean somehow. I have been working too much in the past few days for OFED-1.3 validation. When you are tired, it's easy to make mistake. Sorry about that. I sent out an email for checking today's patch already. Somehow, I couldn't receive your email on time. Looks like, some of my emails got warning saying that the email needs to be approved since it matches spam contents. Do you have any idea why it's blocked? Thanks Shirley From jlentini at netapp.com Thu Jan 31 07:50:35 2008 From: jlentini at netapp.com (James Lentini) Date: Thu, 31 Jan 2008 10:50:35 -0500 (EST) Subject: [ofa-general] Re: Status of NFS-RDMA ? In-Reply-To: References: Message-ID: Krishna, If you would like to do some testing/development on NFS/RDMA, take a look at the current NFS/RDMA code. There are instructions on were to get it and how to set it up here: http://nfs-rdma.sourceforge.net/Documents/README I'm revising the instructions for 2.6.25. I'll be posting the new version once the first 2.6.25-rc is released. We would appreciate feedback in this area as well. james On Thu, 31 Jan 2008, Krishna Kumar2 wrote: > Hi Jeff & James, > > Great. If you let me know when the bits are ready (I don't always read the > mailing list), > I should be able to get some testing done. > > Thanks, > > - KK > > Jeff Becker wrote on 01/30/2008 11:02:09 PM: > > > Hi all. > > > > James Lentini wrote: > > > On Wed, 30 Jan 2008, Krishna Kumar2 wrote: > > > > > > > > >> Hi James, > > >> > > >> Since you had mentioned in an earlier email that NFS-RDMA server > > >> side will be present in OFED1.4, > > >> > > > > > > Actually, that was Tziporet. > > > > > > > > >> do you know if any port of the server code to OFED1.3 (when it comes > > >> out) will happen? Is there any effort for that, any work ongoing, > > >> any help required, etc? > > >> > > > > > > Jeff Becker had looked into this. We would definitely appreciate the > > > help. > > > > > I have set up a git tree for NFSoRDMA and succesfully merged it with, > > and built it on OFED 1.3-rcx. I'm currently doing the backports (SLES 10 > > SP1 first). All this is in preparation for OFED 1.4, as that is when > > NFSoRDMA will be included in OFED. I think I have this > > patching/backporting stuff under control. However, my testing resources > > are limited. Thus depending on your platform, I might be able to point > > you at OFED 1.3 based bits for testing if/when they are ready. Thanks. > > > > -jeff > > > > > The NFS framework has changed significantly in several areas in recent > > > kernel releases. This has made backporting the NFS/RDMA code to older > > > kernels challenging. > > > > > > If you are interested in working on OFED1.3 support, let us know. > > > > > > > > >> I couldn't find the release time lines for OFED1.4, is there any > > >> link on openfabrics homepage? > > >> > > > > > > I'm not involved with the OFED1.4 planning. Tziporet, is there > > > information on this? > > > > > > > > >> Thanks, > > >> > > >> - KK > > >> > > >> general-bounces at lists.openfabrics.org wrote on 01/29/2008 08:23:46 PM: > > >> > > >> > > >>> On Tue, 29 Jan 2008, Pawel Dziekonski wrote: > > >>> > > >>> > > >>>> On Mon, 28 Jan 2008 at 10:14:22AM -0500, James Lentini wrote: > > >>>> > > >>>>> On Sat, 26 Jan 2008, Pawel Dziekonski wrote: > > >>>>> > > >>>>> > > >>>>>> I pulled Tom's tree from new url and build a kernel. > > >>>>>> > > >>>>> If you enabled support for INFINIBAND drivers (IB and iWARP > support) > > >>>>> and NFS client/server support, the kernel should be ready to go > (run > > >>>>> "grep RDMA /your_kernel_sources/.config" to confirm that > > >>>>> CONFIG_SUNRPC_XPRT_RDMA is either m or y). > > >>>>> > > >>>>> NFS/RDMA doesn't require OFED be installed. OFED is a release of > the > > >>>>> Linux kernel sources and some userspace libraries/tools. If you are > > >>>>> > > >>>>>> then I downloaded OFED from > > >>>>>> http://www.mellanox.com/downloads/NFSoRDMA/OFED-1.2-NFS-RDMA.gz, > > >>>>>> > > >>>>> I don't know what the above URL contains. The latest code is in Tom > > >>>>> Tucker's tree (and now NFS server maintainer Bruce Fields tree). It > > >>>>> > > >> is > > >> > > >>>> hi, > > >>>> > > >>>> back to subject on a proper mailing list. > > >>>> > > >>>> I have a >3 year experience with mellanox hardware and IBGold so I > > >>>> basically know what OFED is all about. up to now i was only using > > >>>> IBGold since IB drivers appeared in kernel pretty recently. > > >>>> > > >>> You'll want to use the mainline kernel's IB drivers for NFS/RDMA. > > >>> We've been developing the NFS/RDMA software on the OpenFabrics (aka > > >>> OpenIB) code since it was merged into 2.6.10 in Dec 2004. > > >>> > > >>> > > >>>> currently I have new hardware. I'm running Tom's kernel and already > > >>>> did some MPI tests. SDP is not working, probably because sdp kernel > > >>>> modules where not build. ;) I understand that those modules are only > > >>>> available from ofa-kernel. please correct me if i'm wrong. > > >>>> > > >>> Correct. SDP has never been submitted to mainline Linux. > > >>> > > >>> > > >>>> system is Scientic Linux 4.5, which is supposed to be a fully > > >>>> compatible RH4 clone. hardware is Supermicro mobos with Mellanox > > >>>> MT25204 and Flextronisc switch. > > >>>> > > >>>> error log from ofa-kernel build: > > >>>> > > >>> Is your goal to build a kernel with an NFS/RDMA server? If so, the > > >>> kernel sources from Tom Tucker's git tree are the ones you want, not > > >>> the old OFED 1.2-based packages which are out of date. > > >>> > > >>> Did you try setting up the NFS/RDMA server on the kernel used for > your > > >>> MPI tests above? > > >>> > > >>> > > >>>>>> make[1]: Entering directory `/usr/src/ib/xprt-switch-2.6' > > >>>>>> test -e include/linux/autoconf.h -a -e include/config/auto.conf || > > >>>>>> > > >> ( \ > > >> > > >>>>>> echo; \ > > >>>>>> echo " ERROR: Kernel configuration is invalid."; \ > > >>>>>> echo " include/linux/autoconf.h or > include/config/auto.conf > > >>>>>> > > >> are > > >> > > >>> missing."; \ > > >>> > > >>>>>> echo " Run 'make oldconfig && make prepare' on kernel src > > >>>>>> > > >> to fix it."; \ > > >> > > >>>>>> echo; \ > > >>>>>> /bin/false) > > >>>>>> > > >>>>>> obviously, doing 'make oldconfig && make prepare' does not help. > > >>>>>> anyway, above mentioned files do exist: > > >>>>>> > > >>>>>> # ls -la /usr/src/ib/xprt-switch-2.6/{include/linux/autoconf.h, > > >>>>>> > > >>> include/config/auto.conf} > > >>> > > >>>>>> -rw-r--r-- 1 root root 10156 Jan 25 17:42 > > >>>>>> > > >> /usr/src/ib/xprt-switch-2. > > >> > > >>> 6/include/config/auto.conf > > >>> > > >>>>>> -rw-r--r-- 1 root root 14733 Jan 25 17:42 > > >>>>>> > > >> /usr/src/ib/xprt-switch-2. > > >> > > >>> 6/include/linux/autoconf.h > > >>> > > >>>>>> despite of above, compilation continues but fails with: > > >>>>>> > > >>>>>> gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > > >>>>>> > > >>> 2/drivers/infiniband/core/.mad.o.d -nostdinc -isystem > > >>> > > >> /usr/lib/gcc/x86_64- > > >> > > >>> redhat-linux/3.4.6/include -D__KERNEL__ > > >>> > > >> -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > > >> > > >>> 2/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > > >>> > > >> /drivers/infiniband/include > > >> > > >>> -Iinclude -include include/linux/autoconf.h -include > > >>> /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h > -Wall > > >>> > > >> -Wundef > > >> > > >>> -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common > > >>> > > >> -Werror- > > >> > > >>> implicit-function-declaration -Os -m64 -mno-red-zone > -mcmodel=kernel > > >>> > > >> -pipe - > > >> > > >>> Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time > > >>> > > >> -mno-sse - > > >> > > >>> mno-mmx -mno-sse2 -mno-3dnow -maccumulate-outgoing-args > -DCONFIG_AS_CFI=1 > > >>> > > >> - > > >> > > >>> DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fomit-frame-pointer > -Wdeclaration-after- > > >>> statement -DMODULE -D"KBUILD_STR(s)=#s" - > > >>> D"KBUILD_BASENAME=KBUILD_STR(mad)" > -D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" > > >>> > > >> -c - > > >> > > >>> o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.! > > >>> tmp > > >>> > > >>>> _mad.o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > > >>>> > > >> /drivers/infiniband/core/mad.c > > >> > > >>>>>> /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > > >>>>>> > > >> /drivers/infiniband/core/mad.c: In > > >> > > >>> function `ib_mad_init_module': > > >>> > > >>>>>> /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > > >>>>>> > > >> /drivers/infiniband/core/mad.c: > > >> > > >>> 2966: error: too many arguments to function `kmem_cache_create' > > >>> > > >>>>>> make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > > >>>>>> > > >>> 2/drivers/infiniband/core/mad.o] Error 1 > > >>> > > >>>>>> make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1. > > >>>>>> > > >>> 2/drivers/infiniband/core] Error 2 > > >>> > > >>>>>> make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2 > > >>>>>> > > >> /drivers/infiniband] Error 2 > > >> > > >>>>>> make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error > > >>>>>> > > >> 2 > > >> > > >>>>>> make[1]: Leaving directory `/usr/src/ib/xprt-switch-2.6' > > >>>>>> make: *** [kernel] Error 2 > > >>>>>> error: Bad exit status from /var/tmp/rpm-tmp.3877 (%install) > > >>>>>> > > >>>>>> full log: > > >>>>>> https://cefeid.wcss.wroc.pl/d/tmp/OFED.build.32122.log > > >>>>>> > > >>>> thanks in advance for any help, P > > >>>> > > >>>> > > >>>> -- > > >>>> Pawel Dziekonski > > >>>> Wroclaw Centre for Networking & Supercomputing, HPC Department > > >>>> Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, > > >>>> > > >> POLAND > > >> > > >>>> phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl > > >>>> > > >>>> > > >>>> > > >> > ------------------------------------------------------------------------- > > >> > > >>>> This SF.net email is sponsored by: Microsoft > > >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. > > >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > >>>> _______________________________________________ > > >>>> nfs-rdma-devel mailing list > > >>>> nfs-rdma-devel at lists.sourceforge.net > > >>>> https://lists.sourceforge.net/lists/listinfo/nfs-rdma-devel > > >>>> > > >>>> > > >>> _______________________________________________ > > >>> general mailing list > > >>> general at lists.openfabrics.org > > >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > >>> > > >>> To unsubscribe, please visit > > >>> > > >> http://openib.org/mailman/listinfo/openib-general > > >> > > >> > > > From eli at mellanox.co.il Thu Jan 31 07:51:21 2008 From: eli at mellanox.co.il (Eli Cohen) Date: Thu, 31 Jan 2008 17:51:21 +0200 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <1201758248.19565.69.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> <47A1798C.8050202@voltaire.com> <1201771703.27803.18.camel@mtls03> <1201758248.19565.69.camel@localhost.localdomain> Message-ID: <1201794681.5131.1.camel@mtls03> > Somehow, I couldn't receive your email on time. Looks like, some of my > emails got warning saying that the email needs to be approved since it > matches spam contents. Do you have any idea why it's blocked? Maybe the list administrator can help with this issue. From Harris.Shi at lsi.com Thu Jan 31 07:57:07 2008 From: Harris.Shi at lsi.com (Shi, Harris) Date: Thu, 31 Jan 2008 08:57:07 -0700 Subject: [ofa-general] (no subject) Message-ID: <18A61515E49B764AB09447A336E51F560102A941@NAMAIL2.ad.lsil.com> Please add me in your email list. Thanks. Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob.kossey at hp.com Thu Jan 31 07:59:54 2008 From: bob.kossey at hp.com (Kossey, Robert) Date: Thu, 31 Jan 2008 10:59:54 -0500 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: <47A0EDB6.5050804@mellanox.co.il> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <47A0EDB6.5050804@mellanox.co.il> Message-ID: <47A1F07A.9010502@hp.com> Yes this is better, I have no particular objections to these changes and appreciate your efforts to hold the line on the OFED 1.3 schedule. Thanks, Bob Tziporet Koren wrote: > > > > > The main reason is not the bugs but the features supported by IBM - CM > support for non SRQ and 4K MTU > > I see that these are important for IBM (see other mails) > > Another thing we can do in order not to delay the release is insert the > changes tomorrow (immediately after RC3 is out) and do RC4 next week > (instead of 2 weeks between every RC), and RC5 the week after. > In this way we will have enough time for testing and if we find some bug > we can fix then in RC5 > > Is this better? > > Tziporet > From eli at dev.mellanox.co.il Thu Jan 31 07:58:43 2008 From: eli at dev.mellanox.co.il (Eli Cohen) Date: Thu, 31 Jan 2008 17:58:43 +0200 Subject: [ofa-general] Bonding and hw_csum In-Reply-To: <47A1C83B.1040700@voltaire.com> References: <479DFA58.7050800@intec.ugent.be> <47A07CC9.8030005@voltaire.com> <47A0A90B.40506@mellanox.co.il> <1201784262.7095.17.camel@koenVRT> <47A1C83B.1040700@voltaire.com> Message-ID: <1201795123.5131.7.camel@mtls03> On Thu, 2008-01-31 at 15:08 +0200, Or Gerlitz wrote: > checksum offloading is supported by the connectx and some of the other > Mellanox devices, I am quite sure that 25208 is one of them, but you > have to clarify this with Mellanox > Device ID 25208, known as Tavor mode, does not support checksum offloading. It has to have device ID 25218 to have this capability. Some of the cards can be burnt with FW which makes it 25218 and thus have checksum offloading. From mashirle at us.ibm.com Wed Jan 30 22:06:54 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 22:06:54 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness In-Reply-To: <47A1CB21.40103@mellanox.co.il> References: <6C2C79E72C305246B504CBA17B5500C903368531@mtlexch01.mtl.com> <1201639285.28486.101.camel@firewall.xsintricity.com> <47A0A86A.5060003@mellanox.co.il> <47A1CB21.40103@mellanox.co.il> Message-ID: <1201759614.19565.84.camel@localhost.localdomain> Thanks for everyone here. I appreciate your comments and effort. The big challenge for us is how to sync features/blockers with OFED release Distros release. Most of our customers prefer Distros release so they can get same level of support as other pieces. If OFED could work with Distros release, then it will be less problems for both end users and Distros. That's just my personal opinion. We are here to support any issues being found in OFED release cycle on time regarding these patches. Thanks again! Shirley From mashirle at us.ibm.com Wed Jan 30 22:19:05 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 22:19:05 -0800 Subject: [ofa-general] [PATCH] IPoIB-UD S/G 4K MTU patch against OFED-1.3 RC2 In-Reply-To: <20080131111148.C1468E6109E@openfabrics.org> References: <20080131111148.C1468E6109E@openfabrics.org> Message-ID: <1201760345.19565.95.camel@localhost.localdomain> Hello Vlad, This is a patch build against OFED-1.3 RC2 on top of Pradeep's noSRQ patch. I am not sure whether you can apply cleanly to your current OFED-1.3 tree. If not, please let me know. This patch is in our test bed. This patch has been tested against: IPoIB-CM SRQ, IPoIB-UD 2K mtu, IPoIB 4K mtu, IPoIB-CM noSRQ for two nodes. Cluster testing is going on. Thanks Shirley This patch makes IPoIB-CM S/G RX more generic, and enables IPoIB UD S/G 4K MTU support. --Signed-off-by: Shirley Ma diff -urpN ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c --- ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 2008-01-27 14:20:17.000000000 -0600 +++ ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c 2008-01-27 14:26:18.000000000 -0600 @@ -77,17 +77,6 @@ static struct ib_send_wr ipoib_cm_rx_dra static int ipoib_cm_tx_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event); -static void ipoib_cm_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, - u64 mapping[IPOIB_CM_RX_SG]) -{ - int i; - - ib_dma_unmap_single(priv->ca, mapping[0], IPOIB_CM_HEAD_SIZE, DMA_FROM_DEVICE); - - for (i = 0; i < frags; ++i) - ib_dma_unmap_single(priv->ca, mapping[i + 1], PAGE_SIZE, DMA_FROM_DEVICE); -} - static int ipoib_cm_post_receive_srq(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -102,8 +91,9 @@ static int ipoib_cm_post_receive_srq(str ret = ib_post_srq_recv(priv->cm.srq, &priv->cm.rx_wr, &bad_wr); if (unlikely(ret)) { ipoib_warn(priv, "post srq failed for buf %d (%d)\n", id, ret); - ipoib_cm_dma_unmap_rx(priv, priv->cm.num_frags - 1, - priv->cm.srq_ring[id].mapping); + ipoib_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, + priv->cm.srq_ring[id].mapping); dev_kfree_skb_any(priv->cm.srq_ring[id].skb); priv->cm.srq_ring[id].skb = NULL; } @@ -126,8 +116,9 @@ static int ipoib_cm_post_receive_nonsrq( ret = ib_post_recv(rx->qp, &priv->cm.rx_wr, &bad_wr); if (unlikely(ret)) { ipoib_warn(priv, "post recv failed for buf %d (%d)\n", id, ret); - ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, - rx->rx_ring[id].mapping); + ipoib_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, + rx->rx_ring[id].mapping); dev_kfree_skb_any(rx->rx_ring[id].skb); rx->rx_ring[id].skb = NULL; } @@ -135,69 +126,16 @@ static int ipoib_cm_post_receive_nonsrq( return ret; } -static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, - struct ipoib_cm_rx_buf *rx_ring, - int id, int frags, - u64 mapping[IPOIB_CM_RX_SG]) -{ - struct ipoib_dev_priv *priv = netdev_priv(dev); - struct sk_buff *skb; - int i; - - skb = dev_alloc_skb(IPOIB_CM_HEAD_SIZE + 12); - if (unlikely(!skb)) - return NULL; - - /* - * IPoIB adds a 4 byte header. So we need 12 more bytes to align the - * IP header to a multiple of 16. - */ - skb_reserve(skb, 12); - - mapping[0] = ib_dma_map_single(priv->ca, skb->data, IPOIB_CM_HEAD_SIZE, - DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, mapping[0]))) { - dev_kfree_skb_any(skb); - return NULL; - } - - for (i = 0; i < frags; i++) { - struct page *page = alloc_page(GFP_ATOMIC); - - if (!page) - goto partial_error; - skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE); - - mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page, - 0, PAGE_SIZE, DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1]))) - goto partial_error; - } - - rx_ring[id].skb = skb; - return skb; - -partial_error: - - ib_dma_unmap_single(priv->ca, mapping[0], IPOIB_CM_HEAD_SIZE, DMA_FROM_DEVICE); - - for (; i > 0; --i) - ib_dma_unmap_single(priv->ca, mapping[i], PAGE_SIZE, DMA_FROM_DEVICE); - - dev_kfree_skb_any(skb); - return NULL; -} - static void ipoib_cm_free_rx_ring(struct net_device *dev, - struct ipoib_cm_rx_buf *rx_ring) + struct ipoib_rx_buf *rx_ring) { struct ipoib_dev_priv *priv = netdev_priv(dev); int i; for (i = 0; i < ipoib_recvq_size; ++i) if (rx_ring[i].skb) { - ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, - rx_ring[i].mapping); + ipoib_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, rx_ring[i].mapping); dev_kfree_skb_any(rx_ring[i].skb); } @@ -345,9 +283,11 @@ static int ipoib_cm_nonsrq_init_rx(struc spin_unlock_irq(&priv->lock); for (i = 0; i < ipoib_recvq_size; ++i) { - if (!ipoib_cm_alloc_rx_skb(dev, rx->rx_ring, i, IPOIB_CM_RX_SG - 1, - rx->rx_ring[i].mapping)) { - ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); + rx->rx_ring[i].skb = ipoib_alloc_rx_skb(dev, i, IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, 12, + rx->rx_ring[i].mapping); + if (!rx->rx_ring[i].skb) { + ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); ret = -ENOMEM; goto err_count; } @@ -480,43 +420,11 @@ static int ipoib_cm_rx_handler(struct ib return 0; } } -/* Adjust length of skb with fragments to match received data */ -static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, - unsigned int length, struct sk_buff *toskb) -{ - int i, num_frags; - unsigned int size; - - /* put header into skb */ - size = min(length, hdr_space); - skb->tail += size; - skb->len += size; - length -= size; - - num_frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < num_frags; i++) { - skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - - if (length == 0) { - /* don't need this page */ - skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE); - --skb_shinfo(skb)->nr_frags; - } else { - size = min(length, (unsigned) PAGE_SIZE); - - frag->size = size; - skb->data_len += size; - skb->truesize += size; - skb->len += size; - length -= size; - } - } -} void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) { struct ipoib_dev_priv *priv = netdev_priv(dev); - struct ipoib_cm_rx_buf *rx_ring; + struct ipoib_rx_buf *rx_ring; unsigned int wr_id = wc->wr_id & ~(IPOIB_OP_CM | IPOIB_OP_RECV); struct sk_buff *skb, *newskb; struct ipoib_cm_rx *p; @@ -582,7 +490,7 @@ void ipoib_cm_handle_rx_wc(struct net_de frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, (unsigned)IPOIB_CM_HEAD_SIZE)) / PAGE_SIZE; - newskb = ipoib_cm_alloc_rx_skb(dev, rx_ring, wr_id, frags, mapping); + newskb = ipoib_alloc_rx_skb(dev, wr_id, frags, IPOIB_CM_HEAD_SIZE, 12, mapping); if (unlikely(!newskb)) { /* * If we can't allocate a new RX buffer, dump @@ -592,8 +500,9 @@ void ipoib_cm_handle_rx_wc(struct net_de ++dev->stats.rx_dropped; goto repost; } + rx_ring[wr_id].skb = newskb; - ipoib_cm_dma_unmap_rx(priv, frags, rx_ring[wr_id].mapping); + ipoib_dma_unmap_rx(priv, frags, IPOIB_CM_HEAD_SIZE, rx_ring[wr_id].mapping); memcpy(rx_ring[wr_id].mapping, mapping, (frags + 1) * sizeof *mapping); ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", @@ -1508,9 +1417,10 @@ int ipoib_cm_dev_init(struct net_device if (ipoib_cm_has_srq(dev)) { for (i = 0; i < ipoib_recvq_size; ++i) { - if (!ipoib_cm_alloc_rx_skb(dev, priv->cm.srq_ring, i, - priv->cm.num_frags - 1, - priv->cm.srq_ring[i].mapping)) { + priv->cm.srq_ring[i].skb = ipoib_alloc_rx_skb(dev, i, priv->cm.num_frags - 1, + IPOIB_CM_HEAD_SIZE, 12, + priv->cm.srq_ring[i].mapping); + if (!priv->cm.srq_ring[i].skb) { ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); ipoib_cm_dev_cleanup(dev); return -ENOMEM; diff -urpN ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib.h ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib.h --- ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib.h 2008-01-27 14:20:17.000000000 -0600 +++ ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib.h 2008-01-27 14:23:34.000000000 -0600 @@ -56,10 +56,9 @@ /* constants */ enum { - IPOIB_PACKET_SIZE = 2048, - IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, - IPOIB_ENCAP_LEN = 4, + IPOIB_MAX_IB_MTU = 4096, /* max ib device payload is 4096 */ + IPOIB_UD_MAX_RX_SG = ALIGN(IPOIB_MAX_IB_MTU + IB_GRH_BYTES + 4, PAGE_SIZE) / PAGE_SIZE, /* padding to align IP header */ IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header to 16 */ IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU + IPOIB_ENCAP_LEN, @@ -142,7 +141,7 @@ struct ipoib_mcast { struct ipoib_rx_buf { struct sk_buff *skb; - u64 mapping; + u64 mapping[IPOIB_CM_RX_SG]; }; struct ipoib_tx_buf { @@ -261,7 +260,7 @@ enum ipoib_cm_state { struct ipoib_cm_rx { struct ib_cm_id *id; struct ib_qp *qp; - struct ipoib_cm_rx_buf *rx_ring; + struct ipoib_rx_buf *rx_ring; struct list_head list; struct net_device *dev; unsigned long jiffies; @@ -285,14 +284,9 @@ struct ipoib_cm_tx { struct ib_wc ibwc[IPOIB_NUM_WC]; }; -struct ipoib_cm_rx_buf { - struct sk_buff *skb; - u64 mapping[IPOIB_CM_RX_SG]; -}; - struct ipoib_cm_dev_priv { struct ib_srq *srq; - struct ipoib_cm_rx_buf *srq_ring; + struct ipoib_rx_buf *srq_ring; struct ib_cm_id *id; struct list_head passive_ids; /* state: LIVE */ struct list_head rx_error_list; /* state: ERROR */ @@ -398,6 +392,9 @@ struct ipoib_dev_priv { struct dentry *path_dentry; #endif struct ipoib_ethtool_st etool; + unsigned int max_ib_mtu; + struct ib_sge rx_sge[IPOIB_UD_MAX_RX_SG]; + struct ib_recv_wr rx_wr; }; struct ipoib_ah { @@ -493,6 +490,14 @@ int ipoib_ib_dev_stop(struct net_device int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port); void ipoib_dev_cleanup(struct net_device *dev); +void ipoib_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, int head_size, + u64 *mapping); +void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, + unsigned int length, struct sk_buff *toskb); +struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, + int id, int frags, int head_size, + int pad, u64 *mapping); + void ipoib_mcast_join_task(struct work_struct *work); void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb); @@ -542,6 +547,11 @@ void ipoib_drain_cq(struct net_device *d void ipoib_set_ethtool_ops(struct net_device *dev); +#define IPOIB_UD_MTU(ib_mtu) (ib_mtu - IPOIB_ENCAP_LEN) +#define IPOIB_UD_BUF_SIZE(ib_mtu) (ib_mtu + IB_GRH_BYTES + 4) /* padding to align IP header */ +#define IPOIB_UD_HEAD_SIZE(ib_mtu) (IPOIB_UD_BUF_SIZE(ib_mtu)) % PAGE_SIZE +#define IPOIB_UD_RX_SG(ib_mtu) ALIGN(IPOIB_UD_BUF_SIZE(ib_mtu), PAGE_SIZE) / PAGE_SIZE + #ifdef CONFIG_INFINIBAND_IPOIB_CM #define IPOIB_FLAGS_RC 0x80 diff -urpN ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_ib.c ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_ib.c --- ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2008-01-27 14:20:17.000000000 -0600 +++ ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2008-01-27 14:26:05.000000000 -0600 @@ -89,63 +89,118 @@ void ipoib_free_ah(struct kref *kref) spin_unlock_irqrestore(&priv->lock, flags); } -static int ipoib_ib_post_receive(struct net_device *dev, int id) +/* Adjust length of skb with fragments to match received data */ +void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, + unsigned int length, struct sk_buff *toskb) { - struct ipoib_dev_priv *priv = netdev_priv(dev); - struct ib_sge list; - struct ib_recv_wr param; - struct ib_recv_wr *bad_wr; - int ret; + int i, num_frags; + unsigned int size; - list.addr = priv->rx_ring[id].mapping; - list.length = IPOIB_BUF_SIZE; - list.lkey = priv->mr->lkey; - - param.next = NULL; - param.wr_id = id | IPOIB_OP_RECV; - param.sg_list = &list; - param.num_sge = 1; + /* put header into skb */ + size = min(length, hdr_space); + skb->tail += size; + skb->len += size; + length -= size; - ret = ib_post_recv(priv->qp, ¶m, &bad_wr); - if (unlikely(ret)) { - ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); - ib_dma_unmap_single(priv->ca, priv->rx_ring[id].mapping, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); - dev_kfree_skb_any(priv->rx_ring[id].skb); - priv->rx_ring[id].skb = NULL; + num_frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < num_frags; i++) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + + if (length == 0) { + /* don't need this page */ + skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE); + --skb_shinfo(skb)->nr_frags; + } else { + size = min(length, (unsigned) PAGE_SIZE); + + frag->size = size; + skb->data_len += size; + skb->truesize += size; + skb->len += size; + length -= size; + } } +} - return ret; +void ipoib_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, int head_size, u64 *mapping) +{ + int i; + ib_dma_unmap_single(priv->ca, mapping[0], head_size, + DMA_FROM_DEVICE); + for (i = 0; i < frags; i++) + ib_dma_unmap_single(priv->ca, mapping[i+1], PAGE_SIZE, + DMA_FROM_DEVICE); } -static int ipoib_alloc_rx_skb(struct net_device *dev, int id) +struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, int id, int frags, + int head_size, int pad, u64 *mapping) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct sk_buff *skb; - u64 addr; + int i; - skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); - if (!skb) - return -ENOMEM; + skb = dev_alloc_skb(head_size + pad); + if (unlikely(!skb)) + return NULL; /* - * IB will leave a 40 byte gap for a GRH and IPoIB adds a 4 byte - * header. So we need 4 more bytes to get to 48 and align the + * IPoIB adds a 4 byte header. So we need 12 more bytes to align the * IP header to a multiple of 16. */ - skb_reserve(skb, 4); + skb_reserve(skb, pad); - addr = ib_dma_map_single(priv->ca, skb->data, IPOIB_BUF_SIZE, - DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { + mapping[0] = ib_dma_map_single(priv->ca, skb->data, head_size, + DMA_FROM_DEVICE); + if (unlikely(ib_dma_mapping_error(priv->ca, mapping[0]))) { dev_kfree_skb_any(skb); - return -EIO; + return NULL; } - priv->rx_ring[id].skb = skb; - priv->rx_ring[id].mapping = addr; + for (i = 0; i < frags; i++) { + struct page *page = alloc_page(GFP_ATOMIC); - return 0; + if (!page) + goto partial_error; + skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE); + + mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page, + 0, PAGE_SIZE, DMA_FROM_DEVICE); + if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1]))) + goto partial_error; + } + + return skb; + +partial_error: + + ib_dma_unmap_single(priv->ca, mapping[0], head_size, DMA_FROM_DEVICE); + + for (; i > 0; --i) + ib_dma_unmap_single(priv->ca, mapping[i], PAGE_SIZE, DMA_FROM_DEVICE); + + dev_kfree_skb_any(skb); + return NULL; +} + +static int ipoib_ib_post_receive(struct net_device *dev, int id) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_recv_wr *bad_wr; + int ret, i; + + priv->rx_wr.wr_id = id | IPOIB_OP_RECV; + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu); ++i) + priv->rx_sge[i].addr = priv->rx_ring[id].mapping[i]; + ret = ib_post_recv(priv->qp, &priv->rx_wr, &bad_wr); + if (unlikely(ret)) { + ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); + ipoib_dma_unmap_rx(priv, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), priv->rx_ring[id].mapping); + dev_kfree_skb_any(priv->rx_ring[id].skb); + priv->rx_ring[id].skb = NULL; + } + + return ret; } static int ipoib_ib_post_receives(struct net_device *dev) @@ -153,13 +208,24 @@ static int ipoib_ib_post_receives(struct struct ipoib_dev_priv *priv = netdev_priv(dev); int i; + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu); ++i) + priv->rx_sge[i].lkey = priv->mr->lkey; + priv->rx_sge[0].length = IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu); + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1; ++i) + priv->rx_sge[i+1].length = PAGE_SIZE; + priv->rx_wr.num_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu); + priv->rx_wr.next = NULL; + priv->rx_wr.sg_list = priv->rx_sge; + for (i = 0; i < ipoib_recvq_size; ++i) { - if (ipoib_alloc_rx_skb(dev, i)) { - ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); + priv->rx_ring[i].skb = ipoib_alloc_rx_skb(dev, i, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), 4, + priv->rx_ring[i].mapping); + if (!priv->rx_ring[i].skb) return -ENOMEM; - } if (ipoib_ib_post_receive(dev, i)) { ipoib_warn(priv, "ipoib_ib_post_receive failed for buf %d\n", i); + ipoib_dev_cleanup(dev); return -EIO; } } @@ -171,9 +237,10 @@ static void ipoib_ib_handle_rx_wc(struct { struct ipoib_dev_priv *priv = netdev_priv(dev); unsigned int wr_id = wc->wr_id & ~IPOIB_OP_RECV; - struct sk_buff *skb; + struct sk_buff *skb, *newskb; + u64 mapping[IPOIB_UD_RX_SG(priv->max_ib_mtu)]; struct ipoib_header *header; - u64 addr; + int frags; ipoib_dbg_data(priv, "recv completion: id %d, status: %d\n", wr_id, wc->status); @@ -185,15 +252,15 @@ static void ipoib_ib_handle_rx_wc(struct } skb = priv->rx_ring[wr_id].skb; - addr = priv->rx_ring[wr_id].mapping; if (unlikely(wc->status != IB_WC_SUCCESS)) { if (wc->status != IB_WC_WR_FLUSH_ERR) ipoib_warn(priv, "failed recv event " "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); - ib_dma_unmap_single(priv->ca, addr, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[wr_id].mapping); dev_kfree_skb_any(skb); priv->rx_ring[wr_id].skb = NULL; return; @@ -206,21 +273,28 @@ static void ipoib_ib_handle_rx_wc(struct if (wc->slid == priv->local_lid && wc->src_qp == priv->qp->qp_num) goto repost; + frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, + (unsigned)IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu))) / PAGE_SIZE; + newskb = ipoib_alloc_rx_skb(dev, wr_id, frags, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), 4, mapping); /* * If we can't allocate a new RX buffer, dump * this packet and reuse the old buffer. */ - if (unlikely(ipoib_alloc_rx_skb(dev, wr_id))) { - ++dev->stats.rx_dropped; - goto repost; - } + if (unlikely(!newskb)) { + ipoib_dbg(priv, "failed to allocate receive buffer %d\n", wr_id); + ++dev->stats.rx_dropped; + goto repost; + } + priv->rx_ring[wr_id].skb = newskb; ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", wc->byte_len, wc->slid); - ib_dma_unmap_single(priv->ca, addr, IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, frags, IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), priv->rx_ring[wr_id].mapping); + memcpy(priv->rx_ring[wr_id].mapping, mapping, (frags + 1) * sizeof *mapping); - skb_put(skb, wc->byte_len); + skb_put_frags(skb, IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), wc->byte_len, newskb); skb_pull(skb, IB_GRH_BYTES); header = (struct ipoib_header *)skb->data; @@ -687,10 +761,10 @@ int ipoib_ib_dev_stop(struct net_device rx_req = &priv->rx_ring[i]; if (!rx_req->skb) continue; - ib_dma_unmap_single(priv->ca, - rx_req->mapping, - IPOIB_BUF_SIZE, - DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, + IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[i].mapping); dev_kfree_skb_any(rx_req->skb); rx_req->skb = NULL; } diff -urpN ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_main.c ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_main.c --- ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-27 14:20:17.000000000 -0600 +++ ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-01-27 14:23:34.000000000 -0600 @@ -196,7 +196,7 @@ static int ipoib_change_mtu(struct net_d return 0; } - if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) + if (new_mtu > IPOIB_UD_MTU(priv->max_ib_mtu)) return -EINVAL; priv->admin_mtu = new_mtu; @@ -1024,10 +1024,6 @@ static void ipoib_setup(struct net_devic set_bit(IPOIB_FLAG_HW_CSUM, &priv->flags); } - /* MTU will be reset when mcast join happens */ - dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; - priv->mcast_mtu = priv->admin_mtu = dev->mtu; - memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); netif_carrier_off(dev); @@ -1182,6 +1178,7 @@ static struct net_device *ipoib_add_port struct ib_device *hca, u8 port) { struct ipoib_dev_priv *priv; + struct ib_port_attr attr; int result = -ENOMEM; priv = ipoib_intf_alloc(format); @@ -1192,6 +1189,18 @@ static struct net_device *ipoib_add_port priv->dev->features |= NETIF_F_HIGHDMA; + if (!ib_query_port(hca, port, &attr)) + priv->max_ib_mtu = ib_mtu_enum_to_int(attr.max_mtu); + else { + printk(KERN_WARNING "%s: ib_query_port %d failed\n", + hca->name, port); + goto device_init_failed; + } + + /* MTU will be reset when mcast join happens */ + priv->dev->mtu = IPOIB_UD_MTU(priv->max_ib_mtu); + priv->mcast_mtu = priv->admin_mtu = priv->dev->mtu; + result = ib_query_pkey(hca, port, 0, &priv->pkey); if (result) { printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", diff -urpN ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c --- ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-01-14 04:00:04.000000000 -0600 +++ ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-01-27 14:23:34.000000000 -0600 @@ -567,9 +567,7 @@ void ipoib_mcast_join_task(struct work_s return; } - priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - - IPOIB_ENCAP_LEN; - + priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); diff -urpN ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c --- ofa_1_3_kernel-20080114-0200_a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 2008-01-27 14:20:17.000000000 -0600 +++ ofa_1_3_kernel-20080114-0200_b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 2008-01-27 14:23:34.000000000 -0600 @@ -150,7 +150,7 @@ int ipoib_transport_dev_init(struct net_ .max_send_wr = ipoib_sendq_size, .max_recv_wr = ipoib_recvq_size, .max_send_sge = dev->features & NETIF_F_SG ? MAX_SKB_FRAGS + 1 : 1, - .max_recv_sge = 1 + .max_recv_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu) }, .sq_sig_type = IB_SIGNAL_ALL_WR, .qp_type = IB_QPT_UD, @@ -212,6 +212,16 @@ int ipoib_transport_dev_init(struct net_ priv->tx_wr.sg_list = priv->tx_sge; priv->tx_wr.send_flags = IB_SEND_SIGNALED; + priv->rx_sge[0].length = IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu); + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1; ++i) { + priv->rx_sge[i].lkey = priv->mr->lkey; + priv->rx_sge[i + 1].length = PAGE_SIZE; + } + priv->rx_sge[i + 1].lkey = priv->mr->lkey; + priv->rx_wr.num_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu); + priv->rx_wr.next = NULL; + priv->rx_wr.sg_list = priv->rx_sge; + return 0; out_free_cq: From tziporet at dev.mellanox.co.il Thu Jan 31 08:40:34 2008 From: tziporet at dev.mellanox.co.il (Tziporet Koren) Date: Thu, 31 Jan 2008 18:40:34 +0200 Subject: [ofa-general] OFED 1.3 new schedule Message-ID: <47A1FA02.8010209@mellanox.co.il> Hi All, Due to feedback we got regarding Redhat inclusion of OFED 1.3 in RHEL 5.2 and 4.7 we are changing the release schedule (See the thread on OFED Jan 28 meeting summary on RC3 readiness) This is the new schedule: * RC3 - done (30-Jan) * RC4 - Feb 6 * RC5 - Feb 18 <== Gold * GA - Feb 25 RC4 should include: 1. Critical and major bug fixes 2. IPoIB: Non-SRQ for CM mode 3. IPOIB: 4K MTU - assuming this will converge in few days To make sure we can do it without more slips the following procedure should be used: 1. Every patch that is sent to OFED will include not only the kernel code against 2.6.24 but also the backport patches 2. Send me the patch for approval including explanation what is the reason for this patch 3. Do not mix code cleanup in a patch that fix some critical bug Thanks, Tziporet -------------- next part -------------- An HTML attachment was scrubbed... URL: From sashak at voltaire.com Thu Jan 31 09:23:34 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 31 Jan 2008 17:23:34 +0000 Subject: [ofa-general] Re: [PATCH] opensm: diags better error checking for DR option In-Reply-To: <47A0F7EC.3040004@llnl.gov> References: <47A0F7EC.3040004@llnl.gov> Message-ID: <20080131172334.GC29624@sashak.voltaire.com> Hi Tim, On 14:19 Wed 30 Jan , Timothy A. Meier wrote: > Sasha, this patch contains a bug fix for the -D option for the two perl > utils > iblinkinfo.pl and ibqueryerrors.pl. > > Since I touched these files, I also ran them through 'perltidy' with your > suggestion options. It would be much better to not mix formatting with other changes. I'm splitting this patch into two separate ones. Applied. Thanks. Sasha From sashak at voltaire.com Thu Jan 31 09:25:00 2008 From: sashak at voltaire.com (Sasha Khapyorsky) Date: Thu, 31 Jan 2008 17:25:00 +0000 Subject: [ofa-general] Re: [PATCH] opensm: diags bug fix in iblinkinfo.pl In-Reply-To: <47A110DE.6090103@llnl.gov> References: <47A110DE.6090103@llnl.gov> Message-ID: <20080131172500.GD29624@sashak.voltaire.com> On 16:05 Wed 30 Jan , Timothy A. Meier wrote: > Sasha, > > Sorry, I missed this in my previous patch. > > From b9bd2d2e5be0121c148fe7087ca7e6cce357a55e Mon Sep 17 00:00:00 2001 > From: Tim Meier > Date: Wed, 30 Jan 2008 16:00:50 -0800 > Subject: [PATCH] opensm: diags bug fix in iblinkinfo.pl > > Fixes -D bug determining guid from dr > > Signed-off-by: Tim Meier Applied. Thanks. Sasha From parks at lanl.gov Thu Jan 31 09:36:51 2008 From: parks at lanl.gov (Parks Fields) Date: Thu, 31 Jan 2008 10:36:51 -0700 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> Message-ID: <7.0.1.0.2.20080131103316.0291de00@lanl.gov> At 09:16 PM 1/30/2008, Chet Mehta wrote: >Robert, > >In response to your question...... > > A more general question I would like to ask the group is how many > people use OFED from the RH or SUSE distros as > > is, as compared with using OFED releases from other sources like > the IB vendors, or building their own from > > openfabrics.org? We use RH distros, but to this point, the OFED > support provided in RH distros has lagged > > behind the latest releases available from openfabrics.org. This > is not to fault Red Hat, but OFED is still > > changing too rapidly, with minor point releases and bug fixes, > for a distro to keep up. I think many of us hope > > that someday that will not be the case, but appears to be true > for the foreseeable future. Right now, our mode > > of operation is to remove whatever IB support comes in the distro > and replace it, so it does not help us to > > delay OFED 1.3 to get a particular bug fix in a distro. We have found that there are some vendors who dictate that they will only support a Distro EX: RHXXX. Then if you layer the latest OFED on top, then the support is nullified. Or to get any bug fixes or support you have to uninstall the what you did and repeat the bug/error. SO I think it is very important to keep the Distros very close the latest OFED stack. >I believe the question that should be asked is "'How many IB >customers would like to use the OFED distribution if provided by the >distro?" The answer at least for the customer set we deal with is >pretty much unanimous. The fact is that customers are already >dealing with a distro for their base OS so obtaining the >interconnect code & support from the same sources is highly >desirable. When OpenIB was new (in 2004/5) and "common" IB code was >still in its infancy, the customer set was tolerant of 'build your >own' or vendor provided distribution mechanism. However if IB is to >become a ubiquitous interconnect, we in OFA have to strive to tailor >our deliverables to meet distro requirements. Until we do that, IB >will have difficultly gaining broader market acceptance. > >Just my perspective. > >:Chet. >_______________________________________________ >general mailing list >general at lists.openfabrics.org >http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > >To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ***** Correspondence ***** This email contains no programmatic content that requires independent ADC review -------------- next part -------------- An HTML attachment was scrubbed... URL: From mashirle at us.ibm.com Wed Jan 30 23:37:17 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Wed, 30 Jan 2008 23:37:17 -0800 Subject: [ofa-general] [PATCH] IPoIB-UD S/G 4K MTU patch against OFED-1.3 RC2 In-Reply-To: <1201760345.19565.95.camel@localhost.localdomain> References: <20080131111148.C1468E6109E@openfabrics.org> <1201760345.19565.95.camel@localhost.localdomain> Message-ID: <1201765038.19565.99.camel@localhost.localdomain> Hello Vlad, There is one line backport patch needed for this patch: priv->stats vs. dev->stats. If needed, let me know. Thanks Shirley From changquing.tang at hp.com Thu Jan 31 09:36:53 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Thu, 31 Jan 2008 17:36:53 +0000 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: <479FA93C.6050305@ichips.intel.com> References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> <479F9EA5.4020407@ichips.intel.com> <1201644204.28486.116.camel@firewall.xsintricity.com> <479FA93C.6050305@ichips.intel.com> Message-ID: I look at the v2 header file udat.h, it still has #include it should be #include If I install v2 header at include/dat2 right? --CQ > -----Original Message----- > From: Arlin Davis [mailto:ardavis at ichips.intel.com] > Sent: Tuesday, January 29, 2008 4:31 PM > To: Tang, Changqing > Cc: Doug Ledford; general > Subject: Re: [ofa-general] Re: Dapl 2 question/issue > > Tang, Changqing wrote: > > devel environments are just runtimes plus header files, so > the libraries won't overwrite each other, but header files will ? > > v2 header files are include/dat2, v1 headers are include/dat > > From sean.hefty at intel.com Thu Jan 31 10:07:42 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 31 Jan 2008 10:07:42 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary onRC3readiness In-Reply-To: <1201733706.28486.218.camel@firewall.xsintricity.com> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <47A0EDB6.5050804@mellanox.co.il> <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> <1201733706.28486.218.camel@firewall.xsintricity.com> Message-ID: <000201c86434$2cbc6400$a937170a@amr.corp.intel.com> >In all fairness, the kernel portion of all of this, and the process of >getting things into Linus' kernel, has *always* been a case of staging >things in Roland's tree and then merging upstream. So, at least for the >kernel, that's mostly true as OFED is pretty close to Roland's tree >generally speaking. As for the user space packages though, you guys >*are* the upstream. There's no one to merge upstream to and very little >oversight by anyone. So, it's entirely up to all of you just how much >your package seems to be a feature of the day change-athon versus a >solid, stable program. I don't believe that this is the model actually in use. OFED has accepted kernel features that have not been submitted for upstream inclusion, or, in some cases, that were, but were rejected. (For examples, see local SA, SA event subscription, XRC, SDP, and some of the previous incarnations of IPoIB CM.) There are thousands of lines of code difference between OFED and the kernel upon which it's based. (To be clear, I'm not objecting to any changes, just the sheer volume.) The OFED releases of the userspace libraries are not identical to those provided by the maintainers. (See libibverbs.) Whose version of libibverbs does RedHat plan on using? How do you manage the differences between OFED and Roland's libibverbs libraries? And I'm really not trying to come across harsh here, but if the distros are willing to pull the OFED code, why should OFA bother trying to merge anything upstream? - Sean From bob.kossey at hp.com Thu Jan 31 10:20:46 2008 From: bob.kossey at hp.com (Kossey, Robert) Date: Thu, 31 Jan 2008 13:20:46 -0500 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness In-Reply-To: <7.0.1.0.2.20080131103316.0291de00@lanl.gov> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <7.0.1.0.2.20080131103316.0291de00@lanl.gov> Message-ID: <47A2117E.6030604@hp.com> Conversely, there are some vendors who will *not* support OFED from a distro, but only a version they supply and control. The customers I work with are more sensitive to performance, as well as quick turnaround for bug fixes, so having the latest releases, and being able to update them quickly is critical. I agree that for other customers, a more slowly changing supported stack is suitable. These may also be the same customers more likely to use 10GE. I agree more discussion and adjustment is needed for OFED to be able to balance the needs of the spectrum of users, from the latest kernel.org users, IB vendor based users and distro based users. Bob Parks Fields wrote: > At 09:16 PM 1/30/2008, Chet Mehta wrote: > >> Robert, >> >> In response to your question...... >> > A more general question I would like to ask the group is how many >> people use OFED from the RH or SUSE distros as >> > is, as compared with using OFED releases from other sources like >> the IB vendors, or building their own from >> > openfabrics.org? We use RH distros, but to this point, the OFED >> support provided in RH distros has lagged >> > behind the latest releases available from openfabrics.org. This is >> not to fault Red Hat, but OFED is still >> > changing too rapidly, with minor point releases and bug fixes, for >> a distro to keep up. I think many of us hope >> > that someday that will not be the case, but appears to be true for >> the foreseeable future. Right now, our mode >> > of operation is to remove whatever IB support comes in the distro >> and replace it, so it does not help us to >> > delay OFED 1.3 to get a particular bug fix in a distro. > > > We have found that there are some vendors who dictate that they will > only support a Distro EX: RHXXX. Then if you layer the latest OFED > on top, then the support is nullified. Or to get any bug fixes or > support you have to uninstall the what you did and repeat the > bug/error. SO I think it is very important to keep the Distros very > close the latest OFED stack. > > > > >> I believe the question that should be asked is "'How many IB >> customers would like to use the OFED distribution if provided by the >> distro?" The answer at least for the customer set we deal with is >> pretty much unanimous. The fact is that customers are already dealing >> with a distro for their base OS so obtaining the interconnect code & >> support from the same sources is highly desirable. When OpenIB was >> new (in 2004/5) and "common" IB code was still in its infancy, the >> customer set was tolerant of 'build your own' or vendor provided >> distribution mechanism. However if IB is to become a ubiquitous >> interconnect, we in OFA have to strive to tailor our deliverables to >> meet distro requirements. Until we do that, IB will have difficultly >> gaining broader market acceptance. >> >> Just my perspective. >> >> :Chet. >> _______________________________________________ >> general mailing list >> general at lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general >> >> To unsubscribe, please visit >> http://openib.org/mailman/listinfo/openib-general > > ***** Correspondence ***** > > This email contains no programmatic content that requires independent > ADC review > From dledford at redhat.com Thu Jan 31 10:30:23 2008 From: dledford at redhat.com (Doug Ledford) Date: Thu, 31 Jan 2008 13:30:23 -0500 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary onRC3readiness In-Reply-To: <000201c86434$2cbc6400$a937170a@amr.corp.intel.com> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <47A0EDB6.5050804@mellanox.co.il> <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> <1201733706.28486.218.camel@firewall.xsintricity.com> <000201c86434$2cbc6400$a937170a@amr.corp.intel.com> Message-ID: <1201804223.28486.284.camel@firewall.xsintricity.com> On Thu, 2008-01-31 at 10:07 -0800, Sean Hefty wrote: > >In all fairness, the kernel portion of all of this, and the process of > >getting things into Linus' kernel, has *always* been a case of staging > >things in Roland's tree and then merging upstream. So, at least for the > >kernel, that's mostly true as OFED is pretty close to Roland's tree > >generally speaking. As for the user space packages though, you guys > >*are* the upstream. There's no one to merge upstream to and very little > >oversight by anyone. So, it's entirely up to all of you just how much > >your package seems to be a feature of the day change-athon versus a > >solid, stable program. > > I don't believe that this is the model actually in use. OFED has accepted > kernel features that have not been submitted for upstream inclusion, or, in some > cases, that were, but were rejected. (For examples, see local SA, SA event > subscription, XRC, SDP, and some of the previous incarnations of IPoIB CM.) > There are thousands of lines of code difference between OFED and the kernel upon > which it's based. (To be clear, I'm not objecting to any changes, just the > sheer volume.) > > The OFED releases of the userspace libraries are not identical to those provided > by the maintainers. (See libibverbs.) Whose version of libibverbs does RedHat > plan on using? How do you manage the differences between OFED and Roland's > libibverbs libraries? > > And I'm really not trying to come across harsh here, but if the distros are > willing to pull the OFED code, why should OFA bother trying to merge anything > upstream? I pull *some* OFED code. I don't pull it all. There are things in OFED I won't accept until they've gone upstream. Hence, RDS is not in our offering. We made the mistake of taking SDP long ago and we'll carry that forward, but we generally look for things to be upstream before pulling them from OFED at this point (or at least have been submitted upstream and is being worked towards acceptance). In terms of user space, given a choice between a released tarball or the custom OFED tarball, I choose the released tarball. So, I currently have Roland's libibverbs, libmthca, and libmlx4. -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From rdreier at cisco.com Thu Jan 31 10:47:50 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 31 Jan 2008 10:47:50 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: <1201757816.19565.62.camel@localhost.localdomain> (Shirley Ma's message of "Wed, 30 Jan 2008 21:36:56 -0800") References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> <1201774303.27803.26.camel@mtls03> <1201757816.19565.62.camel@localhost.localdomain> Message-ID: > Actually, I found a problem in IPoIB-CM here. Looks like most of HCAs do > support S/G numbers > 16. Then the other OS implementation could have > IPoIB link MTU = 64K -4 instead of current implementation 64K-16. It > could be a problem for this implementation to be in a mixed OS > evviornment. Just like my previous simple approach implementation for > IPoIB-UD, which is IPoIB link MTU = 4K - 48. I don't think CM is quite the same issue as for 4K-4 MTU, as one could easily imagine an implementation that allows an MTU of 128KB or 1MB or whatever. Also the CM protocol negotiates the receive size available, so we shouldn't have the problems of getting a local length error. - R. From Jeffrey.C.Becker at nasa.gov Thu Jan 31 10:49:07 2008 From: Jeffrey.C.Becker at nasa.gov (Jeff Becker) Date: Thu, 31 Jan 2008 10:49:07 -0800 Subject: [ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support In-Reply-To: <1201794681.5131.1.camel@mtls03> References: <1201718540.6850.41.camel@localhost.localdomain> <47A1798C.8050202@voltaire.com> <1201771703.27803.18.camel@mtls03> <1201758248.19565.69.camel@localhost.localdomain> <1201794681.5131.1.camel@mtls03> Message-ID: <47A21823.8040307@nasa.gov> Hi. Eli Cohen wrote: > > >> Somehow, I couldn't receive your email on time. Looks like, some of my >> emails got warning saying that the email needs to be approved since it >> matches spam contents. Do you have any idea why it's blocked? >> > > Maybe the list administrator can help with this issue. > Every once in a while, perfectly normal mail trips the spam filter. If this is really a problem, I can look into it. However, training the filter is kind of a black art so I'm not sure how successful I'll be. -jeff > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > From mashirle at us.ibm.com Thu Jan 31 01:04:01 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 31 Jan 2008 01:04:01 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> <1201774303.27803.26.camel@mtls03> <1201757816.19565.62.camel@localhost.localdomain> Message-ID: <1201770241.19565.108.camel@localhost.localdomain> Hello Roland, > I don't think CM is quite the same issue as for 4K-4 MTU, as one could > easily imagine an implementation that allows an MTU of 128KB or 1MB or > whatever. Also the CM protocol negotiates the receive size available, > so we shouldn't have the problems of getting a local length error. There is no issue on getting local length error, but there is an issue when the other side set to 64k - 4 mtu, linux side set to 64k - 12 mtu, the none TCP based application will be broken without amdin's notice. TCP should be OK since it does MSS negotiation. Thanks Shirley From rdreier at cisco.com Thu Jan 31 11:18:15 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 31 Jan 2008 11:18:15 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: <1201770241.19565.108.camel@localhost.localdomain> (Shirley Ma's message of "Thu, 31 Jan 2008 01:04:01 -0800") References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> <1201774303.27803.26.camel@mtls03> <1201757816.19565.62.camel@localhost.localdomain> <1201770241.19565.108.camel@localhost.localdomain> Message-ID: > There is no issue on getting local length error, but there is an issue > when the other side set to 64k - 4 mtu, linux side set to 64k - 12 mtu, > the none TCP based application will be broken without amdin's notice. > TCP should be OK since it does MSS negotiation. But the other side could easily have an MTU of 1 MB or 1 GB with a different IPoIB CM implementation. So I don't see how we can fix this really. From glebn at voltaire.com Thu Jan 31 11:27:30 2008 From: glebn at voltaire.com (Gleb Natapov) Date: Thu, 31 Jan 2008 21:27:30 +0200 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary onRC3readiness In-Reply-To: <1201804223.28486.284.camel@firewall.xsintricity.com> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <47A0EDB6.5050804@mellanox.co.il> <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> <1201733706.28486.218.camel@firewall.xsintricity.com> <000201c86434$2cbc6400$a937170a@amr.corp.intel.com> <1201804223.28486.284.camel@firewall.xsintricity.com> Message-ID: <20080131192730.GA22614@minantech.com> On Thu, Jan 31, 2008 at 01:30:23PM -0500, Doug Ledford wrote: > > And I'm really not trying to come across harsh here, but if the distros are > > willing to pull the OFED code, why should OFA bother trying to merge anything > > upstream? > > I pull *some* OFED code. I don't pull it all. There are things in OFED > I won't accept until they've gone upstream. Hence, RDS is not in our > offering. We made the mistake of taking SDP long ago and we'll carry What about XRC? -- Gleb. From rdreier at cisco.com Thu Jan 31 11:30:08 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 31 Jan 2008 11:30:08 -0800 Subject: [ofa-general] Re: [PATCH 1 of 2] mthca: mthca_QUERY_ADAPTER reads fields which are reserved in memfree In-Reply-To: <200801271813.21048.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Sun, 27 Jan 2008 18:13:20 +0200") References: <200801271813.21048.jackm@dev.mellanox.co.il> Message-ID: Thansk, applied. > I left the non-memfree implementation as it was before. Its possible that > the memfree implementation is good for non-memfree as well, in which case > the "if (mthca_is_memfree)" conditions can be eliminated (i.e., just > use the memfree implementation unconditionally). What would make the memfree implementation no good for non-memfree? How can we find out whether we can make that simplification? From mashirle at us.ibm.com Thu Jan 31 01:32:37 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 31 Jan 2008 01:32:37 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> <1201774303.27803.26.camel@mtls03> <1201757816.19565.62.camel@localhost.localdomain> <1201770241.19565.108.camel@localhost.localdomain> Message-ID: <1201771957.19565.111.camel@localhost.localdomain> On Thu, 2008-01-31 at 11:18 -0800, Roland Dreier wrote: > > There is no issue on getting local length error, but there is an issue > > when the other side set to 64k - 4 mtu, linux side set to 64k - 12 mtu, > > the none TCP based application will be broken without amdin's notice. > > TCP should be OK since it does MSS negotiation. > > But the other side could easily have an MTU of 1 MB or 1 GB with a > different IPoIB CM implementation. So I don't see how we can fix this > really. It could be fixed by being able to set max IPv4 MTU 64k-4 and IPv6 MTU 4G-4. Thanks Shirley From rdreier at cisco.com Thu Jan 31 11:33:07 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 31 Jan 2008 11:33:07 -0800 Subject: [ofa-general] Re: [PATCH 2 of 2] mlx4: mlx4__QUERY_ADAPTER reads fields which are reserved In-Reply-To: <200801271813.25879.jackm@dev.mellanox.co.il> (Jack Morgenstein's message of "Sun, 27 Jan 2008 18:13:25 +0200") References: <200801271813.25879.jackm@dev.mellanox.co.il> Message-ID: thanks, applied. From dledford at redhat.com Thu Jan 31 11:39:22 2008 From: dledford at redhat.com (Doug Ledford) Date: Thu, 31 Jan 2008 14:39:22 -0500 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary onRC3readiness In-Reply-To: <20080131192730.GA22614@minantech.com> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <47A0EDB6.5050804@mellanox.co.il> <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> <1201733706.28486.218.camel@firewall.xsintricity.com> <000201c86434$2cbc6400$a937170a@amr.corp.intel.com> <1201804223.28486.284.camel@firewall.xsintricity.com> <20080131192730.GA22614@minantech.com> Message-ID: <1201808362.28486.294.camel@firewall.xsintricity.com> On Thu, 2008-01-31 at 21:27 +0200, Gleb Natapov wrote: > On Thu, Jan 31, 2008 at 01:30:23PM -0500, Doug Ledford wrote: > > > And I'm really not trying to come across harsh here, but if the distros are > > > willing to pull the OFED code, why should OFA bother trying to merge anything > > > upstream? > > > > I pull *some* OFED code. I don't pull it all. There are things in OFED > > I won't accept until they've gone upstream. Hence, RDS is not in our > > offering. We made the mistake of taking SDP long ago and we'll carry > What about XRC? It's currently in, although that isn't written in stone either. Individual changes to existing components, like xrc, can slip past me easier than whole unsubmitted subsystems like RDS. -- Doug Ledford GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From sean.hefty at intel.com Thu Jan 31 11:44:58 2008 From: sean.hefty at intel.com (Sean Hefty) Date: Thu, 31 Jan 2008 11:44:58 -0800 Subject: [ewg] Re: [ofa-general] OFED Jan 28 meeting summaryonRC3readiness In-Reply-To: <1201808362.28486.294.camel@firewall.xsintricity.com> References: <93703EDD1F125544A59BD25384B175621C86870E96@G1W0485.americas.hpqcorp.net> <47A0EDB6.5050804@mellanox.co.il> <000601c8638c$03fc8070$0dfd070a@amr.corp.intel.com> <1201733706.28486.218.camel@firewall.xsintricity.com> <000201c86434$2cbc6400$a937170a@amr.corp.intel.com> <1201804223.28486.284.camel@firewall.xsintricity.com> <20080131192730.GA22614@minantech.com> <1201808362.28486.294.camel@firewall.xsintricity.com> Message-ID: <000a01c86441$c3577640$a937170a@amr.corp.intel.com> >It's currently in, although that isn't written in stone either. >Individual changes to existing components, like xrc, can slip past me >easier than whole unsubmitted subsystems like RDS. I think for RedHat it would end up being in the kernel, but Roland's userspace library doesn't support it, so it ends up being unused code. From rdreier at cisco.com Thu Jan 31 11:47:33 2008 From: rdreier at cisco.com (Roland Dreier) Date: Thu, 31 Jan 2008 11:47:33 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: <1201771957.19565.111.camel@localhost.localdomain> (Shirley Ma's message of "Thu, 31 Jan 2008 01:32:37 -0800") References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> <1201774303.27803.26.camel@mtls03> <1201757816.19565.62.camel@localhost.localdomain> <1201770241.19565.108.camel@localhost.localdomain> <1201771957.19565.111.camel@localhost.localdomain> Message-ID: > It could be fixed by being able to set max IPv4 MTU 64k-4 and IPv6 MTU > 4G-4. But we could never receive a message of size even close to 4G into a single skb. So IPv6 at least will always be a problem. Even for the IPv4 case I think CM is OK: the two sides exchange the size of the message they can receive, so the not quite 64K MTU we support should be fine. The current CM implementation handles pmtu etc when a remove system has a smaller receive side. From mashirle at us.ibm.com Thu Jan 31 02:15:56 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 31 Jan 2008 02:15:56 -0800 Subject: [ofa-general] [PATCH] IPoIB UD 4K MTU support In-Reply-To: References: <1201025321.756.33.camel@localhost.localdomain> <1201102346.6925.215.camel@mtls03> <1201068498.756.59.camel@localhost.localdomain> <4e6a6b3c0801231354v24baa6d8q5034a1cdaf3c704f@mail.gmail.com> <1201162492.9739.21.camel@localhost.localdomain> <1201181656.9739.42.camel@localhost.localdomain> <1201340657.10918.8.camel@dyn9047018117.beaverton.ibm.com> <1201774303.27803.26.camel@mtls03> <1201757816.19565.62.camel@localhost.localdomain> <1201770241.19565.108.camel@localhost.localdomain> <1201771957.19565.111.camel@localhost.localdomain> Message-ID: <1201774556.19565.113.camel@localhost.localdomain> Hello Roland, > Even for the IPv4 case I think CM is OK: the two sides exchange the > size of the message they can receive, so the not quite 64K MTU we > support should be fine. The current CM implementation handles pmtu > etc when a remove system has a smaller receive side. If pmtu is there, then there should be no problem for current CM. I didn't pay attention to this. Thanks Shirley From dwspm at sp.nl Thu Jan 31 12:18:07 2008 From: dwspm at sp.nl (Jerome Baird) Date: Thu, 31 Jan 2008 22:18:07 +0200 Subject: [ofa-general] =?iso-8859-1?q?Pssst=85USA_Players!_Find_BIG_THINGS?= =?iso-8859-1?q?_=26_BIG_DOLLARS_?= Message-ID: <01c86457$27a33980$eaade458@dwspm> Make Big Dollar Casino your home Turn dreams into dollars! Get $500 INSTANTLY when you register! See them grow in dozens of your favorite games, tournaments or our huge jackpot! Big Dollar welcomes everyone! Even if you're from the USA! At Big Dollar we make sure you have the fun, excitement and get the winnings you deserve. PAYOUTS ARE FAST! Deposits are made safely and securely! You can turn your fortune in three easy steps REGISTER PLAY WIN! http://geocities.com/petejarvis99/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From unionizedf940 at preservationrow.net Thu Jan 31 13:40:17 2008 From: unionizedf940 at preservationrow.net (Crystal Robinson) Date: Thu, 31 Jan 2008 15:40:17 -0600 Subject: [ofa-general] The corpora cavernosa are the two bodies of erectile tissue on each side of the penis. Like all the other muscles in your body, your penis is actually designed to grow! Message-ID: <01c8641f$94024680$02ed6bc8@unionizedf940> This is going to sound weird but I was never really embarrassed of my penis size in front of my wife. VPXL Pills do not cause any known adverse side effects. http://swpotles.com From Harris.Shi at lsi.com Thu Jan 31 13:53:42 2008 From: Harris.Shi at lsi.com (Shi, Harris) Date: Thu, 31 Jan 2008 14:53:42 -0700 Subject: [ofa-general] OFED 1.2.5 SRP driver did not send DID_NO_CONNECT on target failure Message-ID: <18A61515E49B764AB09447A336E51F560102AA6C@NAMAIL2.ad.lsil.com> Hi, Currently when I was working a failover solution on Engenio storage array with IB host connection, I noticed that there is no DID_NO_CONNECT notification to upper level driver when the link to target is failed. Our failover driver relied heavily on this notice from OFED 1.2 SRP driver to send out command to do failover at the expiration of link_down_timeout period. Due to this reason, the IO command eventually times out and failover occurred much later than what we expected. I am wondering if anyone is familiar with SRP driver and possibly have something for me to work around the issue. My system setting is as follows, Mellanox HCA with Dell Server and Engenio storage array with IB host card. Many thanks. Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From adzeswj54 at ilmtalklinik.de Thu Jan 31 14:03:53 2008 From: adzeswj54 at ilmtalklinik.de (Jean Colvin) Date: Thu, 31 Jan 2008 19:03:53 -0300 Subject: [ofa-general] The next generation of enlargement pill is here! Message-ID: <726034079.34834074718470@ilmtalklinik.de> Always it is necessary to be ahead. It is a correct choice.V wye P rrx X se Lhttp://home.graffiti.net/bordieri/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From changquing.tang at hp.com Thu Jan 31 14:50:49 2008 From: changquing.tang at hp.com (Tang, Changqing) Date: Thu, 31 Jan 2008 22:50:49 +0000 Subject: [ofa-general] [PATCH 0/ 8] XRC patch series (including xrc receive-only QPs) In-Reply-To: <200801231159.30989.jackm@dev.mellanox.co.il> References: <200801231159.30989.jackm@dev.mellanox.co.il> Message-ID: Jack: In order to open a new XRC domain, all processes on a node open a file descriptor using the same pathname, and pass the fd to ibv_open_xrc_domain(). When can I close the fd ? when can I remove the temp file ? Can I close the fd and unlink the temp file right after ibv_open_xrc_domain() returns ? Does ibv_open_xrc_domain() increase the fd reference count and ibv_close_xrc_domain() decrease the fd reference count ? Thanks. --CQ > -----Original Message----- > From: general-bounces at lists.openfabrics.org > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of > Jack Morgenstein > Sent: Wednesday, January 23, 2008 4:00 AM > To: Roland Dreier > Cc: general at lists.openfabrics.org > Subject: [ofa-general] [PATCH 0/ 8] XRC patch series > (including xrc receive-only QPs) > > This patch series is the updated XRC implementation (kernel > and user (libibverbs and libmlx4)). > > Please give feedback -- I'm still reviewing the locking in > this implementation. > > The kernel patches are all based on > git: //git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git > branch: for-2.6.25 > commit: 5e8a3c6041ded7e306607bb6c96a0e68ca4dd2b4 > > ***** > In addition, the kernel patch series requires that Eli > Cohen's patch 7/16, posted January 16, be applied first > ([ofa-general] [PATCH 7/16] ib/core: Add creation flags to QPs ) > ***** > > The patches should be applied in the order posted. > > Changes: > - Added creation of XRC receive-only QPs for userspace, which > reside in kernel space (user cannot post-to or poll these QPs). > Motivation: MPI community required XRC receive QPs which would > not be destroyed when the creating process terminated. > > Solution: Userspace requests that a QP be created in kernel space. > Each userspace process using that QP (i.e. receiving > packets on an XRC SRQ via the qp), registers with > that QP (-- the creator is also registered, whether > or not it is a user of the QP). When the last > userspace user > unregisters with the QP, it is destroyed. Unregistration > is also part of userspace cleanup, so there is > no leakage. > > API for this: > ibv_create_xrc_rcv_qp > ibv_modify_xrc_rcv_qp > ibv_query_xrc_rcv_qp > ibv_reg_xrc_rcv_qp > ibv_unreg_xrc_rcv_qp > > Creating process workflow: > ibv_create_xrc_rcv_qp -- to create > ibv_modify_xrc_rcv_qp -- to move QP to INIT > ibv_modify_xrc_rcv_qp -- to move QP to RTR > (to RTS is not needed for receive-only QPs) > > ibv_unreg_xrc_rcv_qp -- instead of destroy. > > Using process workflow > ibv_create_xrc_srq -- to create an SRQ > ibv_reg_xrc_rcv_qp -- to register with the QP as a user > > ibv_destroy_srq > ibv_unreg_xrc_rcv_qp -- to "unregister" with the QP. If no > user process remain registered, the > QP is destroyed. > > NOTES: > 1. Since there is no userspace object for the QP, the API uses > the XRC domain object and qp number instead. > > 2. Registration needs to be performed only once per process > (multiple registrations count as a single registration). > > 3. Async events for the receive QP are delivered to all registered > processes. The event ID is "OR'ed" with 0x80000000, to indicate > that this is an XRC receive-only QP event. The element field > union value "xrc_qp_num" is set to the QP number which > generated the > event. > > - Jack > > _______________________________________________ > general mailing list > general at lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > From ardavis at ichips.intel.com Thu Jan 31 17:02:46 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Thu, 31 Jan 2008 17:02:46 -0800 Subject: [ofa-general] Re: Dapl 2 question/issue In-Reply-To: References: <1201628522.28486.36.camel@firewall.xsintricity.com> <479F73F9.9050704@ichips.intel.com> <1201634763.28486.63.camel@firewall.xsintricity.com> <1201636696.28486.75.camel@firewall.xsintricity.com> <479F9EA5.4020407@ichips.intel.com> <1201644204.28486.116.camel@firewall.xsintricity.com> <479FA93C.6050305@ichips.intel.com> Message-ID: <47A26FB6.8070306@ichips.intel.com> Tang, Changqing wrote: > I look at the v2 header file udat.h, it still has > > #include > > it should be > > #include > > If I install v2 header at include/dat2 > > right? correct. This is a bug. I will fix in RC4 From kliteyn at mellanox.co.il Thu Jan 31 17:07:48 2008 From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il) Date: 1 Feb 2008 03:07:48 +0200 Subject: [ofa-general] nightly osm_sim report 2008-02-01:normal completion Message-ID: OSM Simulation Regression Summary [Generated mail - please do NOT reply] OpenSM binary date = 2008-01-31 OpenSM git rev = Wed_Jan_30_16:05:50_2008 [6972276aab52479428a0f689907bcb40d95ca2ec] ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f] Total=400 Pass=400 Fail=0 Pass: 30 Stability IS1-16.topo 30 Pkey IS1-16.topo 30 OsmTest IS1-16.topo 30 OsmStress IS1-16.topo 30 Multicast IS1-16.topo 30 LidMgr IS1-16.topo 10 Stability IS3-loop.topo 10 Stability IS3-128.topo 10 Pkey IS3-128.topo 10 OsmTest IS3-loop.topo 10 OsmTest IS3-128.topo 10 OsmStress IS3-128.topo 10 Multicast IS3-loop.topo 10 Multicast IS3-128.topo 10 LidMgr IS3-128.topo 10 FatTree merge-roots-4-ary-2-tree.topo 10 FatTree merge-root-4-ary-3-tree.topo 10 FatTree gnu-stallion-64.topo 10 FatTree blend-4-ary-2-tree.topo 10 FatTree RhinoDDR.topo 10 FatTree FullGnu.topo 10 FatTree 4-ary-2-tree.topo 10 FatTree 2-ary-4-tree.topo 10 FatTree 12-node-spaced.topo 10 FTreeFail 4-ary-2-tree-missing-sw-link.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo 10 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo 10 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo Failures: From mhanafi at csc.com Thu Jan 31 17:20:31 2008 From: mhanafi at csc.com (Mahmoud Hanafi) Date: Thu, 31 Jan 2008 20:20:31 -0500 Subject: [ofa-general] ofed1.2.5rc2 and intel mpi error Message-ID: I am getting this error when trying to run xphl using 1.2.5rc2. Any one else seen this error? [112][rdma_iba.c:260] Intel MPI fatal error: DTO operation completed with error. status=0x8. cookie=0x2aaaaaefe300 Mahmoud Hanafi Sr. System Administrator CSC HPC COE Bld. 676 2435 Fifth Street WPAFB, Ohio 45433 (937) 255-1536 Computer Sciences Corporation Registered Office: 2100 East Grand Avenue, El Segundo California 90245, USA Registered in USA No: C-489-59 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose. ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ardavis at ichips.intel.com Thu Jan 31 17:26:12 2008 From: ardavis at ichips.intel.com (Arlin Davis) Date: Thu, 31 Jan 2008 17:26:12 -0800 Subject: [ofa-general] ofed1.2.5rc2 and intel mpi error In-Reply-To: References: Message-ID: <47A27534.6060600@ichips.intel.com> Mahmoud Hanafi wrote: > > I am getting this error when trying to run xphl using 1.2.5rc2. Any one > else seen this error? > > [112][rdma_iba.c:260] Intel MPI fatal error: DTO operation completed > with error. status=0x8. cookie=0x2aaaaaefe300 > What adapter and what mpiexec options are being used? -arlin From mhanafi at csc.com Thu Jan 31 19:02:31 2008 From: mhanafi at csc.com (Mahmoud Hanafi) Date: Thu, 31 Jan 2008 22:02:31 -0500 Subject: [ofa-general] ofed1.2.5rc2 and intel mpi error In-Reply-To: <47A27534.6060600@ichips.intel.com> Message-ID: here is my mpirun command mpirun -np 128 -env I_MPI_DEVICE rdma:OpenIB-cma -env I_MPI_DEBUG 2 /home/hanafim/HPL/xhpl Mahmoud Hanafi Sr. System Administrator CSC HPC COE Bld. 676 2435 Fifth Street WPAFB, Ohio 45433 (937) 255-1536 Computer Sciences Corporation Registered Office: 2100 East Grand Avenue, El Segundo California 90245, USA Registered in USA No: C-489-59 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose. ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Arlin Davis Sent by: general-bounces at lists.openfabrics.org 01/31/2008 08:26 PM To Mahmoud Hanafi/DEF/CSC at CSC cc general at lists.openfabrics.org Subject Re: [ofa-general] ofed1.2.5rc2 and intel mpi error Mahmoud Hanafi wrote: > > I am getting this error when trying to run xphl using 1.2.5rc2. Any one > else seen this error? > > [112][rdma_iba.c:260] Intel MPI fatal error: DTO operation completed > with error. status=0x8. cookie=0x2aaaaaefe300 > What adapter and what mpiexec options are being used? -arlin _______________________________________________ general mailing list general at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From prescott at hpc.ufl.edu Thu Jan 31 20:29:39 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Thu, 31 Jan 2008 23:29:39 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <4798D0D2.5070103@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> <4798D0D2.5070103@opengridcomputing.com> Message-ID: <47A2A033.4060208@hpc.ufl.edu> Steve Wise wrote: > Roland Dreier wrote: >> Sorry to come into this thread so late, but does it make sense to try >> the current SDP code over iWARP? As I understand things, the RDMA >> consortium has its own spec for SDP on iWARP, which may not precisely >> correspond to the IBA SDP annex. So probably the SDP code would need >> updating to work over iWARP. >> > > I didn't think they were that different, but I don't know for sure. > However, unless the IB-SDP uses atomics or some other IB-specific work > request, it just might work. > Sorry for the slow follow-up. SDP on iWARP is working now: [root at tebow2 ~]# /opt/netperf/bin/netperf -H 128.227.253.91 -L 128.227.253.92 -t SDP_STREAM -c -C -l 10 -p 5006 SDP STREAM TEST from 128.227.253.92 (128.227.253.92) port 0 AF_INET to 128.227.253.91 (128.227.253.91) port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 262144 262144 262144 10.00 6305.54 16.39 14.38 0.852 1.495 The patch to enable this is not big - I will produce one and send it to the list. Might not happen before next week. There is only one other remarkable problem encountered which is not already documented in this thread. That is, when SDP tries to resize the receive private buffers (receiver gets an SDP_MID_CHRCVBUF), this can create up to 9 scatter-gather entries for each associated work request. This is larger than the Chelsio RNICs I am using can handle (T3_MAX_SGE), and the building of work requests fails. Does T3_MAX_SGE come from hardware? Anyway, one way to work around this was to deny SDP any "large" sockets via moddule parameter (max_large_sockets=0). It would be good if SDP queried the RNIC for the max number of SGEs when an SDP_MID_CHRCVBUF is encountered, and resize the private buffers in a way that will not exceed the capability of the device. I haven't tried that yet, but I have limited the requested receive buffer size to something the Chelsio RNIC could handle. The above netperf result uses this hack. Thanks again for all your help. Will post some numbers shortly. Cheers, Craig From johann.george at qlogic.com Thu Jan 31 20:35:57 2008 From: johann.george at qlogic.com (Johann George) Date: Thu, 31 Jan 2008 20:35:57 -0800 Subject: [ofa-general] Re: [PATCH] qperf: adding SL option for RDMA tests In-Reply-To: <47A0881C.9020509@dev.mellanox.co.il> References: <47A0881C.9020509@dev.mellanox.co.il> Message-ID: <20080201043557.GA12942@cuprite.pathscale.com> Hello Yevgeny. You did quite well with the patch but missed a few items. We should also increment the version since the data we are passing between the client and the server has changed to include the service level. I'll fix it up and check it in. Thanks for the patch. It is a useful option. Johann On Wed, Jan 30, 2008 at 04:22:20PM +0200, Yevgeny Kliteynik wrote: > Hi Johann, > > I'd like to add to qperf a command line parameter to for > changing the SL of the QP/AH (for RDMA tests). > This is being used mainly in order to check the QoS feature > and performance on different VLs. > > I'm not sure I figured out the option thing in qperf right, > but in any case, the following patch does the job for me. > > Please review and let me know what you think. > > -- Yevgeny > > Signed-off-by: Yevgeny Kliteynik > --- > src/help.txt | 4 ++++ > src/qperf.c | 11 ++++++++++- > src/qperf.h | 3 +++ > src/rdma.c | 6 ++++-- > 4 files changed, 21 insertions(+), 3 deletions(-) > > diff --git a/src/help.txt b/src/help.txt > index 59195cb..ec00356 100644 > --- a/src/help.txt > +++ b/src/help.txt > @@ -77,6 +77,7 @@ Opts > --timeout Time (-T) Set timeout > --loc_timeout Time (-lT) Set local timeout > --rem_timeout Time (-rT) Set remote timeout > + --service_level SL (-sl) Set Service Level to SL for RDMA tests > --unify_nodes (-un) Unify nodes > --unify_units (-uu) Unify units > --use_bits_per_sec (-ub) Use bits/sec rather than bytes/sec > @@ -184,6 +185,9 @@ Options > Set local timeout to Time. > --rem_timeout Time (-rT) > Set local timeout to Time. > + --service_level SL (-sl) > + Set Service Level to SL. This is the SL used for RDMA tests only. > + The default SL is 0. > --unify_nodes (-un) > Unify the nodes. Describe them in terms of local and remote rather > than send and receive. > diff --git a/src/qperf.c b/src/qperf.c > index fd4c24f..06ff2d5 100644 > --- a/src/qperf.c > +++ b/src/qperf.c > @@ -280,6 +280,7 @@ PAR_NAME ParName[] ={ > { "sock_buf_size", L_SOCK_BUF_SIZE, R_SOCK_BUF_SIZE }, > { "time", L_TIME, R_TIME }, > { "timeout", L_TIMEOUT, R_TIMEOUT }, > + { "service_level", L_SL, R_SL }, > }; > > > @@ -317,6 +318,8 @@ PAR_INFO ParInfo[P_N] ={ > { R_TIME, 't', &RReq.time }, > { L_TIMEOUT, 't', &Req.timeout }, > { R_TIMEOUT, 't', &RReq.timeout }, > + { L_SL, 'q', &Req.sl }, > + { R_SL, 'q', &RReq.sl }, > }; > > > @@ -392,6 +395,8 @@ OPTION Options[] ={ > { "-rT", 0, &opt_time, R_TIMEOUT }, > { "--server_timeout", 0, &opt_misc, 's', 't' }, > { "-st", 0, &opt_misc, 's', 't' }, > + { "--service_level", 0, &opt_long, L_SL, R_SL }, > + { "-sl", 0, &opt_long, L_SL, R_SL }, > { "--unify_nodes", 0, &opt_misc, 'u', 'n' }, > { "-un", 0, &opt_misc, 'u', 'n' }, > { "--unify_units", 0, &opt_misc, 'u', 'u' }, > @@ -1217,6 +1222,8 @@ client(TEST *test) > par_use(R_AFFINITY); > par_use(L_TIME); > par_use(R_TIME); > + par_use(L_SL); > + par_use(R_SL); > > set_affinity(); > RReq.ver_maj = VER_MAJ; > @@ -1848,7 +1855,7 @@ show_rest(void) > uint64_t lr = LStat.r.no_bytes; > uint64_t rs = RStat.s.no_bytes; > uint64_t rr = RStat.r.no_bytes; > - > + > if (ls && !rs && rr && !lr) { > srmode = 1; > resnS = &Res.l; > @@ -2385,6 +2392,7 @@ enc_req(REQ *host) > enc_int(host->no_msgs, sizeof(host->no_msgs)); > enc_int(host->sock_buf_size, sizeof(host->sock_buf_size)); > enc_int(host->time, sizeof(host->time)); > + enc_int(host->sl, sizeof(host->sl)); > enc_str(host->id, sizeof(host->id)); > } > > @@ -2411,6 +2419,7 @@ dec_req(REQ *host) > host->no_msgs = dec_int(sizeof(host->no_msgs)); > host->sock_buf_size = dec_int(sizeof(host->sock_buf_size)); > host->time = dec_int(sizeof(host->time)); > + host->sl = dec_int(sizeof(host->sl)); > dec_str(host->id, sizeof(host->id)); > } > > diff --git a/src/qperf.h b/src/qperf.h > index 0c42361..2539f61 100644 > --- a/src/qperf.h > +++ b/src/qperf.h > @@ -119,6 +119,8 @@ typedef enum { > R_TIME, > L_TIMEOUT, > R_TIMEOUT, > + L_SL, > + R_SL, > P_N > } PAR_INDEX; > > @@ -156,6 +158,7 @@ typedef struct REQ { > uint32_t no_msgs; /* Number of messages */ > uint32_t sock_buf_size; /* Socket buffer size */ > uint32_t time; /* Duration in seconds */ > + uint8_t sl; /* Service Level */ > char id[STRSIZE]; /* Identifier */ > char rate[STRSIZE]; /* Rate */ > } REQ; > diff --git a/src/rdma.c b/src/rdma.c > index b0cd067..5208d4d 100644 > --- a/src/rdma.c > +++ b/src/rdma.c > @@ -1596,7 +1596,8 @@ ib_prepare(IBDEV *ibdev) > .ah_attr = { > .dlid = ibdev->rcon.lid, > .port_num = ibdev->port, > - .static_rate = ibdev->rate > + .static_rate = ibdev->rate, > + .sl = Req.sl > } > }; > struct ibv_qp_attr rts_attr ={ > @@ -1610,7 +1611,8 @@ ib_prepare(IBDEV *ibdev) > struct ibv_ah_attr ah_attr ={ > .dlid = ibdev->rcon.lid, > .port_num = ibdev->port, > - .static_rate = ibdev->rate > + .static_rate = ibdev->rate, > + .sl = Req.sl > }; > > if (ibdev->trans == IBV_QPT_UD) { > -- > 1.5.1.4 From swise at opengridcomputing.com Thu Jan 31 20:50:03 2008 From: swise at opengridcomputing.com (Steve Wise) Date: Thu, 31 Jan 2008 22:50:03 -0600 Subject: [ofa-general] SDP and iWARP In-Reply-To: <47A2A033.4060208@hpc.ufl.edu> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> <4798D0D2.5070103@opengridcomputing.com> <47A2A033.4060208@hpc.ufl.edu> Message-ID: <47A2A4FB.503@opengridcomputing.com> Craig Prescott wrote: > Steve Wise wrote: >> Roland Dreier wrote: >>> Sorry to come into this thread so late, but does it make sense to try >>> the current SDP code over iWARP? As I understand things, the RDMA >>> consortium has its own spec for SDP on iWARP, which may not precisely >>> correspond to the IBA SDP annex. So probably the SDP code would need >>> updating to work over iWARP. >>> >> >> I didn't think they were that different, but I don't know for sure. >> However, unless the IB-SDP uses atomics or some other IB-specific work >> request, it just might work. >> > Sorry for the slow follow-up. SDP on iWARP is working now: > Good work! > [root at tebow2 ~]# /opt/netperf/bin/netperf -H 128.227.253.91 -L > 128.227.253.92 -t SDP_STREAM -c -C -l 10 -p 5006 > SDP STREAM TEST from 128.227.253.92 (128.227.253.92) port 0 AF_INET to > 128.227.253.91 (128.227.253.91) port 0 AF_INET > Recv Send Send Utilization Service > Demand > Socket Socket Message Elapsed Send Recv Send Recv > Size Size Size Time Throughput local remote local > remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > us/KB > > 262144 262144 262144 10.00 6305.54 16.39 14.38 0.852 > 1.495 > The patch to enable this is not big - I will produce one and send it to > the list. Might not happen before next week. > What mtu are you using? > There is only one other remarkable problem encountered which is not > already documented in this thread. That is, when SDP tries to resize > the receive private buffers (receiver gets an SDP_MID_CHRCVBUF), this > can create up to 9 scatter-gather entries for each associated work > request. This is larger than the Chelsio RNICs I am using can handle > (T3_MAX_SGE), and the building of work requests fails. > > Does T3_MAX_SGE come from hardware? > yes. > Anyway, one way to work around this was to deny SDP any "large" sockets > via moddule parameter (max_large_sockets=0). It would be good if SDP > queried the RNIC for the max number of SGEs when an SDP_MID_CHRCVBUF > is encountered, and resize the private buffers in a way that will not > exceed the capability of the device. > > I haven't tried that yet, but I have limited the requested > receive buffer size to something the Chelsio RNIC could handle. > The above netperf result uses this hack. > > Thanks again for all your help. Will post some numbers shortly. > > Cheers, > Craig From prescott at hpc.ufl.edu Thu Jan 31 20:58:18 2008 From: prescott at hpc.ufl.edu (Craig Prescott) Date: Thu, 31 Jan 2008 23:58:18 -0500 Subject: [ofa-general] SDP and iWARP In-Reply-To: <47A2A4FB.503@opengridcomputing.com> References: <4783A5B0.6040603@hpc.ufl.edu> <4783B3F5.20600@opengridcomputing.com> <4783BDD5.7000702@hpc.ufl.edu> <4783C326.3070306@opengridcomputing.com> <478634A5.3080204@hpc.ufl.edu> <47863794.9080709@opengridcomputing.com> <47865A4A.4070603@hpc.ufl.edu> <47865E5B.4030607@opengridcomputing.com> <4787936E.5010603@hpc.ufl.edu> <4787977E.509@opengridcomputing.com> <479765AC.1040600@hpc.ufl.edu> <8A71B368A89016469F72CD08050AD33401FCDA2F@maui.asicdesigners.com> <47977262.1060906@hpc.ufl.edu> <4798CB4C.7070706@opengridcomputing.com> <4798D0D2.5070103@opengridcomputing.com> <47A2A033.4060208@hpc.ufl.edu> <47A2A4FB.503@opengridcomputing.com> Message-ID: <47A2A6EA.60605@hpc.ufl.edu> Steve Wise wrote: > Craig Prescott wrote: >> [root at tebow2 ~]# /opt/netperf/bin/netperf -H 128.227.253.91 -L >> 128.227.253.92 -t SDP_STREAM -c -C -l 10 -p 5006 >> SDP STREAM TEST from 128.227.253.92 (128.227.253.92) port 0 AF_INET >> to 128.227.253.91 (128.227.253.91) port 0 AF_INET >> Recv Send Send Utilization >> Service Demand >> Socket Socket Message Elapsed Send Recv >> Send Recv >> Size Size Size Time Throughput local remote >> local remote >> bytes bytes bytes secs. 10^6bits/s % S % S >> us/KB us/KB >> >> 262144 262144 262144 10.00 6305.54 16.39 14.38 >> 0.852 1.495 >> The patch to enable this is not big - I will produce one and send it to >> the list. Might not happen before next week. >> > > What mtu are you using? 9000. Cheers, Craig From mashirle at us.ibm.com Thu Jan 31 10:58:26 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 31 Jan 2008 10:58:26 -0800 Subject: [ofa-general] [UPDATE][PATCH 1/3] ib/ipoib: Make IPoIB-CM RX S/G APIs more generic (for-2.6.25) In-Reply-To: <1201718716.6850.46.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> <1201718716.6850.46.camel@localhost.localdomain> Message-ID: <1201805906.19565.118.camel@localhost.localdomain> This patch makes IPoIB-CM RX S/G APIs more generic for IPoIB-UD RX S/G to be resued later. Signed-off-by: Shirley Ma --- drivers/infiniband/ulp/ipoib/ipoib.h | 26 +++++- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 135 ++++++------------------------- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 85 +++++++++++++++++++ 3 files changed, 132 insertions(+), 114 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index fe250c6..d1d3ca2 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -141,6 +141,11 @@ struct ipoib_rx_buf { u64 mapping; }; +struct ipoib_cm_rx_buf { + struct sk_buff *skb; + u64 mapping[IPOIB_CM_RX_SG]; +}; + struct ipoib_tx_buf { struct sk_buff *skb; u64 mapping; @@ -212,11 +217,6 @@ struct ipoib_cm_tx { struct ib_wc ibwc[IPOIB_NUM_WC]; }; -struct ipoib_cm_rx_buf { - struct sk_buff *skb; - u64 mapping[IPOIB_CM_RX_SG]; -}; - struct ipoib_cm_dev_priv { struct ib_srq *srq; struct ipoib_cm_rx_buf *srq_ring; @@ -458,6 +458,22 @@ int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey); void ipoib_pkey_poll(struct work_struct *work); int ipoib_pkey_dev_delay_open(struct net_device *dev); void ipoib_drain_cq(struct net_device *dev); +void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, + unsigned int length, struct sk_buff *toskb); +struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, + int id, int frags, int head_size, + int pad, u64 *mapping); +static void inline ipoib_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, + int head_size, u64 *mapping) +{ + int i; + ib_dma_unmap_single(priv->ca, mapping[0], head_size, DMA_FROM_DEVICE); + for (i = 0; i < frags; i++) + ib_dma_unmap_single(priv->ca, mapping[i + 1], PAGE_SIZE, + DMA_FROM_DEVICE); + +} + #ifdef CONFIG_INFINIBAND_IPOIB_CM diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 1818f95..2c2c6b2 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -77,17 +77,6 @@ static struct ib_send_wr ipoib_cm_rx_drain_wr = { static int ipoib_cm_tx_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event); -static void ipoib_cm_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, - u64 mapping[IPOIB_CM_RX_SG]) -{ - int i; - - ib_dma_unmap_single(priv->ca, mapping[0], IPOIB_CM_HEAD_SIZE, DMA_FROM_DEVICE); - - for (i = 0; i < frags; ++i) - ib_dma_unmap_single(priv->ca, mapping[i + 1], PAGE_SIZE, DMA_FROM_DEVICE); -} - static int ipoib_cm_post_receive_srq(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -102,8 +91,9 @@ static int ipoib_cm_post_receive_srq(struct net_device *dev, int id) ret = ib_post_srq_recv(priv->cm.srq, &priv->cm.rx_wr, &bad_wr); if (unlikely(ret)) { ipoib_warn(priv, "post srq failed for buf %d (%d)\n", id, ret); - ipoib_cm_dma_unmap_rx(priv, priv->cm.num_frags - 1, - priv->cm.srq_ring[id].mapping); + ipoib_dma_unmap_rx(priv, priv->cm.num_frags - 1, + IPOIB_CM_HEAD_SIZE, + priv->cm.srq_ring[id].mapping); dev_kfree_skb_any(priv->cm.srq_ring[id].skb); priv->cm.srq_ring[id].skb = NULL; } @@ -126,8 +116,8 @@ static int ipoib_cm_post_receive_nonsrq(struct net_device *dev, ret = ib_post_recv(rx->qp, &priv->cm.rx_wr, &bad_wr); if (unlikely(ret)) { ipoib_warn(priv, "post recv failed for buf %d (%d)\n", id, ret); - ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, - rx->rx_ring[id].mapping); + ipoib_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, IPOIB_CM_HEAD_SIZE, + rx->rx_ring[id].mapping); dev_kfree_skb_any(rx->rx_ring[id].skb); rx->rx_ring[id].skb = NULL; } @@ -135,59 +125,6 @@ static int ipoib_cm_post_receive_nonsrq(struct net_device *dev, return ret; } -static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, - struct ipoib_cm_rx_buf *rx_ring, - int id, int frags, - u64 mapping[IPOIB_CM_RX_SG]) -{ - struct ipoib_dev_priv *priv = netdev_priv(dev); - struct sk_buff *skb; - int i; - - skb = dev_alloc_skb(IPOIB_CM_HEAD_SIZE + 12); - if (unlikely(!skb)) - return NULL; - - /* - * IPoIB adds a 4 byte header. So we need 12 more bytes to align the - * IP header to a multiple of 16. - */ - skb_reserve(skb, 12); - - mapping[0] = ib_dma_map_single(priv->ca, skb->data, IPOIB_CM_HEAD_SIZE, - DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, mapping[0]))) { - dev_kfree_skb_any(skb); - return NULL; - } - - for (i = 0; i < frags; i++) { - struct page *page = alloc_page(GFP_ATOMIC); - - if (!page) - goto partial_error; - skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE); - - mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page, - 0, PAGE_SIZE, DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1]))) - goto partial_error; - } - - rx_ring[id].skb = skb; - return skb; - -partial_error: - - ib_dma_unmap_single(priv->ca, mapping[0], IPOIB_CM_HEAD_SIZE, DMA_FROM_DEVICE); - - for (; i > 0; --i) - ib_dma_unmap_single(priv->ca, mapping[i], PAGE_SIZE, DMA_FROM_DEVICE); - - dev_kfree_skb_any(skb); - return NULL; -} - static void ipoib_cm_free_rx_ring(struct net_device *dev, struct ipoib_cm_rx_buf *rx_ring) { @@ -196,8 +133,9 @@ static void ipoib_cm_free_rx_ring(struct net_device *dev, for (i = 0; i < ipoib_recvq_size; ++i) if (rx_ring[i].skb) { - ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, - rx_ring[i].mapping); + ipoib_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, + rx_ring[i].mapping); dev_kfree_skb_any(rx_ring[i].skb); } @@ -345,8 +283,12 @@ static int ipoib_cm_nonsrq_init_rx(struct net_device *dev, struct ib_cm_id *cm_i spin_unlock_irq(&priv->lock); for (i = 0; i < ipoib_recvq_size; ++i) { - if (!ipoib_cm_alloc_rx_skb(dev, rx->rx_ring, i, IPOIB_CM_RX_SG - 1, - rx->rx_ring[i].mapping)) { + rx->rx_ring[i].skb = ipoib_cm_alloc_rx_skb(dev, i, + IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, + 12, + rx->rx_ring[i].mapping); + if (!rx->rx_ring[i].skb) { ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); ret = -ENOMEM; goto err_count; @@ -480,38 +422,6 @@ static int ipoib_cm_rx_handler(struct ib_cm_id *cm_id, return 0; } } -/* Adjust length of skb with fragments to match received data */ -static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, - unsigned int length, struct sk_buff *toskb) -{ - int i, num_frags; - unsigned int size; - - /* put header into skb */ - size = min(length, hdr_space); - skb->tail += size; - skb->len += size; - length -= size; - - num_frags = skb_shinfo(skb)->nr_frags; - for (i = 0; i < num_frags; i++) { - skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - - if (length == 0) { - /* don't need this page */ - skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE); - --skb_shinfo(skb)->nr_frags; - } else { - size = min(length, (unsigned) PAGE_SIZE); - - frag->size = size; - skb->data_len += size; - skb->truesize += size; - skb->len += size; - length -= size; - } - } -} void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) { @@ -581,7 +491,8 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, (unsigned)IPOIB_CM_HEAD_SIZE)) / PAGE_SIZE; - newskb = ipoib_cm_alloc_rx_skb(dev, rx_ring, wr_id, frags, mapping); + newskb = ipoib_cm_alloc_rx_skb(dev, wr_id, frags, IPOIB_CM_HEAD_SIZE, + 12, mapping); if (unlikely(!newskb)) { /* * If we can't allocate a new RX buffer, dump @@ -592,7 +503,10 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) goto repost; } - ipoib_cm_dma_unmap_rx(priv, frags, rx_ring[wr_id].mapping); + rx_ring[wr_id].skb = newskb; + + ipoib_dma_unmap_rx(priv, frags, IPOIB_CM_HEAD_SIZE, + rx_ring[wr_id].mapping); memcpy(rx_ring[wr_id].mapping, mapping, (frags + 1) * sizeof *mapping); ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", @@ -1481,9 +1395,12 @@ int ipoib_cm_dev_init(struct net_device *dev) if (ipoib_cm_has_srq(dev)) { for (i = 0; i < ipoib_recvq_size; ++i) { - if (!ipoib_cm_alloc_rx_skb(dev, priv->cm.srq_ring, i, - priv->cm.num_frags - 1, - priv->cm.srq_ring[i].mapping)) { + priv->cm.srq_ring[i].skb = + ipoib_cm_alloc_rx_skb(dev, i, + priv->cm.num_frags - 1, + IPOIB_CM_HEAD_SIZE, 12, + priv->cm.srq_ring[i].mapping); + if (!priv->cm.srq_ring[i].skb) { ipoib_warn(priv, "failed to allocate " "receive buffer %d\n", i); ipoib_cm_dev_cleanup(dev); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 52bc2bd..c40329f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -116,6 +116,91 @@ static int ipoib_ib_post_receive(struct net_device *dev, int id) return ret; } +struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, + int id, int frags, int head_size, + int pad, u64 *mapping) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct sk_buff *skb; + int i; + + skb = dev_alloc_skb(head_size + pad); + if (unlikely(!skb)) + return NULL; + + /* + * IPoIB adds a 4 byte header. So we need pad more bytes to align the + * IP header to a multiple of 16. For CM mode, you add pad 12, + * for UD mode, we add pad 4. + */ + skb_reserve(skb, pad); + + mapping[0] = ib_dma_map_single(priv->ca, skb->data, head_size, + DMA_FROM_DEVICE); + if (unlikely(ib_dma_mapping_error(priv->ca, mapping[0]))) { + dev_kfree_skb_any(skb); + return NULL; + } + + for (i = 0; i < frags; i++) { + struct page *page = alloc_page(GFP_ATOMIC); + + if (!page) + goto partial_error; + skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE); + + mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page, + 0, PAGE_SIZE, DMA_FROM_DEVICE); + if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1]))) + goto partial_error; + } + + return skb; + +partial_error: + + ib_dma_unmap_single(priv->ca, mapping[0], head_size, DMA_FROM_DEVICE); + + for (; i > 0; --i) + ib_dma_unmap_single(priv->ca, mapping[i], PAGE_SIZE, DMA_FROM_DEVICE); + + dev_kfree_skb_any(skb); + return NULL; +} + +/* Adjust length of skb with fragments to match received data */ +void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, + unsigned int length, struct sk_buff *toskb) +{ + int i, num_frags; + unsigned int size; + + /* put header into skb */ + size = min(length, hdr_space); + skb->tail += size; + skb->len += size; + length -= size; + + num_frags = skb_shinfo(skb)->nr_frags; + for (i = 0; i < num_frags; i++) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + + if (length == 0) { + /* don't need this page */ + skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE); + --skb_shinfo(skb)->nr_frags; + } else { + size = min(length, (unsigned) PAGE_SIZE); + + frag->size = size; + skb->data_len += size; + skb->truesize += size; + skb->len += size; + length -= size; + } + } +} + static int ipoib_alloc_rx_skb(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); From mashirle at us.ibm.com Thu Jan 31 11:35:48 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 31 Jan 2008 11:35:48 -0800 Subject: [ofa-general] [UPDATE][PATCH 2/3] ib/ipoib: set IPoIB-UD RX S/G parameters In-Reply-To: <1201721611.6850.48.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> <1201721611.6850.48.camel@localhost.localdomain> Message-ID: <1201808148.19565.121.camel@localhost.localdomain> This patch sets up all IPoIB-UD RX S/G related parameters. Signed-off-by: Shirley Ma --- drivers/infiniband/ulp/ipoib/ipoib.h | 13 +++++++++++++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 19 ++++++++++++++----- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 3 +-- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 14 ++++++++++++-- 4 files changed, 40 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index d1d3ca2..004a80b 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -61,6 +61,10 @@ enum { IPOIB_ENCAP_LEN = 4, + IPOIB_MAX_IB_MTU = 4096, + IPOIB_UD_MAX_RX_SG = ALIGN(IPOIB_MAX_IB_MTU + IB_GRH_BYTES + 4, + PAGE_SIZE) / PAGE_SIZE, /* padding to align IP header */ + IPOIB_CM_MTU = 0x10000 - 0x10, /* padding to align header to 16 */ IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU + IPOIB_ENCAP_LEN, IPOIB_CM_HEAD_SIZE = IPOIB_CM_BUF_SIZE % PAGE_SIZE, @@ -319,6 +323,9 @@ struct ipoib_dev_priv { struct dentry *mcg_dentry; struct dentry *path_dentry; #endif + int max_ib_mtu; + struct ib_sge rx_sge[IPOIB_UD_MAX_RX_SG]; + struct ib_recv_wr rx_wr; }; struct ipoib_ah { @@ -359,6 +366,12 @@ struct ipoib_neigh { struct list_head list; }; +#define IPOIB_UD_MTU(ib_mtu) (ib_mtu - IPOIB_ENCAP_LEN) +/* padding to align IP header */ +#define IPOIB_UD_BUF_SIZE(ib_mtu) (ib_mtu + IB_GRH_BYTES + 4) +#define IPOIB_UD_HEAD_SIZE(ib_mtu) (IPOIB_UD_BUF_SIZE(ib_mtu)) % PAGE_SIZE +#define IPOIB_UD_RX_SG(ib_mtu) ALIGN(IPOIB_UD_BUF_SIZE(ib_mtu), PAGE_SIZE) / PAGE_SIZE + /* * We stash a pointer to our private neighbour information after our * hardware address in neigh->ha. The ALIGN() expression here makes diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index a082466..242591f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -194,7 +194,7 @@ static int ipoib_change_mtu(struct net_device *dev, int new_mtu) return 0; } - if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) + if (new_mtu > IPOIB_UD_MTU(priv->max_ib_mtu)) return -EINVAL; priv->admin_mtu = new_mtu; @@ -968,10 +968,6 @@ static void ipoib_setup(struct net_device *dev) dev->tx_queue_len = ipoib_sendq_size * 2; dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; - /* MTU will be reset when mcast join happens */ - dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; - priv->mcast_mtu = priv->admin_mtu = dev->mtu; - memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); netif_carrier_off(dev); @@ -1103,6 +1099,7 @@ static struct net_device *ipoib_add_port(const char *format, struct ib_device *hca, u8 port) { struct ipoib_dev_priv *priv; + struct ib_port_attr attr; int result = -ENOMEM; priv = ipoib_intf_alloc(format); @@ -1111,6 +1108,18 @@ static struct net_device *ipoib_add_port(const char *format, SET_NETDEV_DEV(priv->dev, hca->dma_device); + if (!ib_query_port(hca, port, &attr)) + priv->max_ib_mtu = ib_mtu_enum_to_int(attr.max_mtu); + else { + printk(KERN_WARNING "%s: ib_query_port %d failed\n", + hca->name, port); + goto device_init_failed; + } + + /* MTU will be reset when mcast join happens */ + priv->dev->mtu = IPOIB_UD_MTU(priv->max_ib_mtu); + priv->mcast_mtu = priv->admin_mtu = priv->dev->mtu; + result = ib_query_pkey(hca, port, 0, &priv->pkey); if (result) { printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 2628339..630b429 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -567,8 +567,7 @@ void ipoib_mcast_join_task(struct work_struct *work) return; } - priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - - IPOIB_ENCAP_LEN; + priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu)); if (!ipoib_cm_admin_enabled(dev)) dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 433e99a..7e2d4d6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -150,13 +150,13 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) .max_send_wr = ipoib_sendq_size, .max_recv_wr = ipoib_recvq_size, .max_send_sge = 1, - .max_recv_sge = 1 + .max_recv_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu) }, .sq_sig_type = IB_SIGNAL_ALL_WR, .qp_type = IB_QPT_UD }; - int ret, size; + int ret, size, i; priv->pd = ib_alloc_pd(priv->ca); if (IS_ERR(priv->pd)) { @@ -208,6 +208,16 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) priv->tx_wr.num_sge = 1; priv->tx_wr.send_flags = IB_SEND_SIGNALED; + priv->rx_sge[0].length = IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu); + for (i = 0; i < IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1; ++i) { + priv->rx_sge[i + 1].length = PAGE_SIZE; + priv->rx_sge[i].lkey = priv->mr->lkey; + } + priv->rx_sge[i + 1].lkey = priv->mr->lkey; + priv->rx_wr.num_sge = IPOIB_UD_RX_SG(priv->max_ib_mtu); + priv->rx_wr.next = NULL; + priv->rx_wr.sg_list = priv->rx_sge; + return 0; out_free_cq: From mashirle at us.ibm.com Thu Jan 31 12:20:20 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 31 Jan 2008 12:20:20 -0800 Subject: [ofa-general] [UPDATE] [PATCH 3/3] ib/ipoib: IPoIB-UD RX S/G support In-Reply-To: <1201725009.6850.54.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> <1201725009.6850.54.camel@localhost.localdomain> Message-ID: <1201810821.19565.125.camel@localhost.localdomain> This patch enables IPoIB-UD RX to allocate S/G buffer up to payload size 4096. The link IPoIB MTU size is up to 4K - 4. Signed-off-by: Shirley Ma --- drivers/infiniband/ulp/ipoib/ipoib.h | 14 +---- drivers/infiniband/ulp/ipoib/ipoib_cm.c | 25 ++++---- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 95 +++++++++++------------------- 3 files changed, 50 insertions(+), 84 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 004a80b..57d33d5 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -56,9 +56,6 @@ /* constants */ enum { - IPOIB_PACKET_SIZE = 2048, - IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, - IPOIB_ENCAP_LEN = 4, IPOIB_MAX_IB_MTU = 4096, @@ -142,11 +139,6 @@ struct ipoib_mcast { struct ipoib_rx_buf { struct sk_buff *skb; - u64 mapping; -}; - -struct ipoib_cm_rx_buf { - struct sk_buff *skb; u64 mapping[IPOIB_CM_RX_SG]; }; @@ -198,7 +190,7 @@ enum ipoib_cm_state { struct ipoib_cm_rx { struct ib_cm_id *id; struct ib_qp *qp; - struct ipoib_cm_rx_buf *rx_ring; + struct ipoib_rx_buf *rx_ring; struct list_head list; struct net_device *dev; unsigned long jiffies; @@ -223,7 +215,7 @@ struct ipoib_cm_tx { struct ipoib_cm_dev_priv { struct ib_srq *srq; - struct ipoib_cm_rx_buf *srq_ring; + struct ipoib_rx_buf *srq_ring; struct ib_cm_id *id; struct list_head passive_ids; /* state: LIVE */ struct list_head rx_error_list; /* state: ERROR */ @@ -473,7 +465,7 @@ int ipoib_pkey_dev_delay_open(struct net_device *dev); void ipoib_drain_cq(struct net_device *dev); void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, unsigned int length, struct sk_buff *toskb); -struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, +struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, int id, int frags, int head_size, int pad, u64 *mapping); static void inline ipoib_dma_unmap_rx(struct ipoib_dev_priv *priv, int frags, diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 2c2c6b2..b2fe0f8 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -126,7 +126,7 @@ static int ipoib_cm_post_receive_nonsrq(struct net_device *dev, } static void ipoib_cm_free_rx_ring(struct net_device *dev, - struct ipoib_cm_rx_buf *rx_ring) + struct ipoib_rx_buf *rx_ring) { struct ipoib_dev_priv *priv = netdev_priv(dev); int i; @@ -283,11 +283,11 @@ static int ipoib_cm_nonsrq_init_rx(struct net_device *dev, struct ib_cm_id *cm_i spin_unlock_irq(&priv->lock); for (i = 0; i < ipoib_recvq_size; ++i) { - rx->rx_ring[i].skb = ipoib_cm_alloc_rx_skb(dev, i, - IPOIB_CM_RX_SG - 1, - IPOIB_CM_HEAD_SIZE, - 12, - rx->rx_ring[i].mapping); + rx->rx_ring[i].skb = ipoib_alloc_rx_skb(dev, i, + IPOIB_CM_RX_SG - 1, + IPOIB_CM_HEAD_SIZE, + 12, + rx->rx_ring[i].mapping); if (!rx->rx_ring[i].skb) { ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); ret = -ENOMEM; @@ -426,7 +426,7 @@ static int ipoib_cm_rx_handler(struct ib_cm_id *cm_id, void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) { struct ipoib_dev_priv *priv = netdev_priv(dev); - struct ipoib_cm_rx_buf *rx_ring; + struct ipoib_rx_buf *rx_ring; unsigned int wr_id = wc->wr_id & ~(IPOIB_OP_CM | IPOIB_OP_RECV); struct sk_buff *skb, *newskb; struct ipoib_cm_rx *p; @@ -491,8 +491,7 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, (unsigned)IPOIB_CM_HEAD_SIZE)) / PAGE_SIZE; - newskb = ipoib_cm_alloc_rx_skb(dev, wr_id, frags, IPOIB_CM_HEAD_SIZE, - 12, mapping); + newskb = ipoib_alloc_rx_skb(dev, wr_id, frags, IPOIB_CM_HEAD_SIZE, 12, mapping); if (unlikely(!newskb)) { /* * If we can't allocate a new RX buffer, dump @@ -1396,10 +1395,10 @@ int ipoib_cm_dev_init(struct net_device *dev) if (ipoib_cm_has_srq(dev)) { for (i = 0; i < ipoib_recvq_size; ++i) { priv->cm.srq_ring[i].skb = - ipoib_cm_alloc_rx_skb(dev, i, - priv->cm.num_frags - 1, - IPOIB_CM_HEAD_SIZE, 12, - priv->cm.srq_ring[i].mapping); + ipoib_alloc_rx_skb(dev, i, + priv->cm.num_frags - 1, + IPOIB_CM_HEAD_SIZE, 12, + priv->cm.srq_ring[i].mapping); if (!priv->cm.srq_ring[i].skb) { ipoib_warn(priv, "failed to allocate " "receive buffer %d\n", i); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index c40329f..e6540a4 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -90,25 +90,16 @@ void ipoib_free_ah(struct kref *kref) static int ipoib_ib_post_receive(struct net_device *dev, int id) { struct ipoib_dev_priv *priv = netdev_priv(dev); - struct ib_sge list; - struct ib_recv_wr param; struct ib_recv_wr *bad_wr; int ret; - list.addr = priv->rx_ring[id].mapping; - list.length = IPOIB_BUF_SIZE; - list.lkey = priv->mr->lkey; - - param.next = NULL; - param.wr_id = id | IPOIB_OP_RECV; - param.sg_list = &list; - param.num_sge = 1; - - ret = ib_post_recv(priv->qp, ¶m, &bad_wr); + priv->rx_wr.wr_id = id | IPOIB_OP_RECV; + ret = ib_post_recv(priv->qp, &priv->rx_wr, &bad_wr); if (unlikely(ret)) { ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret); - ib_dma_unmap_single(priv->ca, priv->rx_ring[id].mapping, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[id].mapping); dev_kfree_skb_any(priv->rx_ring[id].skb); priv->rx_ring[id].skb = NULL; } @@ -116,9 +107,9 @@ static int ipoib_ib_post_receive(struct net_device *dev, int id) return ret; } -struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev, - int id, int frags, int head_size, - int pad, u64 *mapping) +struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, + int id, int frags, int head_size, + int pad, u64 *mapping) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct sk_buff *skb; @@ -201,43 +192,17 @@ void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space, } } -static int ipoib_alloc_rx_skb(struct net_device *dev, int id) -{ - struct ipoib_dev_priv *priv = netdev_priv(dev); - struct sk_buff *skb; - u64 addr; - - skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); - if (!skb) - return -ENOMEM; - - /* - * IB will leave a 40 byte gap for a GRH and IPoIB adds a 4 byte - * header. So we need 4 more bytes to get to 48 and align the - * IP header to a multiple of 16. - */ - skb_reserve(skb, 4); - - addr = ib_dma_map_single(priv->ca, skb->data, IPOIB_BUF_SIZE, - DMA_FROM_DEVICE); - if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { - dev_kfree_skb_any(skb); - return -EIO; - } - - priv->rx_ring[id].skb = skb; - priv->rx_ring[id].mapping = addr; - - return 0; -} - static int ipoib_ib_post_receives(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); int i; for (i = 0; i < ipoib_recvq_size; ++i) { - if (ipoib_alloc_rx_skb(dev, i)) { + priv->rx_ring[i].skb = ipoib_alloc_rx_skb(dev, i, + IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), 4, + priv->rx_ring[i].mapping); + if (!priv->rx_ring[i].skb) { ipoib_warn(priv, "failed to allocate receive buffer %d\n", i); return -ENOMEM; } @@ -254,8 +219,9 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) { struct ipoib_dev_priv *priv = netdev_priv(dev); unsigned int wr_id = wc->wr_id & ~IPOIB_OP_RECV; - struct sk_buff *skb; - u64 addr; + struct sk_buff *skb, *newskb; + u64 mapping[IPOIB_UD_RX_SG(priv->max_ib_mtu)]; + int frags; ipoib_dbg_data(priv, "recv completion: id %d, status: %d\n", wr_id, wc->status); @@ -267,15 +233,15 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) } skb = priv->rx_ring[wr_id].skb; - addr = priv->rx_ring[wr_id].mapping; if (unlikely(wc->status != IB_WC_SUCCESS)) { if (wc->status != IB_WC_WR_FLUSH_ERR) ipoib_warn(priv, "failed recv event " "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); - ib_dma_unmap_single(priv->ca, addr, - IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[wr_id].mapping); dev_kfree_skb_any(skb); priv->rx_ring[wr_id].skb = NULL; return; @@ -288,11 +254,17 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) if (wc->slid == priv->local_lid && wc->src_qp == priv->qp->qp_num) goto repost; + frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len, + (unsigned)(IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu)))) / PAGE_SIZE; + newskb = ipoib_alloc_rx_skb(dev, wr_id, frags, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + 4, mapping); /* * If we can't allocate a new RX buffer, dump * this packet and reuse the old buffer. */ - if (unlikely(ipoib_alloc_rx_skb(dev, wr_id))) { + if (unlikely(newskb)) { + ipoib_dbg(priv, "failed to allocate receive buffer %d\n", wr_id); ++dev->stats.rx_dropped; goto repost; } @@ -300,9 +272,12 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", wc->byte_len, wc->slid); - ib_dma_unmap_single(priv->ca, addr, IPOIB_BUF_SIZE, DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, frags, IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[wr_id].mapping); + memcpy(priv->rx_ring[wr_id].mapping, mapping, + (frags + 1) * sizeof *mapping); - skb_put(skb, wc->byte_len); + skb_put_frags(skb, IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), wc->byte_len, newskb); skb_pull(skb, IB_GRH_BYTES); skb->protocol = ((struct ipoib_header *) skb->data)->proto; @@ -715,10 +690,10 @@ int ipoib_ib_dev_stop(struct net_device *dev, int flush) rx_req = &priv->rx_ring[i]; if (!rx_req->skb) continue; - ib_dma_unmap_single(priv->ca, - rx_req->mapping, - IPOIB_BUF_SIZE, - DMA_FROM_DEVICE); + ipoib_dma_unmap_rx(priv, + IPOIB_UD_RX_SG(priv->max_ib_mtu) - 1, + IPOIB_UD_HEAD_SIZE(priv->max_ib_mtu), + priv->rx_ring[i].mapping); dev_kfree_skb_any(rx_req->skb); rx_req->skb = NULL; } From mashirle at us.ibm.com Thu Jan 31 12:31:05 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 31 Jan 2008 12:31:05 -0800 Subject: [ofa-general] [UPDATE][PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support (for-2.6.25) In-Reply-To: <1201718540.6850.41.camel@localhost.localdomain> References: <1201718540.6850.41.camel@localhost.localdomain> Message-ID: <1201811465.19565.137.camel@localhost.localdomain> Hello Roland, Finally I cleaned up my local git tree and built compile this patchset against 2.6.25 kernel. The patchset has splitted into three patches, the patch could be built in sequence separately so it's easy to test. 1/3. Make IPoIB-CM RX S/G APIs generic 2/3. Set IPoIB-UD RX S/G parameters 3/3. Enable IPoIB-UD RX S/G Please review these patches as soon as possible so we can meet OFED-1.3-RC4 schedule. Appreciate your help on time. The current IPoIB-UD implementation is limited IPoIB payload size to 2048 through hard coding IPOIB_PACKET_SIZE. The implementation is designed for kernel PAGE_SIZE equals or greater than 4K. If the kernel PAGE_SIZE is equals to 2K, memory buffer allocation will be failed when lack of large buffer of memory. However most of the Distros does support PAGE_SIZE >= 4K. So this implementation has no problem for 2048 payload.This implementation is simple but it prevents HCA device who does support 4096 payload from performing, like IBM eHCA2. This patch allows IPoIB-UD MTU up to 4092 (4K - IPOIB_ENCAP_LEN) when HCA can support 4K MTU. In this patch, APIs for S/G buffer allocation in IPoIB-CM mode has been made generic so IPoIB-UD and IPoIB-CM can share the S/G code. When PAGE_SIZE is equal or greater than IPOIB_UD_BUF_SIZE + bytes padding to align IP header, Only one buffer is needed for 4K MTU buffer allocation, otherwise, two buffers allocation is needed in S/G. The node IPoIB link MTU size is the minimum value of admin configurable MTU through ifconfig and IPoIB default broadcast group MTU size. When Subnet Manager enables default broadcast group during start up, this subnet IPoIB link MTU will be the value of default broadcast group MTU size. For any node IB MTU smaller than this value, the node can't join this IPoIB subnet. For any node IB MTU is greater than this value, the node will join this IPoIB subnet and this value will be set as its IPOIB link MTU. If Subnet Manager disables default broadcast group during start up, the first bring up node in this subnet will create the default IPoIB broadcast group based on the negotiation with the Subnet Manager, the default is currently set as 2K according to IPoIB RFC. Thanks Shirley From krkumar2 at in.ibm.com Thu Jan 31 23:44:51 2008 From: krkumar2 at in.ibm.com (Krishna Kumar2) Date: Fri, 1 Feb 2008 13:14:51 +0530 Subject: [ofa-general] Re: Status of NFS-RDMA ? In-Reply-To: Message-ID: Hi James, Sure, I will start in a couple of weeks and report how it goes. Thanks, - KK James Lentini wrote on 01/31/2008 09:20:35 PM: > > Krishna, > > If you would like to do some testing/development on NFS/RDMA, take a > look at the current NFS/RDMA code. There are instructions on were to > get it and how to set it up here: > > http://nfs-rdma.sourceforge.net/Documents/README > > I'm revising the instructions for 2.6.25. I'll be posting the new > version once the first 2.6.25-rc is released. We would appreciate > feedback in this area as well. > > james From mashirle at us.ibm.com Thu Jan 31 21:57:40 2008 From: mashirle at us.ibm.com (Shirley Ma) Date: Thu, 31 Jan 2008 21:57:40 -0800 Subject: [ofa-general] Re: [PATCH 0/5]: Improve small UDP messages In-Reply-To: <1201873218.6677.4.camel@eli-laptop> References: <1201873218.6677.4.camel@eli-laptop> Message-ID: <1201845460.19565.153.camel@localhost.localdomain> Hello Eli, I am going to verify this in our lab. I assume this patchset is built against 2.6.25? On Fri, 2008-02-01 at 15:40 +0200, Eli Cohen wrote: > The following patches, based on ofed 1.3, are intended to address bugs > https://bugs.openfabrics.org/show_bug.cgi?id=760 and > https://bugs.openfabrics.org/show_bug.cgi?id=761. They address UD mode > both send and receive and improve performance when using small > messages > UDP traffic. The observation we had is that at small UDP messages, the > message rate is high and so what limits throughput is CPU, e.g. CPU is > 100% busy. What's the configuration for this test? How many CPUs? > In the send flow I use a dedicated CQ for the send flow which in turn > is > never armed. CQEs consumption is done by polling after posting a send > message. Also, the QP is configured for selective signaling and > polling > the CQ is done once in 16 messages. I did see selective signaling impact the performance. Depends on how many packets you want to pull once, the performance could be good, could be worse from my experience based on how many you want to pull once. I haven't fully understood this yet. I have a similar patch. But why you pick up 16 messages? > On the receive side the code is changed to post to receive queue once > in > 16 completions. This is done in for both UD and and CM. Ohmm, have you tested latency? I think it will increase latency for small messages. Thanks Shirley