[ofa-general] Re: [PATCH 3/3] osm: QoS - parsing port names

Yevgeny Kliteynik kliteyn at dev.mellanox.co.il
Sun Oct 14 15:32:45 PDT 2007


Hi Sasha,

Sasha Khapyorsky wrote:
> On 12:24 Sun 14 Oct     , Yevgeny Kliteynik wrote:
>>  Sasha Khapyorsky wrote:
>>> Hi Yevgeny,
>>> On 11:01 Tue 09 Oct     , Yevgeny Kliteynik wrote:
>>>> Added CA-by-name hash to the QoS policy object and
>>> Why it is called "CA"-by-name? In the code below I see that hash is
>>> created for all nodes (including switches and routers).
>>  In osm_qos_policy.c:
>>
>>  	if (p_node->node_info.node_type == IB_NODE_TYPE_CA)
>>  		st_insert(p_qos_policy->p_ca_hash,
>>  			  (st_data_t)p_node->print_desc,
>>  			  (st_data_t)p_node);
> 
> Ok, so what is wrong with switches and routers? Why it cannot be
> specified by "name"?

Switches have the NodeDescription filled by FW, and it's usually the
same string for all the switches. Also, what would be the meaning of
"host id" for switches?
As for routers - I don't know what's going on there, since I didn't
get the chance to lay my hands on it yet (those IB routers are hard
to get right now :-)

>>>> diff --git a/opensm/opensm/osm_qos_parser.y 
>>>> b/opensm/opensm/osm_qos_parser.y
>>>> index 2405519..cf342d3 100644
>>>> --- a/opensm/opensm/osm_qos_parser.y
>>>> +++ b/opensm/opensm/osm_qos_parser.y
>>>> @@ -603,23 +603,74 @@ port_group_use_start:   TK_USE {
>>>>
>>>>  port_group_port_name:   port_group_port_name_start string_list {
>>>>                              /* 'port-name' in 'port-group' - any num of 
>>>> instances */
>>>> -                            cl_list_iterator_t    list_iterator;
>>>> -                            char                * tmp_str;
>>>> -
>>>> -                            list_iterator = 
>>>> cl_list_head(&tmp_parser_struct.str_list);
>>>> -                            while( list_iterator != 
>>>> cl_list_end(&tmp_parser_struct.str_list) )
>>>> +                            cl_list_iterator_t list_iterator;
>>>> +                            osm_node_t * p_node;
>>>> +                            osm_physp_t * p_physp;
>>>> +                            unsigned port_num;
>>>> +                            char * name_str;
>>>> +                            char * tmp_str;
>>>> +                            char * host_str;
>>>> +                            char * ca_str;
>>>> +                            char * port_str;
>>>> +                            char * node_desc = 
>>>> (char*)malloc(IB_NODE_DESCRIPTION_SIZE + 1);
>>>> +
>>>> +                            /* parsing port name strings */
>>>> +                            for (list_iterator = 
>>>> cl_list_head(&tmp_parser_struct.str_list);
>>>> +                                 list_iterator != 
>>>> cl_list_end(&tmp_parser_struct.str_list);
>>>> +                                 list_iterator = 
>>>> cl_list_next(list_iterator))
>>>>                              {
>>>>                                  tmp_str = 
>>>> (char*)cl_list_obj(list_iterator);
>>>> +                                if (tmp_str && *tmp_str)
>>>> +                                {
>>>> +                                    name_str = tmp_str;
>>>> +                                    host_str = strtok (name_str,"/");
>>>> +                                    ca_str = strtok (NULL, "/");
>>>> +                                    port_str = strtok (NULL, "/");
>>>> +
>>>> +                                    if (!host_str || !(*host_str) ||
>>>> +                                        !ca_str || !(*ca_str) ||
>>>> +                                        !port_str || !(*port_str) ||
>>>> +                                        (port_str[0] != 'p' && 
>>>> port_str[0] != 'P')) {
>>>> +                                        yyerror("illegal port name");
>>>> +                                        free(tmp_str);
>>>> +                                        free(node_desc);
>>>> +                                        
>>>> cl_list_remove_all(&tmp_parser_struct.str_list);
>>>> +                                        return 1;
>>>> +                                    }
>>>>
>>>> -                                /*
>>>> -                                 * TODO: parse port name strings
>>>> -                                 */
>>>> +                                    if (!(port_num = 
>>>> strtoul(&port_str[1],NULL,0))) {
>>>> +                                        yyerror("illegal port number in 
>>>> port name");
>>>> +                                        free(tmp_str);
>>>> +                                        free(node_desc);
>>>> +                                        
>>>> cl_list_remove_all(&tmp_parser_struct.str_list);
>>>> +                                        return 1;
>>>> +                                    }
>>>>
>>>> -                                if (tmp_str)
>>>> -                                    
>>>> cl_list_insert_tail(&p_current_port_group->port_name_list,tmp_str);
>>>> -                                list_iterator = 
>>>> cl_list_next(list_iterator);
>>>> +                                    sprintf(node_desc,"%s 
>>>> %s",host_str,ca_str);
>>>> +                                    free(tmp_str);
>>>> +
>>>> +                                    if 
>>>> (st_lookup(p_qos_policy->p_ca_hash,
>>>> +                                                  (st_data_t)node_desc,
>>>> +                                                  (st_data_t*)&p_node))
>>> I am not following this. Hash key is generated as "host_str ca_str", but
>>> below where hash table is filled NodeDescription string is used. Why
>>> this should be same?
>>  Because that's how node description is created.
>>  From /etc/init.d/openibd:
>>
>>      # Add node description to sysfs
>>      IBSYSDIR="/sys/class/infiniband"
>>      if [ -d ${IBSYSDIR} ]; then
>>          declare -i hca_id=1
>>          for hca in ${IBSYSDIR}/*
>>          do
>>              if [ -e ${hca}/node_desc ]; then
>>                  echo -n "$(hostname -s) HCA-${hca_id}" >> ${hca}/node_desc
>>              fi
>>              let hca_id++
>>          done
>>      fi
> 
> This script is optional, even when used the way how node_desc is
> generated can be easily changed. I think it is not good idea to copy the
> algorithm to OpenSM code and in this way to enforce an user to use the
> only this hardcoded node_desc format.
> 
> Actually this (or another similar) script is sort of config file, as
> well as qos policy file, and both are in admin's hands. So basically I
> agree that it is ok to require to define node_desc (if an admin wishes
> to use names for her QoS). _But_ we cannot dictate how it should be
> generated - it clearly must be user's and not our choice.
> 
> So instead of approaching hardcoded node_desc format I think that name
> definition in qos policy file should refer node_desc as whole string
> (well, in improved case it could be single substring with wild cards).

Ok, so let's elaborate on this.

Currently NodeDescription is filled by the openibd script.
Although this script can be modified by admin, I doubt that an average
admin would like to tweak it. Thus, I believe that in most cases the
NodeDescription will look this way: "node-id hca-num".

If we want to allow port names to have number ranges or asterisk
(and we do want it), then we have to have *some* format.

So here's my suggestion:
1. First of all, when the ca-by-name hash is created, osm will check that
    the NodeDescriptions are unique. If they aren't - parsing of the port
    names will be off, even if specified in the policy file.
2. If a port name doesn't have any special characters, it will be compared
    to the NodeDescription as is, and it'd better be unique
3. If the admin would like to include num. ranges and asterisks in the
    port name, he has to make sure that the NodeDescription is created
    like it is created now by openibd.

Sounds ok?

>>>> +	for (p_node = (osm_node_t *) cl_qmap_head(p_node_guid_tbl);
>>>> +	     p_node != (osm_node_t *) cl_qmap_end(p_node_guid_tbl);
>>>> +	     p_node = (osm_node_t *) cl_qmap_next(&p_node->map_item)) {
>>>> +		if (p_node->node_info.node_type == IB_NODE_TYPE_CA)
>>>> +			st_insert(p_qos_policy->p_ca_hash,
>>>> +				  (st_data_t)p_node->print_desc,
>>>> +				  (st_data_t)p_node);
>>> Hmm, why do you think NodeDescription will be unique for each node in a
>>> fabric?
>>  NodeDescription is a combination of host id and hca number.
>>  If nobody "plays" with these values (and doesn't modify this area
>>  of /etc/init.d/openibd), then NodeDescription will be unique.
>>  But IB spec doesn't require it to be unique. In fact, it doesn't say
>>  anything at all about how this NodeDescription should look.
> 
> Yes, it is the point.
> 
> OTOH as I stated above I think it ok to require node_desc setup for "by
> node_desc" resolution, and in this case an user is responsible to have
> it unique.
> 
> But let's do by node_desc and not by "$host $hca" or any another
> hardcoded format.
> 
>>  Moreover, if the device won't be found in /sys/class/infiniband,
>>  or if it won't have /sys/class/infiniband/${hca}/node_desc, I have no
>>  idea what would be the content of NodeDescription.
>>
>>  I am, however, trying to give best value for the money :)
>>
>>  OSM doesn't know what is the host id of a certain hca.
>>  The only thing I can think of right now is that OpenSM can
>>  check the NodeDescription before inserting it to the hash
>>  (which can be done in a consequent patch).
> 
> This is handled internally in st_insert() - new value will just replace
> old one.

I mean that OSM will check the description format, not its existence in the hash

>>  It can check that the description looks like this:
>>     <some string w/o white space><space><HCA-><some number>
> 
> Please no. :)

OK, but only because you asked nicely :)

-- Yevgeny

> Sasha
> 




More information about the general mailing list