[openib-general] Automatically connect to SRP target
PN
poknam at gmail.com
Mon Dec 11 18:41:08 PST 2006
Hi Vu,
i have 2 more questions,
now i have 3 srp targets and use LVM to form a GFS system.
after setting SRPHA_ENABLE=yes, i found that sometimes (~30%) it will miss a
target during reboot.
i need to manually type "srp_daemon -e -o" to discover the missing target.
is there any method such that the srp_daemon will repeat to try to ensure
all targets were found?
also, currently there is only 1 cable connect to each dual ports client.
is it normal to have the following messages?
Dec 12 10:18:10 storage02 run_srp_daemon[5471]: starting srp_daemon:
[HCA=mthca0] [port=2]
Dec 12 10:18:13 storage02 run_srp_daemon[5483]: failed srp_daemon:
[HCA=mthca0] [port=2] [exit status=0]
Dec 12 10:18:43 storage02 run_srp_daemon[5489]: starting srp_daemon:
[HCA=mthca0] [port=2]
Dec 12 10:18:46 storage02 run_srp_daemon[5501]: failed srp_daemon:
[HCA=mthca0] [port=2] [exit status=0]
.....[repeat infinitely]
Thanks a lot,
PN
Below is the log:
Dec 12 10:17:18 storage02 network: Setting network parameters: succeeded
Dec 12 10:17:18 storage02 network: Bringing up loopback interface:
succeeded
Dec 12 10:17:23 storage02 network: Bringing up interface eth0: succeeded
Dec 12 10:17:23 storage02 network: Bringing up interface ib0: succeeded
Dec 12 10:17:26 storage02 kernel: REJ reason 0xa
Dec 12 10:17:26 storage02 kernel: ib_srp: Connection failed
Dec 12 10:17:26 storage02 kernel: scsi3 : SRP.T10:00D0680000000578
Dec 12 10:17:26 storage02 kernel: Vendor: Mellanox Model:
IBSRP10-TGT Rev: 1.46
Dec 12 10:17:26 storage02 kernel: Type:
Direct-Access ANSI SCSI revision: 03
Dec 12 10:17:26 storage02 kernel: SCSI device sdb: 160086528 512-byte hdwr
sectors (81964 MB)
Dec 12 10:17:26 storage02 kernel: SCSI device sdb: drive cache: write back
Dec 12 10:17:26 storage02 kernel: SCSI device sdb: 160086528 512-byte hdwr
sectors (81964 MB)
Dec 12 10:17:26 storage02 kernel: SCSI device sdb: drive cache: write back
Dec 12 10:17:26 storage02 rpcidmapd: rpc.idmapd startup succeeded
Dec 12 10:17:26 storage02 kernel: sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7
>
Dec 12 10:17:26 storage02 kernel: Attached scsi disk sdb at scsi3, channel
0, id 0, lun 0
Dec 12 10:17:26 storage02 kernel: scsi4 : SRP.T10:00D06800000007B2
Dec 12 10:17:26 storage02 kernel: Vendor: Mellanox Model: IBSRP10-TGT
hy-b Rev: 1.46
Dec 12 10:17:26 storage02 kernel: Type:
Direct-Access ANSI SCSI revision: 03
Dec 12 10:17:26 storage02 kernel: SCSI device sdc: 160086528 512-byte hdwr
sectors (81964 MB)
Dec 12 10:17:26 storage02 kernel: SCSI device sdc: drive cache: write back
Dec 12 10:17:26 storage02 kernel: SCSI device sdc: 160086528 512-byte hdwr
sectors (81964 MB)
Dec 12 10:17:26 storage02 kernel: SCSI device sdc: drive cache: write back
Dec 12 10:17:26 storage02 kernel: sdc: sdc1 sdc2 sdc3 sdc4 < sdc5 sdc6 >
Dec 12 10:17:26 storage02 kernel: Attached scsi disk sdc at scsi4, channel
0, id 0, lun 0
Dec 12 10:17:26 storage02 scsi.agent[3668]: disk at
/devices/pci0000:00/0000:00:02.0/0000:01:00.0/host3/target3:0:0/3:0:0:0
Dec 12 10:17:26 storage02 scsi.agent[3705]: disk at
/devices/pci0000:00/0000:00:02.0/0000:01:00.0/host4/target4:0:0/4:0:0:0
Dec 12 10:17:26 storage02 ccsd[3769]: Starting ccsd 1.0.7:
Dec 12 10:17:26 storage02 ccsd[3769]: Built: Aug 26 2006 15:01:49
Dec 12 10:17:26 storage02 ccsd[3769]: Copyright (C) Red Hat, Inc. 2004
All rights reserved.
Dec 12 10:17:26 storage02 kernel: NET: Registered protocol family 10
Dec 12 10:17:26 storage02 kernel: Disabled Privacy Extensions on device
ffffffff80405540(lo)
Dec 12 10:17:26 storage02 kernel: IPv6 over IPv4 tunneling driver
Dec 12 10:17:26 storage02 ccsd: succeeded
Dec 12 10:17:26 storage02 kernel: CMAN 2.6.9-45.4.centos4 (built Aug 26 2006
14:55:55) installed
Dec 12 10:17:26 storage02 kernel: NET: Registered protocol family 30
Dec 12 10:17:26 storage02 kernel: DLM 2.6.9-42.12.centos4 (built Aug 27 2006
05:25:40) installed
Dec 12 10:17:27 storage02 ccsd[3769]: cluster.conf (cluster name =
GFS_Cluster, version = 21) found.
Dec 12 10:17:27 storage02 ccsd[3769]: Unable to perform sendto: Cannot
assign requested address
Dec 12 10:17:27 storage02 run_srp_daemon[3845]: failed srp_daemon:
[HCA=mthca0] [port=2] [exit status=0]
Dec 12 10:17:28 storage02 run_srp_daemon[3851]: starting srp_daemon:
[HCA=mthca0] [port=2]
Dec 12 10:17:29 storage02 ccsd[3769]: Remote copy of cluster.conf is from
quorate node.
Dec 12 10:17:29 storage02 ccsd[3769]: Local version # : 21
Dec 12 10:17:29 storage02 ccsd[3769]: Remote version #: 21
Dec 12 10:17:29 storage02 kernel: CMAN: Waiting to join or form a
Linux-cluster
Dec 12 10:17:29 storage02 kernel: CMAN: sending membership request
Dec 12 10:17:29 storage02 ccsd[3769]: Connected to cluster infrastruture
via: CMAN/SM Plugin v1.1.7.1
Dec 12 10:17:29 storage02 ccsd[3769]: Initial status:: Inquorate
Dec 12 10:17:30 storage02 kernel: CMAN: got node storage01
Dec 12 10:17:30 storage02 kernel: CMAN: got node storage03
Dec 12 10:17:30 storage02 kernel: CMAN: quorum regained, resuming activity
Dec 12 10:17:30 storage02 ccsd[3769]: Cluster is quorate. Allowing
connections.
Dec 12 10:17:30 storage02 cman: startup succeeded
Dec 12 10:17:30 storage02 lock_gulmd: no <gulm> section detected in
/etc/cluster/cluster.conf succeeded
Dec 12 10:17:31 storage02 fenced: startup succeeded
Dec 12 10:17:31 storage02 run_srp_daemon[4196]: failed srp_daemon:
[HCA=mthca0] [port=2] [exit status=0]
Dec 12 10:17:33 storage02 run_srp_daemon[4224]: starting srp_daemon:
[HCA=mthca0] [port=2]
Dec 12 10:17:36 storage02 run_srp_daemon[4236]: failed srp_daemon:
[HCA=mthca0] [port=2] [exit status=0]
Dec 12 10:17:40 storage02 run_srp_daemon[4242]: starting srp_daemon:
[HCA=mthca0] [port=2]
Dec 12 10:17:42 storage02 clvmd: Cluster LVM daemon started - connected to
CMAN
Dec 12 10:17:42 storage02 kernel: CMAN: WARNING no listener for port 11 on
node storage01
Dec 12 10:17:42 storage02 kernel: CMAN: WARNING no listener for port 11 on
node storage03
Dec 12 10:17:42 storage02 clvmd: clvmd startup succeeded
Dec 12 10:17:42 storage02 vgchange: Couldn't find device with uuid
'U8viRP-K6Ev-0HlZ-5pwK-09co-tXgh-sJJKXT'.
Dec 12 10:17:42 storage02 vgchange: Couldn't find all physical volumes for
volume group gfsvg.
Dec 12 10:17:42 storage02 vgchange:
Dec 12 10:17:42 storage02 vgchange: Couldn't find device with uuid
'U8viRP-K6Ev-0HlZ-5pwK-09co-tXgh-sJJKXT'.
Dec 12 10:17:42 storage02 vgchange: Couldn't find all physical volumes for
volume group gfsvg.
Dec 12 10:17:42 storage02 vgchange: Couldn't find device with uuid
'U8viRP-K6Ev-0HlZ-5pwK-09co-tXgh-sJJKXT'.
Dec 12 10:17:42 storage02 vgchange: Couldn't find all physical volumes for
volume group gfsvg.
Dec 12 10:17:42 storage02 vgchange: Couldn't find device with uuid
'U8viRP-K6Ev-0HlZ-5pwK-09co-tXgh-sJJKXT'.
Dec 12 10:17:42 storage02 vgchange: Couldn't find all physical volumes for
volume group gfsvg.
Dec 12 10:17:42 storage02 vgchange: Volume group "gfsvg" not found
Dec 12 10:17:42 storage02 clvmd: Activating VGs: failed
Dec 12 10:17:42 storage02 netfs: Mounting other filesystems: succeeded
Dec 12 10:17:42 storage02 kernel: Lock_Harness 2.6.9-58.2.centos4 (built Aug
27 2006 05:27:43) installed
Dec 12 10:17:42 storage02 kernel: GFS 2.6.9-58.2.centos4 (built Aug 27 2006
05:28:00) installed
Dec 12 10:17:42 storage02 mount: mount: special device /dev/gfsvg/gfslv does
not exist
Dec 12 10:17:42 storage02 gfs: Mounting GFS filesystems: failed
Dec 12 10:17:42 storage02 kernel: i2c /dev entries driver
.....
2006/12/12, Vu Pham <vuhuong at mellanox.com>:
>
> PN,
> Edit file /etc/infiniband/openib.conf and set
>
> SRPHA_ENABLE=yes
>
> this will start srp_daemon by default
>
> -vu
>
> > No one can help me? :(
> >
> > PN
> >
> >
> > 2006/12/7, Lai Dragonfly <poknam at gmail.com <mailto:poknam at gmail.com>>:
> >
> > Hi all,
> >
> > i'm using CentOS 4.4 (kernel 2.6.9-42.ELsmp) with OFED-1.1 in
> > clients and
> > IBGD-1.8.2-srpt in targets.
> > i found that even i use "modprobe ib_srp" or set SRP_LOAD=yes in
> > openib.conf,
> > i could not found the SRP target.
> > until i execute "srp_daemon -e -o", i can see all the targets appear
> > in /dev/sdX.
> >
> > since i want to export the targets to other nodes,
> > any idea so that i can connect to the targets automatically in each
> > reboot.
> > without typing "srp_daemon -e -o" each time?
> >
> > thanks in advance.
> >
> > PN
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > openib-general mailing list
> > openib-general at openib.org
> > http://openib.org/mailman/listinfo/openib-general
> >
> > To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20061212/e11db3c5/attachment.html>
More information about the general
mailing list