vgscan fails when other nodes quit cleanly.
Last updated: Tue, 09 Mar 2010 00:03:49 -0800
View thread
at
LVM General Discussion
Hi all,
Here's an interesting issue. When we shutdown the cluster stack
cleanly, all lvm commands will
fail to grab the global lock. Like this:
--->8----
sys3:~ # vgscan
cluster request failed: Host is down
Unable to obtain global lock.
---8<----
I went through the code history a bit. It seems to be caused by
e65ffb8e, which is for gulm only I think.
--->8----
commit e65ffb8e687bbce4e7edff70ebff2b3f1c0b6157
Author: Christine Caulfield <ccaulfie*******>
Make clvmd return immediately if other nodes are down in a gulm cluster.
bz#447799
diff --git a/WHATS_NEW b/WHATS_NEW
index ec7ff54..023659e 100644
--- a/WHATS_NEW
+++ b/WHATS_NEW
@@ -1,5 +1,6 @@
Version 2.02.39 -
================================
+ Make clvmd return immediately if other nodes are down in a gulm cluster.
Improve/Fix read ahead 'auto' calculation for stripe_size
Fix lvchange output for -r auto setting if auto is already set
Add testcase for read ahead
diff --git a/daemons/clvmd/clvmd-gulm.c b/daemons/clvmd/clvmd-gulm.c
index 3a230b5..a2f2148 100644
--- a/daemons/clvmd/clvmd-gulm.c
+++ b/daemons/clvmd/clvmd-gulm.c
@@ -665,6 +665,7 @@ static int _cluster_do_node_callback(struct
local_client *master_client,
{
struct dm_hash_node *hn;
struct node_info *ninfo;
+ int somedown = 0;
dm_hash_iterate(hn, node_hash)
{
@@ -686,12 +687,14 @@ static int _cluster_do_node_callback(struct
local_client *master_client,
client = dm_hash_lookup_binary(sock_hash, csid, GULM_MAX_CSID_LEN);
}
+ DEBUGLOG("down_callback2. node %s, state = %d\n", ninfo->name, ninfo->state);
if (ninfo->state != NODE_DOWN)
callback(master_client, csid, ninfo->state == NODE_CLVMD);
-
+ if (ninfo->state != NODE_CLVMD)
+ somedown = -1;
}
- return 0;
+ return somedown;
}
/* Convert gulm error codes to unix errno numbers */
---8<----
clvmd-corosync.c is copied over from clvmd-openais.c, then from clvmd-gulm.c.
I'd suggest to remove this patch for both clvmd-corosync and clvmd-gulm.
Any comments ?
Thanks.
Here's an interesting issue. When we shutdown the cluster stack
cleanly, all lvm commands will
fail to grab the global lock. Like this:
--->8----
sys3:~ # vgscan
cluster request failed: Host is down
Unable to obtain global lock.
---8<----
I went through the code history a bit. It seems to be caused by
e65ffb8e, which is for gulm only I think.
--->8----
commit e65ffb8e687bbce4e7edff70ebff2b3f1c0b6157
Author: Christine Caulfield <ccaulfie*******>
Make clvmd return immediately if other nodes are down in a gulm cluster.
bz#447799
diff --git a/WHATS_NEW b/WHATS_NEW
index ec7ff54..023659e 100644
--- a/WHATS_NEW
+++ b/WHATS_NEW
@@ -1,5 +1,6 @@
Version 2.02.39 -
================================
+ Make clvmd return immediately if other nodes are down in a gulm cluster.
Improve/Fix read ahead 'auto' calculation for stripe_size
Fix lvchange output for -r auto setting if auto is already set
Add testcase for read ahead
diff --git a/daemons/clvmd/clvmd-gulm.c b/daemons/clvmd/clvmd-gulm.c
index 3a230b5..a2f2148 100644
--- a/daemons/clvmd/clvmd-gulm.c
+++ b/daemons/clvmd/clvmd-gulm.c
@@ -665,6 +665,7 @@ static int _cluster_do_node_callback(struct
local_client *master_client,
{
struct dm_hash_node *hn;
struct node_info *ninfo;
+ int somedown = 0;
dm_hash_iterate(hn, node_hash)
{
@@ -686,12 +687,14 @@ static int _cluster_do_node_callback(struct
local_client *master_client,
client = dm_hash_lookup_binary(sock_hash, csid, GULM_MAX_CSID_LEN);
}
+ DEBUGLOG("down_callback2. node %s, state = %d\n", ninfo->name, ninfo->state);
if (ninfo->state != NODE_DOWN)
callback(master_client, csid, ninfo->state == NODE_CLVMD);
-
+ if (ninfo->state != NODE_CLVMD)
+ somedown = -1;
}
- return 0;
+ return somedown;
}
/* Convert gulm error codes to unix errno numbers */
---8<----
clvmd-corosync.c is copied over from clvmd-openais.c, then from clvmd-gulm.c.
I'd suggest to remove this patch for both clvmd-corosync and clvmd-gulm.
Any comments ?
Thanks.
