Hi,
Sorry for my bad english, i'm a french people.
I'm heartbeat to have a 3 nodes cluster server with a VIP.
I have a big problem. In bcast mode, heartbeat fall down the
network.
I have put the ucast method to resolv the problem, and it
works.
But If the first server fail, the VIP goes on the second
one, and the network is slow
on the client application.
authkeys :
auth 1
1 sha1 net-cluster
2 md5 "cluster"
3 crc
ha.cf
use_logd on
#debugfile /var/log/ha-debug
#logfile /var/log/ha-log
keepalive 1
warntime 2
deadtime 3
initdead 60
#bcast eth1
ucast eth1 192.168.1.1
ucast eth1 192.168.1.2
ucast eth1 192.168.1.3
udpport 694
node SVR01
node SVR02
node SVR03
auto_failback on
crm on
I had a power crash yesterday, and this problem appears
(slow network). Some logs :
d22312/HBWRITE]
heartbeat[16404]: 2008/07/23_15:11:16 info: cl_malloc stats:
446/5809982 42328/
20020 [pid22312/HBWRITE]
heartbeat[16404]: 2008/07/23_15:11:16 info: RealMalloc
stats: 50916 total malloc
bytes. pid [22312/HBWRITE]
heartbeat[16404]: 2008/07/23_15:11:16 info: Current arena
value: 0
heartbeat[16404]: 2008/07/23_15:11:16 info: MSG stats: 0/0
ms age 1314784650 [pi
d22313/HBREAD]
heartbeat[16404]: 2008/07/23_15:11:16 info: cl_malloc stats:
447/22018891 42412
/20064 [pid22313/HBREAD]
heartbeat[16404]: 2008/07/23_15:11:16 info: RealMalloc
stats: 50584 total malloc
bytes. pid [22313/HBREAD]
heartbeat[16404]: 2008/07/23_15:11:16 info: Current arena
value: 0
heartbeat[16404]: 2008/07/23_15:11:16 info: These are
nothing to worry about.
heartbeat[16404]: 2008/07/23_15:13:23 WARN: 2 lost packet(s)
for [svr03] [4233
315:4233318]
heartbeat[16404]: 2008/07/23_15:13:23 WARN: Late heartbeat:
Node svr03: interv
al 3000 ms
heartbeat[16404]: 2008/07/23_15:13:23 info: No pkts missing
from svr03!
heartbeat[16404]: 2008/07/23_15:13:24 WARN: node svr02: is
dead
heartbeat[16404]: 2008/07/23_15:13:24 info: Link svr02:eth1
dead.
heartbeat[16404]: 2008/07/23_15:13:24 CRIT: Cluster node
svr02 returning after
partition.
heartbeat[16404]: 2008/07/23_15:13:24 info: For information
on cluster partition
s, See URL: http://linux-ha.org/Sp
litBrain
heartbeat[16404]: 2008/07/23_15:13:24 WARN: Deadtime value
may be too small.
heartbeat[16404]: 2008/07/23_15:13:24 info: See FAQ for
information on tuning deadtime.
heartbeat[16404]: 2008/07/23_15:13:24 info: URL: http://linux-ha.or
g/FAQ#heavy_load
heartbeat[16404]: 2008/07/23_15:13:24 info: Link svr02:eth1
up.
heartbeat[16404]: 2008/07/23_15:13:24 WARN: Late heartbeat:
Node svr02: interval 4000 ms
heartbeat[16404]: 2008/07/23_15:13:24 info: Status update
for node svr02: status active
heartbeat[16404]: 2008/07/23_15:13:26 WARN: 1 lost packet(s)
for [svr03] [4233322:4233324]
heartbeat[16404]: 2008/07/23_15:13:26 info: No pkts missing
from svr03!
heartbeat[16404]: 2008/07/23_16:06:24 WARN: 2 lost packet(s)
for [svr03] [4236562:4236565]
heartbeat[16404]: 2008/07/23_16:06:24 WARN: Late heartbeat:
Node svr03: interval 3000 ms
heartbeat[16404]: 2008/07/23_16:06:24 info: No pkts missing
from svr03!
heartbeat[16404]: 2008/07/23_16:10:49 WARN: 1 lost packet(s)
for [svr02] [4236937:4236939]
heartbeat[16404]: 2008/07/23_16:10:49 WARN: 1 lost packet(s)
for [svr03] [4236828:4236830]
heartbeat[16404]: 2008/07/23_16:10:49 info: No pkts missing
from svr02!
heartbeat[16404]: 2008/07/23_16:10:50 info: No pkts missing
from svr03!
heartbeat[16404]: 2008/07/23_16:10:57 WARN: 1 lost packet(s)
for [svr03] [4236835:4236837]
h
How should I resolve the problem ?
I must have a specific equipment to use the multicast method
?
---
Reza ISSANY
Ingénieur Systèmes
ZA Les Playes - Jean Monnet Sud
Avenue de Lisbonne
83500 La Seyne sur Mer
Mail : issanyr olympecti.fr <mailto:issanyr olympecti.fr>
_______________________________________________
Linux-HA mailing list
Linux-HA lists.linux-ha.org
h
ttp://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha
.org/ReportingProblems
|