I’m running Keepalived on our loadbalancers for many years and I’m really happy with it. Today I run into an issue that took me some time to solve. I thought I’d share it 🙂
In my current setup I have a pair of Debian Squeeze boxes running version 1.1.20. Since I’m rolling out ipv6 at the moment, I need to upgrade to 1.2.2. Fortunately, Debian provides this version in Squeeze-Backports.
So, I decided to upgrade the Backup loadbalancer first (lb-1). It was a simple ‘apt-get’ procedure to get it installed. But soon these errors popped up in my syslog:
Jun 16 21:12:25 lb-1 Keepalived_vrrp: bogus VRRP packet received on bond1 !!! Jun 16 21:12:25 lb-1 Keepalived_vrrp: VRRP_Instance(CLOUD_MGT_GW) ignoring received advertisment... Jun 16 21:12:25 lb-1 Keepalived_vrrp: receive an invalid passwd!
No messages were to be found in the primary loadbalancer, lb-0. The two loadbalancers weren’t talking to each other any more. I did’t try a failover, as this apparently wouldn’t work. To be sure, I stopped keepalived on the lb-1.
Using tcpdump I found the problem: version 1.1.20 uses a password in its broadcast advertisement that was truncated to the first 7 characters, while the version 1.2.2 uses the full length password, as configured in /etc/keepalived/keepalived.conf. This of course did not match, and so they refused to talk to each other.
The solution was simple: I changed the password in the version 1.2.2 loadbalancer, to be 7 characters long. Then restarted keepalived, and all was working again. After upgrading both loadbalancers, I changed back the password to the longer version and since the versions are now both 1.2.2 it still worked 🙂