Archive for the ‘Computers’ Category

Simple IP Failover

Thursday, August 13th, 2009

At work, we have a few virtual machines which are part of some sort of cluster.  Some are active/active and some are active/passive.  Some are load balancers and some are webservers.  Using clustering or IP failover for high availability is a great.  It's much easier to update nodes one at a time without having to schedule downtime or cause noticeable impact to the end user.  In the past I've been using the Linux-HA software.  It's full featured but very complicated.

Recently, I've been working on moving one of our services from Redhat Enterprise Linux based virtual machines to Ubuntu Linux.  Redhat has always served us pretty well, but some of the project requirements included newer versions of software than what was available in the lastest distribution.  These requirements were met with the latest long term release of Ubuntu, which is now over a year old.  I appreciate the timely release schedule that Ubuntu uses as well as the inclusion of the latest versions of various packages.  But, I'm getting off topic here.  I was utilizing the Linux-HA software with Redhat to run an active/active cluster.  Each cluster node handled http requests directly (one hostname had a couple IP addresses associated).  This worked well, but the Linux-HA software wasn't fun to manage.  I didn't use any front end tools, just edited the XML files and loaded them.  My other complaint was that the requests were not properly balanced over all the nodes using the DNS round robin approach.

So the new implementation now has redundant backend workers (running Tomcat), with a single Apache load balancer on the front end.  The Apache load balancer works as a reverse proxy and gracefully handles conditions where workers stop responding.  The load is appropriately dispersed between the workers and I am extremely pleased with the results.

But, there is one problem.  The Apache load balancer isn't highly available.  I didn't want to set up the Linux-HA software again, so I started looking around for a more simple solution (think KISS).  I soon found this blog post and it was exactly what I was looking for.

After reading the article, I decided that I would like to write a perl script that would use lock files and daemonize instead of a shell script.  I had just done another script that did that and was very happy with how it worked.  After putting together the script and doing some testing, I decided I needed two scripts.  One for the active node and one for the standby node.  The basic idea is that the standby node checks the active node to see that it is up and running.  If it detects a failure, it will bring up the service IP address, send the arp packet, and restart the Apache daemon (so it sees all IP addresses).  When the standby node detects the primary is back on the network, it shuts down the service IP address.  On the primary node, it will check the default gateway to determine if it is up on the network.  If it detects a failure, it will shutdown the service IP address.  When it resumes network connectivity, it will add the service IP, send the arp packet, and restart the apache service.

So the two scripts are below.  First the one that runs on the primary and second is the one that runs on the standby node.  I hope to use these with little modifications for all our applications.  Some have multiple IP addresses for IP based virtualhosting (SSL sites).  I also installed the fake package to utilize the send_arp program. I will probably need to make some more revisions, but I thought this might be helpful to other people out there trying to accomplish the same thing. These scripts come "as-is" with no warranty what so ever. Feel free to do what you'd like with them. If anyone has better solutions, feel free to post a comment!

EDIT (19-August-2009): I had some issues with a race condition in one of the scripts. The script that brought the IP address up on the primary node had an issue with the ping counter. I decided to just run one script on the standby node to simplify things. On the primary, I scheduled a cronjob to run the send_arp command every minute. This will send an arp packet to update the tables on the router when it's on-line. I've also made some slight modifications to the script that runs on the backup host.

EDIT (28-March-2010): I've taken down my git repository and I'm including the script below.

#!/usr/bin/perl -w
#
# ipfaild.pl
#
# Daemon to handle IP failover and restart necessary services.
#
# 11-Aug-2009 - Patrick Hennessy
#
use strict;
use Sys::Syslog;
use POSIX qw(setsid);
use Fcntl ':flock';
use Net::Ping::External qw(ping);

# Vars
#
my $pid;
my $progname = "ipfaild";
my $daemon_pidfile = "/var/run/$progname/$progname.pid";
my $daemon_lockfile = "/var/run/$progname/$progname.lock";
my $log_facility = "LOG_DAEMON";
my $ifconfig = '/sbin/ifconfig';
my $send_arp = '/usr/sbin/send_arp';
my $apache2ctl = '/usr/sbin/apache2ctl';
my $ping_timeout = 1;
my $sleep_time = 2;
my $missed = 0;
my $ipRec;

my $otherHost = 'otherhost.domain.com';
my $thisMAC = '00:11:22:33:44:55';

my @ipRecords = (
        { name=> 'fooservice', pubip => '192.168.1.200', dev => 'eth0:10', mask => '255.255.255.0' },
);

# Subroutines
#
sub daemonize;

# Daemonize process
#
daemonize;

# Acquire exclusive lock
#
open LOCKFILE, ">$daemon_lockfile" or die "$progname: can't write to $daemon_lockfile: $!n";
flock(LOCKFILE, LOCK_EX | LOCK_NB) or die "$progname: can't acquire lock: $daemon_lockfile: $!n";
print LOCKFILE "$pidn";

# Open syslog.
#
openlog($progname, "pid", $log_facility);

# Signal handlers
#
my $keep_processing = 1;
$SIG{HUP}  = sub { syslog("info", "Caught SIGHUP:  exiting gracefully"); $keep_processing = 0; };
$SIG{INT}  = sub { syslog("info", "Caught SIGINT:  exiting gracefully"); $keep_processing = 0; };
$SIG{QUIT}  = sub { syslog("info", "Caught SIGQUIT:  exiting gracefully"); $keep_processing = 0; };
$SIG{TERM}  = sub { syslog("info", "Caught SIGTERM:  exiting gracefully"); $keep_processing = 0; };

# Bring down interfaces.
#
for $ipRec (@ipRecords) {
        syslog("info", "Running: $ifconfig $ipRec->{'dev'} down");
        system($ifconfig, $ipRec->{'dev'}, 'down') == 0
                or syslog("info", "Error: $? Could not run: $ifconfig $ipRec->{'dev'} down");
}

# Main loop
#
while ($keep_processing) {
        # Ping the other host and count dropped packets
        #
        if (! ping(host => $otherHost, timeout => $ping_timeout)) {
                $missed++;
        } else {
                if ($missed > 2) {
                        for $ipRec (@ipRecords) {
                                syslog("info", "Running: $ifconfig $ipRec->{'dev'} down");
                                system($ifconfig, $ipRec->{'dev'}, 'down') == 0
                                        or syslog("info", "Error: $? Could not run: $ifconfig $ipRec->{'dev'} down");
                        }
                        $missed = 0;
                }
        }

        # Bring up IP addresses if packets dropped
        #
        if ($missed == 2) {
                for $ipRec (@ipRecords) {
                        syslog("info", "Running: $ifconfig $ipRec->{'dev'} $ipRec->{'pubip'} netmask $ipRec->{'mask'}");
                        system($ifconfig, $ipRec->{'dev'}, $ipRec->{'pubip'}, 'netmask', $ipRec->{'mask'}) == 0
                                or syslog("info", "Error: $? Could not run: $ifconfig $ipRec->{'dev'} $ipRec->{'pubip'} netmask $ipRec->{'mask'}");
                        syslog("info", "Running: $send_arp $ipRec->{'pubip'} $thisMAC $ipRec->{'pubip'} ff:ff:ff:ff:ff:ff");
                        system($send_arp, $ipRec->{'pubip'}, $thisMAC, $ipRec->{'pubip'}, 'ff:ff:ff:ff:ff:ff') == 0
                                or syslog("info", "Error: $? Could not run: $send_arp $ipRec->{'pubip'} $thisMAC $ipRec->{'pubip'} ff:ff:ff:ff:ff:ff");
                }
                syslog("info", "Running: $apache2ctl restart");
                system($apache2ctl, 'restart') == 0
                        or syslog("info", "Error: $? Could not run: $apache2ctl restart");
        }

        # Sleep
        #
        sleep($sleep_time);
}

# Bring down interfaces.
#
for $ipRec (@ipRecords) {
        syslog("info", "Running: $ifconfig $ipRec->{'dev'} down");
        system($ifconfig, $ipRec->{'dev'}, 'down') == 0
                or syslog("info", "Error: $? Could not run: $ifconfig $ipRec->{'dev'} down");
}

# Close syslog.
#
closelog();

# Close lockfile.
close (LOCKFILE);

# Exit.
#
exit(0);

# Functions
#
sub daemonize() {
        open STDIN, '/dev/null' or die "$progname: can't read /dev/null: $!";
        open STDOUT, '>/dev/null' or die "$progname: can't write to /dev/null: $!";
        defined(my $pid = fork) or die "$progname: can't fork: $!";
        if($pid) {
                # parent
                open PIDFILE, ">$daemon_pidfile" or die "$progname: can't write to $daemon_pidfile: $!n";
                print PIDFILE "$pidn";
                close(PIDFILE);
                exit;
        }
        # child
        setsid or die "$progname: can't start a new session: $!";
        open STDERR, '>&STDOUT' or die "$progname: can't dup stdout: $!";
}

Linode Dynamic DNS Ash Script

Monday, May 11th, 2009

After my post last night regarding a bash script to update Linode's DNS Manager, opello from #linode on OFTC provided some sed commands to parse the JSON output. Using his commands, I made another version of the script.  I've tested this on my router running the Tomato firmware (which runs BusyBox).

I found that the wget command built into the version of BusyBox does not support HTTP POST. It also does not support https urls, only http. This means your Linode API key would be transmitted in clear text, which probably isn't a good thing.

Another solution that was suggested was to simply wget a CGI script running on your webserver, which could update the Linode DNS Manager using perl or python over secure channels. That would reduce the complexity on the home router side and allow you to use the developed Linode API libraries.

Therefor, I wouldn't recommend using this unless you are able to send the requests over ssl channels. I am glad to have a slightly better understanding of sed. I'll probably modify the original bash script to use that as well.

#!/bin/ash
#
# Script to update Linode's DNS Manager for a given name.
#

# Things you need to change.
APIKEY=$(cat /home/root/linode-apikey)
LASTIP="/tmp/lastip"
DOMAIN="domain.com"
SOAEMAIL="hostmaster@domain.com"
STATUS="1"
RRTYPE="A"
RRNAME="home"
IFACE="vlan1"

# Shouldn't need to change anything below here.

WGET="wget -qO - http://api.linode.com/api/?api_key=$APIKEY"
NEWIP=$(ifconfig $IFACE | head -n2 | tail -n1 | cut -d: -f2 | cut -d' ' -f1)
test -e $LASTIP && OLDIP=$(cat $LASTIP) || OLDIP=""

if [ x"$OLDIP" = x"$NEWIP" ]; then
  logger "No IP address change detected. Keeping $NEWIP"
else
   DOMAINID=$($WGET"&action=domainList" | 
        sed -nr "s#.*"DOMAIN":"$DOMAIN","DOMAINID":([0-9]+),.*#1#p")
   RESOURCEID=$($WGET"&action=domainResourceList&DomainID=$DOMAINID" | 
        sed -nr "s#.*"RESOURCEID":([0-9]+),"DOMAINID":$DOMAINID,"TYPE":"$RRTYPE","NAME":"$RRNAME".*#1#p")
   $WGET"&action=domainResourceSave&ResourceID=$RESOURCEID&DomainID=$DOMAINID&Name=$RRNAME&Type=$RRTYPE&Target=$NEWIP"; echo
   $WGET"&action=domainSave&DomainID=$DOMAINID&Domain=$DOMAIN&Type=master&Status=$STATUS&SOA_Email=$SOAEMAIL"; echo
   echo $NEWIP > $LASTIP
   logger "Updated IP address to $NEWIP"
fi

Linode Dynamic DNS Bash Script

Monday, May 11th, 2009

So Mark Walling was working on an ash script for his router running OpenWRT to update Linode's DNS Manager with his IP address. I liked the idea of a simple shell script to update without needing to install libraries for Perl or Python.  I took his script and tried to adapt it to get the DOMAINID and RESOURCEID using sed or awk.  Those utilities seem great for manipulating multiline files of text, but I wasn't getting anywhere trying to parse one line of JSON from wget.  So I used perl to extract the id numbers.  I believe OpenWRT has some sort of perl with limited functionality, so maybe this will work with that or at least be easily adapted.  This could also be easily modified to use curl instead of wget. I suspect someone out there will find this useful.

#!/bin/bash
#
# Script to update Linode's DNS Manager for a given name.
#

# Things you need to change.
APIKEY=$(cat ~/.linode-apikey)
LASTIP="/tmp/lastip"
DOMAIN="domain.com"
SOAEMAIL="hostmaster@domain.com"
STATUS="1"
RRTYPE="A"
RRNAME="home"
IFACE="eth0"

# Shouldn't need to change anything below here.

WGET="wget -qO - https://api.linode.com/api/"
NEWIP=$(ifconfig $IFACE | head -n2 | tail -n1 | cut -d: -f2 | cut -d' ' -f1)
test -e $LASTIP && OLDIP=$(cat $LASTIP) || OLDIP=""

if [ x"$OLDIP" = x"$NEWIP" ]; then
  logger "No IP address change detected. Keeping $NEWIP"
else
   DOMAINID=$($WGET --post-data "api_key=$APIKEY&action=domainList" | 
        perl -e 'if ( =~ /"DOMAIN":"'"$DOMAIN"'","DOMAINID":([0-9]+),/) { print $1; }')
   RESOURCEID=$($WGET --post-data "api_key=$APIKEY&action=domainResourceList&DomainID=$DOMAINID" | 
        perl -e 'if ( =~ /"RESOURCEID":([0-9]+),"DOMAINID":'"$DOMAINID"',"TYPE":"'"$RRTYPE"'","NAME":"'"$RRNAME"'"/) { print $1; }')
   $WGET --post-data "api_key=$APIKEY&action=domainResourceSave&ResourceID=$RESOURCEID&DomainID=$DOMAINID&Name=$RRNAME&Type=$RRTYPE&Target=$NEWIP"; echo
   $WGET --post-data "api_key=$APIKEY&action=domainSave&DomainID=$DOMAINID&Domain=$DOMAIN&Type=master&Status=$STATUS&SOA_Email=$SOAEMAIL"; echo
   echo $NEWIP > $LASTIP
   logger "Updated IP address to $NEWIP"
fi

css.php