Archives For March 2012

Today I came across a nice post by Tod Werth, who created a nice theme for the OSX Terminal program called IR_Black.

All you have to do is download his schema and open in it Terminal. Then tweak the colors a bit to fully meet our needs.

Last step: enable colors in your profile:

vim ~/.bash_profile

Add this line:

export CLICOLOR=1;

When you open a  new window and enter a simple ‘ls’ command, it looks like this:

So that is pretty cool 🙂 Thanks Todd!

I found that using these setting brought some trouble when working in vim and nano. Changing the terminal from “xterm-256color” to “xterm-color” fixed that for me.

Networking in CloudStack 3.0 is awesome; the Virtual Router provides many cool features like LoadBalancing, PortForwarding, (s)NAT, DHCP, VPN and so on. When a new network is created and being used, a Virtual Router is automatically launched to support these features. Since the Virtual Router is a Single Point of Failure, you should turn on the HA (High Available) option; which actually addes a 2nd Virtual Router on each network. While this is pretty cool, it makes the number of System VM’s go up and when you don’t need them it’s kind of wasting resources.

For example, when I was creating a network for the web servers to talk privately to the database, I didn’t need a Virtual Router. All I want is them to be able to do networking to each other and that is all. When using the default settings, a Virtual Router is launched anyway.

So how to tell CloudStack you don’t need a Virtual Router? Well, this is done through Service Offerings. You find this option on the menu at the left, it’s the last option. Select Network Offerings and a list is displayed.

Click Add Network Offering at the right and fill in the form. When you do not select any service, you’ll create a Network Offering for which CloudStack does not spin off Virtual Routers.

Now, when you create a new Guest Network, make sure to select the Network Offering you just created. This will make sure your new Guest network will have no Virtual Router launched when in use 🙂

Update: I’ve written another blog with more details on how to use this network. Also have a look at the comments in both blogs for some examples and idea’s. Feel free to ask me any questions you have below!

We still have some older hardware running for non-critical and lightweight operations. Since upgrading these old boxes to Debian Squeeze weird things started to happen. After some time, the system clock stops working and that results in processes to crash and new ones unable to start. Of course, nothing works properly on any computer without a working system clock. Here you see how that looks like:

First thought was to replace the CMOS battery, but that didn’t help and it occurred on multiple machines around the same time. Rebooting solves the problem for a while but 24-36 hours later the system clock stops again. Sometimes the problem stays away much longer. Also, some servers with the same hardware do not seem to have the problem. It seems hardware related. But is that really true?

Since the server crashes it is kinda hard to figure out what exactly happens. Logs aren’t properly written anymore and system processes are unreliable. Sometimes it is possible to access the server via SSH, sometimes it isn’t.

Today this happened again with one of our servers. Fortunately I was able to SSH into the machine and I found this warning in the logs:

Mar  14 13:00:05 server kernel: [97383.660485] Clocksource tsc unstable (delta = 4686838547 ns)

TSC is the Time Stamp Counter. Processors have dynamically changed clock speed (ofc to save power). The TSC is supposed to tick at the CPU rate so on frequency change, this ought to happen. The kernel will automatically switch to something else. This is true for modern hardware, this old CPU is set to constant_tsc (when using cat /proc/cpuinfo it’s one of the flags). It is thus supposed to run at the same frequency at all times.

The system might have multiple available clock sources. You can list the available clock sources in a system like this:

cat /sys/devices/system/clocksource/clocksource0/available_clocksource

On my system this returns:

tsc acpi_pm jiffies

To see which one is actually in use, issue:

cat /sys/devices/system/clocksource/clocksource0/current_clocksource

On my system this returns:


So this tells me I’m using ‘tsc’ as a Clock Source, and my system has two more options. Since I’m having trouble with ‘tsc’, let’s change the Clock Source from ‘tsc’ to ‘acpi_pm’.

echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource

Verify the new setting like this:

cat /sys/devices/system/clocksource/clocksource0/current_clocksource

It should now return ‘acpi_pm”.

This can be done at run time, when you reboot the setting is lost. It is a great way to test the setting. To make it permanent, add this line to the boot parameters in /boot/grub/menu.lst:

notsc clocksource=acpi_pm

I’ve just changed this on the server and I’m really curious whether this will be the solution to prevent the hangs! At least the time is still ticking.. 😉

If anyone has more suggestions or info, please let me know! I’ll keep you posted.

Update: Just reached 4 days of uptime!

Update 2: Today it’s 4 month’s after I wrote this blog. The machine is still up & running and has reached 106 days of uptime!

Update 3: After 7 months (222 days of uptime) I finally retired the machine since we migrated to our CloudStack cloud. The fix described above really works 🙂

In a previous post I described howto restore a OpenLDAP server from backup . But how to backup Open LDAP?

The backups I make consist of two parts:

1. First backup the LDAP database itself using a program called ‘slapcat.’ Slapcat  is  used  to generate an LDAP Directory Interchange Format (LDIF) output based upon the contents of a given LDAP database. This is a text version of your database which can be imported later. Think of it as a SQL-backup for relational databases. Anyway, here’s how to run slapcat on the OpenLDAP server:

slapcat -l backup.ldif

This will backup the whole database into the file called ‘backup.ldif’. You can then use this file to restore an OpenLDAP server later, using slapadd. Be sure to run this in a backup script from crontab and have a backup at least once per day.

2. Second thing I do, is backing up the config of the OpenLDAP server. This config is usually in /etc/ldap. Back it up using a tar, or using a technique like rsnapshot.

When you have this in place (and save the backups on a different place), you’ll be able to rebuild an OpenLDAP server without problems.

Then probably the code is not properly disconnecting the MySQL connection. The time MySQL waits for the next command to be send, can be controlled by the ‘wait_timeout’ parameter. The default is a massive 28800 seconds (8h). Setting this value too low gives ‘MySQl server has gone away’ errors, so be careful.

Debugging scripts that create many ‘sleeping’ connections is challenging. An easy solution is setting the ‘wait_timeout’ parameter to something low after the MySQL connection is started. This setting is then valid only for this current session.

Issue this query after every connect:

SET wait_timeout=60;

Database administrators can permanently change this setting in ‘/etc/mysql/my.cnf’ or by issuing a

SET GLOBAL wait_timeout=60;

to change it during runtime. Don’t forget to edit ‘my.cnf’ also, otherwise the setting is back to default the next time MySQL restarts.