Linux sysadmin « Remi Bergsma's blog

Managing hundreds of Linux servers with CFEngine and Git: full control over configuration

31 August 2013 — 21 Comments

Back in June, just before I went off for holiday, I attended a CFEngine training in Amsterdam. When I returned from holiday a few weeks later, me and my team started making plans to implement CFEngine in our environment. After two months of hard work, I’m proud to say we manage about 350 out of our 400 Linux servers with CFEngine!

The ride has been fun, although not always easy. In this post I’ll give a quick overview of our CFEngine implementation, where I found useful info, etc.

CFEngine is different
To start, let me tell you that one of the most difficult parts of learning CFEngine is to get used to the terminology and to ‘think’ CFEngine. For example, a ‘class’ in CFEngine is not what you think it is. It has nothing to do with object oriented programming. It’s more like a ‘context’ that you can use to make decisions. There’s no ‘flow control’ in CFEngine either: no IF/THEN/ELSE, no FOR/FOREACH/WHILE etcetera. In CFEngine classes are used for decision making, and, since CFEngine is smart, it does looping automatically. This results in clean and easy-to-read code.

CFEngine works on top of a theoretical model called ‘Promise Theory‘ by Mark Burgess (author of CFEngine). This theory models the behavior of autonomous agents in an environment without central authority, based on only promises of behavior made by each agent, and shows that even without central control , the system can converge to a stable state.

To get used to it, read ‘Learning CFEngine 3‘ by Diego Zamboni, as it will walk you through all of it with a lot of examples. The quote above is also from the book.

The basic idea is that each agent makes promises only about its own behavior, since that is all it can control. In CFEngine 3, everything is a promise.

Examples:
– a file promises to have a certain content and to be executable
– a service promises to be running
– a user account promises to exist (or not to exist) and have certain properties

When CFEngine finds a promise is not kept, it will do everything it knows about to make the promise true. If it cannot reach the promised state at first, it tries to the next best. Over time, the system converges to the desired (promised) state.

Once you get it and get used to it, it actually makes sense and is pretty easy to implement.

With great power comes great responsibility
This one-liner says it all. When you have a configuration management system that manages a lot of servers, you better be careful what promises you have it keep. This is why you need to manage CFEngine promises like software. You need version control and it needs to be flexible as well. I’ve read a lot about this subject and I believe Git is the way to go. This blog by Brian Bennett pretty much nails it. I got a lot of inspiration from it, thanks Brian!

I implemented these ‘branches’ in Git:
– development (aka master)
– beta
– pre-production
– production

This works perfectly: develop new promises in the ‘development’ branch, then merge to ‘beta’ branch to test on some of our own test servers. When everything works together and seems stable, we merge to the next branch ‘pre-production’. This is then tested on ~15 real production servers so it better be good. But when it isn’t, the impact is still not too high and it should be fixed before it ever hits ‘production’. Production branch is everything that is stable and is used on all ~350 servers.

Every time we merge to either ‘pre-production’ or ‘production’, we create a Git ‘tag’ with a date, that allows for easy roll backs. Whenever we need to get back to a certain state, we can always just checkout a tag. This is also very useful for audit trails, by the way.

Actually, we’re using another branch called ‘hotfix’. Whenever there’s an emergency to fix, we branch a ‘hotfix’ from ‘production’ and do the fix. This is for example when a promise misbehaves. This branch is then merged to production when ready, and also to ‘development’. Git handles this nicely: whenever the hotfix makes it all the way from ‘development’ to ‘production’, Git recognizes this commit was already processed earlier and ignores it.

Git commits, branches and tags in CFEngine repo

This is a screenshot from ‘Gitk’ that shows the commits, branches (green) and tags (yellow). As you can see, ‘production’ and ‘pre-production’ are at the same level now, so nothing new is tested in ‘pre-production’ at the moment. Quite some work is tested in the ‘beta’ branch and there are already some fixes committed in ‘master’. Recently there was a ‘hotfix’ branch that has now been merged. It should give an idea of how it works. It provides a clear overview and we now know about every change on the configuration of our servers. Clicking on a commit show what changed, who did it, etc.

CFEngine Policy hubs
For each of the 4 branches we’ve created a CFEngine policy hub. The policy hub is a server running CFEngine software that serves the given branch to the agents (the Linux servers connected to it). Linux servers can even switch between them by ‘bootstrapping’ to one of the 4 policy hubs. Although we only use that on our test servers.

Manage what’s ‘in flight’ with a CFEngine Trello board
Trello provides an intuitive and modern web interface that allows you to manage ‘cards’ on different ‘lists’ on a ‘board’. To get an idea, see the example Trello board below (click on it to enlarge).

Trello CFEngine board

New cards are usually created in ‘Feature requests’ or ‘Bugs’ and then transferred to ‘Working on it!’. The number of cards in this stage should be limited, as you can only work on a number of things at the same time. This is actually Kanban style. Next, we’ve created a list for each Git ‘branch’ we have and cards flow from ‘beta’ to ‘pre-production’ and finally ‘production’. Moving cards is just dragging & dropping. Each month, cards in ‘production’ are archived. This creates an overview of what new work is to be done (‘Feature requests’ and ‘Bugs’), what we’re currently working on and what’s in each of the branches. Trello has the overview, Git has the code and the details. Also, Trello is perfect for communication between team members. Notes, comments, documents, lists, etcetera can all be created with ease.

Testing promises
To be able to test the promises on our local laptops, we’re using a tool called Vagrant. Vagrant sets up Virtual Machines (for example using Virtual Box) and allows you to ‘destroy’ and ‘create’ them within minutes. All team members have a local Git checkout, that is also available in the Vagrant boxes. This allows us to test any change before even committing. We have Vagrant boxes setup for all Linux distributions we support. It’s so easy and so fast to test changes that everybody does. And even when an error slips through, other team members will soon notice and it’s usually fixed within minutes, before it ever hits the ‘beta’ branch.

Bugs
We encountered a strange bug when using SLES 11 and CFEngine 3.5: CFEngine (community edition) got running with the ‘SIGPIPE’ signal blocked. When CFEngine restarts SSH, this too gets running with ‘SIGPIPE’ blocked. This results in ‘sudo’ no longer working. It would just return nothing at all. It took us quite some time to figure out it was the ‘SIGPIPE’ signal that was blocked. The root cause probably lies in an old ‘Bash’ version (3.51) that SLES uses, combined with something CFEngine triggers. We’ve now implemented an automated work-around (made a CFEngine promise) that fixes the problem. We did some nice team work on this one!

Conclusion
CFEngine’s learning curve might be steep, but the result is definitely rewarding. Combined with Git and Trello it allows for fine control and great overview of configuration changes. Our whole team is involved in changes, they are reviewed and result in high quality code. This eventually makes the Linux servers we manage more stable. Also, it’s a great feeling to be in-control and know what’s going on our servers.

From this point on, we’ll continue to both scale horizontally (add more servers) and vertically (add more promises). After two months of daily working with CFEngine, I’ve to say I really like it and I enjoy writing promises.

I’ll keep you posted, I promise 😉

In Linux sysadmin cfengine, devops, diego zamboni, git, howto, mark burgess, SUSE

Improving data security: encrypting disks in Linux using LUKS and LVM

31 July 2013 — 4 Comments

Data security is getting more and more important these days. Imaging you work as a sysadmin, on a laptop, and it gets lost. Of course you might lose your work (but you have a backup, right?). The real problem: your sensible data (SSH private key for example) is no longer under your control.

In this blog I explain how you can add an encrypted partition to Linux. As long as you also use a password protected screensaver, with a decent password (to protect a running, logged in laptop), no one can access your data. Even rebooting into single user mode (to by-pass the login screen) won’t help. No access to the encrypted disk without a working passphrase. Wiping your disk and reinstall is an option, but your data is not unveiled.

LUKS: Linux Unified Key Setup
Linux ships with LUKS, that is short for ‘Linux Unified Key Setup’. It’s a tool and technique to setup encrypted devices. This device can be a laptop harddisk, but also a USB-pendrive or a virtual disk (when you have a virtualized server). Encryption with LUKS works on the block level, so filesystems above it are not even aware. This is nice, because it also means you can use LVM inside a LUKS encrypted block device. The encrypted drive is protected with a passphrase that you need to enter at boot.

Below I’ll show you how to set it up. Be aware we are formatting partitions. In other words, you will lose all data on the partition you will experiment with. For testing purposes, use a USB-pendrive or a spare partition. I’m using a virtual disk called /dev/sdc for this demo.

Note: you need the kernel module dm_crypt loaded for this to work.

modprobe dm_crypt

Don’t forget to make it persistent after a reboot.

To format the partition:

cryptsetup luksFormat /dev/sdc

Output:

WARNING!
========
This will overwrite data on /dev/sdc irrevocably.

Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase: 
Verify passphrase:

Now that the encrypted disk is created, we’ll open it:

cryptsetup luksOpen /dev/sdc encrypted_disk

Output:

Enter passphrase for /dev/sdc:

To unlock the disk, you need to enter the passphrase you just set. You need to do this every time you want to unlock the disk.

The ‘encrypted_disk‘ part of the command above, is used to map a device name to your encrypted disk. In this case: ‘/dev/mapper/encrypted_disk’ is created. So, the encrypted disk called ‘/dev/sdc’ now has an extra name to refer to its unlocked state called ‘/dev/mapper/encrypted_disk’. This device is also a block device, on which you could run ‘fdisk’ or ‘mkfs.ext4’, etc.

But stop, before you do that. When you’d like to use multiple partitions, I’d suggest using LVM inside the encrypted disk. This isn’t visible from outside; only when you’ve unlocked it the LVM partitions appear. It prevents from entering a password for each and every encrypted disk you create. Also, LVM is more flexible in resizing its logical volumes.

Let me show you how to setup LVM on the encrypted disk:

pvcreate /dev/mapper/encrypted_disk

Output:

Physical volume "/dev/mapper/encrypted_disk" successfully created

vgcreate encrypted /dev/mapper/encrypted_disk

Output:

Volume group "encrypted" successfully created

lvcreate encrypted --name disk1 --size 10G

Output:

Logical volume "disk1" created

lvcreate encrypted --name disk2 --size 10G

Output:

Logical volume "disk2" created

You now have two more block devices called ‘/dev/encrypted/disk1’ and ‘/dev/encrypted/disk2’. Let’s put a file system on top of them:

mkfs.ext4 -m0 /dev/encrypted/disk1
mkfs.ext4 -m0 /dev/encrypted/disk2

The two encrypted partitions are now ready to be used. Let’s mount them somewhere:

mkdir -p /mnt/disk1 /mnt/disk2
mount /dev/encrypted/disk1 /mnt/disk1
mount /dev/encrypted/disk2 /mnt/disk2

This works all pretty cool already. But when you reboot, you’ll have to run:

cryptsetup luksOpen /dev/sdc encrypted_disk
vgscan
lvchange --activate y encrypted/disk1
lvchange --activate y encrypted/disk2
mount /dev/encrypted/disk1 /mnt/disk1
mount /dev/encrypted/disk2 /mnt/disk2

Entering the passphrase is required after the first command. Line 2 scans for new LVM devices (because when unlocking the encrypted device, a new block device appears). Line 3 and 4 activate the two logical volumes, and finally they are mounted.

Automating these steps
It is possible to automate this. That is, Linux will then ask for the passphrase at boot time and mount everything for you. Just think about that for a while. When booting a laptop this is probably what you want. But if it is a server in a remote location, it might not, as you need to enter it on the (virtual) console for it to continue booting. No SSH access at that time.

Anyway, you need to do two things. First is to tell Linux to unlock the encrypted device at boot time. Second is to mount the logical volumes.

To start, lookup the UUID of the encrypted disk, /dev/sdc:

cryptsetup luksDump /dev/sdc | grep UUID

The result should be something like:

UUID: b8f60c1d-ffeb-4aaf-8368-9e5d4d29fc52

Open /etc/crypttab and enter this line:

encrypted_disk UUID=b8f60c1d-ffeb-4aaf-8368-9e5d4c29fc52 none

The first field is the name of the device that is created, use the same as when using luksOpen above. The second field is the UUID we just found. The final field is the password, but this should be set to ‘none‘ as this prompts for the passphrase. Entering the passphrase in this file is a bad idea, if you ask me.

Final step is to setup /etc/fstab to mount the encrypted disks automatically. Add these lines:

/dev/encrypted/disk1 /mnt/disk1 ext4 defaults 0 0
/dev/encrypted/disk2 /mnt/disk2 ext4 defaults 0 0

I’m using device names here, because LVM gives me the /dev/encrypted/disk[12] name every time. When you did not use LVM, it’s probably wise to use UUID’s in /etc/fstab instead. This makes sure the right filesystem is mounted, regardless of the device’s name.

Time to reboot. During this reboot, Linux will ask for the passphrase of the /dev/sdc device. On RHEL 6 it looks like:

RHEL 6 asks for the LUKS passphrase of /dev/sdc during boot

It might look different on your OS. The Ubuntu version looks a bit prettier, for example.

Enter the passphrase, hit enter, and Linux should continue booting normally. Then login to the console (or SSH) and verify if the two disks are mounted.

...
/dev/mapper/encrypted-disk1 on /mnt/disk1 type ext4 (rw)
/dev/mapper/encrypted-disk2 on /mnt/disk2 type ext4 (rw)
...

Ubuntu makes this very easy to setup: they just have a checkbox during install that says ‘encrypt disk’ and will setup LUKS. But, you end up with everything in one big / partition without LVM. That’s why I prefer to configure it myself, and with these instructions so can you.

Conclusion
It is cool that these security features are now mainstream and easy to use. Do yourself a favor, and setup LUKS today!

In Linux sysadmin encryption, howto, luks, LVM, security

Today I started learning CFEngine 3

13 June 2013 — 1 Comment

Me and two colleagues went to Amsterdam today, for a 1-day CFEngine-3 training. I’ve worked with configuration management before (Puppet), and my goal is to explore alternatives to be able to pick the right tool. Today’s quick overview training was a nice opportunity to get into CFEngine and meet some people behind the scenes.

Impression of CFEngine training

Diego Zamboni, author of the book Learning CFEngine 3, taught us what the concepts behind CFEngine are, how the language is build and how to get started. He demo’ed a lot of things and answered all of our questions. It was a very informative training that really inspired me.

What surprised me most, was that CFEngine is actually a pretty nice monitoring tool! A smart one, because it is either able to fix things, or is able to report it. All in all, I’ve to say I’m really impressed by CFEngine 3.5. We have now inspiration to create a good plan and then start working on implementing it. Looking forward to it!

Thanks Diego and Carsten for this training and the nice drinks we had afterwards 🙂

In Linux sysadmin cfengine, devops, puppet

Next step on the certification road: I am now LPIC-3 core certified!

12 June 2013 — 2 Comments

LPICLevel3_logo Today I passed the senior level LPIC-301 core exam. I am now LPIC-3 certified!

The exam objectives are OpenLDAP and Capacity planning. Both of which I used in past projects, so I was already pretty familiar with them. In my current job I came across AD integration and Kerberos, and those were objectives as well.

In the past weeks I went through every detail of the objectives and learned some new stuff. My score on the exam was 690/800.

I’ve mainly used the IBM Developer Works LPIC-3 website, OpenLDAP manual and the LPIC-3 page hosted at rootkit.nl. As always, the man pages of the Linux commands are very useful.

After studying the materials, it’s just practice, practice, practice. I’ve setup an environment in which I could test and play with the tools and that worked really well for me.

The Security LPIC-3 specialty exam is the next one on my wish-list. I’ve already started to study it because I’m really interested in these topics.

Later this year I’ll be doing the RedHat RHCSA and RHCE exams. I’ve booked them on the same day, so that’s probably getting a bit touch 😉 But I’m looking forward to it and enjoy the steps on the certification road.

In Linux sysadmin certification, exam, LPI

Playing with two-factor authentication in Linux using Google Authenticator

8 June 2013 — 14 Comments

As discussed in a previous post, multi-factor authentication really makes things more secure. Let’s see how we can secure services like SSH and the Gnome desktop with multi-factor authentication.

The Google Authenticator project provides a PAM module than can be integrated with your Linux server or desktop. This PAM module is designed for for home and other small environments. There is no central management of keys, and the configuration is saved in each users home folder. I’ve successfully deployed this on my home server (a Raspberry Pi) and on my work laptop.

When using a Debian-like OS, you can install it with a one-liner

apt-get install libpam-google-authenticator

But note the packaged version is old and does not support all documented options. Below I talk about the ‘nullok’ option, but that is not supported in the packaged version. You then see this error:

sshd(pam_google_authenticator)[884]: Unrecognized option "nullok"
sshd(pam_google_authenticator)[887]: Failed to read "/home/pi/.google_authenticator"

That’s why I suggest building from source, as this can be done quickly:

apt-get remove libpam-google-authenticator
apt-get install libpam0g-dev libqrencode3
cd ~
wget https://google-authenticator.googlecode.com/files/libpam-google-authenticator-1.0-source.tar.bz2
tar jvxf libpam-google-authenticator-1.0-source.tar.bz2
cd libpam-google-authenticator-1.0/
make
make install

Open a shell for a user you want to enable two-factor authentication for and run ‘google-authenticator’ to configure.

google-authenticator

Configure Google Authenticator

Nice!

Configure your Mobile device
Install the ‘Google Authenticator’ app and just scan the QR-code. It will automatically configure itself and start displaying verification codes.

Verification codes displayed by app

You should notice it’ll display new codes each 30 seconds.

Configure SSH
Two files need to be edited in order to enable two-factor authentication in SSH.

vim /etc/pam.d/sshd

Add this line:

auth required pam_google_authenticator.so nullok

Where to put this in the file? That depends. When you put it at the top, SSH will first ask a verification code, then a password. To me this sounds unlogical, so I placed it just below this line:

@include common-auth

The ‘nullok’ option, by the way, tells PAM whenever no config for 2-factor authentication is found, it should just ignore it. If you want SSH logins to fail, when no two-factor authentication is configured, you can delete the option. Be warned to at least config it for one user, or you will be locked out of your server.

Now tel SSH to ask for the verification code:

vim /etc/ssh/sshd_config

Edit the setting, it’s probably set to ‘no’:

ChallengeResponseAuthentication yes

Now all you need to do is restart SSH. Keep a spare SSH session logged-in, just in case.

/etc/init.d/ssh restart

SSH two-factor authentication in action

SSH will now ask for a verification code when you do an interactive login. When a certificate is used, no verification code is asked.

Configuring Ubuntu Desktop
The Ubuntu desktop can also ask for a verification code. Just edit one file:

vim /etc/pam.d/lightdm

And add this line, just like SSH:

auth required pam_google_authenticator.so nullok

The login screen should now ask for a verification code. It looks like:

Two-factor authentication in Ubuntu

You can setup any PAM service like this, the basic principles are all the same.

Skip two-factor authentication if logging in from the local network
At first this is all very cool, but soon it becomes a bit annoying, too. When I SSH from a local network, I just don’t want to enter the verification code because I trust my local network. When I SSH from remote, a verification code is required. One way to arrange that, is always login with certificates. But there is another way to configure it: using the pam_access module. Try this config:

auth [success=1 default=ignore] pam_access.so accessfile=/etc/security/access-local.conf
auth required pam_google_authenticator.so nullok

The config file, /etc/security/access-local.conf looks like:

# Two-factor can be skipped on local network
 + : ALL : 10.0.0.0/24
 + : ALL : LOCAL
 - : ALL : ALL

Local login attempts from 10.0.0.0/24 will not require two-factor authentication, while all others do.

Two-factor only for users of a certain group
You can even enable two-factor authentication only for users of a certain group. Say, you’d want only users in the ‘sudo’ group to do use two-factor authentication. You could use:

auth [success=1 default=ignore] pam_succeed_if.so user notingroup sudo
auth required pam_google_authenticator.so nullok

By the way, “[success=1 default=ignore]” in PAM means: on success, skip the next 1 auth provider (two-factor authentication), on failure, just pretend it didn’t happen.

Moving the config outside the user’s home
Some people prefer not to store the authentication data in the user’s home directory. You can do that using the ‘secret’ parameter:

mkdir -p /var/lib/google-authenticator/
mv ~remibergsma/.google_authenticator /var/lib/google-authenticator/remibergsma
chown root.root /var/lib/google-authenticator/*

The PAM recipe will then become:

auth required pam_google_authenticator.so nullok user=root secret=/var/lib/google-authenticator/${USER}

As you’ve seen, PAM is very flexible. You can write almost anything you want.

Try the Google Authenticator two-factor authentication. It’s free and adds another security layer to your home server.

In Linux sysadmin google authenticator, pam, security, ssh, two-factor authentication, ubuntu

	Shashi on Setting locales correctly on M…
	Sayling Low on Alt-key in OSX-Terminal
	Roger on Setting locales correctly on M…
	belwardblog on HOWTO discover the ip address…
	Guilherme Caeiro Dia… on Setting locales correctly on M…
	Terminal Show Multip… on Setting locales correctly on M…
	bodhix on RRDtool: moving data between 3…
	vasu on One-liner: restore compressed…
	Angel on HOWTO quickly add a route in M…
	Kar.ma on HOWTO connect to hosts on a re…
	Home \| MacarioJames.… on Sed inline editing different o…
	Mac i problemy z loc… on Setting locales correctly on M…
	NearlyNormal on HOWTO enable color for PHP and…
	Yong on Connecting two Open vSwitches…
	Aysad Kozanoglu on Creating a multi hop SSH tunne…

Remi Bergsma's blog

Archives For 30 November 1999

Managing hundreds of Linux servers with CFEngine and Git: full control over configuration

Improving data security: encrypting disks in Linux using LUKS and LVM

Today I started learning CFEngine 3

Next step on the certification road: I am now LPIC-3 core certified!

Playing with two-factor authentication in Linux using Google Authenticator

About me

Blog Stats

Tag Cloud

Top posts

Recent comments

Archives

Tweets @remibergsma

Follow Blog via Email

Remi Bergsma's blog

Archives For 30 November 1999

Managing hundreds of Linux servers with CFEngine and Git: full control over configuration

Rate this:

Share this:

Improving data security: encrypting disks in Linux using LUKS and LVM

Rate this:

Share this:

Today I started learning CFEngine 3

Rate this:

Share this:

Next step on the certification road: I am now LPIC-3 core certified!

Rate this:

Share this:

Playing with two-factor authentication in Linux using Google Authenticator

Rate this:

Share this:

About me

Blog Stats

Tag Cloud

Top posts

Recent comments

Archives

Tweets @remibergsma

Follow Blog via Email