Archives For devops

Be the automator!

30 November 2014 — Leave a comment

Today I saw an awesome video of a presentation by Glenn O’Donnell.

In his presentation, Glenn states that it’s not about technology, it’s about services. Service design is modular with a logical structure. Approach it as a system and try to improve it as a whole, not just tiny pieces of it. To do that, you need systems engineers. Although most are lazy, or, as Glenn puts it: “Locally brilliant, globally stupid”.

Accept we have no full control
The whole eco system includes a lot of infrastructure and application components. Including services from third parties. We tend to zoom in a lot but then we miss the point: it’s not about the servers, the storage and the network. It’s about how it all works together. And let’s face it, we’ll not have full control over the eco system because it contains components managed by third parties.

We IT people have a hard time accepting we have no control. We think we’re the only one that can maintain that server. But in fact, software can do better and will put us out of business. That software is already out there today.

In this new world, we need people with a different skill set that can manage this complex eco system. It’s all software these days. Obviously the application is software, but so is the infrastructure on which it runs. Cloud infrastructure is all software defined. Even physical servers should be software controlled, instead of manipulating them manually.

How? Well, you can automate if you have a model. The model is a software description of reality. Tools consume this model and create reality out of it.

 ne the automator model

Software Model drives Automation Tool that produces the Service.

Glenn compares this with building a plane: first models are build and simulated before they ever put a plane in the air. Makes sense, right?

We should not be on the command line
This means we should not be on the command line. Let’s get away from the command line! We should manipulate the model instead, and let the model create or change reality. This model is our system software, which we should treat the same way as we treat application software. It’s software, and we can automate software.  By the way, no automation means no DevOps because it’s gonna be too slow.

You’ll also get a better quality because human beings are bad at repetitive tasks. And it’s a waste having smart people do repetitive work. Software can do that instead. The model is the language, the secret code.

Automate yourself out of your job
Although this is cool, it does render some jobs obsolete. Glenn states that if you have “administrator” in jour job title, you’ll be replaced by software (that can do better). But don’t worry, there will be other, more interesting jobs, instead. Automate yourself out of your job. It’s fun! In short: be the automator, not the automated.

be the automator jobs

To drive this movement, we need innovators! Geeks are innovators 🙂 Geeks love change, they automate, they create, and they want to move on to the next interesting thing to discover.

Next to Geeks, there are also Geek Imposters. They might do the same job, but they hate change and want to keep everything as it is. To them, Glenn has a nice advice: “Learn to say: Would you like fries with that?“.

View the video of Glenn’s presentation:

Geeks are changing the world. If you think you are a Geek that loves change and loves automation, you could be what we call a Cupfighter at Schuberg Philis.

Are you a Cupfighter?

If you think you are a Cupfighter, please contact me and we’ll change the world 😉

Back in June, just before I went off for holiday, I attended a CFEngine training in Amsterdam. When I returned from holiday a few weeks later, me and my team started making plans to implement CFEngine in our environment. After two months of hard work, I’m proud to say we manage about 350 out of our 400 Linux servers with CFEngine!

The ride has been fun, although not always easy. In this post I’ll give a quick overview of our CFEngine implementation, where I found useful info, etc.

CFEngine is different
To start, let me tell you that one of the most difficult parts of learning CFEngine is to get used to the terminology and to ‘think’ CFEngine. For example, a ‘class’ in CFEngine is not what you think it is. It has nothing to do with object oriented programming. It’s more like a ‘context’ that you can use to make decisions. There’s no ‘flow control’ in CFEngine either: no IF/THEN/ELSE, no FOR/FOREACH/WHILE etcetera. In CFEngine classes are used for decision making, and, since CFEngine is smart, it does looping automatically. This results in clean and easy-to-read code.

CFEngine works on top of a theoretical model called ‘Promise Theory‘ by Mark Burgess (author of CFEngine). This theory models the behavior of autonomous agents in an environment without central authority, based on only promises of behavior made by each agent, and shows that even without central control , the system can converge to a stable state.

To get used to it, read ‘Learning CFEngine 3‘ by Diego Zamboni, as it will walk you through all of it with a lot of examples. The quote above is also from the book.

The basic idea is that each agent makes promises only about its own behavior, since that is all it can control. In CFEngine 3, everything is a promise.

– a file promises to have a certain content and to be executable
– a service promises to be running
– a user account promises to exist (or not to exist) and have certain properties

When CFEngine finds a promise is not kept, it will do everything it knows about to make the promise true. If it cannot reach the promised state at first, it tries to the next best. Over time, the system converges to the desired (promised) state.

Once you get it and get used to it, it actually makes sense and is pretty easy to implement.

With great power comes great responsibility
This one-liner says it all. When you have a configuration management system that manages a lot of servers, you better be careful what promises you have it keep. This is why you need to manage CFEngine promises like software. You need version control and it needs to be flexible as well. I’ve read a lot about this subject and I believe Git is the way to go. This blog by Brian Bennett pretty much nails it. I got a lot of inspiration from it, thanks Brian!

I implemented these ‘branches’ in Git:
– development (aka master)
– beta
– pre-production
– production

This works perfectly: develop new promises in the ‘development’ branch, then merge to ‘beta’ branch to test on some of our own test servers. When everything works together and seems stable, we merge to the next branch ‘pre-production’. This is then tested on ~15 real production servers so it better be good. But when it isn’t, the impact is still not too high and it should be fixed before it ever hits ‘production’. Production branch is everything that is stable and is used on all ~350 servers.

Every time we merge to either ‘pre-production’ or ‘production’, we create a Git ‘tag’ with a date, that allows for easy roll backs. Whenever we need to get back to a certain state, we can always just checkout a tag. This is also very useful for audit trails, by the way.

Actually, we’re using another branch called ‘hotfix’. Whenever there’s an emergency to fix, we branch a ‘hotfix’ from ‘production’ and do the fix. This is for example when a promise misbehaves. This branch is then merged to production when ready, and also to ‘development’. Git handles this nicely: whenever the hotfix makes it all the way from ‘development’ to ‘production’, Git recognizes this commit was already processed earlier and ignores it.

Git commits, branches and tags in CFEngine repo

Git commits, branches and tags in CFEngine repo


This is a screenshot from ‘Gitk’ that shows the commits, branches (green) and tags (yellow). As you can see, ‘production’ and ‘pre-production’ are at the same level now, so nothing new is tested in ‘pre-production’ at the moment. Quite some work is tested in the ‘beta’ branch and there are already some fixes committed in ‘master’. Recently there was a ‘hotfix’ branch that has now been merged. It should give an idea of how it works. It provides a clear overview and we now know about every change on the configuration of our servers. Clicking on a commit show what changed, who did it, etc.

CFEngine Policy hubs
For each of the 4 branches we’ve created a CFEngine policy hub. The policy hub is a server running CFEngine software that serves the given branch to the agents (the Linux servers connected to it). Linux servers can even switch between them by ‘bootstrapping’ to one of the 4 policy hubs. Although we only use that on our test servers.

Manage what’s ‘in flight’ with a CFEngine Trello board
Trello provides an intuitive and modern web interface that allows you to manage ‘cards’ on different ‘lists’ on a ‘board’. To get an idea, see the example Trello board below (click on it to enlarge).

Trello CFEngine board

Trello CFEngine board


New cards are usually created in ‘Feature requests’ or ‘Bugs’ and then transferred to ‘Working on it!’. The number of cards in this stage should be limited, as you can only work on a number of things at the same time. This is actually Kanban style. Next, we’ve created a list for each Git ‘branch’ we have and cards flow from ‘beta’ to ‘pre-production’ and finally ‘production’. Moving cards is just dragging & dropping. Each month, cards in ‘production’ are archived. This creates an overview of what new work is to be done (‘Feature requests’ and ‘Bugs’), what we’re currently working on and what’s in each of the branches. Trello has the overview, Git has the code and the details. Also, Trello is perfect for communication between team members. Notes, comments, documents, lists, etcetera can all be created with ease.

Testing promises
To be able to test the promises on our local laptops, we’re using a tool called Vagrant. Vagrant sets up Virtual Machines (for example using Virtual Box) and allows you to ‘destroy’ and ‘create’ them within minutes. All team members have a local Git checkout, that is also available in the Vagrant boxes. This allows us to test any change before even committing. We have Vagrant boxes setup for all Linux distributions we support. It’s so easy and so fast to test changes that everybody does. And even when an error slips through, other team members will soon notice and it’s usually fixed within minutes, before it ever hits the ‘beta’ branch.

We encountered a strange bug when using SLES 11 and CFEngine 3.5: CFEngine (community edition) got running with the ‘SIGPIPE’ signal blocked. When CFEngine restarts SSH, this too gets running with ‘SIGPIPE’ blocked. This results in ‘sudo’ no longer working. It would just return nothing at all. It took us quite some time to figure out it was the ‘SIGPIPE’ signal that was blocked. The root cause probably lies in an old ‘Bash’ version (3.51) that SLES uses, combined with something CFEngine triggers. We’ve now implemented an automated work-around (made a CFEngine promise) that fixes the problem. We did some nice team work on this one!

CFEngine’s learning curve might be steep, but the result is definitely rewarding. Combined with Git and Trello it allows for fine control and great overview of configuration changes. Our whole team is involved in changes, they are reviewed and result in high quality code. This eventually makes the Linux servers we manage more stable. Also, it’s a great feeling to be in-control and know what’s going on our servers.

From this point on, we’ll continue to both scale horizontally (add more servers) and vertically (add more promises). After two months of daily working with CFEngine, I’ve to say I really like it and I enjoy writing promises.

I’ll keep you posted, I promise 😉

Me and two colleagues went to Amsterdam today, for a 1-day CFEngine-3 training. I’ve worked with configuration management before (Puppet), and my goal is to explore alternatives to be able to pick the right tool. Today’s quick overview training was a nice opportunity to get into CFEngine and meet some people behind the scenes.

Impression of CFEngine training

Impression of CFEngine training


Diego Zamboni, author of the book Learning CFEngine 3, taught us what the concepts behind CFEngine are, how the language is build and how to get started. He demo’ed a lot of things and answered all of our questions. It was a very informative training that really inspired me.

What surprised me most, was that CFEngine is actually a pretty nice monitoring tool! A smart one, because it is either able to fix things, or is able to report it. All in all, I’ve to say I’m really impressed by CFEngine 3.5. We have now inspiration to create a good plan and then start working on implementing it. Looking forward to it!

Thanks Diego and Carsten for this training and the nice drinks we had afterwards 🙂

Today while I was running I realized I’ve been doing a lot of coding lately. Coding our new infrastructure to be exact. I remembered Kris Buytaert’s talk about DevOps back in February when I was in Antwerp. One of the key statements is that there are ‘sysadmin coders’ and not ‘sysadmins’ and ‘coders’. The only way to achieve great results, is when these two work together and communicate with each other. The idea’s behind this are called DevOps. IT Operations starts using code to manage configurations and infrastructure instead of doing it by hand over and over again. Thanks to CloudStack and Puppet this is now possible. Ideally, you would not have two groups, but one. Stephen Nelson-Smith from describes it like this:

So, the Devops movement is characterized by people with a multidisciplinary skill set – people who are comfortable with infrastructure and configuration, but also happy to roll up their sleeves, write tests, debug, and ship features. These are people who making connections, because they can – because they have feet in multiple camps, they can be ambassadors, peace makers, facilitators and communicators. And the point of the movement is to identify these, currently rare, people and encourage them, compare ideas, and start to identify, train, recruit and popularize this way of doing IT.

More on Stephen’s blog..

I didn’t realize back then what this would mean because I was focused on CloudStack and the tools around it. But it is not only about the tools, it’s the way you look at managing infrastructure and development. What we’re doing looks like DevOps but we’re not there yet ;-). In the coming weeks I’ll spend some more time reading about DevOps to see how we can implement this in our organization. Because I really believe this is the way to go..

Goint to Antwerp by train

Going to Antwerp by train

Thursday February 2nd me and colleague Pim went to Antwerp by train to attend “Build an Open Source Cloud-Day, hosted by INUITS the following day.

The programme looked promising and I really looked forward to meeting David Nalley and Mark Hinkle from Citrix’ CloudStack. The last months we kept an eye on  CloudStack: tested their current 2.2 release, and the 3.0 beta’s. Although we had CloudStack more or less up and running, there were still many questions to ask and many things to learn. This was a perfect opportunity for that.

Dinner at Brasserie Appelmans

Dinner at Brasserie Appelmans

But that would follow the next day. So we first decided to have some dinner in Antwerp. A friend of mine suggested Brasserie Appelmans and that really was a good suggestion! Service was friendly and nice also. 🙂

Back in the hotel wifi wasn’t working. The reception didn’t know why (only got a link-local ip so probably DHCP problem). With some guessing and trying managed to get it to work. Assign yourself an ip in the 192.168.0/24 range, .1 as gw and you’ll be good to go 😉

Crash Course on Open Source Cloud Computing

Crash Course on Open Source Cloud Computing

The following morning we went to the “Build an OpenSource Cloud“-event. Mark Hinkle kicked off with an interesting “Crash Course on Open Source Cloud Computing“. He showed us what a cloud really is, what OpenSource tools are available and what makes a cloud scalable. Scale up (add more compute nodes) and scale out (using loadbalancing). Mark talked about PaaS and IaaS, and listed associated Open Soure software solutions. I found it really interesting and refreshing to kick-off with such a broad overview of Cloud Computing!

Xen Cloud Platform

Xen Cloud Platform

Next, Lars Kurth told us all about Xen, Xen Cloud Platform and Citrix XenServer. A lot of hard work has been done in getting Xen into the Linux Kernel and in building Citrix XenServer from the OpenSource code, although a lot of work still needs to be done.

Both Citrix XenServer (the commercial supported version) and Xen Cloud Platform are supported by CoudStack and integrate nicely.

Build Your Cloud -CloudStack

Build Your Cloud -CloudStack

Lunch time!  Wow – we’ve had a really tasty lunch 🙂 During lunch I had the opportunity to chat with David Nalley and Mark Hinkle about CloudStack and our experience with it so far. They kindly answered all of our questions and had some nice suggestions, too. It gave me the feeling CloudStack is the best choice for us. A good product and nice and friendly people behind it. Great! I’ll write in some more detail about CloudStack and our progress with the project in a later post. After lunch, David presented and demo’ed both the current and upcoming CloudStack release. Good news: CloudStack 3.0 release is targeted at the end of the month 🙂

Automatic Configuration of Your Cloud with Puppet

Automatic Configuration of Your Cloud with Puppet

I’ve to be honest – the main reason to come was CloudStack and meeting up with David and Mark. But the other presentations added up a nice broad overview and even changed the way I look at building our Cloud. So that’s cool!

Carl Caum really impressed me with his Puppet presentation. He’s a pretty good presenter and came with solutions to problems that many sysadmin’s have: there’s a limit in how many servers you can manage by hand. Of course with some smart automation this number will go up, but it just isn’t scalable. And worse, the systems then aren’t always the same which may lead to unexpected trouble. Puppet is a system for automating system administration tasks that makes our life easier!

Puppet overview

Puppet overview

This is how it works: from a central location – Puppetmasterd – configuration of groups of servers is managed. Want to change a file, package or setting? Do it once, Puppet makes it happen, and makes sure it is always in that given state. So, you tell Puppet “what” you want, not “how”. It’s pretty cool stuff that I’ll look into in the coming weeks.

The presentations made me realize administering servers (VM’s) in the Cloud is very different compared to traditional sysadmin work. No longer you must think of servers as something that stays there all the time. VM’s should be spinned-off when needed and destroyed when no longer needed. For this to work, configuration and user-data must be separated from the VM itself. The loadbalancer has the public ip and decides how many VM’s are needed to handle the load. Via API calls it can deploy new VM’s. The VM therefore should be easy to re-deploy. Puppet & CloudStack together can do that! Now, that is true scaling-out.

David Nalley also presented some slides about Zenoss CoreMonitoring the Cloud with Zenoss Core“. There are three really cool things about it:

  1. it integrates nicely with CloudStack
  2. it has an API (whoohoo! Monitoring system with an API!)
  3. it is compatible with Nagios-plugins, so previous work in that area can be re-used.

To me, it seems this is gonna be our new monitoring system 🙂

Devops the Future is Here

Devops the Future is Here

The final presentation was by Kris Buytaert from INUITS.

Kris had a really interesting talk about Devs (software development) and Ops (IT-Operations) and how the two should start working together. “DevOps” is an emerging set of principles, methods and practices for communication, collaboration and integration between the two.

Look here for all the details!

Snow in Antwerp

Snow in Antwerp

When we came outside there was a nice little surprise for us there.. All of Antwerp had become white! That was some sort of a challenge to get back to The Netherlands, since many trains were delayed or cancelled.

After having a nice dinner at Antwerp we managed to get a train and only 4 hours later we were home 😉

It was an inspiring day and I’ve learned a lot. It motivates me to get our CloudStack cloud ready and implement it the way a Cloud is meant to be.

Thanks guys!