Archives For devops

At DevOpsDays Amsterdam 2016 I hosted a 3 hour hands-on workshop with my colleague Fred Neubauer. In these 3 hours we had 12 people build their own cloud based on Cosmic Cloud. It was great fun!

DevOpsDays Amsterdam 2016 Cosmic workshop

The slides are here:

customuiversionI was the Release Manager for Apache CloudStack versions 4.6, 4.7 and 4.8 and during this time many people told me they thought it’s hard to build and package Apache CloudStack. The truth is, it’s not that hard once you know the steps to take. 😉

Next to that there’s the release pace. You may want to move in a different pace than the project is. There’ve been lots of discussions for example on a fast release cycle, and also on LTS-releases on the other hand. In this blog post I’ll show you how easy it is to create your own CloudStack release to satisfy the demands of your organisation.

Maven
Compiling Apache CloudStack is done using Maven. You need to install this tool in order to work with releases so let’s do it. Let’s assume a CentOS 7 box you want to compile this on. It should pretty much work on any OS.

yum install maven java-1.8.0-openjdk mkisofs ws-commons-util genisoimage gcc

We also install Java and some tools needed to compile Apache CloudStack.

Versioning
When you’re building your own release of Apache CloudStack, you have two options:

  1. Rebuild an existing version
  2. Create a new version

You’ve to keep in mind that when you create a new version, you need also to create so-called upgrade-paths: how the database can be upgraded to your version. When you choose option 1, and rebuild an existing version, this is not necessary. This sounds easy, but on the other hand, it’s confusing as there’s no way to tell the difference later on.

Apache CloudStack works with versions like 4.8.0 and 4.7.1, etc. The upgrade mechanism will only consider the first 3 parts. This means, we are free to create 4.8.0.16 (our 16th custom version of 4.8.0) as long as we do not touch the database. That sounds like a nice way to make custom daily releases, and at the same time can be used for those wanting to build LTS (Long Term Support) versions.

Setting the version
The question then is, how can we modify the version of Apache CloudStack? Well, there’s actually a tool supplied in the source that does this for you. It’s called setnextversion.sh. Here’s how it works:

usage: ./tools/build/setnextversion.sh -v version [-b branch] [-s source dir] [-h]
  -v sets the version
  -b sets the branch (defaults to 'master')
  -s sets the source directory
  -h

To build our custom 4.8.0.16 version off the 4.8 branch we run:

./tools/build/setnextversion.sh -v 4.8.0.16 -b 4.8 -s /data/git/cs1/cloudstack/

Output shows a lot of things, most interesting:

found 4.8.0 setting version numbers 
[master 27fa04a] Updating pom.xml version numbers for release 4.8.0.16
126 files changed, 130 insertions(+), 130 deletions(-)
committed as 858805957e2c7e0a0fbeb170a7f1681a73b4fb7a

The result is a new commit that changed the versions in the POMs. You an see it here:

git log

setnextversion

You’ve successfully set the version!

Compiling the custom version
Let’s compile it. This works like with any release build:

mvn clean install -P systemvm

This will take a few minutes. You should see your new version flying by lots of times.

customversion

After a few minutes the build completes.

By the way, if you want to skip the unit tests to speed up the process, add -DskipTests to the mvn command.

custombuilddone

RPM packages
The source also contains scripts to build packages. To build packages for CentOS 7 for example, do this:

cd ./packaging
./package.sh -d centos7

You should see this:

Preparing to package Apache CloudStack 4.8.0.16

When the script finishes, you can find the RPM packages here:

ls -la ../dist/rpmbuild/RPMS/x86_64/

customrpmpackages

Installing the new version
You can either use the war that is the result of the compile, or install the generated RPM packages. Installing and upgrading is out-of-scope in this blog, so I assume you know how to install a stock version of CloudStack. I first installed a stock 4.8.0 version, then (as shown above) built 4.8.0.16 and will show the upgrade.

As soon as you start the management server with the new version you will see this:

2016-03-14 12:34:09,986 DEBUG [c.c.u.d.VersionDaoImpl] (main:null) (logid:) Checking to see if the database is at a version before it was the version table is created
2016-03-14 12:34:09,995 INFO  [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) DB version = 4.8.0 Code Version = 4.8.0.16
2016-03-14 12:34:09,995 INFO  [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) DB version and code version matches so no upgrade needed.

This is actually very nice. We made a custom version and CloudStack still assumes it’s 4.8.0 and so no upgrade of the data base is needed. This obviously means that you cannot do this when your patch requires a data base change.

When we look at the database, we can confirm we still run a 4.8.0-compatible release:

databasecustomrelease

From this table, one cannot tell we upgraded to our custom version. But when you look closer, the new version is active.

This is the version an agent reports:

customapiversion

Also, the UI will show the custom version in the About box. This way users can easily tell what version they are running.

Conclusion
Creating your own custom version of Apache CloudStack may sound complicated but we’ve seen it’s pretty easy to do so. Creating a custom release will provide you with a lot of flexibility, especially if you combine it with war dropping. By numbering your version like 4.8.0.x you don’t have to worry about upgrade paths.

Happy releasing!

dod-amsterdamToday I hosted a CloudStack hands-on workshop at DevOpsDays Amsterdam, together with my colleague Fred Neubauer. It was really awesome to see all attendees building their own CloudStack cloud. And succeeding!

Here are the slides and instructions: We proposed the workshop to the CloudStack Collaboration Conference Europe, in Dublin next October so we might do it again!

dod_fred

Fred in action!

That's me in action

That’s me in action!

one-does-not-pull-request

Recently I became a committer in the Apache CloudStack project. Last week when I was working with Rohit Yadav from ShapeBlue he showed me how he had automated the process with a Git alias. I really like it so I’ll share it here as well.

First of all, pull requests are created on Github.com. Rohit created a git alias called simply ‘pr’. This is how it looks like (for the impatient: copy/paste the one-liner below). The command below is for easy reading, it will print a syntax error.

[alias]
pr= "!apply_pr() { set -e; 
rm -f githubpr.patch; 
wget $1.patch -O githubpr.patch
--no-check-certificate; 
git am -s githubpr.patch; 
rm -f githubpr.patch; 
pr_num=$(echo $1 | sed 's/.*pull\\///'); 
git log -1 --pretty=%B > prmsg.txt; 
echo \"This closes #$pr_num\" >> 
prmsg.txt; git commit --amend -m \"$(cat prmsg.txt)\"; 
rm prmsg.txt; }; apply_pr"

Copy/paste this next two lines into your .gitconfig file (usually located in ‘~/.gitconfig’.):

[alias]
pr= "!apply_pr() { set -e; rm -f githubpr.patch; wget $1.patch -O githubpr.patch --no-check-certificate; git am -s githubpr.patch; rm -f githubpr.patch; pr_num=$(echo $1 | sed 's/.*pull\\///'); git log -1 --pretty=%B > prmsg.txt; echo \"This closes #$pr_num\" >> prmsg.txt; git commit --amend -m \"$(cat prmsg.txt)\"; rm prmsg.txt; }; apply_pr"

This alias allows you to do this:

git pr https://pull-request-url

It will then fetch the patch, extract the pull request number, adds a note that closes the pull request, finally commits all commits to your current branch. All you have to do is review and push.

Let’s demo this on a pull request I openend on the CloudStack documentation. Goes like this:

git pr https://github.com/apache/cloudstack-docs-rn/pull/21
--2015-05-24 19:22:54-- 
https://github.com/apache/cloudstack-docs-rn/pull/21.patch
Resolving github.com... 192.30.252.130
Connecting to github.com|192.30.252.130|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://patch-diff.githubusercontent.com/raw/apache/cloudstack-docs-rn/pull/21.patch [following]
--2015-05-24 19:22:54-- 
https://patch-diff.githubusercontent.com/raw/apache/cloudstack-docs-rn/pull/21.patch
Resolving patch-diff.githubusercontent.com... 192.30.252.130
Connecting to patch-diff.githubusercontent.com|192.30.252.130|:443... connected.
HTTP request sent, awaiting response... 200 OK
Cookie coming from patch-diff.githubusercontent.com attempted to set domain to github.com
Length: unspecified [text/plain]
Saving to: 'githubpr.patch'
githubpr.patch [ <=> ] 9.23K --.-KB/s in 0.002s
2015-05-24 19:22:55 (4.96 MB/s) - 'githubpr.patch' saved [9449]

Applying: add note on XenServer: we depend on pool HA these days
Applying: remove cached python classes and add ignore file for it
Applying: explicitly mention the undocumented timeout setting

[master 08e325d] explicitly mention the undocumented timeout setting
 Author: Remi Bergsma <github@remi.nl>
 Date: Sun May 24 08:28:43 2015 +0200
 3 files changed, 21 insertions(+), 3 deletions(-)

The pull request has 3 commits that are now committed to the current branch. Check the log:

git log

git_log

Isn’t that great? Now all you have to do is push it to a upstream repository.

Signed-off-by
This is achieved by adding the following to ‘.gitconfig’:

[format]
 signoff = true

Thanks Rohit for sharing!

git_logoWhen contributing to open source projects, it’s pretty common these days to fork the project on Github, add your contribution, and then send your work as a so-called “pull request” to the project for inclusion. It’s nice, clean and fast. I did this last week to contribute to Apache CloudStack. When I wanted to contribute again today, I had to figure out how to get my “forked” repo up-to-date before I could send a new contribution.

Remember, you can read/write to your fork but only-read from the upstream repository.

Adding upstream as a remote
When you clone your forked repo to a local development machine, you get it setup like this:

git remote -v
origin git@github.com:remibergsma/cloudstack.git (fetch)
origin git@github.com:remibergsma/cloudstack.git (push)

As this refers to the “static” forked version, no new commits come in. For that to happen, we need to add the original repo as an extra “remote” that we’ll call “upstream”:

git remote add upstream https://github.com/apache/cloudstack

Now, run the same command again and you’ll see two:

git remote -v
origin git@github.com:remibergsma/cloudstack.git (fetch)
origin git@github.com:remibergsma/cloudstack.git (push)
upstream https://github.com/apache/cloudstack (fetch)
upstream https://github.com/apache/cloudstack (push)

The cloned git repo is now configured to both the forked and the upstream repo.

Let’s fetch the updates from upstream:

git fetch upstream

Sample output:

remote: Counting objects: 151, done.
remote: Compressing objects: 100% (123/123), done.
remote: Total 151 (delta 39), reused 0 (delta 0)
Receiving objects: 100% (151/151), 153.30 KiB | 0 bytes/s, done.
Resolving deltas: 100% (39/39), done.
From https://github.com/apache/cloudstack
   2f2ff4b..49cf2ac  4.4        -> upstream/4.4
   aca0f79..66b7738  4.5        -> upstream/4.5
* [new branch]      hotfix/4.4/CLOUDSTACK-8073 -> upstream/hotfix/4.4/CLOUDSTACK-8073
   85bb685..356793d  master     -> upstream/master
   b963bb1..36c0c38  volume-upload -> upstream/volume-upload

We now got the new updates in. Before you continue, be sure to be on the master branch:

git checkout master

Then we will rebase the new changes to our own master branch:

git rebase upstream/master

You can achieve the same by merging, but rebasing is usually cleaner and doesn’t add the extra merge commit.

Sample output:

Updating 4e1527e..356793d
Fast-forward
SSHKeyPairResponse.java                       | 12 ++++++++++++
SolidFireSharedPrimaryDataStoreLifeCycle.java | 33 +++++++++++++++++++++++++++++++++
RulesManagerImpl.java                         |  2 +-
ManagementServerImpl.java                     |  5 +----
4 files changed, 47 insertions(+), 5 deletions(-)

Finally, update your fork at Github with the new commits:

git push origin master

Branches other than master
Imagine you want to track another branch and sync that as well.

git checkout -b 4.5 origin/4.5

This will setup a local branch called ‘4.5’ that is linked to ‘origin/4.5’.

If you want to get them in sync again later on, the workflow is similar to above:

git checkout 4.5
git fetch upstream
git rebase upstream/4.5
git push origin 4.5

Automating this process
I wrote this script to synchronise my clones with upstream:


#!/bin/bash

# Sync upstream repo with fork repo
# Requires upstream repo to be defined

# Check if local repo is specified and exists
if [ -z $1 ]; then
 echo "Please specify repo to sync: $0 <dir>"
 exit 1
fi
if [ ! -d $1 ]; then
 echo "Dir $1 does not exist!"
 exit 1
fi

# Go into git repo and update
cd $1

# Check upstream
git remote -v | grep upstream >/dev/null 2>&1
RES=$?

if [ $RES -gt 0 ]; then
 echo "Upstream repo not defined. Please add it:
 git remote add http://github.com/..."
 exit 1
fi

# Update and push
git fetch upstream
git rebase upstream/master
git push origin master

Execute like this:

./update_origin_with_upstream.sh /path/to/git/repo

 

Happy contributing!

Be the automator!

30 November 2014 — Leave a comment

Today I saw an awesome video of a presentation by Glenn O’Donnell.

In his presentation, Glenn states that it’s not about technology, it’s about services. Service design is modular with a logical structure. Approach it as a system and try to improve it as a whole, not just tiny pieces of it. To do that, you need systems engineers. Although most are lazy, or, as Glenn puts it: “Locally brilliant, globally stupid”.

Accept we have no full control
The whole eco system includes a lot of infrastructure and application components. Including services from third parties. We tend to zoom in a lot but then we miss the point: it’s not about the servers, the storage and the network. It’s about how it all works together. And let’s face it, we’ll not have full control over the eco system because it contains components managed by third parties.

We IT people have a hard time accepting we have no control. We think we’re the only one that can maintain that server. But in fact, software can do better and will put us out of business. That software is already out there today.

In this new world, we need people with a different skill set that can manage this complex eco system. It’s all software these days. Obviously the application is software, but so is the infrastructure on which it runs. Cloud infrastructure is all software defined. Even physical servers should be software controlled, instead of manipulating them manually.

How? Well, you can automate if you have a model. The model is a software description of reality. Tools consume this model and create reality out of it.

 ne the automator model

Software Model drives Automation Tool that produces the Service.

Glenn compares this with building a plane: first models are build and simulated before they ever put a plane in the air. Makes sense, right?

We should not be on the command line
This means we should not be on the command line. Let’s get away from the command line! We should manipulate the model instead, and let the model create or change reality. This model is our system software, which we should treat the same way as we treat application software. It’s software, and we can automate software.  By the way, no automation means no DevOps because it’s gonna be too slow.

You’ll also get a better quality because human beings are bad at repetitive tasks. And it’s a waste having smart people do repetitive work. Software can do that instead. The model is the language, the secret code.

Automate yourself out of your job
Although this is cool, it does render some jobs obsolete. Glenn states that if you have “administrator” in jour job title, you’ll be replaced by software (that can do better). But don’t worry, there will be other, more interesting jobs, instead. Automate yourself out of your job. It’s fun! In short: be the automator, not the automated.

be the automator jobs

To drive this movement, we need innovators! Geeks are innovators 🙂 Geeks love change, they automate, they create, and they want to move on to the next interesting thing to discover.

Next to Geeks, there are also Geek Imposters. They might do the same job, but they hate change and want to keep everything as it is. To them, Glenn has a nice advice: “Learn to say: Would you like fries with that?“.

View the video of Glenn’s presentation:

Geeks are changing the world. If you think you are a Geek that loves change and loves automation, you could be what we call a Cupfighter at Schuberg Philis.

Are you a Cupfighter?

If you think you are a Cupfighter, please contact me and we’ll change the world 😉

Back in June, just before I went off for holiday, I attended a CFEngine training in Amsterdam. When I returned from holiday a few weeks later, me and my team started making plans to implement CFEngine in our environment. After two months of hard work, I’m proud to say we manage about 350 out of our 400 Linux servers with CFEngine!

The ride has been fun, although not always easy. In this post I’ll give a quick overview of our CFEngine implementation, where I found useful info, etc.

CFEngine is different
To start, let me tell you that one of the most difficult parts of learning CFEngine is to get used to the terminology and to ‘think’ CFEngine. For example, a ‘class’ in CFEngine is not what you think it is. It has nothing to do with object oriented programming. It’s more like a ‘context’ that you can use to make decisions. There’s no ‘flow control’ in CFEngine either: no IF/THEN/ELSE, no FOR/FOREACH/WHILE etcetera. In CFEngine classes are used for decision making, and, since CFEngine is smart, it does looping automatically. This results in clean and easy-to-read code.

CFEngine works on top of a theoretical model called ‘Promise Theory‘ by Mark Burgess (author of CFEngine). This theory models the behavior of autonomous agents in an environment without central authority, based on only promises of behavior made by each agent, and shows that even without central control , the system can converge to a stable state.

To get used to it, read ‘Learning CFEngine 3‘ by Diego Zamboni, as it will walk you through all of it with a lot of examples. The quote above is also from the book.

The basic idea is that each agent makes promises only about its own behavior, since that is all it can control. In CFEngine 3, everything is a promise.

Examples:
– a file promises to have a certain content and to be executable
– a service promises to be running
– a user account promises to exist (or not to exist) and have certain properties

When CFEngine finds a promise is not kept, it will do everything it knows about to make the promise true. If it cannot reach the promised state at first, it tries to the next best. Over time, the system converges to the desired (promised) state.

Once you get it and get used to it, it actually makes sense and is pretty easy to implement.

With great power comes great responsibility
This one-liner says it all. When you have a configuration management system that manages a lot of servers, you better be careful what promises you have it keep. This is why you need to manage CFEngine promises like software. You need version control and it needs to be flexible as well. I’ve read a lot about this subject and I believe Git is the way to go. This blog by Brian Bennett pretty much nails it. I got a lot of inspiration from it, thanks Brian!

I implemented these ‘branches’ in Git:
– development (aka master)
– beta
– pre-production
– production

This works perfectly: develop new promises in the ‘development’ branch, then merge to ‘beta’ branch to test on some of our own test servers. When everything works together and seems stable, we merge to the next branch ‘pre-production’. This is then tested on ~15 real production servers so it better be good. But when it isn’t, the impact is still not too high and it should be fixed before it ever hits ‘production’. Production branch is everything that is stable and is used on all ~350 servers.

Every time we merge to either ‘pre-production’ or ‘production’, we create a Git ‘tag’ with a date, that allows for easy roll backs. Whenever we need to get back to a certain state, we can always just checkout a tag. This is also very useful for audit trails, by the way.

Actually, we’re using another branch called ‘hotfix’. Whenever there’s an emergency to fix, we branch a ‘hotfix’ from ‘production’ and do the fix. This is for example when a promise misbehaves. This branch is then merged to production when ready, and also to ‘development’. Git handles this nicely: whenever the hotfix makes it all the way from ‘development’ to ‘production’, Git recognizes this commit was already processed earlier and ignores it.

Git commits, branches and tags in CFEngine repo

Git commits, branches and tags in CFEngine repo

 

This is a screenshot from ‘Gitk’ that shows the commits, branches (green) and tags (yellow). As you can see, ‘production’ and ‘pre-production’ are at the same level now, so nothing new is tested in ‘pre-production’ at the moment. Quite some work is tested in the ‘beta’ branch and there are already some fixes committed in ‘master’. Recently there was a ‘hotfix’ branch that has now been merged. It should give an idea of how it works. It provides a clear overview and we now know about every change on the configuration of our servers. Clicking on a commit show what changed, who did it, etc.

CFEngine Policy hubs
For each of the 4 branches we’ve created a CFEngine policy hub. The policy hub is a server running CFEngine software that serves the given branch to the agents (the Linux servers connected to it). Linux servers can even switch between them by ‘bootstrapping’ to one of the 4 policy hubs. Although we only use that on our test servers.

Manage what’s ‘in flight’ with a CFEngine Trello board
Trello provides an intuitive and modern web interface that allows you to manage ‘cards’ on different ‘lists’ on a ‘board’. To get an idea, see the example Trello board below (click on it to enlarge).

Trello CFEngine board

Trello CFEngine board

 

New cards are usually created in ‘Feature requests’ or ‘Bugs’ and then transferred to ‘Working on it!’. The number of cards in this stage should be limited, as you can only work on a number of things at the same time. This is actually Kanban style. Next, we’ve created a list for each Git ‘branch’ we have and cards flow from ‘beta’ to ‘pre-production’ and finally ‘production’. Moving cards is just dragging & dropping. Each month, cards in ‘production’ are archived. This creates an overview of what new work is to be done (‘Feature requests’ and ‘Bugs’), what we’re currently working on and what’s in each of the branches. Trello has the overview, Git has the code and the details. Also, Trello is perfect for communication between team members. Notes, comments, documents, lists, etcetera can all be created with ease.

Testing promises
To be able to test the promises on our local laptops, we’re using a tool called Vagrant. Vagrant sets up Virtual Machines (for example using Virtual Box) and allows you to ‘destroy’ and ‘create’ them within minutes. All team members have a local Git checkout, that is also available in the Vagrant boxes. This allows us to test any change before even committing. We have Vagrant boxes setup for all Linux distributions we support. It’s so easy and so fast to test changes that everybody does. And even when an error slips through, other team members will soon notice and it’s usually fixed within minutes, before it ever hits the ‘beta’ branch.

Bugs
We encountered a strange bug when using SLES 11 and CFEngine 3.5: CFEngine (community edition) got running with the ‘SIGPIPE’ signal blocked. When CFEngine restarts SSH, this too gets running with ‘SIGPIPE’ blocked. This results in ‘sudo’ no longer working. It would just return nothing at all. It took us quite some time to figure out it was the ‘SIGPIPE’ signal that was blocked. The root cause probably lies in an old ‘Bash’ version (3.51) that SLES uses, combined with something CFEngine triggers. We’ve now implemented an automated work-around (made a CFEngine promise) that fixes the problem. We did some nice team work on this one!

Conclusion
CFEngine’s learning curve might be steep, but the result is definitely rewarding. Combined with Git and Trello it allows for fine control and great overview of configuration changes. Our whole team is involved in changes, they are reviewed and result in high quality code. This eventually makes the Linux servers we manage more stable. Also, it’s a great feeling to be in-control and know what’s going on our servers.

From this point on, we’ll continue to both scale horizontally (add more servers) and vertically (add more promises). After two months of daily working with CFEngine, I’ve to say I really like it and I enjoy writing promises.

I’ll keep you posted, I promise 😉