Archives For automation

customuiversionI was the Release Manager for Apache CloudStack versions 4.6, 4.7 and 4.8 and during this time many people told me they thought it’s hard to build and package¬†Apache CloudStack. The truth is, it’s not that hard once you know the steps to take. ūüėČ

Next to that there’s the release pace. You may want to move in a different pace than the project is. There’ve been lots of discussions for example on a fast release cycle, and also on LTS-releases on the other hand.¬†In this blog post I’ll show you¬†how easy it is to create your own CloudStack release to satisfy the demands of your organisation.

Maven
Compiling Apache CloudStack is done using Maven. You need to install this tool in order to work with releases so let’s do it. Let’s assume a CentOS 7 box you want to compile this on. It should pretty much work on any OS.

yum install maven java-1.8.0-openjdk mkisofs ws-commons-util genisoimage gcc

We also install Java and some tools needed to compile Apache CloudStack.

Versioning
When you’re building your own release of Apache CloudStack, you have two options:

  1. Rebuild an existing version
  2. Create a new version

You’ve to keep in mind that when you create a new version, you need also to create so-called upgrade-paths: how the database can be upgraded to your version. When you choose option 1, and rebuild an existing version, this is not necessary. This sounds easy, but on the other hand, it’s confusing as there’s no way to tell the difference later on.

Apache CloudStack works with versions like 4.8.0 and 4.7.1, etc. The upgrade mechanism will only consider the first 3 parts. This means, we are free to create 4.8.0.16 (our 16th custom version of 4.8.0) as long as we do not touch the database. That sounds like a nice way to make custom daily releases, and at the same time can be used for those wanting to build LTS (Long Term Support) versions.

Setting the version
The question then is, how can we modify the version of Apache CloudStack? Well, there’s actually a tool supplied in the source that does this for you. It’s called¬†setnextversion.sh.¬†Here’s how it works:

usage: ./tools/build/setnextversion.sh -v version [-b branch] [-s source dir] [-h]
  -v sets the version
  -b sets the branch (defaults to 'master')
  -s sets the source directory
  -h

To build our custom 4.8.0.16 version off the 4.8 branch we run:

./tools/build/setnextversion.sh -v 4.8.0.16 -b 4.8 -s /data/git/cs1/cloudstack/

Output shows a lot of things, most interesting:

found 4.8.0 setting version numbers 
[master 27fa04a] Updating pom.xml version numbers for release 4.8.0.16
126 files changed, 130 insertions(+), 130 deletions(-)
committed as 858805957e2c7e0a0fbeb170a7f1681a73b4fb7a

The result is a new commit that changed the versions in the POMs. You an see it here:

git log

setnextversion

You’ve successfully set the version!

Compiling the custom version
Let’s compile it. This works like with any release build:

mvn clean install -P systemvm

This will take a few minutes. You should see your new version flying by lots of times.

customversion

After a few minutes the build completes.

By the way, if you want to skip the unit tests to speed up the process, add -DskipTests to the mvn command.

custombuilddone

RPM packages
The source also contains scripts to build packages. To build packages for CentOS 7 for example, do this:

cd ./packaging
./package.sh -d centos7

You should see this:

Preparing to package Apache CloudStack 4.8.0.16

When the script finishes, you can find the RPM packages here:

ls -la ../dist/rpmbuild/RPMS/x86_64/

customrpmpackages

Installing the new version
You can either use the war that is the result of the compile, or install the generated RPM packages. Installing and upgrading is out-of-scope in this blog, so I assume you know how to install a stock version of CloudStack. I first installed a stock 4.8.0 version, then (as shown above) built 4.8.0.16 and will show the upgrade.

As soon as you start the management server with the new version you will see this:

2016-03-14 12:34:09,986 DEBUG [c.c.u.d.VersionDaoImpl] (main:null) (logid:) Checking to see if the database is at a version before it was the version table is created
2016-03-14 12:34:09,995 INFO  [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) DB version = 4.8.0 Code Version = 4.8.0.16
2016-03-14 12:34:09,995 INFO  [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) DB version and code version matches so no upgrade needed.

This is actually very nice. We made a custom version and CloudStack still assumes it’s 4.8.0 and so no upgrade of the data base is needed. This obviously means that you cannot do this when your patch requires a data base change.

When we look at the database, we can confirm we still run a 4.8.0-compatible release:

databasecustomrelease

From this table, one cannot tell we upgraded to our custom version. But when you look closer, the new version is active.

This is the version an agent reports:

customapiversion

Also, the UI will show the custom version in the About box. This way users can easily tell what version they are running.

Conclusion
Creating your own custom version of Apache CloudStack may sound complicated but we’ve seen¬†it’s pretty easy to do so. Creating a custom release will¬†provide you with a lot of flexibility, especially if you combine it with war dropping. By numbering your version like 4.8.0.x you don’t have to worry about upgrade paths.

Happy releasing!

300px-Tomcat-logo.svgUpgrading CloudStack can be tough at times. Some say therefore you should do as little upgrades as possible. My¬†approach is to make it easy and make the number of changes as small as possible. The more upgrades, the better. The smallest unit is a single Pull Request with a single commit. That we should be able to deploy to production in an automated way with almost no effort. At Schuberg Philis, our current deploy record to our Mission Critical Clouds is five times a day. Here’s how we do it.

CloudStack is a Java application

CloudStack is a java application and when you compile it, it creates jars and also a war. A war-file (short for Web application ARchive) according to Wikipedia, is used “to distribute a collection of JavaServer Pages, Java Servlets, Java classes, XML files, tag libraries, static web pages (HTML and related files) and other resources that together constitute a web application.” All-in-one, that’s handy! Now, Tomcat is used to serve Java applications and one can add an application¬†to it by what we call “dropping a war“. This means that Tomcat will unpack and start the application inside the war-file.

Our goal: we compile a war from the CloudStack source and then tell Tomcat to run it. Let’s me show you how we do this.

Installing Tomcat

In this blog I’ll show you how to do it on CentOS 7, as this is currently my favourite distribution. Tomcat runs on any OS, so with slight adjustments you can use another distribution as well.

yum install tomcat mkisofs

The package mkisofs is used by the Management Server. We’ll configure Tomcat later on.

 

The MySQL connector story

If you try dropping a CloudStack war in Tomcat, you will find that it doesn’t work because the MySQL driver cannot be found. My colleague Miguel Ferreira sent a Pull Request to add it but as it seems that wasn’t allowed due to licensing issues.

There’s a reason why the MySQL connector is not a dependency – it is Cat-X licensed, which means we may not depend on it in the default build.

Miguel then investigated alternatives and finally found that when you install the mysql-connector-java RPM package, and then put it at the front of the ClassPath, Tomcat will pick it up automatically. Details are in the Pull Request. Awesome work dude!

In short, this makes it work:

yum install -y -q tomcat mysql-connector-java
echo "CLASSPATH=\"/usr/share/java/mysql-connector-java.jar:${CLASSPATH}\"" >> /etc/sysconfig/tomcat

Building a CloudStack war-file

Time to build a CloudStack war-file. In order to do that, you need the CloudStack source and Maven + Java.

Clone the source:

git clone https://github.com/apache/cloudstack.git

Install these packages:

yum install maven tomcat mkisofs python-paramiko jakarta-commons-daemon-jsvc jsvc ws-commons-util genisoimage gcc python MySQL-python openssh-clients wget git python-ecdsa bzip2 python-setuptools mariadb-server mariadb python-devel nfs-utils setroubleshoot openssh-askpass java-1.8.0-openjdk-devel.x86_64 rpm-build

Set the Maven options like this:

export MAVEN_OPTS="-Xmx1024m -XX:MaxPermSize=512m -Xdebug -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n -Djava.net.preferIPv4Stack=true"

Set Tomcat options:

echo "JAVA_OPTS=\"-Djava.awt.headless=true -Dfile.encoding=UTF-8 -server -Xms1536m -Xmx3584m -XX:MaxPermSize=256M\"" >> ~tomcat/conf/tomcat.conf

Then go into the root of the source folder and run:

mvn clean install -P systemvm

If you have a fast computer, you may add -T 4¬†to compile with 4 treads. It’s faster, that’s all.

After a while you see:

[INFO] --------------------------------------------
[INFO] BUILD SUCCESS
[INFO] --------------------------------------------
[INFO] Total time: 06:09 min (Wall Clock)
[INFO] Finished at: 2016-02-29T11:21:50+01:00
[INFO] Final Memory: 109M/1571M
[INFO] --------------------------------------------

Notice this line:

[INFO] Building war: /data/git/cs1/cloudstack/dist/rpmbuild/BUILD/cloudstack-4.8.0/client/target/cloud-client-ui-4.8.0.war

So, the war is in client/target/ folder.

Copy the war-file to the Management server that will run Tomcat.

You can build the war-file on the management server itself, on another server, even on your laptop if you like. Just make sure you copy it to the management server (mine is named cs1).

scp client/target/cloud-client-ui-4.8.0.war root@cs1.cloud.lan:/tmp

Dropping a war-file into Tomcat

The war-file needs to be added to Tomcat. You can use the UI manager for this and do it while Tomcat keeps running. To keep this demo easy, let’s stop Tomcat and assume it’s the only application it runs:

systemctl stop tomcat

Be sure to go to the Tomcat home folder:

cd ~tomcat/webapps/

Then copy the war-file you built, but rename it to ‘client.war‘. This is because the application is supposed to be called ‘client‘. This is why the url also has ‘/client‘.

cp /tmp/cloud-client-ui-4.8.0.war ~tomcat/webapps/client.war

Folder now looks like:

[root@cs1 webapps]# ls -al
total 247836
drwxrwxr-x. 3 root tomcat 4096 Feb 29 10:42 .
drwxr-xr-x. 3 root tomcat 4096 Oct 28 12:49 ..
-rw-r--r--. 1 root root 253766159 Feb 29 10:41 client.war

Starting Tomcat will automatically unpack the war and start it:

systemctl start tomcat

That’s it! Now, have a look at the folder:

[root@cs1 webapps]# ls -al
total 247836
drwxrwxr-x. 3 root tomcat 4096 Feb 29 10:42 .
drwxr-xr-x. 3 root tomcat 4096 Oct 28 12:49 ..
drwxr-xr-x. 11 tomcat tomcat 4096 Feb 29 10:42 client
-rw-r--r--. 1 root root 253766159 Feb 29 10:41 client.war

Management server log is here:

less ~tomcat/vmops.log

As you’ll see, CloudStack is started and the UI is available after a few minutes.

Missing files in war-file

If you want to deploy a new cloud, you’ll find that some stuff goes wrong. First of all, the database scripts are not in the war. You can place them in the Tomcat home dir.

To generate an archive of the files from the CloudStack source:

cd setup/db; tar -c -f ~/Downloads/db-scripts-${CLOUD_VERSION}.tar.gz -pPz db/; cd -

Copy this to the Tomcat homedir, and extract.

cd ~tomcat
tar zvxf db-scripts-${CLOUD_VERSION}.tar.gz
chown tomcat.tomcat ./db -R

The same counts for any additional scripts, like ‘cloudstack-setup-management‘ etc. They are not part of the war so you need to copy them should you want to use them.

Legacy /etc/cloudstack folder

If you’re looking for config files, they’re all inside the webapps folder. This means both a new location and¬†that you need to backup/restore when dropping a new war-file.

You could make a symlink to the new location like this:

mkdir -p /etc/cloudstack
cd /etc/cloudstack
ln -s /var/lib/tomcat/webapps/client/WEB-INF/classes management 

Another option is to copy the files and add ‘/etc/cloudstack/management‘ to the CLASSPATH. The benefit is that you keep your config files out of the client folder so persistence is easier.

The user tomcat runs as

CloudStack wants it to run as user ‘cloud‘, don’t ask me why. For development it works fine as user tomcat, but for some stuff to work (like encryption) you better change it to user ‘cloud‘.

vim /etc/sysconfig/tomcat 
TOMCAT_USER="cloud"

 

Automation!

We obviously want automation in place for this to work without even thinking about it. You can use your favourite configuration management tool.

Screen Shot 2016-03-02 at 11.09.39

Flipping an attribute does the trick.

The work-flow we use looks like this:

– Build new war, then upload to some central place
– Flip a Chef attribute to indicate the new version should be installed
– Chef then takes over and:

– downloads the war-file
– downloads the db-scripts.tar.gz
Рstops Tomcat
– backups settings, like db.properties and such
– removes old files
– puts new client.war in ~tomcat/webapps folder
– extracts db-scripts.tar.gz in Tomcat home folder
Рstarts Tomcat

Loadbalanced Management Servers

As you can see, there is some downtime involved when upgrading to a newer version. When you setup two or more CloudStack application servers with a loadbalancer in front of them, you can upgrade the management servers one-by-one without service interruption.

Conclusion

CloudStack clearly wasn’t designed for easy war-dropping into Tomcat, but as we’ve seen it can be done. Once you overcome a few hurdles, you’ll find it is actually much easier compared to upgrading RPMs all the time. Especially when you want to deploy the same version, with an extra bugfix this comes in handy.

It’d be nice to enhance CloudStack to fully support war drops. Even though users may want to use RPM packages we could simplify them by getting a war artifact (signed) from Maven Central or such, and then have the RPM package setup the things around it. We then would get the best of both worlds: Use RPMs if you want, be able to simply drop a war-file if you want to go faster.

For Schuberg Philis, this allows us to do continuous deployments to our production clouds. By making the numer of changes as smal as possible, we have many small deployments that are easy to test and easy to revert, should we have to.

Welcome to the new ‘continuous everything‘ world!

Watch my talk about how we do CloudStack Operations. It eas presented at the CloudStack Collaboration Conference in Dublin, on October 9th 2015.

The slides can be found in this post, and the code can be found on Github.

Here are the slides from my talk about CloudStack automation, at the CloudStack Collaboration Conference in Dublin today.

At Schuberg Philis, we manage quite a farm of XenServer clusters. As the number of clusters we operate¬†goes up, so goes the time it takes to manage¬†them all. And, let’s be honest: patching and upgrading isn’t the most exciting work. We needed more automation and I took the challenge ūüėČ

The challenge
Patching itself is not a challenge, as we automated that already using Chef. The challenge comes when maintenance tasks require a hypervisor reboot in order to take effect. In such cases, some magic is needed to keep all VMs running, because we aim for zero down time due to maintenance. This blog describes how I automated rebooting full XenServer clusters while all VMs keep running.

N+1 concept
Our clusters are designed based on N+1, which means that if we have a cluster of 6 XenServer hypervisors, we use ~83% (5/6) of its capacity. In this case, when one hypervisor crashes, we have still room for the VMs on the crashed hypervisor, to start again on one of the remaining hosts.

When executing maintenance on our hypervisors, we also use this concep. In this case, we make one hypervisor empty by live migrating all its VMs to the other hypervisors. Now, without any VMs left on it, we can easily patch or upgrade that hypervisor without impact.

Goal
The goal that I had in mind was that I wanted to be able to run a one-liner to automatically reboot a XenServer cluster. Based on the N+1 principle above, that should be possible. The image below shows the process:

This is how it works

This is what happens:

  • Set the specified cluster to unmanage in CloudStack
  • Turn OFF XenServer poolHA for the specified cluster
  • For any hypervisor it will do this (poolmaster first):
    • put it to Disabled aka Maintenance in XenServer
    • live migrate all VMs off of it using XenServer evacuate command
    • when empty, it will reboot the hypervisor
    • will wait for it to come back online (checks SSH connection)
    • set the hypervisor to Enabled in XenServer
    • continues to the next hypervisor
  • When the rebooting is done, it enables XenServer poolHA again for the specified cluster
  • Finally, it sets the specified cluster to Managed again in CloudStack
  • CloudStack will update its admin according to the new situation Then the reboot cyclus for the specified cluster is done!

About CloudStack and XenServer
Since XenServer has the concept of a pool (you could call it a cluster) and one of the hosts is the pool master (it distributes the work to the other cluster members), it is important to know that we need to start with the pool master.

If you want to be sure CloudStack is not touching the cluster while you are performing maintenance, you can put it to an ‘Unmanaged state‘. In the old days we had to do this to prevent CloudStack from electing a new pool master while the previous one was rebooting. As it now relies on XenHA to handle¬†electing a new pool master, I think it’s not needed per se. I just kept it in the scripts to be sure.


cloudstackOpsToolboxCloudStackOps

The script I wrote for this is called ‘xenserver_rolling_reboot.py‘ and is part of the CloudStackOps Toolbox. If you haven’t heard of it before,¬†please check this video¬†of a recent talk I did.

The slides can be found here.

You can get the code from Github.

The one-liner
Now the fun part! All I have to do to reboot a XenServer cluster that is called ‘CLUSTER-1‘ is this:

./xenserver_rolling_reboot.py --clustername CLUSTER-1

To be sure, you need to specify –exec to make it actually do something.

./xenserver_rolling_reboot.py --clustername CLUSTER-1 --exec

Demo time ūüôā
I recorded a screencast that demonstrates how it works. Check this video:

As we saw, the script live migrates all VMs off of the pool master, then reboots it. After that, it does the same with all the other hosts. One-by-one. Until they are all rebooted. All VMs stay online during the maintenance.

About live migrations
You would think that it’d be easiest to have CloudStack do the live migrations, and I thought that too. The problem is, that CloudStack live migrates the VMs to a random hypervisor in the same cluster. It depends a bit on the deployment planner but I found little difference in this case. Why is that a problem? Well, you could end up ‘pushing VMs forward’ all the time. I want to live migrate a VM as less times as possible. This makes the whole process a lot faster. Let’s have a look on how the live migrations should work in my opinion:

migrate_start migrate_pool_master migrate_host2 migrate_host3 migrate_host4

Conclusion: VMs¬†on the pool master will be live migrated twice. All others only once. Optionally we could rebalance the cluster, causing more live migrations. For now, I kept it to the minimum as I’m OK with some hosts being almost empty.

In order to control the live migrations shown in the images above, I decided to talk to Xapi directly instead of to CloudStack.

Speeding up live-migrations
Unfortunately the whole process was quite slow when I first tested it. The first idea was to use ‘xe host-evacuate‘ but this script can only do one live migration at a time. I asked some XenServer folks¬†about it. Conclusion is that it cannot be done at this time.

xenserver_host-evac-tweet

xenserver_host-evac-answers

Well, that means I’d to come up with something of my own to speed up the live migrations. To prove it would be possible to migrate faster, we used XenCenter and clicked around. We could easily do 5-10 threads and it was a lot faster. But who wants to click these days? Not me.

After some searches on Google I came up with a very simple solution: use the -P flag of xargs. I wrote a small script that looks up the VMs running on the hypervisor¬†you run the script and it will then calculate a migration plan according to what we discussed above. Then, I’d have a list of ‘xe vm-migrate‘ commands that I run in parallel using xargs. Very simple, very effective.

By using 5 threads, I got twice the speed so that saved a lot of waiting time. I also tried more threads, but it didn’t go any faster. The ‘xenserver_rolling_reboot.py‘ script has a –threads flag, so feel free to play with it. It defaults to 5 threads.

Verified on Production
My team uses this script to do automated patching and rebooting. For the patching itself we use Chef, but that’s the easy part. Rebooting the clusters is what takes time. Now that it is automated, we only have to keep an eye on it. We used it in production to reboot hundreds of XenServer hypervisors without issues.

Conclusion
Automation is key in today’s cloudy environments. This blog shows how I automated the regular maintenance of our XenServer farm. Patching and rebooting is now easy!

Here are the slides from my presentation at the CloudStack meetup today in Amsterdam.

Get the code on Github.

xenserverAt work we’re creating a CloudStack development environment that can be created and destroyed as we go. One of the requirements is to quickly create a couple of hypervisors to run some tests. KVM is easy: they are just Linux boxes with some extra packages. XenServer is a different story.

Imagine you create a XenServer by installing it from ISO, then saving it as a template. Then create two instances from it called xen1 and xen2. If you try to add them to CloudStack, you’ll notice only the first one¬†succeeds.

duplicate_xenserver_uuid_cloudstack

The logs tell you why it did not work:

Skipping xen2 because 7af82904-8924-49fa-b0bc-49f8dbff8d44 is already in the database.

Hmm.. so would both have the same UUID? Let’s see. To get the UUID of a XenServer, simply run this:

xe host-list

The output of both my XenServers is below:

[root@xen1 ~]# xe host-list 
uuid ( RO) : 7af82904-8924-49fa-b0bc-49f8dbff8d44
 name-label ( RW): xen1
 name-description ( RW): Default install of XenServer
[root@xen2 ~]# xe host-list 
uuid ( RO) : 7af82904-8924-49fa-b0bc-49f8dbff8d44
 name-label ( RW): xen2
 name-description ( RW): Default install of XenServer

CloudStack is now confused and wants them to have a unique UUID, which makes sense.

It was quite a puzzle to come up with an automatic process. A couple of colleagues of mine all contributed a part and we finally came up with something that works on both XenServer 6.2 and 6.5.

First, you need to generate new UUIDs in ‘/etc/xensource-inventory’ for both the XenServer and the dom0. Do this after you’ve stopped the Xapi¬†process. You should also wipe the state.db of Xapi.

An important part consists of running ‘/opt/xensource/libexec/create_templates‘ and then reset the network, after which the XenServer reboots.

To make it easy for others, we’ve published the script on Github.

It should be run as a ‘first boot’ script. It needs two reboots to fully complete the procedure, and it does this automatically.

When you look at the UUIDs again after running the script:

[root@xen1 ~]# xe host-list 
uuid ( RO) : ae2eeb9b-ed7e-4c3c-a619-376f8a632815
 name-label ( RW): xen1
 name-description ( RW): Default install of XenServer
[root@xen2 ~]# xe host-list 
uuid ( RO) : 50adc2d0-8420-425c-a1c8-68621bd68931
 name-label ( RW): xen2
 name-description ( RW): Default install of XenServer

They’re unique now! Adding them works fine… but hey, they have the same name in CloudStack? This is due to the name-label that also needs to be set to the new hostname.

xe host-param-set uuid=\$(xe host-list params=uuid|awk {'print \$5'} | head -n 1) \
name-label=\$HOSTNAME 

Just use our script and you’ll be fine!

Update:

Rohit from ShapeBlue, a fellow engineer working on CloudStack, almost immediately responded to my blog:

xenserver_inuque_tweets

He sent me a pull request that turns the script into an init-script for use in XenServer. That makes it even easier for others to use it. He also updated the documentation:

To use this in a VM template for testing XenServer hosts in VMs:

scp xenserver_make_unique.sh root@xenserver-host-ip:/opt/ 
scp init.d/xenserver_make_unique root@xenserver-host-ip:/etc/init.d/ 
ssh root@xenserver-host-ip "chmod +x /etc/init.d/xenserver_make_unique && chkconfig xenserver_make_unique on"

When the XenServer host starts for the first time, it would reset the host uuid by running the script from /opt/xenserver_make_unique.sh, then remove the init.d script and reboot the host.

You only need to do this once in your template, and all XenServers you will create from it will be unique. Enjoy!