Friday, February 17, 2012

How to use the vhd-util tool in xenserver




I have always found that the vdi chain in xenserver is one area where there is always some misunderstanding on how it all hangs together so I thought I would blog about how the vdi chains work and the vhd files make up the vdi chain. When running blocked based storage in XenServer, it isn't easy to see the underlying files and to do so you have to use the vhd-util tool. I have put together a guide which helps how you can see what vhd files make up the vdi chain.
 First of all, run vgs to display the volume groups, look for the the iscsi volume group and note down the name

VG_XenStorage-7ec18595-0bf0-859a-1e85-7e19721dad84   1  51   0 wz--n- 799.99G 495.84G

Here are some vhd-util scan options you can use.

usage: [OPTIONS] FILES

options: [-m match filter] [-f fast] [-c continue on failure] [-l LVM volume] [-p pretty print] [-a scan parents] [-v verbose] [-h help]

A typical command would be.

vhd-util scan -m "VHD-*" -f -c -l VG_XenStorage-7ec18595-0bf0-859a-1e85-7e19721dad84 -p –v

And the output would be this:

[root@xenserver ~]# vhd-util scan -m "VHD-*" -f -c -l VG_XenStorage-7ec18595-0bf0-859a-1e85-7e19721dad84 -p –v

vhd=VHD-79d0aca7-eda7-4d73-9238-c3b5c9378aa7 capacity=26843545600 size=4630511616 hidden=1 parent=none
   vhd=VHD-691d986e-e84a-4ab4-9672-b6f9b3148cab capacity=26843545600 size=26902265856 hidden=0 parent=VHD-79d0aca7-eda7-4d73-9238-c3b5c9378aa7
   vhd=VHD-838154db-db5c-42b7-8ec3-2ebb31f73683 capacity=26843545600 size=8388608 hidden=0 parent=VHD-79d0aca7-eda7-4d73-9238-c3b5c9378aa7
vhd=VHD-80c4d84c-8aea-4391-9d31-fcc2be388ce6 capacity=21474836480 size=4626317312 hidden=1 parent=none
   vhd=VHD-0eaa0588-3668-4697-956d-bd6e98478585 capacity=21474836480 size=8388608 hidden=0 parent=VHD-80c4d84c-8aea-4391-9d31-fcc2be388ce6
   vhd=VHD-240a20e8-0d70-4b4a-88c1-e4b2a6e138e7 capacity=21474836480 size=21525168128 hidden=0 parent=VHD-80c4d84c-8aea-4391-9d31-fcc2be388ce6
vhd=VHD-a5d73baa-1eb6-4fe6-a2e9-32ddcf14555a capacity=2147483648 size=2101346304 hidden=1 parent=none
   vhd=VHD-87e8553e-340d-4269-8401-f2b4be874b62 capacity=2147483648 size=8388608 hidden=0 parent=VHD-a5d73baa-1eb6-4fe6-a2e9-32ddcf14555a
vhd=VHD-1b10a183-ef8d-4c53-a031-b9aeeb38e0be capacity=10737418240 size=10766778368 hidden=0 parent=nonevhd=VHD-2e45811f-c3a1-48bf-b6b0-0d0fb671da8e capacity=5368709120 size=5385486336 hidden=0 parent=none
vhd=VHD-6fe32f92-bc88-4f98-a4e1-47bf04894ce8 capacity=21474836480 size=21525168128 hidden=0 parent=none
vhd=VHD-8336df72-992d-4a52-a5e3-4a776f1f86f8 capacity=5368709120 size=5385486336 hidden=0 parent=none

Typically, the chain consists of a base-copy, where VHD has the attributes hidden=1 parent=none. Subsequent VHD chain’s have the following attributes - hidden=0 parent=<VHD-UUID>. The following is a good example of a two child chain, which is actually the domain router in Cloudstack

vhd=VHD-ff1c907a-d75d-444e-af85-735121fb9794 capacity=2097152000 size=2101346304 hidden=1 parent=none
   vhd=VHD-14175cb0-35f5-4b45-b686-7d34348283b4 capacity=2097152000 size=2109734912 hidden=0 parent=VHD-ff1c907a-d75d-444e-af85-735121fb9794

It starts getting tricky when snapshots get involved. There is a coalescing process which runs and changes the chains – good article on how snapshots work http://support.citrix.com/servlet/KbServlet/download/21626-102-671572/XenServer_-_Understanding_Snapshots_(v1.1).pdf

Relating this to how Cloudstack typically works, we should only see a maximum chain on 2 childs. Each base-copy relates to a unique template. Chains are then created from the base-copy. Multiple templates will mean multiple base-copies. A base-copy which stems from a cloudstack template can exists on many SRs across many pools. If you were taking multiple snapshots using XenCenter, you would have multiple VHD chains which would reflect the snapshot structure.

Hope that helps understanding VHD chains. It is handy to see how the guts of the disk structure hangs together as sometimes you need to troubleshoot broken vdi health, I dare I say the dreaded error 22.

Sunday, February 5, 2012

Using the cloudstack API

I’ve recently been working on a project to enhance a testing strategy that involved revisiting cloudstack API. I dove headfirst into the deep end, which made me realise just how much you can do with it. I’d like to share some of my experiences, in the hope of getting some feedback from people who have used CloudStack or CloudPlatform or are thinking about it and want to find out more. After plunging into the depths of the API, this is what I discovered.

First, it’s good! Everything you can do in our compute platform can be done via the API and it’s actually quite simple once you get the hang of things. I wrote my test scripts in Python, but there are a few different languages you can use and I know some guys  have dived in using PHP and .NET. There is also a Java library available from Jclouds so you can make your choice depending on your comfort zone.

Second, most of the API commands are asynchronous, depending on what you’re trying to do. This means you can call an API command and move on to the next one or, like me, hold out for the async command response. This way you can do one task, wait for the response and grab certain criteria for the next command. You can then build up a fairly comprehensive set of commands. For example, you can deploy a virtual machine (VM) using the deployVirtualMachine together with the following:
  • Serviceofferingid: The serviceofferingid relates to what instance size you wish to use. You can get a list of available service offerings by using the listServiceOfferingscommand.
  • Templateid: The templateid refers to the ID of either the templates we offer to customers, or one you have preconfigured. These values can be obtained by using the listTemplates command.
  • Zoneid: The zoneid signifies the zone in which you deploy the VM. This is achieved by running the listZones command.
The good news is that once you have these values, you can use them over and over again. Or you can list all the template IDs and service offerings and pass a value from a collection if you want different size instances or want to use different templates.

Once you have built up your deployVirtualMachine API command string, you’re ready to move on to the next command – and this is just the beginning. As the deployVirtualMachine is an async command, what you really need before you can move on is the ID of the VM you have deployed. 
Then you can do other things like attach disks if required, assign an IP or enable a static NAT, as these commands need a VM ID to work. This is where the queryAsyncJobResult command comes into play. Once you run your command to deploy a VM, it responds with a jobid and jobstatus. The jobid is the asyncjobid number, which you can query using the queryAsyncJobResult command. Once the deployVirtualMachine job finishes, the jobstatus changes to 1, which means you can find out all sorts of information about the VM that has been deployed. One key piece of information is the VM ID, which is required for most commands to work. Once you have this, the world is your oyster.
What I did was create a class in Python, which I was able to reuse to find out the status of the async job. Once I had this working, I was able to call the method in the class every time I wanted to find out the status of the async job.
So, let’s quickly look at attaching a disk to a virtual machine. First, you need to create a disk using createVolume and, again, apply the queryAsyncJobResult to find out the ID of the volume you have created. You will also use the diskofferingid to create a disk of a certain size (I created a 50GB disk). Once you have this information and the queryAsyncJobResult comes back, you are ready to attach the disk to the VM. Here are the required API commands to do so:
  • The virtualmachineid is the ID of the VM, which you can get when using the queryAsyncJobResult of the deployVirtualMachine job.
  • ID refers to the ID of the disk volume you can obtain when using the queryAsyncJobResult of the createVolume.
Once you have built up your createVolume and have the corresponding async job results, you’re ready to use the attachVolume commands. Hey presto! You’ve now deployed a VM and attached a disk. And so it continues…
This isn’t a tutorial or documentation explaining how to use our API but merely a blog on how I did it. I’ve just finished testing all our API commands and yes, there are many ways to skin a cat, but the point is that it can be done and you can achieve a lot with just the click of a button. It’s not all plain sailing – it never is – but once you get involved, you can easily work these things out and it does become quite simple.

I’m keen to hear from people who have used our API and what programming language they used, or from people who are thinking about using the API its capabilities. I’m happy to share some of the framework of the Python scripts I’ve put together, so get in touch if you’d like a hand getting started with our compute API.