Virtual Devops

Monday, March 12, 2012

How to enable dns on a NetApp running ONTAP

I had to enable dns on the NetApp ONTAP device the other day and as I am in the thick of learning NetApp, I thought I would write this one down. It is pretty easy but you need to do this through the cli.

At first, DNS is not enabled. You can see this in the System Manager UI:

You have to drop to the command line and edit a few files to enable dns. The cutdown FreeBSD cli is, yep, cutdown. However, once you know your way around, it is pretty easy. When you need to write and read a file, there are 2 key commands:

wrfile
rdfile

wrfile writes to a file. I don't like this but you need to press CTRL-C to quit out of the file and you get this message when you do so:

read: error reading standard input: Interrupted system call

However you can also use wrfile -a which appends to the file. It's not vi, that's for sure.

However, back to the point. Below shows how one can set up DNS and a sneaky gotcha you need to be aware of.

If you just try to enable dns from the command options dns.enable on, you maight get this message:

Setting option dns.enable to 'off' conflicts with /etc/rc that sets it to 'on'

There is a rc file that loads services on boot and here is where DNS is set, which by default is off as you can see:

ontap> rdfile /etc/rc
hostname ontap
ifconfig e0a `hostname`-e0a mediatype auto flowcontrol full netmask 255.255.255.0 mtusize 1500
route add default 10.10.10.1 1
routed on
options dns.enable off
options nis.enable off
savecore

You can see it states dns.enable off. This means, whilst you can start dns by running options dns.enable on, with the rc file set this way, dns.enable on is not persistent. So first you need to update the /etc/rc file and set dns to be enabled.

Hint : You can rdfile /etc/rc then copy and paste the contents appropriately in when you run the wrfile /etc/rc. You'll get the drift when you have a go yourself. So here goes:

ontap> wrfile /etc/rc
hostname ontap
ifconfig e0a `hostname`-e0a mediatype auto flowcontrol full netmask 255.255.255.0 mtusize 1500
route add default 10.10.10.1 1
routed on
options dns.enable on
options nis.enable off
read: error reading standard input: Interrupted system call

Here you can see this read: error reading standard input: Interrupted system call. This is because you have to CTRL-C out of the wrfile command to save your changes. If ANYONE knows a way around htis, please send a commnt. However, a man wrfile doesn't suggest a way.

So now, you have the /etc/rc file set up with dns enables, you need to change the /etc/resolv.conf. Here you can use the wrfile -a command. Just append your dns nameserver like so:

ontap> wrfile -a /etc/resolv.conf nameserver 10.10.10.10
Lastly, you need to run the following command to trun on dns

ontap> options dns.enable on

And there you have it. To prove dns is now running, the UI will show your changes:

Now I can continue setting up CIFS and adding the ONTAP device to AD. Pretty straight forward once you know how.

Friday, February 17, 2012

How to use the vhd-util tool in xenserver

I have always found that the vdi chain in xenserver is one area where there is always some misunderstanding on how it all hangs together so I thought I would blog about how the vdi chains work and the vhd files make up the vdi chain. When running blocked based storage in XenServer, it isn't easy to see the underlying files and to do so you have to use the vhd-util tool. I have put together a guide which helps how you can see what vhd files make up the vdi chain.
First of all, run vgs to display the volume groups, look for the the iscsi volume group and note down the name

VG_XenStorage-7ec18595-0bf0-859a-1e85-7e19721dad84   1 51   0 wz--n- 799.99G 495.84G

Here are some vhd-util scan options you can use.

usage: [OPTIONS] FILES

options: [-m match filter] [-f fast] [-c continue on failure] [-l LVM volume] [-p pretty print] [-a scan parents] [-v verbose] [-h help]

A typical command would be.

vhd-util scan -m "VHD-*" -f -c -l VG_XenStorage-7ec18595-0bf0-859a-1e85-7e19721dad84 -p –v

And the output would be this:

[root@xenserver ~]# vhd-util scan -m "VHD-*" -f -c -l VG_XenStorage-7ec18595-0bf0-859a-1e85-7e19721dad84 -p –v

vhd=VHD-79d0aca7-eda7-4d73-9238-c3b5c9378aa7 capacity=26843545600 size=4630511616 hidden=1 parent=none
   vhd=VHD-691d986e-e84a-4ab4-9672-b6f9b3148cab capacity=26843545600 size=26902265856 hidden=0 parent=VHD-79d0aca7-eda7-4d73-9238-c3b5c9378aa7
   vhd=VHD-838154db-db5c-42b7-8ec3-2ebb31f73683 capacity=26843545600 size=8388608 hidden=0 parent=VHD-79d0aca7-eda7-4d73-9238-c3b5c9378aa7
vhd=VHD-80c4d84c-8aea-4391-9d31-fcc2be388ce6 capacity=21474836480 size=4626317312 hidden=1 parent=none
   vhd=VHD-0eaa0588-3668-4697-956d-bd6e98478585 capacity=21474836480 size=8388608 hidden=0 parent=VHD-80c4d84c-8aea-4391-9d31-fcc2be388ce6
   vhd=VHD-240a20e8-0d70-4b4a-88c1-e4b2a6e138e7 capacity=21474836480 size=21525168128 hidden=0 parent=VHD-80c4d84c-8aea-4391-9d31-fcc2be388ce6
vhd=VHD-a5d73baa-1eb6-4fe6-a2e9-32ddcf14555a capacity=2147483648 size=2101346304 hidden=1 parent=none
   vhd=VHD-87e8553e-340d-4269-8401-f2b4be874b62 capacity=2147483648 size=8388608 hidden=0 parent=VHD-a5d73baa-1eb6-4fe6-a2e9-32ddcf14555a
vhd=VHD-1b10a183-ef8d-4c53-a031-b9aeeb38e0be capacity=10737418240 size=10766778368 hidden=0 parent=nonevhd=VHD-2e45811f-c3a1-48bf-b6b0-0d0fb671da8e capacity=5368709120 size=5385486336 hidden=0 parent=none
vhd=VHD-6fe32f92-bc88-4f98-a4e1-47bf04894ce8 capacity=21474836480 size=21525168128 hidden=0 parent=none
vhd=VHD-8336df72-992d-4a52-a5e3-4a776f1f86f8 capacity=5368709120 size=5385486336 hidden=0 parent=none

Typically, the chain consists of a base-copy, where VHD has the attributes hidden=1 parent=none. Subsequent VHD chain’s have the following attributes - hidden=0 parent=<VHD-UUID>. The following is a good example of a two child chain, which is actually the domain router in Cloudstack

vhd=VHD-ff1c907a-d75d-444e-af85-735121fb9794 capacity=2097152000 size=2101346304 hidden=1 parent=none
   vhd=VHD-14175cb0-35f5-4b45-b686-7d34348283b4 capacity=2097152000 size=2109734912 hidden=0 parent=VHD-ff1c907a-d75d-444e-af85-735121fb9794

It starts getting tricky when snapshots get involved. There is a coalescing process which runs and changes the chains – good article on how snapshots work http://support.citrix.com/servlet/KbServlet/download/21626-102-671572/XenServer_-_Understanding_Snapshots_(v1.1).pdf

Relating this to how Cloudstack typically works, we should only see a maximum chain on 2 childs. Each base-copy relates to a unique template. Chains are then created from the base-copy. Multiple templates will mean multiple base-copies. A base-copy which stems from a cloudstack template can exists on many SRs across many pools. If you were taking multiple snapshots using XenCenter, you would have multiple VHD chains which would reflect the snapshot structure.

Hope that helps understanding VHD chains. It is handy to see how the guts of the disk structure hangs together as sometimes you need to troubleshoot broken vdi health, I dare I say the dreaded error 22.

Sunday, February 5, 2012

Using the cloudstack API

I’ve recently been working on a project to enhance a testing strategy that involved revisiting cloudstack API. I dove headfirst into the deep end, which made me realise just how much you can do with it. I’d like to share some of my experiences, in the hope of getting some feedback from people who have used CloudStack or CloudPlatform or are thinking about it and want to find out more. After plunging into the depths of the API, this is what I discovered.

First, it’s good! Everything you can do in our compute platform can be done via the API and it’s actually quite simple once you get the hang of things. I wrote my test scripts in Python, but there are a few different languages you can use and I know some guys have dived in using PHP and .NET. There is also a Java library available from Jclouds so you can make your choice depending on your comfort zone.

Second, most of the API commands are asynchronous, depending on what you’re trying to do. This means you can call an API command and move on to the next one or, like me, hold out for the async command response. This way you can do one task, wait for the response and grab certain criteria for the next command. You can then build up a fairly comprehensive set of commands. For example, you can deploy a virtual machine (VM) using the deployVirtualMachine together with the following:

Serviceofferingid: The serviceofferingid relates to what instance size you wish to use. You can get a list of available service offerings by using the listServiceOfferingscommand.
Templateid: The templateid refers to the ID of either the templates we offer to customers, or one you have preconfigured. These values can be obtained by using the listTemplates command.
Zoneid: The zoneid signifies the zone in which you deploy the VM. This is achieved by running the listZones command.

The good news is that once you have these values, you can use them over and over again. Or you can list all the template IDs and service offerings and pass a value from a collection if you want different size instances or want to use different templates.

Once you have built up your deployVirtualMachine API command string, you’re ready to move on to the next command – and this is just the beginning. As the deployVirtualMachine is an async command, what you really need before you can move on is the ID of the VM you have deployed.
Then you can do other things like attach disks if required, assign an IP or enable a static NAT, as these commands need a VM ID to work. This is where the queryAsyncJobResult command comes into play. Once you run your command to deploy a VM, it responds with a jobid and jobstatus. The jobid is the asyncjobid number, which you can query using the queryAsyncJobResult command. Once the deployVirtualMachine job finishes, the jobstatus changes to 1, which means you can find out all sorts of information about the VM that has been deployed. One key piece of information is the VM ID, which is required for most commands to work. Once you have this, the world is your oyster.
What I did was create a class in Python, which I was able to reuse to find out the status of the async job. Once I had this working, I was able to call the method in the class every time I wanted to find out the status of the async job.
So, let’s quickly look at attaching a disk to a virtual machine. First, you need to create a disk using createVolume and, again, apply the queryAsyncJobResult to find out the ID of the volume you have created. You will also use the diskofferingid to create a disk of a certain size (I created a 50GB disk). Once you have this information and the queryAsyncJobResult comes back, you are ready to attach the disk to the VM. Here are the required API commands to do so:

The virtualmachineid is the ID of the VM, which you can get when using the queryAsyncJobResult of the deployVirtualMachine job.
ID refers to the ID of the disk volume you can obtain when using the queryAsyncJobResult of the createVolume.

Once you have built up your createVolume and have the corresponding async job results, you’re ready to use the attachVolume commands. Hey presto! You’ve now deployed a VM and attached a disk. And so it continues…
This isn’t a tutorial or documentation explaining how to use our API but merely a blog on how I did it. I’ve just finished testing all our API commands and yes, there are many ways to skin a cat, but the point is that it can be done and you can achieve a lot with just the click of a button. It’s not all plain sailing – it never is – but once you get involved, you can easily work these things out and it does become quite simple.

I’m keen to hear from people who have used our API and what programming language they used, or from people who are thinking about using the API its capabilities. I’m happy to share some of the framework of the Python scripts I’ve put together, so get in touch if you’d like a hand getting started with our compute API.

Thursday, November 24, 2011

Cloud jargon

Having been deeply involved in the cloud revolution, I thought it would be good to actually take a step back and unpick some of the cloud computing jargon I've heard over the last twelve months. What does it all actually mean?

I like to keep things simple, so let me try to simplify the tangle of definitions currently floating out there.

Iaas, Paas, Saas

Well, here's a confusing one to begin with! Why Iaas over Paas? Why Saas over Iaas? Do we all understand the difference? Maybe, maybe not. Here's my interpretation, for what it's worth.

IaaS is 'Infrastructure as a Service'. It means service providers provide an infrastructure to you, the customer, as an ongoing service and not as a one-off hardware purchase. OPEX versus CAPEX for you accountants out there. That infrastructure includes data storage, CPU, memory and networking. However, what a self-service cloud lmeans is that this is all available through a user interface where the customer has instant access to provision and configure the infrastructure instantly.

PaaS is 'Platform as a Service'. This is similar but not quite the same as IaaS as it is still about providing a virtual cloud computing infrastructure. PaaS differs by having a software layer on top of the IaaS as an intermediary between the customer and the infrastructure itself. Therefore, although the software may make some tasks easier, it may lack the flexibility that comes with direct access.

SaaS is 'Software as a Service'. This is a further step above Iaas and PaaS where you use software that happens to be powered by the cloud and is accessible online. Salesforce is a great example of SaaS.

Public, Private and Hybrid clouds

Public cloud computing is where you run a virtual infrastructure entirely through an external cloud provider. This would be a typical scenario for startups or people who have fully migrated their data centres into the cloud.

Private cloud is where a cloud computing environment is run internally on your own hardware infrastructure. This can often be as simply as running KVM, VMware or Citrix Xenserver to configure your infrastructure to offer cloud flexibility to your internal IT department.

A hybrid cloud leverages both private and public cloud.

As customer accounts are kept entirely separate and all virtual resources are dedicated (ie; are not impacted by the activities of other users), there is often very little benefit to a private cloud but plenty of additional hardware expense. We might call it public cloud computing, but it's still virtually (pun intended) as private as your traditional server infrastructure.

Online Storage versus Compute Storage

Online and compute storage are two very different service offerings. Compute (your virtual servers) obviously needs storage to run the operating system and is bundled as part of the offering. However what happens when you want to just use offsite storage - to backup or archive your files, for example - without the expense of running virtual servers to access the necessary storage? Would you buy a new computer just because you needed more hard drive space?

True public cloud storage can be used completely independently of compute and costs a lot less as a result. By connecting either your public, private or hybrid cloud to online cloud storage, you can simply extend your local storage setup to a safe and secure offsite location. With an API, you can also do lots of cool things with it, such as storing all your website assets (images, video, etc) in cloud storage instead of driving more expensive compute server requests every time someone visits one of your web pages.
Horizontal and vertical cloud compute

This really can be simplified as the difference between the number of cloud servers you're running compared to how powerful these servers are. If you have ten web servers and you need to accommodate a period of heavy traffic, then you may want to add another ten servers to your server farm to deal with this increased demand. This is horizontal scaling. However, if you decide to scale the size - and therefore the memory and CPU of those servers - from 2GB to 4GB, then this is vertical scaling. Most providers enables you to do both. Of course, you can also scale down the resources you use once demand dies down. You could shutdown the ten VMs and keep them on standby or decease the vCPUs and memory on each server. This can be done via our API or using our UI interface.

Cloud bursting

Cloud bursting is a term often thrown around that I particularly like. This is really the gem in the crown for cloud computing in my opinion.

Remember you only pay for what you use. Imagine this: it's nearly Christmas and you sell customised Christmas cards. Your busy period is unlikey to be January, but December is likely to be a mad rush. You need compute resources to deal with increased customer demand in the weeks up to Christmas but for the other ten months of the year you won't need anywhere nearly the same level of resources. A more efficient way is to burst into the cloud. Cloud bursting enables you to spin up, manually or automatically, cloud compute resources to cover your busy period and then once this period is over simply power down or completely remove the compute instances.

You can spin up on demand and only be charged for the period you ran the servers. It can be as little as an hour or as much as a year.

Cloud Compute versus VPS

VPS stands for virtual private server. A VPS runs on compute hardware with allocated resources, shared or explicit. Sometimes a VPS server can run on one dedicated physical server. It can be configured with high availability but doesn't always allow you to easily manage your compute resources. So VPS is similar to cloud computing, but not as scalable or flexible.

Cloud compute gathers a large number of resources, compute, network, storage etc and presents this to the end user so they can leverage the entire service to scale and provision quickly using either a user interface or an API. Cloud compute runs on large and powerful clusters, configured with redundancy and high availability as standard, enabling virtualisation of compute assets on demand over the internet.

Cloud storming

This is similar to cloud bursting, but is really for the dedicated cloud user or cloud junkie. Cloud storming is when you leverage a number of different cloud service offerings for your own compute environment.

Why would one do this? Benefits such as redundancy, reduced latency in different geographical locations, or to be in a relevant operating time zone.

These are just some of the most common terms and I'm sure some of you may disagree on my definitions. Let me know what bits of cloud jargon I've missed or offer your alternative definitions in the comments below.

Sunday, October 9, 2011

Moving to the cloud

Recently I've attended a few cloud forums and have listened to some very interesting discussions. Why interesting? Well, discussions seem to have moved on from the cloud computing technology to how it affects the consumer. However, as similar questions are popping up again and again, I thought I would provide some insight to some of the most common consumer questions from these forums.

Security

The question about security never seems to go away. The feeling I get, as ever, is that common sense prevails. It’s not really the responsibility of any one particular person to cover the security aspect of cloud computing, but it is down to both the consumer and the provider to adhere to best practice. It should also be ongoing. I work for a cloud provider and we do have in-house skills to keep our cloud secure. We also have external third party professionals testing our security on a regular basis to see if we are as good as we think we are! However, the consumer should still take some responsibility to implement security best practice; including good firewall config, good anti-virus tools, regular OS patching, and - importantly - hardening the security of the OS and the applications you run in the cloud. Also, it's advisable for the consumer to monitor their systems.

The IT Department

The IT department obviously takes ownership of infrastructure issues. But when access to the cloud becomes so simple, why bother asking the IT department for a server when you can get your own computer resource from the cloud within minutes? Assuming a certain skill level, it is often quicker and less hassle to go to the cloud than to wait for the IT department to configure a server on your behalf. However it is still important to recognise that your IT department has a vast amount of experience which should not be bypassed – especially where security is concerned. Spinning up your own server will most likely need to comply with your internal IT policies so as to avoid conflicts or bugs. Also, ongoing management of cloud compute resources can be dealt with by your IT provider and often falls in line with their processes they have in place. Therefore, the feedback I keep hearing is that IT departments need to change the way they operate to deal with what cloud computing offers - adapting their processes to allow the speed and flexibility other departments require.

How and Why?

The other interesting question that still crops up is how and why do you move to the cloud? 'Why' is really down to your specific use case. One of the big wins is the fact that you can manage your cost. OPEX compared to CAPEX is a great win for finance people. Instead of the cost of a new server and the associated resources, you can have the cost split up across the relevant months, which you can easily forecast. Questions will keep coming, I'm sure. And this is just a very high-level view of just three key topics. If you still have more questions or need more detail about whether the cloud is right for your business, drop me a line - @oliverleach