Virtual Devops

Tuesday, June 5, 2012

Puppet integration with NetApp

I was look at the puppet forge the other day and noticed that NetApp integrates with Puppet which made me smile. Take a look here:

https://forge.puppetlabs.com/fatmcgav/netapp/0.1.0

I have copied some of the readme notes. Here is an extract

NetApp operations

As part of this module, there is a defined type called ‘netapp::vqe’, which can be used to create a volume, add a qtree and create an NFS export. An example of this is:

netapp::vqe { 'volume_name':
ensure         => present,
size           => '1t',
aggr           => 'aggr2',
spaceres       => 'volume',
snapresv       => 20,
autoincrement => true,
persistent     => true
}

This will create a NetApp volume called ‘v_volume_name’ with a qtree called ‘q_volume_name’. The volume will have an initial size of 1 Terabyte in Aggregate aggr2. The space reservation mode will be set to volume, and snapshot space reserve will be set to 20%. The volume will be able to auto increment, and the NFS export will be persistent. To be honest, that is awesome if you need to build up and automate say infrastructure deployments.

I have used many auto deployments tools in my time, but I have never seen such a great adoption as I have done with Puppet. This just proves it. Why is this? Well, I think the main reasons are that puppet is open source and it doesn't require you to lock your self in with a particular product, i.e. it runs on CentOS, WIndows, Ubuntu etc. It is also very flexbile and you code it in a way that doesn't require integration with the target node. So in the example, NetApp is completely unaware of puppet, so NetApp needs no integration. This is because puppet can interact with the NetApp Manageability SDK Ruby libraries. How cool is that?
A great example of how good opensource can really be.

Hoorah to Puppet and NetApp!

Tuesday, April 17, 2012

NetApp storage solutions

I have worked with many different storage solutions really since the beginning of my journey with VMware, which was at the end of 2005 with VMware ESX 2.5, no virtual center - imagine that!. The first SAN I worked with still stays close to my heart as one of the best products out on the market, Data Core SAN Symphony. It was a great virtualisation SAN where you plugged in vendor neutral storage arrays in to a physical server, which was running Windows Server and the DataCore SAN Symphony management software. This in turn then managed the vendor neutral storage arrays. It offered both iSCSI and Fabric and at the time, it was a really great product to have got involved with. I have used many other products, including the EMC's CX-120, the DS Series from IBM and mainly over the last few years, the Dell Equallogics PS series. I've also built my own SAN using Solaris ZFS and using just plain old CentOS with iscsi targets and tgtadm, running off big disks on a buinch of super micros servers. I also use the Nextenta community edition in a lab which, out of the box, is pretty darn good. However, my other favourite SAN technology that I have used is NetApp. It just oozes a quality and what I like is that it seems to have taken all the good features from all of the other SANs I have used and put them altogether.

I was given a chance to use NetApp when I was working for a large service provider down under, however it didn't make the cut due to the price points and also due to the fact that it was too difficult to migrate from our existing SAN that was running our cloud offering at the time, so we ended up persevering with our existing solution and carried on ploughing our investment into what was a flakey product at the time (no names mentioned).

One of the ways you can get your hands dirty with NetApp is to run the evaluation product. A great colleague of mine, @wadeis, taught me all the tricks there is to know with the NetApps. Here are a couple of key offerings the NetApp solution has:

The NetApp Data ONTAP 8 Operating System

This is the operating system for your storage devices.. You can run this in 2 modes, cluster mode and 7 mode. It runs on FreeBSD, like most provders, Dell, IBM etc (Juniper run also FreeBSD on the network devices) so its a popular choice.. It also has a great command line utility which enables which seems to be very popular. To manage ONTAP, you need to run System Manager which runs on both Windows or using Linux and its a nice UI. You connect to your ONTAP device and the have all the management tools available to you depending on the license you have. Here is a screen shot of System Manager running on Windows managing an ONTAP device.

It pretty nicely laid out. There are a few key concepts with ONTAP which I want to run through:

Aggregates:

NetApp Aggregates is a way of aggregating your disks from disk arrays together to provide you raw usable disk space. Then you assign logical volumes on top of this raw space. It can hold multiple disk from different disk array but it is highly recommended to use the same type of disk. I.e., don't mismatch SAS with SATA, 300GB with 144GB for example. Keep them the same, but the key is you can have many spindles that make up an aggregate. There are some limitations you need to consider, for example, the maximum aggregate size.

Here are some concepts you'll need to undertstand when mangaging your aggregates

    Benefits of keeping your RAID groups homogenous for disk size and speed
    What types of disks can be used together (FCAL and SAS, SATA and ATA)
    How to add disks to aggregates from a heterogenous disk pool
    The requirement to add disks owned by the same system and pool
    Best practices for providing hot spares

Qtrees:

Qtrees stand for quote based trees. It is a concept that is confusing to start with, but you need not overthink how qtrees work. It stands for quota based trees where you can assign qtrees to volumes and slice up quota's for resources created under the volume. It's a way of partitioning volumes up in to quota's for various purposes.

For example, you may create a volume, vol01, and share that out using cifs. Then you create 2 qtree's, one called qtree_one and the other qtree_two. You can set quota's on both the qtrees but still share out the volumes. If you lock down a user to only be able to use a certain amount of space in qtree_one, well they can't add more than their set quota. Qtree_two can have a different quota to qtree_one. You would need to have the ontap device connected to your domain so it can pick up domain groups and user to apply the quotas to.

Another way of using quotas on a volume to quota how large your LUNs can be. For example, you might want to assign volume space to various database administrators and allow them to create and manage their own LUNs. You can organize the volume into qtrees with quotas and enable the individual database administrators to manage the space they have been allocated.

If you organize your LUNs in qtrees with quotas, make sure the quota limit can accommodate the sizes of the LUNs you want to create. Data ONTAP does not allow you to create a LUN in a qtree with a quota if the LUN size exceeds the quota.

I created a volume with 100GB space and then created a qtree under the volume with a quota limit of 10GB. I then tried to create a 10GB LUN and gto this error:

It rounded down the LUN size to 9.96GB but it shows how you can manage space allocation for LUNS using qtrees.

WAFL

Wafl is the file layout used by the netapp filers. It stands for write any file location. Apparently it is not classifgied as a file system but does act like one. WAFL supports all different kinds of storage filsystems. For example, it can handle CIFS for window and UNIX shares, NFS for network file system shares and block base storage for iscsi and FC. So it need to handle a few different types of files. WAFL is best thought of a tree of blocks and at the root of the tree if the root inode. The root inode describe the inode file and the inode file describes the rest of the files in the file system, including bthe block-map and indoe-map files. When WAFL loads, it need to locate the root inode, so this needs to be in a fixed located, which is the only exception to the write anywhere rule.

Here is a diagram that shows how a root inode tree can be made up.

Now, one of the greatest benefits of using this type of technology is how snapshots work. NetApp can snapshot LUNs instantaneously and it does this by copying the the root inode. If a snapshot or clone changes a block, it remaps the new block in the inode tree, as if it was l.ike a new branch. See this:

So this give NetApp some really great flexibilty when it comes to snapshot s and cloning techniques.

So it's now time to discuss some other tools that NetApp uses to help DR and backups

NetApp SnapShot

NetApp SnapShot software enables you to protect your data with no performance impact and minimal consumption of storage space.

NetApp Snapshot technology enables you to create point-in-time copies of file systems, which you can use to protect data—from a single file to a complete disaster recovery solution.

SnapShot key points:

Perform instant backups by copying the inode root tree.

You can have up to 255 snapshot copies per volume

You can combine other technologies such as SnapMirror to build a data protection solution

SyncMirror

NetApp SyncMirror ensures your data is available and up-to-date at all times. By maintaining two copies of data online, SyncMirror protects your data against all types of hardware outages. This is similar to SAN Symphonies mirrored volumes.

The above is a good diagram on how SyncMirror can work. Its best the have 2 arrays which copies its blocks of data destined for the volume to multiple arrays, therefore you can los a who array and the ONTAP cluster will seemlessly fail over to the other array, without any intervention. This is the same methodolgy as SAN Symphony. The only catch here, is you are duplicating you data set as you are effectively mirroring the volumes. You can by all means mirror on the the same ONTAP device.

SyncMirror allows you to split the data copies so that the mirrored data can be used by another application. This allows for doing backup, application testing, or data mining using up-to-date production data but on the passive mirrored volumes and you can perform these background tasks using the mirrored data without affecting your production environment

SnapMirror

You can schedule snapshots in NetApp at regular intervals. This is handy again for DR stratgey. The snapshots can

FlexClone

Again, using WAFLs inode tree structure FlexClone is able to create instant clones. You can use these clones many different purposes, for example, setting up a dev environment, micking production data, or a test environment or a DR strategy. You cannot commit the changes of the clone back in to the producion LUN though, so once it is cloned, you can only branch the dataset.

Here are some more functionality of NetApp which also explains the benefits of using their technologies.

NetApp Data Compression

- Transparent inline data compression for data reduction
- Reduces the amount of storage you need to purchase and maintain

NetApp Deduplication

- General-purpose deduplication for removal of redundant data objects
- Reduces the amount of storage you need to purchase and maintain

FlexCache

- Caches NFS volumes for accelerated file access in remote offices and for server compute farms
- Improves your system’s performance, response times, and data availability for particular workloads

FlexClone

- Instantaneously creates file, LUN, and volume clones without requiring additional storage
- Enables you to save time in testing and development and increases your storage capacity

FlexShare

- Prioritizes storage resource allocation to highest value workloads on a heavily loaded system
- Provides you with better performance for designated high-priority applications

FlexVol

- Creates flexibly sized LUNs and volumes across a large pool of disks and one or more RAID groups
- Enables your storage systems to be used at maximum efficiency and reduces your hardware investment

MetroCluster

- An integrated high-availability/disaster recovery solution for campus and metro-area deployments
- Enables you to have immediate data availability if a site fails

MultiStore

- Securely partitions a storage system into multiple virtual storage appliances
- Allows you to consolidate multiple domains and file servers

Operations Manager

- Manages multiple NetApp systems from a single administrative console
- Simplifies your NetApp deployment and allows you to consolidate management of multiple NetApp systems

Protection Manager

- Backup and replication management software for NetApp disk-to-disk environments
- Lets you automate data protection, enabling you to have mistake-free backup

System Manager

- Provides setup, provisioning, and configuration management of a Data ONTAP storage system
- Simplifies out-of-box setup and device management using an intuitive Windows-based interface

SnapDrive

- Provides host-based data management of NetApp storage from Windows, UNIX, and Linux servers
- Allows you to initiate error-free system restores should servers fail

SnapManager

- Provides host-based data management of NetApp storage for databases and business applications
- Lets you automate error-free data restores and provides you with application-aware disaster recovery

SnapMirror

- Enables automatic, incremental data replication between systems: synchronous or asynchronous
- Provides you with flexibility and efficiency when mirroring for data distribution and disaster recovery

SnapMover

- Enables rapid reassignment of disks between controllers within a system, without disruption
- Lets you load balance an active-active controller system with no disruption to data flow

SnapRestore

- Rapidly restores single files, directories, or entire LUNs and volumes from any Snapshot copy backup
- Instantaneously recovers your files, databases, and complete volumes from your backup

Snapshot

- Makes incremental, data-in-place, point-in-time copies of a LUN or volume with minimal performance impact
- Enables you to create frequent, space-efficient backups with no disruption to data traffic

SnapValidator

- Maximizes data integrity for Oracle

Databases

- Allows you to enhance the resiliency of Oracle Data-bases so they comply with Oracle HARD initiative

SnapVault

- Exports Snapshot copies to another NetApp system, providing an incremental block-level backup solution
- Provides you with cost-effective, long-term backups of disk-based data

SyncMirror

- Maintains two online copies of data with RAID-DP protection on each side of the mirror
- Protects your system from all types of hardware outages, including triple disk failure

The summary for me is the NetApp are way up their for offering performance with functionality but it does come as a cost. The product reminds me of SAN Symphony which is also a failry hefty cost but done of data size.

I had real fun working with the NetApps and would stick my hand up anytime if offered another opportunity.

Have fun!

Monday, March 12, 2012

How to enable dns on a NetApp running ONTAP

I had to enable dns on the NetApp ONTAP device the other day and as I am in the thick of learning NetApp, I thought I would write this one down. It is pretty easy but you need to do this through the cli.

At first, DNS is not enabled. You can see this in the System Manager UI:

You have to drop to the command line and edit a few files to enable dns. The cutdown FreeBSD cli is, yep, cutdown. However, once you know your way around, it is pretty easy. When you need to write and read a file, there are 2 key commands:

wrfile
rdfile

wrfile writes to a file. I don't like this but you need to press CTRL-C to quit out of the file and you get this message when you do so:

read: error reading standard input: Interrupted system call

However you can also use wrfile -a which appends to the file. It's not vi, that's for sure.

However, back to the point. Below shows how one can set up DNS and a sneaky gotcha you need to be aware of.

If you just try to enable dns from the command options dns.enable on, you maight get this message:

Setting option dns.enable to 'off' conflicts with /etc/rc that sets it to 'on'

There is a rc file that loads services on boot and here is where DNS is set, which by default is off as you can see:

ontap> rdfile /etc/rc
hostname ontap
ifconfig e0a `hostname`-e0a mediatype auto flowcontrol full netmask 255.255.255.0 mtusize 1500
route add default 10.10.10.1 1
routed on
options dns.enable off
options nis.enable off
savecore

You can see it states dns.enable off. This means, whilst you can start dns by running options dns.enable on, with the rc file set this way, dns.enable on is not persistent. So first you need to update the /etc/rc file and set dns to be enabled.

Hint : You can rdfile /etc/rc then copy and paste the contents appropriately in when you run the wrfile /etc/rc. You'll get the drift when you have a go yourself. So here goes:

ontap> wrfile /etc/rc
hostname ontap
ifconfig e0a `hostname`-e0a mediatype auto flowcontrol full netmask 255.255.255.0 mtusize 1500
route add default 10.10.10.1 1
routed on
options dns.enable on
options nis.enable off
read: error reading standard input: Interrupted system call

Here you can see this read: error reading standard input: Interrupted system call. This is because you have to CTRL-C out of the wrfile command to save your changes. If ANYONE knows a way around htis, please send a commnt. However, a man wrfile doesn't suggest a way.

So now, you have the /etc/rc file set up with dns enables, you need to change the /etc/resolv.conf. Here you can use the wrfile -a command. Just append your dns nameserver like so:

ontap> wrfile -a /etc/resolv.conf nameserver 10.10.10.10
Lastly, you need to run the following command to trun on dns

ontap> options dns.enable on

And there you have it. To prove dns is now running, the UI will show your changes:

Now I can continue setting up CIFS and adding the ONTAP device to AD. Pretty straight forward once you know how.

Friday, February 17, 2012

How to use the vhd-util tool in xenserver

I have always found that the vdi chain in xenserver is one area where there is always some misunderstanding on how it all hangs together so I thought I would blog about how the vdi chains work and the vhd files make up the vdi chain. When running blocked based storage in XenServer, it isn't easy to see the underlying files and to do so you have to use the vhd-util tool. I have put together a guide which helps how you can see what vhd files make up the vdi chain.
First of all, run vgs to display the volume groups, look for the the iscsi volume group and note down the name

VG_XenStorage-7ec18595-0bf0-859a-1e85-7e19721dad84   1 51   0 wz--n- 799.99G 495.84G

Here are some vhd-util scan options you can use.

usage: [OPTIONS] FILES

options: [-m match filter] [-f fast] [-c continue on failure] [-l LVM volume] [-p pretty print] [-a scan parents] [-v verbose] [-h help]

A typical command would be.

vhd-util scan -m "VHD-*" -f -c -l VG_XenStorage-7ec18595-0bf0-859a-1e85-7e19721dad84 -p –v

And the output would be this:

[root@xenserver ~]# vhd-util scan -m "VHD-*" -f -c -l VG_XenStorage-7ec18595-0bf0-859a-1e85-7e19721dad84 -p –v

vhd=VHD-79d0aca7-eda7-4d73-9238-c3b5c9378aa7 capacity=26843545600 size=4630511616 hidden=1 parent=none
   vhd=VHD-691d986e-e84a-4ab4-9672-b6f9b3148cab capacity=26843545600 size=26902265856 hidden=0 parent=VHD-79d0aca7-eda7-4d73-9238-c3b5c9378aa7
   vhd=VHD-838154db-db5c-42b7-8ec3-2ebb31f73683 capacity=26843545600 size=8388608 hidden=0 parent=VHD-79d0aca7-eda7-4d73-9238-c3b5c9378aa7
vhd=VHD-80c4d84c-8aea-4391-9d31-fcc2be388ce6 capacity=21474836480 size=4626317312 hidden=1 parent=none
   vhd=VHD-0eaa0588-3668-4697-956d-bd6e98478585 capacity=21474836480 size=8388608 hidden=0 parent=VHD-80c4d84c-8aea-4391-9d31-fcc2be388ce6
   vhd=VHD-240a20e8-0d70-4b4a-88c1-e4b2a6e138e7 capacity=21474836480 size=21525168128 hidden=0 parent=VHD-80c4d84c-8aea-4391-9d31-fcc2be388ce6
vhd=VHD-a5d73baa-1eb6-4fe6-a2e9-32ddcf14555a capacity=2147483648 size=2101346304 hidden=1 parent=none
   vhd=VHD-87e8553e-340d-4269-8401-f2b4be874b62 capacity=2147483648 size=8388608 hidden=0 parent=VHD-a5d73baa-1eb6-4fe6-a2e9-32ddcf14555a
vhd=VHD-1b10a183-ef8d-4c53-a031-b9aeeb38e0be capacity=10737418240 size=10766778368 hidden=0 parent=nonevhd=VHD-2e45811f-c3a1-48bf-b6b0-0d0fb671da8e capacity=5368709120 size=5385486336 hidden=0 parent=none
vhd=VHD-6fe32f92-bc88-4f98-a4e1-47bf04894ce8 capacity=21474836480 size=21525168128 hidden=0 parent=none
vhd=VHD-8336df72-992d-4a52-a5e3-4a776f1f86f8 capacity=5368709120 size=5385486336 hidden=0 parent=none

Typically, the chain consists of a base-copy, where VHD has the attributes hidden=1 parent=none. Subsequent VHD chain’s have the following attributes - hidden=0 parent=<VHD-UUID>. The following is a good example of a two child chain, which is actually the domain router in Cloudstack

vhd=VHD-ff1c907a-d75d-444e-af85-735121fb9794 capacity=2097152000 size=2101346304 hidden=1 parent=none
   vhd=VHD-14175cb0-35f5-4b45-b686-7d34348283b4 capacity=2097152000 size=2109734912 hidden=0 parent=VHD-ff1c907a-d75d-444e-af85-735121fb9794

It starts getting tricky when snapshots get involved. There is a coalescing process which runs and changes the chains – good article on how snapshots work http://support.citrix.com/servlet/KbServlet/download/21626-102-671572/XenServer_-_Understanding_Snapshots_(v1.1).pdf

Relating this to how Cloudstack typically works, we should only see a maximum chain on 2 childs. Each base-copy relates to a unique template. Chains are then created from the base-copy. Multiple templates will mean multiple base-copies. A base-copy which stems from a cloudstack template can exists on many SRs across many pools. If you were taking multiple snapshots using XenCenter, you would have multiple VHD chains which would reflect the snapshot structure.

Hope that helps understanding VHD chains. It is handy to see how the guts of the disk structure hangs together as sometimes you need to troubleshoot broken vdi health, I dare I say the dreaded error 22.