Las Solanas Consulting

Storage Virtualization | FAQs & Discussions

Storage Pooling, Thin Provisioning And Over Subscription

Author: Tim Warden

Storage Pooling: The Foundation of Virtualization

Storage Pooling is an abstraction of physical storage resources from the logical volumes or LUNs published by a storage system. Storage Pooling is the very foundation of Storage Virtualization. With Storage Pooling, the disks — whether physical or logical — can be agregated into pools from which the logical volumes are allocated. The objective is to simplify storage allocation by eliminating the need to manage partitioned space on the physical storage resources.

Thin Provisioning Explained

Thin Provisioning refers to a method of improving storage utilization by only allocating to a volume the physical storage required to hold its data. Thin Provisioning is made possible by the layer of abstraction created by storage pooling in a Storage Virtualization system. The storage array represents a served logical volume or disk as a contiguous mapped space made up of addressable blocks. Generally, the logical blocks are remapped by the storage controller to a physical space which may have any geometry supported by the controller. For instance, the logical volume may be remapped to a particular set of blocks on several physical disks that comprise a "partition" in a RAID-5 group. The correlation between logical and physical is one-to-one, albeit the physical blocks may be scattered across the disks of the array

Thin Provisioning takes this abstraction one step further. It presumes that until an application writes a particular block or group of blocks in a virtual volume, there is no need to allocate the physical space to accommodate those blocks. In Thin Provisioning, physical storage resources are placed into logical storage pools. Virtual Volumes are created from these pools as logical entities, comprised initially of only a volume header identifying the volume and providing a volume map table. The volume is, in essence, an empty shell. The volume map table is used to keep track of where allocated blocks physically exist. When an application server writes a block on the volume, the Thin Provisioning manager hashes the volume map table to determine where in the pool the block physically resides. If the block has not yet been allocated, a new allocation unit is taken from the logical volume's associated storage pool to accommodate the new block.

The concept of Thin Provisioning was first introduced to the Open Computing world of SANs when DataCore Software Corporation released the revolutionary "Network Managed Volume" feature in their SANsymphony product back in 2001 -- years before 3Par and others introduced their own Thin Provisioning implementations. DataCore renamed the feature "Auto Provisioning" in their SANmelody Storage Server products, but it's the same mature code set found in the SANsymphony suite.

Thin Provisioning has many merits:

Over Subscription

Thin Provisioning permits implementing another concept called Over Subscription (or Over Provisioning) which is the concept of publishing volume sizes whose cumulative advertised capacity exceeds the physical amount of storage available in the pool.

Thin Provisioning Versus Over Subscription

Some industry pundits — usually Vice Presidents of storage vendors whose offerings don't include Thin Provisioning — have little good to say about the feature. Their principle argument against Thin Provisioning runs along the lines of, "it's a lie; you're telling a user they have more storage than they really do. It can create headaches for the storage admins." Their arguments fail to differentiate between Thin Provisioning and Over Subscription — two distinct concepts. Administrators of Thin Provisioning are not by any means required to employ Over Subscription (although there are many cases where they will want to do so that we will discuss in a later section).

Thin Provisioning in and of itself offers the storage administrator a facility of storage management even without Over Subscription via simplification of storage allocation. New volumes can be created with a simple click of the mouse without regard to finding a large enough contiguous block to represent the new volume.

An Example Sans Auto-Provisioning: Wasted Resources

Let us consider a project of provisioning storage using a non-Thin Provisioning storage array. For the sake of simplicity we will use a small array with only 1TB of physical storage. We will not concern ourselves with the specific composition of the storage — the type of disks or the RAID groups employed. Suffice it to say that the management console of the array will permit us to slice up the storage resources into "partitions" and those partitions will become the logical volumes that we need to complete our project.

Our project requires us to create four volumes for various applications. We have already determined the size of the data requirements, but we wish to allocate additional capacity for each of the volumes in anticipation of future growth. Table 1 below indicates the current sizes of our data, the sizes of the volumes we will create, and the space that will consequently be reserved or allocated for those volumes.

Current Data SizeVolume SizeAllocated Space
Volume1150GB300GB300GB
Volume2170GB300GB300GB
Volume3160GB250GB250GB
Volume460GB100GB100GB
Total Sizes540GB950GB950GB

Our total data is only 540GB of space, but we have almost fully utilized the 1TB of storage in the array, leaving us only 50GB of free space. Thus we are effectively only using 57% of the allocated storage, while 43% is wasted, statically reserved for future growth of the four existing volumes, but unusable should other volumes need to be deployed.

An Example with Thin Provisioning: Fully Utilized Storage Resources

Now let us consider the same example using Thin Provisioning. We use the Thin Provisioning manager's GUI to instantly allocate the four volumes without any partition creation, without notion of where they will be placed in the pool. We then set the volume sizes as per our plan. The Thin Provisioning manager makes the allocation of the four new volumes as simple as a few mouse clicks.

At the time of creation, no space will have been taken from the pool. It is only when we begin writing data onto the volumes that space will be allocated to them. So in our example, we assign or map the four volumes to their respective application servers. From those servers, we format the volumes and write our data onto them. Table 2 reveals that we have only allocated from the pool the physical disk space required to hold the current data.

Current Data SizeVolume SizeAllocated Space
Volume1150GB300GB150GB
Volume2170GB300GB170GB
Volume3160GB250GB160GB
Volume460GB100GB60GB
Total Sizes540GB950GB540GB

So far, there has been no Over Subscription. With little effort we have created four volumes totaling 950 GB, nearly all of our 1TB pool. Our applications could fully utilize their allotted storage as there is enough physical storage in the pool, but on examining the Thin Provisioning manager, we determine that the pool is only about 54% allocated. In fact, as we populated our volumes, the Thin Provisioning manager only allocated the storage necessary to hold the real data. We still have 46% of our storage pool remaining, available should we need to use it.

An Example of Over Subscription: No-More-Headaches Storage Management

Let us now suppose a few months have passed, our applications continue using the available storage of the pool, each growing at some rate. One day we need to create another volume for a project that will require about 90GB today, but could grow to 200GB. We use the Thin Provisioning manager to create another volume from the pool and size it to 200GB.

At this point, we have over subscribed: we have provisioned more storage to volumes than we physically have in the pool. See Table 3 below.

Current Data SizeVolume SizeAllocated Space
Volume1170GB300GB170GB
Volume2185GB300GB185GB
Volume3165GB250GB165GB
Volume465GB100GB65GB
Volume590GB200GB90GB
Total Sizes675GB1150GB675GB

Note that in spite of the fact we are over subscribed by 150GB, the pool is still only 67.5% utilized, and 325GB of storage remains available for the growth of the 5 volumes.

What Happens When the Pool is Depleted?

(Coming Soon)

Growing The Pool With No Downtime

The abstraction layer created by Storage Pooling allows us to add storage to the pool at any time and potentially without disrupting production. Since our Thin-Provisioned volumes are based on a virtual geometry and not on any particular physical disk structure or RAID group, we can easily add more disks or LUNs to the storage pool without modification to those volumes and without informing their associated application servers. We are, in effect, pouring more storage into the storage pool from which our virtual volumes feed.

If the new physical storage can be added without shutting down the server, the procedure is as simple as discovering the new physical storage (i.e. "Rescan Disks" to find new LUNs) and adding them to the pool.

If adding the new storage requires shutting down the server, using synchronous mirroring between two storage servers can allow you to bring down the server without interrupting production: you simply fail all the servers active volumes over to the surviving server, effectively performing a "rolling upgrade".

Thin Provisioning with VMWare VMFS Volumes

Using Thin Provisioning with your VMFS volumes radically improves capacity utilization and simplifies allocating space for VMDK's. It also helps to decouple VM Server Sprawl from capacity allocation, meaning your calls to the storage vendor are less frequent. [More on this later as I find time...]

Thin Provisioning With Snapshots

Using Thin Provisioning with point-in-time Snapshots can dramatically improve storage utilization. Thin Provisioned snapshot volumes take the guesswork out of sizing snapshot reserve space and permits the sharing of space among many snapshots. DataCore implements Snapshots using Copy-On-Write technology: Changes to original blocks on a snapshot-enabled source volume are copied to the snapshot destination volume before the changes are committed to the source volume. Using Thin Provisioning means you no longer have to pre-allocate reserved disk space to hold the snapshot image - physical storage will be allocated from the pool as needed to hold the original blocks when the source is changed. Consider the example of taking a nightly snapshot of our volumes for backup purposes. [To be continued...]