 |
|
|
Storage Pooling, Thin Provisioning And Over Subscription
Author: Tim Warden
Storage Pooling: The Foundation of Virtualization
Storage Pooling is an abstraction of physical storage resources from the logical volumes or LUNs
published by a storage system. Storage Pooling is the very foundation of Storage Virtualization.
With Storage Pooling, the disks whether physical or logical can be agregated into pools
from which the logical volumes are allocated. The objective is to simplify storage allocation by
eliminating the need to manage partitioned space on the physical storage resources.
Thin Provisioning Explained
Thin Provisioning refers to a method of improving storage utilization by only allocating to a volume
the physical storage required to hold its data. Thin Provisioning is made possible by the layer of
abstraction created by storage pooling in a Storage Virtualization system. The storage array represents
a served logical volume or disk as a contiguous mapped space made up of addressable blocks. Generally,
the logical blocks are remapped by the storage controller to a physical space which may have any geometry
supported by the controller. For instance, the logical volume may be remapped to a particular set of blocks
on several physical disks that comprise a "partition" in a RAID-5 group. The correlation between
logical and physical is one-to-one, albeit the physical blocks may be scattered across the disks
of the array
Thin Provisioning takes this abstraction one step further. It presumes that until an application
writes a particular block or group of blocks in a virtual volume, there is no need to allocate the
physical space to accommodate those blocks. In Thin Provisioning, physical storage resources are
placed into logical storage pools. Virtual Volumes are created from these pools as logical entities,
comprised initially of only a volume header identifying the volume and providing a volume map table.
The volume is, in essence, an empty shell. The volume map table is used to keep track of where
allocated blocks physically exist. When an application server writes a block on the volume, the
Thin Provisioning manager hashes the volume map table to determine where in the pool the block
physically resides. If the block has not yet been allocated, a new allocation unit is taken from
the logical volume's associated storage pool to accommodate the new block.
|
|
The concept of Thin Provisioning was first introduced to the Open Computing world of SANs when
DataCore Software Corporation released the
revolutionary "Network Managed Volume" feature in their
SANsymphony product back in 2001 --
years before 3Par and others introduced their own Thin Provisioning implementations. DataCore
renamed the feature "Auto Provisioning" in their
SANmelody Storage
Server products, but it's the same mature code set found in the SANsymphony suite.
Thin Provisioning has many merits:
- It centralizes storage utilization monitoring.
- It radically simplifies storage allocation. The storage administrator no longer needs to carefully pre-plan partition sizes.
- It radically simplifies capacity planning. The storage administrators feed pools with additional storage as needed, without down time.
- It maximizes capacity utilization. This is particulary valuable in light of advanced features such as Snapshots,
Synchronous Mirrors, and Asynchronous Replication.
- It can facilitate better bandwidth utilization when implementing Synchronous and Asynchronous Replication, particularly
during Initialization or Mirror Recovery tasks.
Over Subscription
Thin Provisioning permits implementing another concept called Over Subscription (or Over Provisioning)
which is the concept of publishing volume sizes whose cumulative advertised capacity exceeds the physical
amount of storage available in the pool.
Thin Provisioning Versus Over Subscription
Some industry pundits usually Vice Presidents of storage vendors whose offerings don't include
Thin Provisioning have little good to say about the feature. Their principle argument against
Thin Provisioning runs along the lines of, "it's a lie; you're telling a user they have more storage
than they really do. It can create headaches for the storage admins." Their arguments fail to
differentiate between Thin Provisioning and Over Subscription two distinct concepts.
Administrators of Thin Provisioning are not by any means required to employ Over Subscription
(although there are many cases where they will want to do so that we will discuss in a later
section).
Thin Provisioning in and of itself offers the storage administrator a facility of storage
management even without Over Subscription via simplification of storage allocation. New volumes
can be created with a simple click of the mouse without regard to finding a large enough contiguous
block to represent the new volume.
An Example Sans Auto-Provisioning: Wasted Resources
Let us consider a project of provisioning storage using a non-Thin Provisioning storage array.
For the sake of simplicity we will use a small array with only 1TB of physical storage. We will
not concern ourselves with the specific composition of the storage the type of disks or the RAID
groups employed. Suffice it to say that the management console of the array will permit us to slice up
the storage resources into "partitions" and those partitions will become the logical volumes that we need
to complete our project.
Our project requires us to create four volumes for various applications. We have already determined
the size of the data requirements, but we wish to allocate additional capacity for each of the volumes
in anticipation of future growth. Table 1 below indicates the current sizes of our data, the sizes of
the volumes we will create, and the space that will consequently be reserved or allocated for those
volumes.
| Current Data Size | Volume Size | Allocated Space |
| Volume1 | 150GB | 300GB | 300GB |
| Volume2 | 170GB | 300GB | 300GB |
| Volume3 | 160GB | 250GB | 250GB |
| Volume4 | 60GB | 100GB | 100GB |
| Total Sizes | 540GB | 950GB | 950GB |
Our total data is only 540GB of space, but we have almost fully utilized the 1TB of storage in the array,
leaving us only 50GB of free space. Thus we are effectively only using 57% of the allocated storage, while 43%
is wasted, statically reserved for future growth of the four existing volumes, but unusable should other volumes
need to be deployed.
An Example with Thin Provisioning: Fully Utilized Storage Resources
Now let us consider the same example using Thin Provisioning. We use the Thin Provisioning manager's
GUI to instantly allocate the four volumes without any partition creation, without notion of where
they will be placed in the pool. We then set the volume sizes as per our plan. The Thin Provisioning
manager makes the allocation of the four new volumes as simple as a few mouse clicks.
At the time of creation, no space will have been taken from the pool. It is only when we begin
writing data onto the volumes that space will be allocated to them. So in our example, we assign
or map the four volumes to their respective application servers. From those servers, we format
the volumes and write our data onto them. Table 2 reveals that we have only allocated
from the pool the physical disk space required to hold the current data.
| Current Data Size | Volume Size | Allocated Space |
| Volume1 | 150GB | 300GB | 150GB |
| Volume2 | 170GB | 300GB | 170GB |
| Volume3 | 160GB | 250GB | 160GB |
| Volume4 | 60GB | 100GB | 60GB |
| Total Sizes | 540GB | 950GB | 540GB |
So far, there has been no Over Subscription. With little effort we have created four volumes
totaling 950 GB, nearly all of our 1TB pool. Our applications could fully utilize their allotted
storage as there is enough physical storage in the pool, but on examining the Thin Provisioning
manager, we determine that the pool is only about 54% allocated. In fact, as we populated our
volumes, the Thin Provisioning manager only allocated the storage necessary to hold the real data.
We still have 46% of our storage pool remaining, available should we need to use it.
An Example of Over Subscription: No-More-Headaches Storage Management
Let us now suppose a few months have passed, our applications continue using the available storage
of the pool, each growing at some rate. One day we need to create another volume for a project that
will require about 90GB today, but could grow to 200GB. We use the Thin Provisioning manager to create
another volume from the pool and size it to 200GB.
At this point, we have over subscribed: we have provisioned more storage to volumes than we
physically have in the pool. See Table 3 below.
| Current Data Size | Volume Size | Allocated Space |
| Volume1 | 170GB | 300GB | 170GB |
| Volume2 | 185GB | 300GB | 185GB |
| Volume3 | 165GB | 250GB | 165GB |
| Volume4 | 65GB | 100GB | 65GB |
| Volume5 | 90GB | 200GB | 90GB |
| Total Sizes | 675GB | 1150GB | 675GB |
Note that in spite of the fact we are over subscribed by 150GB, the pool is still only 67.5%
utilized, and 325GB of storage remains available for the growth of the 5 volumes.
What Happens When the Pool is Depleted?
(Coming Soon)
Growing The Pool With No Downtime
The abstraction layer created by Storage Pooling allows us to add storage to the pool at any time and potentially
without disrupting production. Since our Thin-Provisioned volumes are based on a virtual geometry and not on any
particular physical disk structure or RAID group, we can easily add more disks or LUNs to the storage pool without
modification to those volumes and without informing their associated application servers. We are, in effect, pouring
more storage into the storage pool from which our virtual volumes feed.
If the new physical storage can be added without shutting down the server, the procedure is as simple as discovering
the new physical storage (i.e. "Rescan Disks" to find new LUNs) and adding them to the pool.
If adding the new storage requires shutting down the server, using synchronous mirroring between two storage servers
can allow you to bring down the server without interrupting production: you simply fail all the servers active volumes
over to the surviving server, effectively performing a "rolling upgrade".
Thin Provisioning with VMWare VMFS Volumes
Using Thin Provisioning with your VMFS volumes radically improves capacity utilization and simplifies allocating
space for VMDK's. It also helps to decouple VM Server Sprawl from capacity allocation, meaning your calls to
the storage vendor are less frequent. [More on this later as I find time...]
Thin Provisioning With Snapshots
Using Thin Provisioning with point-in-time Snapshots can dramatically improve storage utilization.
Thin Provisioned snapshot volumes take the guesswork out of sizing snapshot reserve space and permits
the sharing of space among many snapshots.
DataCore implements Snapshots using Copy-On-Write technology: Changes to original blocks on a snapshot-enabled
source volume are copied to the snapshot destination volume before the changes are committed to the source volume.
Using Thin Provisioning means you no longer have to pre-allocate reserved disk space to hold the snapshot image -
physical storage will be allocated from the pool as needed to hold the original blocks when the source is changed.
Consider the example of taking a nightly snapshot of our volumes for backup purposes. [To be continued...]
© 2003-2006 LAS SOLANAS CONSULTING. ALL RIGHTS RESERVED. | Terms Of Use |
Various trademarks held by their respective owners.