Las Solanas Consulting

Storage Virtualization | FAQs & Discussions

Should I defragment SAN-based volumes?

A White Paper by Tim Warden

Updated 2012.05.17

This is a complicated question that cannot be answered with a simple yes or no. The objective of volume defragmentation is to improve I/O performance principally by reducing seek latency on an inherently slow physical medium: the disk drive or "spindle".

In a single physical disk environment, such as the hard drive on your laptop, occasional volume defragmentation can significantly improve performance, as most of us have observed. Reading a large file or document from disk is must faster when the file is stored on consecutive blocks. If the file is heavily fragmented, the actuator assembly (actuator, arms, heads, etc.) of the physical drive may have to "travel" some distance and wait for the disk platter to rotate into position before the read can continue. Defragmentation in this environment allows reading an entire file without "thrashing the disk". The blocks are contiguous, and so seek and rotational latency is minimized.

In a Storage Area Network (SAN) or Shared Storage Array, the physical disk or spindle has been virtualized; your server volumes are based on logical entities or "LUNs" that often have no direct relationship with the geometry of a particular physical spindle. We must therefore take into account many other factors affecting volume performance in light of defragmenting.

To begin with, consider the nature of a Shared Storage Array. Generally speaking, its purpose is to offer a set of disk services to consolidate and enhance data storage. The array is typically shared by several storage clients (i.e. Application Servers) who may all be concurrently vying for the shared resources. These resources include:

  • Network bandwidth (Fibre Channel or IP/iSCSI Cables and Switches)
  • Storage Array Front-End bandwidth (the Fibre or iSCSI ports on the array)
  • Storage Processor CPU cycles
  • Storage Processor cache
  • RAID controller cycles and cache
  • Bandwidth of Back-End physical disk connections
  • Physical disks
  • Capacity of Storage Tiers in Auto-Tiered environments
  • Synchronous Mirroring and/or Asynchronous Replication Bandwidth
How these resources are configured and how the disk space is allocated all influence the decision on whether to defragment. Let us look at defragmenting from two points of view: the process of defragmentation itself, and the end result of a defragmented logical volume.

Defragmenting is Very I/O Intensive

Anyone who has defragmented a laptop or desktop computer's internal hard disk knows that defragmentation is extremely disk intensive and can disrupt other concurrent processes.

This is exacerbated in a shared storage environment. The storage processor begins allotting the shared resources (cache, channels, RAID controller activity) to the heavy I/O load generated by the incessant reads and writes of the defragmentation process. Additionally, if the volume you are attempting to defragment is a partition on a RAID group, other servers owning volumes attributed to other partitions on that same RAID group will all suffer seriously degraded performance.

Some defragmentation programs quiesce or throttle their activity when they sense other I/O on the server, but alas, they can only listen for I/O activity on the server on which they are running. They have no visibility on I/O activity on other servers in the SAN.

If the volume you are defragmenting is on a RAID-5 group, performance is further degraded depending on whether the storage processor implements write cache coalescing and the size of the defragment program's writes. The size of a RAID-5 stripe is, among other things, dictated by the number of disks in the stripe. If the defragmenter's writes are smaller than the stripe, those physical disks containing the write destination blocks as well as the disk containing the parity information will need to be read into cache, the parity recalculated, and those same disk blocks re-written.

Worse still, if these small writes cross RAID-5 stripe boundaries, the result will be two stripes worth of reads, parity calculations, and writes. (For a detailed discussion of RAID concepts, consult this tutorial in Wikipedia.)

Of course, if the volume you are defragmenting is small, the immediate impact on performance will be temporary. However, if the target volume is over 10GB, the time required to complete defragmentation can be disruptive to a production environment, having an adverse affect on other critical, I/O intensive servers. Your only option is to defragment during "off hours" if your organization has such a concept in this 24/7 world.

Does Defragmenting Always Improve Performance On Logical Volumes?

Not necessarily. While defragmentation will certainly reduce File System processing for a given volume, it may not improve responsiveness of the retrieving of the data from the spindles. In fact, when multiple volumes are placed on partitions sharing a particular RAID group of spindles, defragmentation may actually degrade overall performance of the group of volumes, depending on how thorough the defragmentation process was. Depending on the RAID type employed, defragmented volumes may actually increase the distance the heads of the individual spindles have to travel (seek latency and subsequent rotational latency) to respond to various-sized concurrent read and write requests for the multiple volumes. It is difficult to draw a general conclusion that regrouping sectors of a logical volume into contiguous "logical" blocks on a RAID group will have a positive or adverse affect on performance.

Advocates of defragmenting will point out that some defrag programs optimize the file system organization to improve performance. While their assertion has merit in terms of the number of I/O requests of meta-data potentially required to satisfy a read or write, the impact of a file system reorg in a SAN shared storage environment is again difficult to measure as defragmenting could actually result in some frequently accessed blocks being placed on slower sectors of the RAID group's spindles — which could override any anticipated gains of the File System optimization. Remember that in a shared storage environment the logical blocks are often scattered across the physical spindles in a RAID group and not necessarily on the "fastest" sectors — unless you have gone to great lengths to isolate your most speed-critical volumes to known fast areas (e.g., the "Hyper-0" sectors on a Symmetrix DMX). Unfortunately, most shops don't have the time or resources to micro-manage or plan their SAN storage at that level.

Defragmenting Not Recommended for Virtual Storage Pools

In Virtual Storage Pools, the physical storage and/or logical volumes of RAID groups are completely abstracted from the virtual volumes mapped to storage clients. For instance, consider a Windows server in a SAN. In the Logical Disk Manager, a Virtual Volume may show up as, say, Disk 2. Format that volume with an NTFS file system. Write several gigabytes worth of data onto it. If Disk 2 is a volume coming from a Virtual Storage Pool based on several RAID-5 stripes, the Disk 2 data may in fact be spread across multiple disks on multiple RAID groups. Storage Allocation Units (SAU's) are assigned to specific groups of contiguous sectors for a volume on an as-needed (e.g. as written) basis, and may be striped across the storage pool members in a round-robin or striping basis.

Virtual Storage Pools that employ striping often yield outstanding performance results, similar to striping used in RAID-10, RAID-50, or RAID-100 type configurations. However, defragmentation of a logical volume becomes meaningless because there is no linear sector to sector, block to block correlation between the logical and physical disks. Defragmenting will typically only result in additional latency as the "logically contiguous" blocks become even more scattered across the pool.

Finally, defragmenting a Thin-Provisioned or Sparse Volume has the undesirable effect of causing unnecessary allocation of storage from the pool.

Defragmenting Not Recommended for Auto-Tiered Storage

Several storage vendors are now introducing Auto-Tiering into their products. Compellent pioneered this feature in 2006 well before their acquisition by Dell. Auto-Tiering exists in a few flavors based on vendor implementations — some are real time, others scheduled, some may be file based while others are sub-LUN block or "chunk" based — but in general the idea is for the storage controllers to monitor storage access and automatically promote the most frequently accessed storage to a higher tiers, while demoting infrequently or rarely accessed blocks to lower tiers.

DataCore introduced Auto-Tiering into their SANsymphony-V Storage Hypervisor in Release 8.1. The implementation is "sub-LUN", based on the pool's settable Storage Allocation Unit size, which by default is set to 128MB. The feature uses an periodic ROI (Return On Investment) calculation to determine, based on the virtual disk's profile, whether to promote or demote an SAU, thus eliminating "thrashing" and assuring that the tier assets are best utilized.

In a defragmentation operation with ROI-based Auto-Tiering, the single read of one block and its subsequent single write to another different block would not necessarily trigger the promotion or demotion of said read block. However, based on the owning virtual disks profile and the occupancy of the tiers, the block written could indeed be placed on a lower or higher tier. If the file system on the virtual disk were heavily fragmented, this could effectively have the undesirable effect of cause much of the virtual disks blocks to change tiers.

Finally, if you are using a Storage Hypervisor such as SANsymphony-V to virtualize storage, then you are likely aware that you can use cloud storage products, such as TwinStrata and Amazon S3 for your lowest or "Archive" tier. Defragmenting an Auto-Tiered LUN whose infrequently accessed blocks have been relegated to the cloud would again have the undesirable effect of potentially yanking those blocks back to a local storage tier.

Defragmenting Not Recommended for Solid State Drives

I think you can figure this one out without much explanation. Briefly: No moving parts, thus no seek or rotional latency, thus defragmentation has no real benefit. Worse still, you've no doubt heard that repeated program-erase cycles will eventually burnout the transisters.

Defragmenting Not Recommended for Replicated Volumes

If your volumes are the source of a Synchronous Data Mirror — or worse still, Asynchronous Data Replication — then you should definitely avoid defragmentation. Each write associated with the defrag process will result in corresponding mirror writes to the destination or replication site. This can easily consume the bandwidth of the interconnect. In the case of Asynch Replication over a slower link (T1, etc.), it can break the model, potentially causing a near or full resynchronization of the entire volume across the slow link (cf. "Internet Tubes").

Defragmenting Not Recommended for Snapshot Source Volumes

Defragmenting a volume which has active snapshots associated will have the adverse effect of causing the snapshot manager to save the original blocks as the blocks are reorganized on the source volume. In the common "Copy On Write" implementations, this can create an additional performance lag.

In those implementations that simply write the new blocks into "reserve space" and re-direct pointers (such as in NetApp's Snapshot implementation), the defragmentation process can easily eat up all the reserve space in the volume.

Defragmenting Not Recommended for CDP Volumes

If you've implemented a CDP (Continuous Data Protection) system, you'll want to avoid defragmenting those volumes, as the precedure will quickly eat up your CDP journal space. Hopefully, you will have chosen to implement CDP at the storage controller level and not as an agent on your server. However, if you are using host-based CDP agent that runs at the volume level (but not necessarily the file level), you will further impair performance on the server as each defrag write will correspond to two physical writes: one to the disk, and another agent write to the CDP server.

Alternatives For Improving Performance

If poor SAN I/O performance is causing you to study articles on disk defragmentation, you might want to consider looking for other ways to reduce I/O latency.

Use Intelligent Caching Storage Processors

Caching is a common and effective means of improving performance for just about any application. Caching is used ubiquitously from the CPU (with 3 levels of cache), to the RAM in your application servers, to the expensive cache on your storage processors. The idea is always to avoid accessing data over slow media, whether that be the slow spindles, the 1, 2, or 4Gb fibres, the slow PCI, PCI-X or PCI Express busses, the slow 667MHz memory busses or the slow L3 and L2 CPU cache. Slow is relative, with L2 being the fastest, and the spindles being the slowest.

Most spindles give I/O response times in thousands of microseconds. The typical storage processor gives I/O response times in hundreds of microseconds on a cache hit (e.g. where the data you are looking for is already in the storage processor's memory).

One Storage Virtualization vendor, DataCore Software Corporation, measures their I/O response times <=10 microseconds. This is a result of their sophisticated caching engine combined with a real-time driven I/O Subsystem. The fact that the software runs on common x86 servers is also a contributing factor, given Time-To-Market considerations, not to mention that you have a choice in what hardware you use to build the Storage Processor on.

Placing their SANsymphony-V Storage Hypervisor in front of your existing SAN Storage Arrays will have the effect of turbo-charging performance for a variety of reasons:

  • Additional layer of high performance adaptive cache (e.g. server RAM) that will change the character of both front-end and back-end I/Os on your existing SAN storage array, reducing contention.
  • Real-Time I/O subsystem uses I/O polling instead of interrupt handlers to complete I/Os
  • Takes advantage of new technologies your existing SAN may not have: the latest PCI-Express bus and memory speeds, 8Gbps (and soon 16Gbps) Fibre Channel, etc. For example, SANsymphony-V can talk 8Gb FC between your application servers and its cache, and on the backend talk 4Gb FC between its cache and your older SAN storage arrays.
  • Allows channel fan-out, effectively creating a Network Storage Processor that has more FC and/or iSCSI front-end ports than your existing SAN storage array.
  • Can be used to implement higher performance nested RAID levels on top of your existing SAN storage array or arrays: RAID 10, RAID 100, RAID 10+1, RAID 0+1, RAID 50+1, etc.
  • Allows Auto-Tiering of storage to assure the most critical and frequently accessed volume blocks are on the fastest disks.
  • Allows easy federation of different vendors' disk technologies, such as SSD drives from FusionIO, Violin or Texas Memory alongside lower tier storage from EMC, HP, Dell, XIO, Nexsan, NetApp, etc.

Use Tiered Virtual Storage Pools

Tiering your storage through virtualization can allow you to move less critical volumes off onto lower cost, lower performance storage, freeing up your higher performance, higher cost resources to focus on your most critical applications. For instance, rather than implement snapshots on your Tier-1 storage hardware, using a Storage Virtualization platform such as SANsymphony-V can allow you to take snapshots of your Tier-1 volumes on another storage platform. Thus you recover expensive Tier-1 storage space otherwise reserved for snapshots, and the Tier-1 Array is no longer involved in the Snapshot processing.

Distribute the Work Load Across Storage Processors and Channels

Consult your SAN Storage Arrays' utilities to determine if you have adequately distributed (or "Load Balanced") the I/O charge across your HBAs, fibres, switches, and storage processors.

Choose Optimized RAID levels for performance-critical volumes

Depending on the characteristics of your I/Os, optimizing RAID levels and / or re-organizing the physical layout of your LUNs can improve performance. If, for example, your database tables and logs are located on the same RAID 5 spindles and you are having performance issues, you should consider moving the logs onto another set of spindles, perhaps in a RAID 1 or RAID 10 configuration. Keeping small transaction write-intensive volumes off RAID-5 will definitely improve performance.

While attempting to avoid contention for the spindles, you should also consult your SAN Storage Arrays' documentation on how the spindles are physically connected to the storage processor. Keep in mind that in a dual-storage processor array, the disks are dual-ported. Laying out the spindles of different RAID groups across different back-end channels and keeping performance-critical volumes on separate channels and spindles can reduce contention on the back end resources.

Reduce Filesystem Fragmentation at the Source

Some of the more modern filesystems will attempt to reduce fragmentation by scattering their writes wide over the logical surface of today's large volumes. The idea is to leave plenty of white space between files so that as they grow over time there is no need to fragment them, thus no reason to defragment them either. The term "plenty" is obviously a relative term, depending on the size of the filesystem and how populated it is.

Condusiv Technologies, the editor of Diskeeper®, claims that their patented IntelliWrite technology proactively reduces fragmentation of NTFS filesystems. This is a far superior approach, eliminating fragmentation at the source, and thus eliminating the need to defragment.

I asked Gary Quan, Senior Vice President of Product Strategy at Condusiv for advice on how best to use Diskeeper® in a SAN environment. "IntelliWrite will help prevent up to 85% of the fragmentation, and thus AutoDefrag will not have to do as much." Nonetheless, for SAN devices he advised "our best practices are to just enable IntelliWrite and disable AutoDefrag. You may still need to run a traditional Defrag at the outset, especially for the new users who have never defragmented their filesystems." He mentioned one of Condusiv's larger Diskeeper customers that had over 40 million fragments on one volume as an example.

Gary adds, "By the way, was just at NetApp Testing facilities this last week to test out IntelliWrite and it worked as expected, preventing fragmentation and thus not causing any provisioning, copy-on-write, or dedup activity." That's good news for NetApp users, as the WAFL filesystem is notorious for fragmenting badly over time, particularly when making heavy use of snapshots which use a "Redirect On Write" technology, or their SIS (Single Instance Storage) feature that reduces duplication but effectively creates additional fragmentation at the block level.

If you would like to learn more click here for a link to a brief white paper from Condusiv explaining the IntelliWrite technology and the Diskeeper product in light of SAN storage volumes.

Test Drive DataCore's SANsymphony-V Storage Hypervisor

SANsymphony-V implements a Storage Hypervisor at the heart of the SAN, creating a new layer of cache, offering Sub-LUN Auto-Tiering and abstracting the storage from the servers. You can download a full-featured free 30-day trial here: http://www.datacore.com/30-Day-Trial.aspx

Other White Papers of Interest

Full Disclosure

The author of this white paper is an employee of DataCore Software Corporation, editor of the SANsymphony-V software mentioned above.

The author of this white paper is in no way affiliated with Condusiv or the Diskeeper® product, and has no financial interest whatsoever in Condusiv.

The opinions expressed above are those of the author and do not necessarily represent those of DataCore, Condusiv, or any of the other vendors mentioned herein.