Should I defragment SAN-based volumes?
A White Paper by Tim Warden
This is a complicated question that cannot be answered with a simple yes or no.
The objective of volume defragmentation is to improve I/O performance principally
by reducing seek latency on an inherently slow physical medium: the disk drive
In a single physical disk environment, such as the hard drive on your laptop,
occasional volume defragmentation can significantly improve performance, as most
of us have observed. Reading a large file or document from disk is must faster
when the file is stored on consecutive blocks. If the file is heavily fragmented,
the actuator assembly (actuator, arms, heads, etc.) of the physical drive may have
to "travel" some distance and wait for the disk platter to rotate into position
before the read can continue. Defragmentation in this environment allows reading
an entire file without "thrashing the disk". The blocks are contiguous, and so
seek and rotational latency is minimized.
In a Storage Area Network (SAN) or Shared Storage Array, the physical disk or
spindle has been virtualized; your server volumes are based on logical entities or
"LUNs" that often have no direct relationship with the geometry of a particular
physical spindle. We must therefore take into account many other factors
affecting volume performance in light of defragmenting.
To begin with, consider the nature of a Shared Storage Array. Generally
speaking, its purpose is to offer a set of disk services to consolidate and enhance
data storage. The array is typically shared by several storage clients
(i.e. Application Servers) who may all be concurrently vying for the shared
resources. These resources include:
How these resources are configured and how the disk space is allocated
all influence the decision on whether to defragment. Let us look at
defragmenting from two points of view: the process of defragmentation
itself, and the end result of a defragmented logical volume.
- Network bandwidth (Fibre Channel or IP/iSCSI Cables and Switches)
- Storage Array Front-End bandwidth (the Fibre or iSCSI ports on the array)
- Storage Processor CPU cycles
- Storage Processor cache
- RAID controller cycles and cache
- Bandwidth of Back-End physical disk connections
- Physical disks
- Capacity of Storage Tiers in Auto-Tiered environments
- Synchronous Mirroring and/or Asynchronous Replication Bandwidth
Defragmenting is Very I/O Intensive
Anyone who has defragmented a laptop or desktop computer's internal
hard disk knows that defragmentation is extremely disk intensive and
can disrupt other concurrent processes.
This is exacerbated in a shared storage environment. The storage
processor begins allotting the shared resources (cache, channels, RAID
controller activity) to the heavy I/O load generated by the incessant
reads and writes of the defragmentation process. Additionally, if the
volume you are attempting to defragment is a partition on a RAID
group, other servers owning volumes attributed to other partitions
on that same RAID group will all suffer seriously degraded performance.
Some defragmentation programs quiesce or throttle their activity
when they sense other I/O on the server, but alas, they can only
listen for I/O activity on the server on which they are running.
They have no visibility on I/O activity on other servers in the
If the volume you are defragmenting is on a RAID-5 group,
performance is further degraded depending on whether the storage
processor implements write cache coalescing and the size of the
defragment program's writes. The size of a RAID-5 stripe is,
among other things, dictated by the number of disks in the stripe.
If the defragmenter's writes are smaller than the stripe, those
physical disks containing the write destination blocks as well
as the disk containing the parity information will need to be
read into cache, the parity recalculated, and those same disk
Worse still, if these small writes cross RAID-5 stripe boundaries,
the result will be two stripes worth of reads, parity calculations,
and writes. (For a detailed discussion of RAID concepts, consult this tutorial in
Of course, if the volume you are defragmenting is small, the
immediate impact on performance will be temporary. However, if the
target volume is over 10GB, the time required to complete defragmentation
can be disruptive to a production environment, having an
adverse affect on other critical, I/O intensive servers. Your only
option is to defragment during "off hours" if your organization has
such a concept in this 24/7 world.
Does Defragmenting Always Improve Performance On Logical Volumes?
Not necessarily. While defragmentation will certainly reduce File System processing
for a given volume, it may not improve responsiveness of the retrieving of the data
from the spindles. In fact, when multiple volumes are placed on partitions
sharing a particular RAID group of spindles, defragmentation may actually
degrade overall performance of the group of volumes, depending on how
thorough the defragmentation process was. Depending on the RAID type
employed, defragmented volumes may actually increase the distance the heads
of the individual spindles have to travel (seek latency and subsequent
rotational latency) to respond to various-sized concurrent read and write
requests for the multiple volumes. It is difficult to draw a general conclusion
that regrouping sectors of a logical volume into contiguous "logical"
blocks on a RAID group will have a positive or adverse affect
Advocates of defragmenting will point out that some defrag
programs optimize the file system organization to improve performance.
While their assertion has merit in terms of the number of I/O requests of
meta-data potentially required to satisfy a read or write, the impact of
a file system reorg in a SAN shared storage environment is again
difficult to measure as defragmenting could actually result in some
frequently accessed blocks being placed on slower sectors of the RAID group's
spindles which could override any anticipated gains of the File
System optimization. Remember that in a shared storage environment the
logical blocks are often scattered across the physical spindles in a RAID
group and not necessarily on the "fastest" sectors unless you have
gone to great lengths to isolate your most speed-critical volumes to known
fast areas (e.g., the "Hyper-0" sectors on a Symmetrix DMX). Unfortunately,
most shops don't have the time or resources to micro-manage or plan their
SAN storage at that level.
Defragmenting Not Recommended for Virtual Storage Pools
In Virtual Storage Pools, the physical storage and/or logical volumes
of RAID groups are completely abstracted from the virtual volumes
mapped to storage clients. For instance, consider a Windows server
in a SAN. In the Logical Disk Manager, a Virtual Volume may show
up as, say, Disk 2. Format that volume with an NTFS file system.
Write several gigabytes worth of data onto it. If Disk 2 is a
volume coming from a Virtual Storage Pool based on several
RAID-5 stripes, the Disk 2 data may in fact be spread across multiple
disks on multiple RAID groups. Storage Allocation Units (SAU's) are
assigned to specific groups of contiguous sectors for a volume
on an as-needed (e.g. as written) basis, and may be striped across
the storage pool members in a round-robin or striping basis.
Virtual Storage Pools that employ striping often yield outstanding
performance results, similar to striping used in RAID-10, RAID-50, or
RAID-100 type configurations. However, defragmentation of a logical
volume becomes meaningless because there is no linear sector to
sector, block to block correlation between the logical and physical
disks. Defragmenting will typically only result in additional latency
as the "logically contiguous" blocks become even more scattered across
Finally, defragmenting a Thin-Provisioned or Sparse Volume
has the undesirable effect of causing unnecessary allocation
of storage from the pool.
Defragmenting Not Recommended for Auto-Tiered Storage
Several storage vendors are now introducing Auto-Tiering into
their products. Compellent pioneered this feature in 2006 well before
their acquisition by Dell. Auto-Tiering exists in a few flavors
based on vendor implementations some are real time, others scheduled,
some may be file based while others are sub-LUN block or "chunk" based
but in general the idea is for the storage controllers to monitor storage
access and automatically promote the most frequently accessed storage to a
higher tiers, while demoting infrequently or rarely accessed blocks to
DataCore introduced Auto-Tiering
into their SANsymphony-V
Storage Hypervisor in Release 8.1.
The implementation is "sub-LUN", based on the pool's settable
Storage Allocation Unit size, which by default is set to 128MB.
The feature uses an periodic ROI (Return On Investment) calculation to
determine, based on the virtual disk's profile, whether to promote or
demote an SAU, thus eliminating "thrashing" and assuring that the
tier assets are best utilized.
In a defragmentation operation with ROI-based Auto-Tiering, the single
read of one block and its subsequent single write to another different
block would not necessarily trigger the promotion or demotion of said
read block. However, based on the owning virtual disks profile and the
occupancy of the tiers, the block written could indeed be placed on a
lower or higher tier. If the file system on the virtual disk were heavily
fragmented, this could effectively have the undesirable effect of cause
much of the virtual disks blocks to change tiers.
Finally, if you are using a Storage Hypervisor such as
to virtualize storage, then you are likely aware that you can use cloud
storage products, such as TwinStrata
and Amazon S3 for your lowest or
"Archive" tier. Defragmenting an Auto-Tiered LUN whose infrequently accessed blocks
have been relegated to the cloud would again have the undesirable effect
of potentially yanking those blocks back to a local storage tier.
Defragmenting Not Recommended for Solid State Drives
I think you can figure this one out without much explanation.
Briefly: No moving parts, thus no seek or rotional latency, thus
defragmentation has no real benefit. Worse still, you've no doubt
heard that repeated program-erase cycles will eventually
burnout the transisters.
Defragmenting Not Recommended for Replicated Volumes
If your volumes are the source of a Synchronous Data Mirror
or worse still, Asynchronous Data Replication then you should
definitely avoid defragmentation. Each write associated with the
defrag process will result in corresponding mirror writes to the
destination or replication site. This can easily consume the bandwidth
of the interconnect. In the case of Asynch Replication over a slower
link (T1, etc.), it can break the model, potentially causing a near or
full resynchronization of the entire volume across the slow link
(cf. "Internet Tubes").
Defragmenting Not Recommended for Snapshot Source Volumes
Defragmenting a volume which has active snapshots associated will
have the adverse effect of causing the snapshot manager to save
the original blocks as the blocks are reorganized on the source
volume. In the common "Copy On Write" implementations, this can
create an additional performance lag.
In those implementations that simply write the new blocks
into "reserve space" and re-direct pointers (such as in NetApp's
Snapshot implementation), the defragmentation process can easily
eat up all the reserve space in the volume.
Defragmenting Not Recommended for CDP Volumes
If you've implemented a
CDP (Continuous Data Protection)
system, you'll want to avoid defragmenting those volumes, as the precedure
will quickly eat up your CDP
journal space. Hopefully, you will have
chosen to implement CDP
at the storage controller level and not as
an agent on your server. However, if you are using host-based CDP
agent that runs at the volume level (but not necessarily the file
level), you will further impair performance on the server as each
defrag write will correspond to two physical writes: one to the disk,
and another agent write to the CDP server.
Alternatives For Improving Performance
If poor SAN I/O performance is causing you to study articles on
disk defragmentation, you might want to consider looking for
other ways to reduce I/O latency.
Use Intelligent Caching Storage Processors
Caching is a common and effective means of improving performance for
just about any application. Caching is used ubiquitously from the
CPU (with 3 levels of cache), to the RAM in your application servers,
to the expensive cache on your storage processors. The idea is always
to avoid accessing data over slow media, whether that be the slow
spindles, the 1, 2, or 4Gb fibres, the slow PCI, PCI-X or PCI Express
busses, the slow 667MHz memory busses or the slow L3 and L2 CPU cache.
Slow is relative, with L2 being the fastest, and the spindles being
Most spindles give I/O response times in thousands of microseconds.
The typical storage processor gives I/O response times in hundreds of
microseconds on a cache hit (e.g. where the data you are looking for
is already in the storage processor's memory).
One Storage Virtualization
vendor, DataCore Software Corporation,
measures their I/O response times <=10 microseconds. This is a
result of their sophisticated caching engine combined with a real-time
driven I/O Subsystem. The fact that the software runs on common x86 servers
is also a contributing factor, given Time-To-Market considerations, not
to mention that you have a choice in what hardware you use to build
the Storage Processor on.
Placing their SANsymphony-V
Storage Hypervisor in front of your existing SAN Storage Arrays will have the effect of
turbo-charging performance for a variety of reasons:
- Additional layer of high performance adaptive cache (e.g. server RAM) that
will change the character of both front-end and back-end I/Os on your existing
SAN storage array, reducing contention.
- Real-Time I/O subsystem uses I/O polling instead of interrupt handlers to complete I/Os
- Takes advantage of new technologies your existing SAN may not have: the latest PCI-Express bus and memory
speeds, 8Gbps (and soon 16Gbps) Fibre Channel, etc. For example,
SANsymphony-V can talk 8Gb FC between your
application servers and its cache, and on the backend talk 4Gb FC between its cache and
your older SAN storage arrays.
- Allows channel fan-out, effectively creating a Network Storage Processor that
has more FC and/or iSCSI front-end ports than your existing SAN storage array.
- Can be used to implement higher performance nested RAID levels on top of your
existing SAN storage array or arrays: RAID 10, RAID 100, RAID 10+1, RAID 0+1, RAID 50+1, etc.
- Allows Auto-Tiering of storage to assure the most critical and frequently accessed
volume blocks are on the fastest disks.
- Allows easy federation of different vendors' disk technologies, such as SSD drives from
Texas Memory alongside lower tier storage
from EMC, HP, Dell, XIO,
Nexsan, NetApp, etc.
Use Tiered Virtual Storage Pools
Tiering your storage through virtualization can allow you to move less critical
volumes off onto lower cost, lower performance storage, freeing up your higher
performance, higher cost resources to focus on your most critical applications.
For instance, rather than implement snapshots on your Tier-1 storage hardware, using
a Storage Virtualization platform such as
can allow you to take snapshots of your Tier-1 volumes on another
storage platform. Thus you recover expensive Tier-1 storage space otherwise
reserved for snapshots, and the Tier-1 Array is no longer involved in the
Distribute the Work Load Across Storage Processors and Channels
Consult your SAN Storage Arrays' utilities to determine if you have
adequately distributed (or "Load Balanced") the I/O charge across
your HBAs, fibres, switches, and storage processors.
Choose Optimized RAID levels for performance-critical volumes
Depending on the characteristics of your I/Os, optimizing RAID levels
and / or re-organizing the physical layout of your LUNs can improve
performance. If, for example, your database tables and logs are
located on the same RAID 5 spindles and you are having performance
issues, you should consider moving the logs onto another set of
spindles, perhaps in a RAID 1 or RAID 10 configuration. Keeping
small transaction write-intensive volumes off RAID-5 will definitely
While attempting to avoid contention for the spindles, you should
also consult your SAN Storage Arrays' documentation on how the spindles
are physically connected to the storage processor. Keep in mind that
in a dual-storage processor array, the disks are dual-ported. Laying
out the spindles of different RAID groups across different back-end
channels and keeping performance-critical volumes on separate channels
and spindles can reduce contention on the back end resources.
Reduce Filesystem Fragmentation at the Source
Some of the more modern filesystems will attempt to reduce fragmentation
by scattering their writes wide over the logical surface of today's large
volumes. The idea is to leave plenty of white space between
files so that as they grow over time there is no need to fragment them,
thus no reason to defragment them either. The term "plenty" is obviously
a relative term, depending on the size of the filesystem and how
populated it is.
the editor of Diskeeper®,
claims that their patented IntelliWrite technology proactively reduces
fragmentation of NTFS filesystems. This is a far superior approach,
eliminating fragmentation at the source, and thus eliminating the need
I asked Gary Quan, Senior Vice President of Product Strategy at
for advice on how best to use Diskeeper®
in a SAN environment. "IntelliWrite will help prevent up to 85% of the
fragmentation, and thus AutoDefrag will not have to do as much." Nonetheless,
for SAN devices he advised "our best practices are to just enable IntelliWrite
and disable AutoDefrag. You may still need to run a traditional Defrag
at the outset, especially for the new users who have never defragmented
their filesystems." He mentioned one of Condusiv's larger Diskeeper
customers that had over 40 million fragments on one volume as an example.
Gary adds, "By the way, was just at NetApp
Testing facilities this last week to test out IntelliWrite and it worked as
expected, preventing fragmentation and thus not causing any provisioning,
copy-on-write, or dedup activity." That's good news for
NetApp users, as the
WAFL filesystem is notorious for fragmenting badly over time, particularly
when making heavy use of snapshots which use a "Redirect On Write" technology,
or their SIS (Single Instance Storage) feature that reduces duplication
but effectively creates additional fragmentation at the block level.
If you would like to learn more
click here for a link
to a brief white paper from Condusiv explaining the IntelliWrite technology
and the Diskeeper product in light of SAN storage volumes.
Test Drive DataCore's SANsymphony-V Storage Hypervisor
SANsymphony-V implements a Storage Hypervisor
at the heart of the SAN, creating a new layer of cache, offering Sub-LUN Auto-Tiering
and abstracting the storage from the servers. You can
download a full-featured free 30-day trial
Other White Papers of Interest
The author of this white paper is an employee of
DataCore Software Corporation,
editor of the SANsymphony-V
software mentioned above.
The author of this white paper is in no way affiliated with
or the Diskeeper®
product, and has no financial interest whatsoever in
The opinions expressed above are those of the author and do not necessarily
represent those of DataCore, Condusiv, or any of the other vendors mentioned herein.