Last Access And Shared Storage - A Simple But Effective Performance Tweak
A White Paper by Tim Warden, June 1, 2008
NTFS AND THE LAST ACCESS TIMESTAMP
Some time ago, I discovered a minor annoyance with the NTFS file system on
Windows machines. My fileserver uses SANmelody multi-path mirrored virtual
volumes for the fileshares. I use SANmelody Thin Provisioned Snapshots to
give me a rollback point in case I accidentally delete something. I noticed
that over time the thin-provisioned snapshot volume was growing, even though
I wasn't modifying anything on the source volume. With a bit of investigation,
I noticed that whenever I would copy files from the file share, or even just
browse the folders of the fileshare, Thin Provisioned storage was allocated on
the corresponding snapshot volume. This would indicate that even though I was
only reading the volume, something was causing writes to the volume, my snapshot
source.
I asked one of the DataCore software engineers if he knew why I was getting
these unexpected writes. He advised me to try the NT Resource Kit's Filemon
and Diskmon utilities to see what was up. As it turns out, every time I
accessed a file, a "SET INFORMATION" was showing up in Filemon, which
corresponded to a WRITE in Diskmon. It occurred to me that the write was
most likely caused by updates to the "Last Accessed" timestamp in the NTFS
file system.
I don't know about you, but I've never found any good use for that
"Last Accessed" timestamp. It seems like it would be good for security
monitoring or forensics, but anytime I've opened the "Properties" dialog
on a file, the "GET INFORMATION" call accesses the file, and so the file's
"Last Accessed" field immediately gets updated showing the current date
and time, which seems to defeat the purpose. Oh, I suppose I could call
"DIR /TA" from the command line (which doesn't "touch" the files, although it
does touch the directory), but it's not an imperative that I know when
files were last touched as long as I can at least know when they were
last modified.
More importantly, those Last Access updates increase "write" traffic to
the disk, which ultimately impedes performance, particularly when RAID5 is
in use. When taken in the context of a SAN environment with many Windows
servers, these "Last Access" updates can generate a lot of superfluous
write traffic, which can have some less than obvious consequences.
LAST ACCESS AND SNAPSHOTS
I've already pointed out one such consequence as relates to thin provisioned
snapshots: the "Copy On Write" mechanism employed by most SAN vendors's
snapshot implementations will need to copy over "chunks" from the source
to the snapshot area each time a file is first accessed or "touched",
needlessly using additional storage from the Thin Provisioned Storage Pool.
But those Last Access writes also create an additional lag on performance
with snapshots because each "copy on write" operation will result in a read
and two writes on the snapshot relationship and again, using RAID5 will
only exacerbate the snapshot write penalty.
LAST ACCESS AND VIRTUAL MACHINES
In Virtual Server environments such as Citrix XenServer, Virtual Iron, or
VMWare ESX, the Last Access timestamp updates on NTFS virtual machine disks
(e.g. VHD or VMDK files) can impose performance degradation beyond the simple
expected write penalty on the datastore. SCSI Reservations are used to
implement XenMotion, LiveMigrate, VMotion or any clustered filesystem to
decide which physical host gets to write the shared LUN at any given instant.
These additional Last Access writes will simply increase the likelihood of
contention on the shared resource; the potential increase in SCSI Reservation
conflicts would result in decreased performance.
As those writes clearly turn into changes on the associated vm disk
file, they will accordingly increase the workload and backup file sizes
of your vm backup system (e.g. vRanger Pro). Worse still, if you are using
or abusing a filesystem snapshot mechanism (like VMware snapshots), the
Last Access writes will cause the snapshot file to grow needlessly.
LAST ACCESS AND VOLUME REPLICATION
Finally, the writes associated with Last Access timestamp updates can have
a negative impact on bandwidth utilization and overall performance in
synchronous and asynchronous mirroring.
In Synchronous Mirroring (used in Business Continuity), the Last Access
writes will be faithfully mirrored between the primary and secondary volumes,
consuming a small increment of bandwidth and adding latency particularly
if write cache on the mirror is in a "pending" state, waiting on a cache flush
to complete.
As for Asynchronous Replication, many shops have bandwidth limited to T1
speeds; Last Access writes occurring on replicated volumes simply add
additional superfluous traffic over the inter-site link.
DISABLING LAST ACCESS UPDATES
As it turns out, there's a simple solution to the Last Access annoyance.
Recently I decided to perform an internet search (notice how I avoided
using the brand-name "Google" as a verb? :-) to see if there was a way to
disable "Last Access" timestamp updates in NTFS. And sure enough, Microsoft
has a Knowledge Base article on the subject:
http://msdn.microsoft.com/en-us/library/ms940846.aspx
Disabling Last Access Time Stamps is as simple as adding a registry key
in RegEdit, as documented in the following table.
|