Las Solanas Consulting

Storage Virtualization | FAQs & Discussions

A Replication Solution Based on DataCore Products

A White Paper by Tim Warden

[Return to Page 1 - Overview of Async Replication]

AIM : Asynchronous IP Mirroring

AIM is DataCore's async replication package; the acronym stands for Asynchronous IP Mirroring. The package is available for both our SANsymphony and SANmelody SAN Storage Virtualization products, and as the name implies, AIM uses standard IP to replicate volumes... no expensive protocol converters or dedicated links required.

AIM Prerequisites

Unlike typical hardware- or agent-based replication methods, AIM is fairly straightforward and doesn't have a long list of prerequisites for the implementation.

Obviously, all the topics we've covered thus far concerning inter-site bandwidth and quantity of data to replicate still apply with AIM. However, aside from implementing a DataCore SANmelody SAN storage array or a DataCore SANsymphony Storage Virtualization project, there's not much more to do.

You will need to have the AIM option enabled on the replication controllers (it's just a license key). You'll also need an IP connection between the two sites that can be used for replicating your selected SAN volumes. Finally, ou will need to allocate some disk space on the source and destination controllers to buffer the asynchronous replication stream.

Replication Stream Buffers

AIM uses standard NTFS space to buffer replication at the source and destination. For each replication stream (e.g. each volume you are replicating), a unique folder is created to contain a set of replication files. These files contain the block-level writes that are to be replicated, as well as meta data such as timestamp and sequence information.

At the source site, the buffer is necessary to accumulate the changes until the replication engine is able to transfer them to the destination site. The volume the source buffers are located on should be adequate to accommodate link outages or slow bandwidth links. The AIM documentation explains how to calculate the space, and any of DataCore's SE's (such as myself) can help you determine how large the buffer should be.

Finally, AIM also uses buffer space at the destination site to buffer arriving replication files. This is necessary to make sure changes are destaged in order in case files arrive out of order. This is also useful if the destination volume should go offline; the replicated changes will be buffered at the destination until you have brought the target destination volume back online.

I would recommend placing these buffers on a separate NTFS partition, either a partition on the rather static boot/system disks. You've probably created them as a RAID-1 set anyway, and your C: drive doesn't need to be that big... use the extra space to hold your AIM buffers. By isolating the buffers on a separate partition, if the AIM buffers fill completely there won't be any contention with other system or application files.

You may also want to assure that the AIM buffers are not physically sharing spindles or controllers with your SAN volumes, so that the AIM reads and writes do not contend with your backend production traffic.

Preparing The Source Site

AIM can be configured to use two transport mechanisms, CIFS and FTP. CIFS is, of course, Microsoft's implementation of SMB, or common Windows File Sharing. FTP is the ubiquitous File Transfer Protocol.

Configuring The Async Connection
Configuring The Async Connection

AIM uses the Windows API to access FTP commands. As such, any compliant FTP package for Windows will work, including of course, the FTP engine provided in Microsoft's IIS. Other FTP packages known to work include Bullet Proof FTP and Vermilion FTP. There are many, many others out there that would probably work fine as well.

By default, AIM will use port 21 for FTP. Obviously, you can change this, as you will note in the dialog above.

Establishing Replication Relationships

Creating Async Replication Sets
Creating An Async Replication Set

Initializing The Volumes

When you issue an "Initialize" command for an AIM source volume, AIM begins a full replication of the source volume copying all blocks into files in the source buffer.

Initializing With Limited Bandwidth

If we are using a T1 for our replication pipe and have several large volumes, the initialization step required to bring the source and destination into synch may take too long. We will need to use an alternate means to perform the initialization.

While there are a few ways to accomplish this, we will focus on one that uses the replication buffering mechanism inherent to AIM.

[Describe using a USB2 drive, etc., to ship the intialization data to the remote site.]

Establishing Replication Schedules And Throttling

Replication Schedules
Scheduling Replication Throttles

AIM allows you to create a Transfer Schedule for each replication destination node. The transfer schedule works by injecting delays between replication file transfers. In the screenshot, I've put together an elaborate schedule that caps replication throughout the work hours of the day, leaving it unthrottled from 8pm to 7am. The schedule is based on hours/minutes, so you have adequate granularity to control the bandwidth utilization.

As the throttle works by injecting delays between the file transfers, the replication will be somewhat "bursty". You can control the length of the bursts by increasing or decreasing replication file sizes; you can control the frequency of the bursts by increasing or decreasing the delay.

Using Snapshots With AIM

Using Custom Markers With AIM

Automating Remote Site Tasks
Automating DR Site Tasks With Custom Markers

In addition to Snapshot Markers, AIM includes a Custom Marker facility that can be used to trigger housekeeping scripts or programs at the DR site. Custom Markers can be as simple as a Windows command file or VB script that takes a snapshot and triggers a tape backup.

You can have multiple custom markers for different actions, such as taking a nightly incremental, a weekly full backup, or sending a weekly storage utilization report via email.

Using AIM Groups to Promote Content Consistency Across Volumes

Volume Consistency Groups
Volume Consistency Groups

AIM has a facility to create volume consistency groups for inter-dependent volumes. With AIM Groups, a Group Snapshot or Group Custom Marker will execute against all the volumes in the group at the same point in time to assure consistency is maintained.

This is particularly useful, for example, when replicating an Oracle database whose tables span multiple volumes. With an AIM Group snapshot, you would have what some vendors call "Storage-based Consistency" (as opposed to Application- based Consistency achieved by stopping or quiescing the application, flushing file manager and I/O subsystem caches, etc.) The set of volumes would be in a state consistent with having experienced a system crash or power failure.

From a purely logical point of view, so-called Storage-based Consistency does not guarantee that the set of volumes will indeed be in an inter-dependent content consistent state, because the Storage Processor has no control over the caching performed by the host OS that the application (in this example, an Oracle database) is running on.

Snapshot Groups for Inter-Dependent Volume Consistency
Example of Snapshots on Inter-Dependent Oracle Volumes

Nonetheless, placing inter-dependent volumes in a Consistenty Group such as that offered by AIM improves coherency of snapshot on the set and also facilitates scripting as you'll only be issuing a command once across the group instead of multiple times.

Finally, AIM Group commands can be pushed from the host owning the volumes, which can be used to flush application server cache, take the snapshot, thus further improving content consistency over the volumes in the set.

Replicating Source Snapshots To Implement De-Duplication

AIM can replicate any SANmelody or SANsymphony virtual volume, including snapshots. We can use this functionality to de-duplicate the replicated data with respect to our RPO.

Consider a volume "Test" which we have mapped to an application server, perhaps a SQL or Exchange server. We use the Snapshot Manager in SANmelody or SANsymphony to create a snapshot volume called "Test-SS". We make the snapshot a "Complete Image" or "Clone" (or BCV, if you prefer) of our "Test" volume.

We create a snapshot script on our SQL or Exchange server to refresh or increment our snapshot every 15 minutes. Thus every 15 minutes, our Test-SS clone volume will be updated with the most recent changes to the Test volume.

Complete Image Snapshots
A Complete Image Snapshot

Now we make our Test-SS volume the source of an AIM replication to the DR site. Every 15 minutes, as the snapshot increments, those latest changed blocks will be copied to the snapshot's virtual volume and so, as writes, they will be replicated to the DR site. We have, in effect, de-duplicated any multiple writes that may have occurred on any given block during the past 15 minutes.

Replicating Snapshots
A Complete Image Snapshot... Replicated

How useful is de-duplication in async replication? That all depends on the application and the server's file system that owns the volume you are replicating. If, when re-writing a file, the file system re-writes the same blocks, then de-duplication can pay off handsomely. This behavior would typically be the case with large files such as database tables where specific records occupying specific blocks on the volume may be written multiple times.

However, in the case of office productivity applications like word processors, it is not a given that de-duplication will have much value. As a software engineer frequently working on a text editor, I developed the habit of frequently hitting "Control-S" (in fact, it was an "Apple-S"...) to make sure I didn't lose my work. So how does a given word processor work? Each time I hit "Save" is it writing a new copy of the file before deleting the previous saved copy? And so the file system writing therefore writing the new file into different blocks? If so, any of the industry's block-level de-duplication technologies isn't going to work. As the application is also doing file renames during the saves, chances are most of the file-level replication packages won't be able to de-duplicate those files, either. C'est la vie.

Pushing Snapshots From The Hosts

Scheduling Snapshots
Take A Snapshot Every 15 Minutes

In addition to automated mechanisms available with the AIM GUI, AIM includes a command line interface that can be driven by script from the hosts or from the DataCore storage controllers.

A Windows script to invoke a snapshot from the hosts can be all of 2 or 20 lines, depending on how elaborate you want to get. In the example above where we periodically replicate source snapshot updates, the script I used was all of 5 lines, including the "@echo off".

The script was installed as a Windows Scheduled Task, set to run every 15 minutes. Similar scripts can easily be run as cron jobs in Unix implementations.

Using VSS with SQL2005 to Automate Snapshots

To be completed...

Replicating Non-SAN-Attached Windows Servers and Laptops with AIM Client

The AIM Client package from DataCore allows you to replicate individual Windows servers, desktops or laptops back to a SANmelody or SANsymphony storage server. The product is ideal for remote or satelite offices where the cost of a SAN storage array is unjustified. With AIM Client you can integrate smaller satelite offices into an overall DR plan without great expense.

Async Replication of Laptop To SAN
The AIM Client GUI

Examples of implementation forthcoming... it all takes time, folks... If you would like to see this now, give me a call and I'll set up a webinar and demonstrate the product. AIM Client is an elegant solution.

The Value of Thin Provisioning in DR Implementations

Whether or not you currently use Thin Provisioning in your environment, deploying it in the context of a DR or Async Replication project will save you a LOT in both time and money. As you will have noted in the these discussions, async replication means double the storage. Some of the techniques we have discussed, such as snapshot clones or BCV volumes to implement de-duplication further gobble up disk space.

With Thin Provisioning, you can effectively reduce disk consumption while still taking "Complete Images" or BCVs of your volumes. Imagine you have a 300GB VMFS volume based on Thin Provisioned storage. Suppose it currently has 125GB of data on it, 125GB of allocated storage. Suppose you wish to use our aforementioned de-duplication technique on this volume. You create a "Complete Image" snapshot of the volume onto Thin Provisioned storage: the BCV or clone will weigh in at 125GB, but can grow to 300GB as necessary. You use AIM to replicate the snapshot... you replicate 125GB, not 300GB. Your actively replicated AIM destination (or DR) volume, also based on pooled, Thin Provisioned storage, weighs only 125GB, but can grow to 300GB. You take a snapshot of the replicated volume so you can test your DR or run a backup. The Thin Provisioned snapshot begins its life as an empty disk geometry, weighing all of one "Storage Allocation Unit", perhaps 8 or 16MB. As replication activity continues and new blocks arrive and are destaged into the DR volume, the snapshot volume will slowly grow.

Thin Provisioning's Storage Pooling aspect means you won't be spending hours studying how you're going to allocate partitions for the replicated volumes on the DR site's storage arrays. DataCore's Thin Provisioning manager will handle all that for you.

For a complete discussion of Storage Pooling, Thin Provisioning and Over Subscription, please read my white paper.