Session 3042 - Snapshot Performance Revisited: The Picture Has Changed

SHARE 90
February 22-27, 1998


This was a tag-team matchup of two Storage Tek performance people. Al Permut was billed as a performance specialist, and Minda Larson as an analyst. They took turns presenting a set of foils on a pair of overheads.

(I noticed that there were several such sessions at this SHARE. I'm not sure why these things require two people; maybe this is a travel boondoggle for the speakers.)

Snapshot is an extra-charge feature of STK's Iceberg (which IBM has licensed and renamed "RAMAC Virtual Array" or simply "RVA"). It is a highly original and clever hack. You can tell Snapshot to make a copy of individual datasets or entire disk volumes, and the copy is made within the subsystem. What is magic about this is that the copy is accomplished in seconds.

How does this work? The answer is that the subsystem doesn't really make a copy of the data; it just updates some tables internally so that you think there are two copies. Here's the clever part: after Snapshot copy has taken place, and you write to the old copy of the dataset, Icebert/RVA writes the new data to a new place, but keeps the old copy of the data.

It is possible to make Snapshot copies of all your production disk volumes, and then start up your online systems. While the online systems are running, you can take your standard backups of the volumes processed by Snapshot. And when your backups are done - simply invalidate the Snapshot copies, and the subsystem will delete any old data that was recently "rewritten".

Snapshot has a firmware component, and some software that runs under MVS. As an optional feature of Snapshot, it will fall back to running a real-no-fooling copy utility (DFDSS) in the unlikely event that Snapshot fails. This is useful for sites that have multiple Icebergs, or for when you take your JCL to a disaster recovery site and it has no Iceberg.

An interminable period was spent reviewing actual performance numbers for different backup scenarios. I snooze. Some of the performance numbers involve a bogus marketing metric that they call "virtual transfer rate" - the number of gigabytes per minute that Iceberg is pretending to copy. What meaningless drivel.

One of the performance tests they ran was to do a Snapshot of a volume, then read both the source and destination volumes. The access times were of course equal. Duh.

The more interesting performance tests were ran between so-called "Turbo" boxes and "non-Turbo". The Turbo upgrade for Iceberg appears to be an improvement to the memory subsystem. The tables inside the Iceberg that keep track of the virtual volumes can be manipulated faster on a Turbo-equipped box. Their measurements suggested that a non-Turbo RVA begins to suffer performance degradation at about 1,000 I/O operations per second. The Turbo box will operate at approximately twice this rate before noticeably degrading.

Here's a cute trick: running a Snapshot copy can actually reduce the amount of data you have in your subsystem. Suppose you had run Snapshot copy against, say, some production volumes to create test volumes. The production and test volumes occupy exactly the same disk locations as the day starts, but during a day of updates the volumes diverge and take up more of your DASD subsystem. In the worst case, if you had rewritten all of your production data, then there would actually be two complete sets of data out on disk.

But the next Snapshot copy would delete the old test data, and you would be back to a single physical copy again. That's sort of counterintuitive at first, but it begins to make sense after awhile.


Back to session index
Back to index of SHARE meetings
Read the disclaimer