Wednesday, December 23, 2009

Why is the Qtree SnapMirror transfer size larger than expected?

 

Solution ID: ntapcs8544
Last updated: 6 DEC 2007

Why is the Qtree SnapMirror transfer size larger than expected?

View Environment section

Symptoms
SnapMirror status output shows transfers that are larger than the sum of changes in the qtree that was replicated.

Qtree SnapMirror (QSM) transfer size is higher then client utilities calculate.

Why am I getting Stale File Handle errors from my mounted exports on a QSM destination?

 
Why is the Qtree SnapMirror transfer size larger than expected?

Cause of this problem
Although the qtree was unchanged, something changed in the volume where the qtree resides.

Solution
The files associated with a qtree are unchanged, but changes were made to an inode that shares an "inodefile" block with the qtree.  Because of this, Data ONTAP transfers the metadata which contains the associated attributes, such as time and permissions, from the qtree inode. None of the unchanged data is transferred. This prevents a security hole where permissions are changed on a file, but are not replicated.

Client utilities such as rsync can determine how many bytes have changed.  However, they cannot determine how many blocks of the file were impacted and will need to be retransmitted. For example:
 
If 100 bytes are inserted into the middle of a 10 MB file, all of the trailing bytes will be shifted down. Utilities like Rsync are able to see the 100 byte change, but they are unable to see the other 5242880 bytes or the inode metadata that has to be transmitted because of the shifting of blocks.

 

Environment
EnvironmentBookmark
 
Data ONTAP
NetApp filer
NearStore appliance
SnapMirror

 

Which snapshots are deleted automatically and which ones are supposed to be busy?

 

Solution ID: kb17046
Last updated: 14 OCT 2008

Which snapshots are deleted automatically and which ones are supposed to be busy?

View Environment section

Symptoms
Should I delete a busy Snapshot?
Which snapshots are deleted automatically and which ones are supposed to be busy?

Keywords : busy snapshot snapmirror

Solution
Which Snapshots are supposed to be deleted automatically?

SnapMirror keeps only the most recent Snapshot and deletes the previous one as soon as the next update completes successfully.


SnapVault and regular volume Snapshots are deleted according to the Snapshot retention value set in the schedule.
 
Which Snapshots are not deleted automatically?

Snapshots created by SnapMirror resync, SnapVault restore, snap restore, dump, volcopy, and ndmpcopy are not deleted automatically.
Which Snapshots are expected to be in a busy state?

Qtree SnapMirror (QSM) and SnapVault lock the last created Snapshot on the destination because it is needed for the next incremental update. On the source, the Snapshot is owned by the service that created it, but it is not locked.


Note that for volume SnapMirror (VSM), all Snapshots on the source volume will be marked "busy" during a SnapMirror update because all Snapshots are transferred. Once the VSM update completes, the busy status will automatically clear from the volume Snapshots.

Once the transfer has completed the snapshot will go back into a "snapmirror" state. The first snapshot from the snapmirror relationship will then be deleted. Any snapshot that was also held as "busy" because of the transfer will be released.
For example:

Volume testvol1
working...

%/used %/total date name
---------- ---------- ------------ --------
0% ( 0%) 0% ( 0%) Apr 16 10:33 f840-ca3(0033583371)_dstsm.31 (busy,snapmirror)
0% ( 0%) 0% ( 0%) Apr 16 10:33 test1 (busy)
1% ( 0%) 0% ( 0%) Apr 16 10:31 f840-ca3(0033583371)_dstsm.30 (busy,snapmirror)

f840-ca1> snapmirror status
Snapmirror is on.
Source Destination State Lag Status

f840-ca1:testvol1 f840-ca3:dstsm Source 00:02:10 Transferring (136 KB done)

f840-ca1> snap list testvol1
Volume testvol1
working...

%/used %/total date name
---------- ---------- ------------ --------
0% ( 0%) 0% ( 0%) Apr 16 10:33 f840-ca3(0033583371)_dstsm.31 (snapmirror)
0% ( 0%) 0% ( 0%) Apr 16 10:33 test1


Also, if a Snapshot is being used by a NDMP backup or dump, the Snapshot will be marked "(busy, backup[#],snapmirror)". The busy status will automatically clear once the backup has completed, or if the backup is terminated. To determine if a backup is using the Snapshot, run the "backup status" command.

 

LUN or volume clone, CIFS share, RAID mirroring, etc. lock their respective Snapshot.

Note: See Deleting busy snapshots for more information on locked Snapshots.

 

Starting in Data ONTAP 7.3, functionality has been added to allow FlexClones to be built off snapmirror destinations without causing SnapMirror transfers to fail. However, for versions prior to Data ONTAP 7.3, FlexClone locks SnapMirror Snapshots: you can clone SnapMirror volumes, but you must be aware that a clone of a SnapMirror destination locks the Snapshot copy from which the clone was created. In doing so, it also locks down that copy in the source volume, and in every volume in the cascade, if the volume is part of a SnapMirror cascade. Also, if a FlexClone volume is created from a Snapshot copy in the destination volume that is not the most recent copy, and hence has locked down the Snapshot copy, if that Snapshot copy no longer exists on the source volume, every update needs to delete the copy on the destination. In this case, all SnapMirror updates to the destination volume will fail until the clone is destroyed or split. This does not occur if the clone is created from the most recent Snapshot copy in the SnapMirror destination, because that copy still exists in the source volume.
Use snap list -b vol_name command to define which service owns a Snapshot: The "-b" option is available in Data ONTAP 7.0.2 and later. The first column lists all the Snapshots in a volume. The second column lists owners of the Snapshot, if it is busy. If the Snapshot is not busy, no owners are reported.

system>snap list -b vol1
Volume vol1 working...
name                                 owners
-----------                          -----------
snap1                                LUN clone
clone_vclone.1                       volume clone
system(0033604314)_vol1_q1-dst.2     snapmirror

When a snap delete is attempted on a busy Snapshot, the output is a similar list of owners of the Snapshot, failing the snap delete. For example:

System>snap delete –a vol1
Are you sure you want to delete all snapshots for volume vol1?
Y snap delete -a: Remaining snapshots are currently in use by dump, snap restore, SnapMirror, a CIFS share, RAID mirroring, LUNs or retained by SnapLock.
Please try to delete remaining snapshots later.
System>snap delete vol1 system(0033604314)_vol1_q1-dst.2        
Snapshot system(0033604314)_vol1_q1-dst.2 is busy because of snapmirror
 

Environment
EnvironmentBookmark
 
Data ONTAP 7G and earlier
All Filer models
SnapVault

 
 
If you would like more help, please try the new NOW Support Communities where registered customers, partners, and NetApp technical experts discuss technical questions and issues. (https://forums.netapp.com/community/support)

 

Qtree SnapMirror update states that file system is full

Qtree SnapMirror update states that file system is full

View Environment section

Symptoms
Qtree SnapMirror update states that file system is full
Snapmirror fails with error: [snapmirror.dst.waflErr:error]: SnapMirror destination transfer from filer:/vol/srcvol/srcqtree to /vol/dstvol/dstqtree : qtree snapmirror destination write failed: No space left on device.

Cause of this problem

For qtree snapmirror (QSM), it is possible that the destination filer's volume containing the destination qtree for a snapmirror relationship will require more space than the volume on the source filer.  This occurs for two reasons:

  1. During a qtree snapmirror update, changes must be replicated to the destination before data can be removed.  Thus, at least 5% free space should be available per QSM relationship to allow for the temporary space needed during the transfer.

  2. Volume-level snapshots independent of the QSM base snapshot  (such as the nightly snapshots) can retain data in the QSM qtrees. Since the volume snapshots are unique to the destination filer, they may contain data that was already deleted from the source filer.

If the destination qtree has enough space to hold the data in the source qtree, but it does not have enough space to contain the source qtree plus the snapshot delta, the QSM update will fail with the following error:
[snapmirror.dst.waflErr:error]: SnapMirror destination transfer from srcfiler:/vol/srcvol/srcqtree to /vol/dstvol/dstqtree : qtree snapmirror destination write failed: No space left on device.


Solution

Increase the size of the destination volume so that it can hold the sum of the following:

  • The amount of data stored in the source qtree
  • The amount of data stored in snapshots
  • 5% free space

The following scenario is an example of how the destination volume for a QSM relationship can become full even though it is the same size as the source filer's volume. A QSM relationship is set up from source srcfiler:/vol/srcvol/srcqtree to destination dstfiler:/vol/dstvol/dstqtree. The srcvol on srcfiler is 100GB in size and contains a qtree using 20 GB.  A "df -g" on this filer shows:

srcfiler> df -g
Filesystem total used avail capacity
/vol/srcvol/ 80GB 20GB 60GB 25%  
/vol/srcvol/.snapshot 20GB  0GB 20GB  0%   

Volume snapshots are disabled on the source volume /vol/srcvol. The dstvol on dstfiler is also 100GB in size and contains the QSM destination qtree, which holds the same 20GB of data as the source qtree (srcfiler:/vol/srcvol/srcqtree).  A "df -g" on dstfiler shows:

dstfiler> df -g
Filesystem total used avail capacity
/vol/dstvol/ 80GB 20GB 60GB 25% 
/vol/dstvol/.snapshot 20GB  0GB 20GB  0%
Volume snapshots are enabled on the destination volume /vol/dstvol:

dstfiler> snap sched
Volume dstvol: 0 2 6@8,12,16,20

The nightly snapshot has been taken on the destination volume.  This snapshot contains the 20GB of data in /vol/dstvol/dstqtree that was replicated over by QSM.
dstfiler> snap list dstvol
Volume dstvol
working...

%/used %/   total       date         name
---------- ---------- ------------ --------
25% ( 25%) 10% ( 10%) Aug 20 16:00 nightly.0
On the source filer, 20 GB of files are added, and 10 GB of files are deleted.  The srcvol now contains 30GB of data:

srcfiler> df -g
Filesystem total used avail capacity
/vol/srcvol/ 80GB 30GB 50GB 38%
/vol/srcvol/.snapshot 20GB  0GB 20GB  0% 

A QSM update occurs, and the changes are replicated to the destination. QSM sends delete information for the files which have been deleted, and complete data for the newly created files.  Thus, the data transferred is equal to 20 GB of new data + (4KB * number of deleted files).  Once the transfer completes, the destination filer now has the following space used:

dstfiler> df -g
Filesystem total used avail capacity
/vol/dstvol/ 80GB 30GB 50GB 38% 
/vol/dstvol/.snapshot 20GB 10GB 20GB 50%

Notice that there is space used in the destination filer's snap reserve.  This space is held in the nightly snapshot:

dstfiler> snap list dstvol
Volume dstvol
working...

%/used %/     total     date        name
---------- ---------- ------------ --------
25% ( 25%) 10% ( 10%) Aug 20 16:00 nightly.0

Thus, because the destination volume has snapshots enabled, it can use more space that the source volume. Therefore, it is important to consider the snapshot retention period and snapshot delta when sizing the destination volume for QSM relationships.

Last Updated:  24 AUG 2006

Environment
EnvironmentBookmark
Data ONTAP
All NetApp filer
NearStore
SnapMirror

If you would like more help, please try the new NOW Support Communities where registered customers, partners, and NetApp technical experts discuss technical questions and issues.