Tuesday, June 10, 2008

Compressed snapshot check makes it fail on one subscriber

As this blog has my personal bread crumbs I thought I would better record this.

We have problems with our merge replication topology (SQL Server 2005 9.0.3228 Publisher and Dist and SQLE 9.0.3228 as Subscribers, pull subscriptions) the other day as we switched to compressed snapshots. I posted the problem in the forum and searched the books about it without luck...

Here's the problem

We noticed one subscriber downloaded the compressed dynamic snapshot, but was unable to apply it:

The trace at the subscriber showed the following error:

Partitioned snapshot validation failed for this Subscriber. The snapshot validation token stored in the specified partitioned snapshot location does not match the value ''b422311'' used by the Merge Agent when evaluating the parameterized filter function. If specifying the location of the partitioned snapshot (using -DynamicSnapshotLocation), you must ensure that the snapshot files in that directory belong to the correct partition or allow the Merge Agent to automatically detect the location.',1,1,N'
SQL Merge Agent encountered an error.


The merge agent log with OutputVerbose level 3 showed the following:

2008-06-03 18:07:48.222 [0%] [978 sec remaining] Snapshot will be applied from a compressed cabinet file
2008-06-03 18:07:48.238 OLE DB Distributor 'TORDISTRIBUTOR': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.254 OLE DB Subscriber 'TORSUBSCRIBER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.285 [0%] [978 sec remaining] A dynamic snapshot will be applied from '\\TORDISTRIBUTOR\repldata\unc\TORPUBLISHER_PUBLISHER_ PUBLICATION\B422311_1\'
2008-06-03 18:07:48.285 OLE DB Distributor 'TORDISTRIBUTOR': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.300 OLE DB Subscriber 'TORSUBSCRIBER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.316 [0%] [732 sec remaining] Validating dynamic snapshot
2008-06-03 18:07:48.316 OLE DB Distributor 'TORDISTRIBUTOR': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.332 OLE DB Subscriber 'TORSUBSCRIBER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.347 [0%] [732 sec remaining] Extracting snapshot file 'dynsnapvalidation.tok' from cabinet file
2008-06-03 18:07:48.363 OLE DB Distributor 'TORDISTRIBUTOR': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.534 OLE DB Subscriber 'TORSUBSCRIBER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.566 [0%] [732 sec remaining] Extracted file 'dynsnapvalidation.tok'
2008-06-03 18:07:48.566 OLE DB Distributor 'TORDISTRIBUTOR': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.566 OLE DB Subscriber 'TORSUBSCRIBER': sys.sp_MSregisterdynsnapseqno @snapshot_session_token = N'\\TORDISTRIBUTOR\ReplData\unc\TORPUBLISHER_PUBLISHER_ PUBLICATION\20080527113713\dynsnap', @dynsnapseqno = 'E69C5BB3-1095-429C-92BC-46747E49A155'
2008-06-03 18:07:48.675 OLE DB Subscriber 'TORSUBSCRIBER': sp_MSreleasesnapshotdeliverysessionlock
2008-06-03 18:07:48.753 The merge process was unable to deliver the snapshot to the Subscriber. If using Web synchronization, the merge process may have been unable to create or write to the message file. When troubleshooting, restart the synchronization with verbose history logging and specify an output file to which to write.
2008-06-03 18:07:48.753 OLE DB Subscriber 'TORSUBSCRIBER': {call sys.sp_MSadd_merge_history90 (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)}
2008-06-03 18:07:48.862 The merge process was unable to deliver the snapshot to the Subscriber. If using Web synchronization, the merge process may have been unable to create or write to the message file. When troubleshooting, restart the synchronization with verbose history logging and specify an output file to which to write.

It seems that when we use compressed snapshots and we validate the subscriptions using the HOST_NAME() there is an extra check done by the sync agent. The sync agent checks a token file inside the snapshot.cab file for that user. The filename is dynsnapvalidation.tok and contains the HOST_NAME() in our case.

The HOST_NAME() the subscriber passed to the agent was lower cased, and the partition, token file and sync agent job and token file had the HOST_NAME() upper cased.

Partitioned snapshot validation failed for this Subscriber. The snapshot validation token stored in the specified partitioned snapshot location does not match the value ''b422311'' used by the Merge Agent when evaluating the parameterized filter function. If specifying the location of the partitioned snapshot (using -DynamicSnapshotLocation), you must ensure that the snapshot files in that directory belong to the correct partition or allow the Merge Agent to automatically detect the location.',1,1,N'

SQL Merge Agent encountered an error.

Here's the solution:

The way we solved the issue by changing the reference data to be lower cased, deleted the snapshot folder, deleted the sync agent job at the distributor for that partition. Dropped the subscription and created it again and it recreated the sync agent job at the distributor with the proper lower cased Host_Name().

We haven't been able to reproduce it again, but time permitting, we will :-p

I tried to reproduce the scenario using a publisher and subscriber on the same server (SQL Server 2005 Developer build 9.00.3228) but couldn’t reproduce the error. On our real life scenario we use SQL Server Express as subscriber and RMO to trigger replication.

Labels:

6 Comments:

Anonymous Crash said...

We had a similiar issue with merge replication and non-compressed snapshots. Our application's publisher/distributor is running SQL Server 2005 enterprise and subscribers are running SQL Server Express. We use pull subscriptions and we run the merge agent from the subscriber via code in C# and RMO.

We were 'randomly' experiencing the token validation error - until we traced it back to spelling/case sensitivity of the value we pass for HOST_NAME.

Long story short: after a subscription is reinitialized the HOST_NAME is validated against the .tok file the next time the merge agent is run. If no spelling & case match is found between the distributor and the HOST_NAME passed up from the subscriber then replication fails with the token validation error.

We were able to reproduce the error and we were able to make it happen or not at will by editing the token value in the .tok file.

Moral of the story: Use an immutable unique value for your HOST_NAME! Never use a value that users or some other part of the application might alter after the subscription has been created.

3:40 PM  
Blogger Lizet Pena de Sola said...

...even if that "immutable" value is the domain login for the given user. Your Active Directory administrator can change the way the name is capitalized :-p

4:48 PM  
Blogger Lizet Pena de Sola said...

This post has been removed by the author.

10:46 PM  
Blogger Lizet Pena de Sola said...

Thanks to Crash for pointing out that the .tok file is not related to the snapshot folder compression. afaik this file shows up if you have partition enabled due to dynamic filtering...

10:56 PM  
Blogger Wiz said...

I cannot thank you enough !! I've had this problem for months now, and I couldnt figure out why it happened. When creating the subscriptions, I sometimes typed the HOST_NAME in uppercase, and sometimes with lowercase. I didnt think this mattered. It's solved now!

Thanks a million

12:57 AM  
Blogger Wiz said...

I cannot thank you enough !! I've had this problem for months now, and I couldnt figure out why it happened. When creating the subscriptions, I sometimes typed the HOST_NAME in uppercase, and sometimes with lowercase. I didnt think this mattered. It's solved now!

Thanks a million

12:58 AM  

Post a Comment

<< Home