http://www.mssqlcity.com/Articles/Replic/SetupSR/SetupSR.htm
Setup Snapshot replication: step by step guide
Alexander Chigrik
chigrik@mssqlcity.com
-
Introduction
General concepts
Replication topologies
Replication types
Replication agents
Checking necessary conditions
Step by step example
Backup and restore strategies
Literature
Introduction
In this article, I want to tell you about some general Microsoft SQL Server replication's topics: replication topologies, replication types, replication agents; and about Snapshot replication: how to check necessary conditions for this replication type, how to backup and restore databases participated in this replication scenario, and how to setting up Snapshot replication step by step.
Because it's only test example, I used only one server to replicate data: Publisher, Subscriber and Distributor databases were resided on the same machine.
General concepts
Replication is the process whereby data is copied between databases on the same server or different servers connected by LANs, WANs, or the Internet.
Microsoft SQL Server replication uses a publisher , distributor and subscriber metaphor.
Publisher is the server or database that sends its data to another server or database.
Subscriber is the server or database that receives data from another server or database.
Distributor is the server that manages the flow of data through the replication system. This server contains the distribution database .
Publisher contains publication/publications . Publication is a collection of one or more articles that is sent to a subscriber server or database.
Article is the basic unit of replication and can be a table or a subset of a table.
Subscription is the group of data that a server or database will receive.
There are push and pull subscriptions . Push subscription is subscription when the publishing server will periodically push transactions out to the subscribing server or database.
Pull subscription is subscription when the subscribing server will periodically connect to the distribution database and pull information.
The Distribution database is a system database , which is stored on the Distributor and does not contain any user tables. This database is used to store snapshot jobs and all transactions waiting to be distributed to Subscribers.
Replication topologies
Microsoft SQL Server supports the following replication topologies:
-
- Central publisher
- Central subscriber
- Central publisher with remote distributor
- Central distributor
- Publishing subscriber
Central publisher
This is one of the most used replication topologies. In this scenario, one server is configured as Publisher and Distributor and another server/servers is/are configured as Subscriber/Subscribers.
Central subscriber
This is a common topology in data warehousing. Many servers or databases replicate their data to a single central server in one or more databases.
Central publisher with remote distributor
In this topology, distribution database resides on another server than publisher. This topology uses for performance reasons when the level of replication activity increases or the server or network resources become constrained. It reduces Publisher loading, but it increases overall network traffic.This topology requires separate Microsoft SQL Server installations, one for the Publisher and one for the Distributor.
Central distributor
In this topology, several publishers use only one distributor, which resides on another server than publishers. This is one of the most unused replication topologies, because it has only single point of failure (on the single server with central distributor), and if distributor's server will fail, entire replication scenario will be destroyed.
Publishing subscriber
This is a dual role topology. In this topology, two servers publish the same data. One publishing server sends data to subscriber, and then this subscriber publish data to any number of other subscribers. This is useful when a Publisher should send data to Subscribers over a slow or expensive communications link.
Replication types
Microsoft SQL Server 7.0/2000 supports the following replication types:
-
- Snapshot
- Transactional
- Merge
Snapshot replication is a simplest type of replication. With this kind of replication, all replicated data (replica) will be copied from the Publisher database to Subscriber/Subscribers database on a periodic basis. Snapshot replication is best used as a method for replicating data that changes infrequently and when the size of replicated data is not very large.
With Transactional replication, SQL Server captures all changes that were made in the articles and stores INSERT, UPDATE, and DELETE statements in the distribution database. This changes then sent to subscribers from the distribution database and applied in the same order. Transactional replication is best used when the replicated data changes frequently or when the size of replicated data is not small and is not necessary to support autonomous changes the replicated data on the Publisher and on the Subscriber.
Merge replication is a most difficult replication type. It allows making autonomous changes to replicated data on the Publisher and on the Subscriber. With Merge replication, SQL Server captures all incremental data changes in the source and in the target databases, and reconciles conflicts according to rules you configure or using a custom resolver you create. Merge replication is best used when you want to support autonomous changes the replicated data on the Publisher and on the Subscriber.
Replication agents
Microsoft SQL Server 7.0/2000 supports the following replication agents:
-
- Snapshot Agent
- Log Reader Agent
- Distribution Agent
- Merge Agent
The Snapshot Agent is a replication agent that makes snapshot files , s t ores the snapshot on the Distributor , and records information about the synchronization status in the distribution database. The Snapshot Agent is used in all replication types (Snapshot, Transactional, and Merge replications) , and can be administered by using SQL Server Enterprise Manager.
The Log Reader Agent is a replication agent that moves transactions m arked for replication from the transaction log on the Publisher to the distribution database . This replication agent is not used in Snapshot replication.
The Distribution Agent is a replication agent that moves the snapshot jobs from the distribution database to Subscribers , and moves all transactions waiting to be distributed to Subscribers. The Distribution Agent is used in Snapshot and Transactional replications , and can be administered by using SQL Server Enterprise Manager.
The Merge Agent is a replication agent that applies initial snapshot jobs from the publication database tables to Subscribers , and merges incremental data changes that have occurred since the initial snapshot was created. The Merge Agent is used only in Merge replication.
Checking necessary conditions
Check the following before setting up Snapshot replication:
1. The Localsystem account has no access to shares on the network as it isn't an authenticated network account. So, if you want to setting up replication you must change the account the MSSQLServer and SQLServerAgent services runs under to a account with the Windows NT/Windows 2000 administrator's rights. If your Microsoft SQL Server runs on Windows NT or Windows 2000, you can create Windows NT/Windows 2000 account and include it into local Administrators group, into Domain Users group, and set Log in as a service permission for this account.
Because Windows 9x does not support Windows NT services, so if your Microsoft SQL Server runs on Windows 9x, you do not need to create SQL account.
2. Only members of the sysadmin server role can setting up and configure replication, so if you have not these rights, you cannot setting up replication.
3. Don't forgive to start SQLServerAgent service (and MSSQLServer service, of course).
4. You should allocate adequate disk space in the snapshot folder.
5. You should allocate adequate disk space for the distribution database.
6. You should ensure the server that is being replicated to, is defined as a remote server.
How Snapshot Replication Works
http://msdn.microsoft.com/en-us/library/Aa256290
Snapshot replication is implemented by the Snapshot Agent and the Distribution Agent.
The Snapshot Agent prepares snapshot files containing schema and data of published tables and database objects, stores the files in the snapshot folder, and records synchronization jobs in the distribution database on the Distributor. By default, the snapshot folder is located on the Distributor, but you can specify an alternate location instead of or in addition to the default.
The Distribution Agent moves the snapshot held in the distribution database tables to the destination tables at the Subscribers. The distribution database is used only by replication and does not contain any user tables.
NOTE: WHEN HAPPEN
"The log was not truncated because records at the beginning of the log are
pending replication. Ensure the Log Reader Agent is running or use
sp_repldone to mark transactions as distributed." DURING BACKUP TRANSACTION LOG
Reason:
The transaction marked for replication could not be removed from the log during normal backup .
Normally in this situation executing sp_repldone will clean these up.
HOW TO CHECK?
HOW TO SOLVE?
To clear the pending transactions, run the following against the published database, after stopping the logreader and distribution agent:
stop&restart agent:
exec sp_repldone @xactid = NULL, @xact_seqno = NULL, @numtrans = 0, @time = 0, @reset = 1
sp_repldone : Updates the record that identifies the last distributed transaction of the server. This stored procedure is executed at the Publisher on the publication database.
HOW WE KNOW WE HAVE DONE THE MARK REPLICATED TRANSACTIONS AS distributed ?
issue the call to DBCC OPENTRAN in the Publication database.
Look for the value of Oldest non-distributed LSN . if the value is : (0:0:0) then the log reader agent is all caught up.
If you try to start the log reader now you will see the following error:
2009-11-25 22:40:00.995 Status: 0, code: 18752, text: 'Only one Log Reader Agent or log-related procedure (sp_repldone, sp_replcmds, and sp_replshowcmds) can connect to a database at a time. If you executed a log-related procedure, drop the connection over which the procedure was executed or execute sp_replflush over that connection before starting the Log Reader Agent or executing another log-related procedure.'.
2009-11-25 22:40:00.995 Status: 0, code: 22017, text: 'The process could not set the last distributed transaction.'.
2009-11-25 22:40:00.995 Status: 0, code: 22037, text: 'The process could not execute 'sp_repldone/sp_replcounters' on ‘Server/Instance'.
HOW TO SOLVE?
WAY 1 : sp_replflush (Transact-SQL)
Flushes the article cache. This stored procedure is executed at the Publisher on the publication database.
http://207.46.16.252/en-gb/library/ms174992%28SQL.100%29.aspx
WAY 2 : sp_replrestart (Transact-SQL)
Used by transactional replication during backup and restore so that the replicated data at the Distributor is synchronized with data at the Publisher. This stored procedure is executed at the Publisher on the publication database.
http://technet.microsoft.com/en-us/library/ms174390.aspx
IMPORTANT!!!:
xec sp_repldone @xactid = NULL, @xact_seqno = NULL, @numtrans = 0, @time = 0, @reset = 1
sp_replshowcmds
sp_replrestart....
those command can only be run in when publication database is SINGLE USER mode