In Microsoft SQL Server, you can replicate data and schema changes from one database server to another. Baya Pavliashvili walks us through the use of the SQL Server 2005 wizards and scripts for setting up transactional replication and points out some of the cool new features of the latest version.
Replication is an advanced feature of Microsoft SQL Server, allowing you to move data and schema changes from one database server to another. In this article, I’ll show you how to set up transactional replication in a simple environment using SQL Server 2005. Subsequent articles in this series will discuss maintaining and troubleshooting transactional replication and replicating code modules.
The examples in this article are based on SQL Server 2005 Service Pack 1. Although this article isn’t intended to be a comparison of features between versions 2000 and 2005, I will underline the improvements that I consider worth observing. This article teaches you how to set up transactional replication using wizards and scripts. In most cases, you’ll set up the initial publication and subscribers using wizards; however, if you need to apply the same publication in multiple environments, you’ll appreciate the option of applying the scripts as opposed to going through wizards time and again.

Configuring a Distributor

SQL Server 2005 introduced numerous welcome improvements to replication, not the least of which is shorter wizards. Following a wizard isn’t difficult, but fewer wizard screens certainly make replication setup quicker. As a rule, replication wizards in SQL Server 2005 are nearly 50% shorter than those in SQL Server 2000.
The first step in configuring replication is designating a server that will take care of storing and delivering replicated transactions—the distributor. A single server can act as a publisher, distributor, and a subscriber, all at the same time. However, in a realistic scenario you’re likely to use two different servers as publisher and subscriber. Using a separate server as the distributor can help to reduce the load on the publisher.
To invoke the Configure Distribution Wizard, connect to an instance of SQL Server by using the SQL Server Management Studio (SSMS), navigate to the "replication" folder, right-click this folder, and choose Configure Distribution from the pop-up menu. Replication wizards are no longer modal; that is, you can continue working with SSMS while the wizard is active. The first screen of the wizard simply informs you of the tasks that this wizard can help you to accomplish. If you don’t ever want to see this screen again, simply check the option to skip the introductory screen in the future.
The next screen asks whether you want to use the local server or a different server as the distributor ( see Figure 1).
Figure
Figure 1
If you want to use a remote distributor, you must first run the Configure Distribution Wizard on that server. For this example, I’ll use the same instance as both publisher and distributor. The next screen allows you to specify the snapshot folder where data and schema of the published database will be stored ( see Figure 2). By default, the snapshot folder is called ReplData and is created within the directory where the current SQL Server instance is installed.
Figure
Figure 2
Notice the warning in the dialog box, indicating that the current directory doesn’t support pull subscriptions. To use pull subscriptions, you need a network folder for storing snapshots. Because both publisher and subscriber instances of SQL Server in this example will reside on the same computer, I can safely disregard this message, and simply click Next.
The following screen allows for configuring the distribution database’s name and the location for its data and transaction log files. By default, the distribution database is called distribution; you can modify the name if you have a compelling reason to do so. For example, if you have dozens or hundreds of publications, you might want to have multiple distribution databases, with descriptive names for each one. The wizard will use the default location for database and log files. You can configure the default location on the Database Settings tab in the Server Properties dialog box in SSMS (right-click the server and choose Properties to access the dialog box). Alternatively, you can change file locations in the wizard, as shown in Figure 3.
Figure
Figure 3
The next screen enables servers to use the current distributor when they’re configured as publishers ( see Figure 4). This screen has a couple of interesting options. First, if you click the ellipsis (...) button next to a publisher, you’ll get a dialog box that allows you to configure the log reader agent’s security credentials as well as the snapshot folder for this publisher, as shown in Figure 5.
Figure
Figure 4
Figure
Figure 5
Second, the Add button allows you to add a SQL Server or Oracle publisher. This feature is worth your attention because using the distribution database for an Oracle publisher wasn’t available in previous versions.
After you’ve enabled the publishers, you can set a password for remote publishers ( see Figure 6). You must enter the same password twice. SQL Server 2005 allows the administrator to enforce password policies and password expiration. Hence, the wizard warns you that the password you enter for a remote publisher must meet the password policy requirements.
Figure
Figure 6
After you click Next on this screen, you can configure distribution right away, save the script for later execution, or perform both actions. If you choose to save the script, you’ll be asked for the location where you want to save the file. At this point, the wizard presents a synopsis of the steps it’s about to undertake; once you click Finish, the wizard will create the script for adding a distributor and/or save the script, depending on what you specified.
Listing 1 shows a script for configuring a distributor and adding a publisher.
Listing 1 Sample script for configuring a distributor and adding a publisher.
use master
GO
exec sp_adddistributor @distributor = N’server\instance’, @password = N’’
GO

/* create and configure distribution database */

exec sp_adddistributiondb @database = N’distribution’,
@data_folder = N’C:\Program Files\Microsoft SQL Server\MSSQL.4\MSSQL\Data’,
@data_file_size = 4, @log_folder = N’C:\Program Files\Microsoft SQL Server\MSSQL.4\MSSQL\Data’,
@log_file_size = 2, @min_distretention = 0, @max_distretention = 72, @history_retention = 48,
@security_mode = 1
GO

use [distribution]
if (not exists (select * from sysobjects where name = ’UIProperties’ and type = ’U ’))
create table UIProperties(id int)
if (exists
(select * from ::fn_listextendedproperty(’SnapshotFolder’, ’user’, ’dbo’, ’table’,
’UIProperties’, null, null)))

EXEC sp_updateextendedproperty
N’SnapshotFolder’, N’C:\Program Files\Microsoft SQL Server\MSSQL.4\MSSQL\ReplData’,
’user’, dbo, ’table’, ’UIProperties’
else
EXEC sp_addextendedproperty N’SnapshotFolder’,
’C:\Program Files\Microsoft SQL Server\MSSQL.4\MSSQL\ReplData’, ’user’, dbo, ’table’,
’UIProperties’
GO

/* add a publisher */

exec sp_adddistpublisher @publisher = N’server\instance’, @distribution_db = N’distribution’,
@security_mode = 1,
@working_directory = N’C:\Program Files\Microsoft SQL Server\MSSQL.4\MSSQL\ReplData’,
@trusted = N’false’, @thirdparty_flag = 0, @publisher_type = N’MSSQLSERVER’
GO
Once you’ve configured the distribution database, you can read or modify the distributor properties by right-clicking the replication folder and choosing Distributor Properties. The resulting dialog box has two pages—a "general" page and a "publishers" page. The "general" page allows you to view distribution database properties and modify settings for transaction retention and/or history retention ( see Figure 7).
Figure
Figure 7
Notice that you’re also allowed to create and configure a queue reader agent from this screen. The queue reader agent is beyond the scope of this article.
The "publishers" page of the Distribution Database Properties dialog box lets you add a publisher or change existing publishers’ properties.


Creating a Publication

Once you’ve configured a distributor, you’re ready to create publications. To invoke the publication wizard, right-click the local publications folder and choose New Publication from the pop-up menu. As with the Distribution Configuration Wizard, the first screen of this wizard is introductory in nature and can be skipped. The second screen allows you to choose the database in which you want to create a publication; for purposes of this article, I’ll create a publication within the AdventureWorksDW database that can be created as part of SQL Server 2005 installation. After selecting the database, you must choose the publication type. The wizard offers the following options:
  • Snapshot Publication
  • Transactional Publication
  • Transactional Publication with Updatable Subscriptions
  • Merge Publication
The wizard includes a brief description of each type of publication. I’ll use the transactional publication for this example; refer to my earlier articles for more info about other publication types.
A transactional publication can contain one or more articles. An article can be a table, a view (including indexed views), a user-defined function, or a stored procedure. For this example, I’ll replicate the dimAccount table from the AdventureWorksDW database. As shown in Figure 8, I can replicate all columns or a subset of all columns within a given table.
Figure
Figure 8
Replication has certain rules as far as which columns can be filtered. Transactional replication prohibits filtering primary-key columns. In addition, if your publication allows updateable subscriptions, you must replicate the msrepl_tran_version column (added by SQL Server when you create such publications). Further, publications that allow updateable subscriptions must replicate any column that doesn’t allow nulls, doesn’t have a predefined default, and isn’t an identity column.
If you check the box Show Only Checked Objects in the List, the wizard limits the list of articles to only those that have been checked. The Article Properties button allows you to set properties for the highlighted article or for all table articles. As Figure 9 shows, you can set a multitude of replication-related properties for each article.
Figure
Figure 9
Most properties you can set for table articles are self-explanatory; for example, the Copy Foreign Key Constraints option instructs the replication to include foreign key constraints when creating the table in the subscriber database.
A few properties deserve additional consideration:
  • Destination Object Name, Destination Object Owner. The destination table isn’t required to have the same name or the same owner as the source object.
  • Convert Data Types. This option automatically changes a user-defined data type to the base data type, because the user-defined data type might not exist on the subscriber(s).
  • Convert TIMESTAMP to BINARY. When replicating a column with a TIMESTAMP data type, you can convert it to BINARY. The TIMESTAMP data type tracks the sequence of modifications; every time you change a data row, SQL Server will automatically change the value of the column with the TIMESTAMP data type. This is important because, if you’re not careful, you might end up with different values in the column with the TIMESTAMP data type on the publisher and the subscriber.
  • Convert MAX Data Types to NTEXT and IMAGE. This option translates VARCHAR(MAX) and VARBINARY(MAX) data types, which are new in SQL Server 2005, to respective data types supported in previous versions.
  • Convert XML to NTEXT. Translates the new XML data type to NTEXT.
  • Another option that wasn’t available through wizards in previous versions of SQL Server is automatic identity range management. This option allows the database administrator to set the ranges of valid values for the identity column in the publisher and subscriber databases. For example, we could assign values 1,000,000 and greater to the publisher and 1 to 1,000,000 to the subscriber. When the publisher database reaches the upper limit for the identity range, it will automatically assign a new range so that publisher and subscriber identity values don’t overlap.
  • The final group of options (not shown in Figure 9) determines how to replicate INSERT, UPDATE, and DELETE statements to the subscriber.
Once you’ve set the necessary properties for the article you want to replicate, you can add publication filters ( see Figure 10). In previous versions of SQL Server, these filters were referred to as horizontal filters—you create them by supplying a WHERE clause to limit the number of published rows. As shown earlier, now you can filter the publication vertically by specifying which columns to publish.
Figure
Figure 10
The next step is to create a snapshot and/or specify the snapshot agent’s schedule, as shown in Figure 11.
Figure
Figure 11
The snapshot agent copies the schema and data of the replicated article(s) into the snapshot folder. If you click the Change button on this screen, you’ll get the typical dialog box for creating job schedules; you can run the snapshot agents monthly, weekly, daily, or even multiple times per day.
Next you specify the security settings for the snapshot and log reader agents ( see Figure 12). I’ll discuss replication security in greater detail in a later article about transactional replication agents. For now, you just need to know that you can customize security for each agent or use different credentials for each.
Figure
Figure 12
The wizard next offers you the option to script the commands for creating the publication. Review the synopsis of the steps the wizard is about to undertake; then specify the publication name and click Finish to create the publication.
Listing 2 shows the script for creating the publication.
Listing 2 Script for creating the example publication.
use [AdventureWorksDW]
exec sp_replicationdboption @dbname = N’AdventureWorksDW’, @optname = N’publish’, @value = N’true’
GO
-- Adding the transactional publication
use [AdventureWorksDW]
exec sp_addpublication @publication = N’DimAccount’,
@description = N’Transactional publication of database ’’AdventureWorksDW’’ from Publisher ’’server\instance’’.’,
@sync_method = N’concurrent’, @retention = 0, @allow_push = N’true’, @allow_pull = N’true’, @allow_anonymous = N’true’,
@enabled_for_internet = N’false’, @snapshot_in_defaultfolder = N’true’, @compress_snapshot = N’false’,
@ftp_port = 21, @ftp_login = N’anonymous’, @allow_subscription_copy = N’false’, @add_to_active_directory = N’false’,
@repl_freq = N’continuous’, @status = N’active’, @independent_agent = N’true’, @immediate_sync = N’true’,
@allow_sync_tran = N’false’, @autogen_sync_procs = N’false’, @allow_queued_tran = N’false’, @allow_dts = N’false’,
@replicate_ddl = 1, @allow_initialize_from_backup = N’false’, @enabled_for_p2p = N’false’,
@enabled_for_het_sub = N’false’
GO


exec sp_addpublication_snapshot @publication = N’DimAccount’, @frequency_type = 4, @frequency_interval = 1,
@frequency_relative_interval = 1, @frequency_recurrence_factor = 0, @frequency_subday = 8,
@frequency_subday_interval = 1, @active_start_time_of_day = 0, @active_end_time_of_day = 235959,
@active_start_date = 0, @active_end_date = 0, @job_login = null, @job_password = null, @publisher_security_mode = 1


use [AdventureWorksDW]
exec sp_addarticle @publication = N’DimAccount’, @article = N’DimAccount’, @source_owner = N’dbo’,
@source_object = N’DimAccount’, @type = N’logbased’, @description = null, @creation_script = null,
@pre_creation_cmd = N’drop’, @schema_option = 0x000000000803509F, @identityrangemanagementoption = N’manual’,
@destination_table = N’DimAccount’, @destination_owner = N’dbo’, @vertical_partition = N’false’,
@ins_cmd = N’CALL sp_MSins_dboDimAccount’, @del_cmd = N’CALL sp_MSdel_dboDimAccount’,
@upd_cmd = N’SCALL sp_MSupd_dboDimAccount’
GO
You can view the newly created publication’s properties by expanding the local publications folder, right-clicking the publication, and choosing Properties from the pop-up menu. The properties dialog box has several pages, each of which has a specific purpose:
  • General. Shows the publication’s name, description, type, and the database on which the publication is based. You can modify subscription expiration options from this page.
  • Articles. Lets you review the published articles, modify their properties, or add new articles to the publication.
  • Filter Rows. Allows you to create horizontal filters for articles.
  • Snapshot. Enables you to specify the snapshot folder location, snapshot format, or additional scripts to be executed before and after applying the snapshot.
  • FTP Snapshot. Settings that let you allow subscribers to download the snapshot from an FTP share and configure FTP security.
  • Agent Security. Controls security settings for the log reader and snapshot agents.
  • Publication Access List. Specifies SQL Server and Windows logins who have permissions to create and synchronize subscriptions.
  • Subscription Options. Provides a multitude of options for subscribers to the current publication.
The following table describes the subscription options you can set through the Publication Properties dialog box. Note that several of these options are new in SQL Server 2005.
Subscription Option
New in SQL Server 2005
Description
Independent Distribution Agent
No
Specifies whether to use an agent independent of other publications in the same database.
Snapshot Always Available
Yes
Makes snapshot files always be available for initializing subscriptions. This option requires an independent distribution agent.
Allow Anonymous Subscriptions
No
Supports anonymous subscriptions, typically used for replicating over the Internet. This option requires Snapshot Always Available.
Attachable Subscription Database
No
Specifies whether subscriptions can be created by attaching the publication database on the subscriber. This option requires Snapshot Always Available.
Allow Pull Subscriptions
No
Provides support for pull subscriptions.
Allow Initialization from Backup Files
Yes
Enables subscriptions to be created by restoring the publication database backup on the subscriber.
Allow Non-SQL Server Subscribers
No
Permits non-SQL Server subscribers.
Allow Data Transformations
No
Allows for data to be transformed before it’s sent to subscribers. This option is deprecated.
Replicate Schema Changes
Yes
Perhaps the most important change introduced with SQL Server 2005. If this option is set to YES (the default value), replication can deliver to the subscriber(s) any ALTER TABLE statements executed on publishers. In previous releases, some schema changes required removing articles from the publication and were therefore associated with downtime.
Allow Peer-to-Peer Subscriptions
Yes
Determines whether subscribers can participate in peer-to-peer relationship with publishers. Peer-to-peer topology allows multiple nodes to act as publishers and subscribers at the same time. This is similar to bidirectional replication available in previous releases.
Allow Immediate Updating Subscriptions
No
Controls whether data changes on subscriber(s) can be delivered immediately to the publisher.
Allow Queued Updating Subscriptions
No
Specifies whether data changes on subscriber(s) can be queued and delivered to the publisher at a later time. This option is useful if the network line between the publisher and subscriber isn’t always reliable.



Creating Subscriptions

Unlike previous versions, of SQL Server 2005 allows you to use the same wizard to create either pull or push subscriptions. To invoke the new subscription wizard right-click the publication (or the local subscriptions folder) and choose New Subscriptions from the pop-up menu. After you get past the introductory screen, select the publication for which you want to create subscription(s). Next, indicate whether you want to use pull or push subscriptions ( see Figure 13). Pull subscriptions reduce the load on the publisher, whereas push subscriptions make it easy to administer all subscriptions at the central location. For this example, I’ll use push subscriptions, but the wizard screens are nearly identical for pull subscriptions.
Figure
Figure 13
Next you choose a subscribing server and database, as shown in Figure 14. You can use an existing database or create a new database; if you choose to create a new database on the subscribing server, you’ll get the typical dialog box for creating databases. More interestingly, note that the wizard allows you to use a non-SQL Server subscriber. You can choose either an Oracle or IBM DB2 subscriber for push subscriptions; only SQL Server subscribers are supported if using pull subscriptions.
Figure
Figure 14
Non-SQL Server subscriptions have certain limitations and other considerations of which you need to be aware. If you plan to use a non-SQL Server subscription be sure to consult the Heterogeneous Database Replication topic in SQL Server Books Online.
After specifying the subscriber server and database, you need to configure distribution agent security ( see Figure 15). I’ll discuss replication agent security in greater detail in another article. For now, keep in mind that you can either impersonate the SQL Server Agent or use a separate Windows login or SQL Server login for the distribution agent. For this example, I’ll use the SQL Server Agent service account for running the distribution agent and for connecting to the subscriber.
Figure
Figure 15
Now it’s time to define a synchronization schedule—how often you want the replicated transactions to be delivered to the subscriber(s). Replicating transactions continuously is the best option if you want to achieve minimal latency; however, this option requires more work on the publisher for push subscriptions and on the subscriber for pull subscriptions. Scheduled delivery is a good option if you want to minimize the load during business hours and deliver commands only at certain times each day. On-demand delivery can be a viable option if you want to synchronize your databases only occasionally.
After indicating the desired synchronization schedule, you can initialize the subscription database ( see Figure 16). During initialization, replication creates the published objects’ schemas and copies data from the snapshot folder to the subscription database; in addition, the stored procedures used for replication are created in the subscriber database. In the dialog box, you can specify that you don’t want to initialize the subscriptions—this option is useful if the schema and data already exist on the subscriber. Other options are to initialize subscriptions immediately or at first synchronization—that is, the first time the snapshot agent runs.
Figure
Figure 16
You’re done specifying all the information that the wizard needs to create subscriptions. At this point, you have the option to script the subscription and/or to create subscriptions. The wizard allows you to review the summary of the steps it’s about to undertake before you click the Finish button.
Listing 3 shows a script for creating a subscription.
Listing 3 Sample script for creating a subscription.
-----------------BEGIN: Script to be run at Publisher ---------------
use [AdventureWorksDW]
GO
exec sp_addsubscription @publication = N’DimAccount’,
@subscriber = N’server\subscriber_instance’,
@destination_db = N’AdventureWorksDWSub’,
@subscription_type = N’Push’,
@sync_type = N’automatic’,
@article = N’all’,
@update_mode = N’read only’,
@subscriber_type = 0

exec sp_addpushsubscription_agent @publication = N’DimAccount’,
@subscriber = N’server\subscriber_instance’,
@subscriber_db = N’AdventureWorksDWSub’,
@job_login = null,
@job_password = null,
@subscriber_security_mode = 1,
@frequency_type = 64,
@frequency_interval = 0,
@frequency_relative_interval = 0,
@frequency_recurrence_factor = 0,
@frequency_subday = 0,
@frequency_subday_interval = 0,
@active_start_time_of_day = 0,
@active_end_time_of_day = 235959,
@active_start_date = 20060627,
@active_end_date = 99991231,
@enabled_for_syncmgr = N’False’,
@dts_package_location = N’Distributor’
To review subscription properties, expand the publication found in the local publications folder, right-click the subscription, and select Properties from the pop-up menu. Figure 17 shows the resulting screen for this example.
Figure
Figure 17
If we check the stored procedures folder on the subscriber database, we’ll find three new procedures that will be used for delivering the replicated transactions:
  • sp_MSupd_dboDimAccount
  • sp_MSdel_dboDimAccount
  • sp_MSins_dboDimAccount





Testing the Replication

Now that we’ve configured replication, we can run a few SQL statements to test it. I’ll execute the following UPDATE statement on the publisher:
UPDATE dimAccount
SET AccountDescription = ’work in progress’
WHERE AccountKey = 11
SQL Server informs me that this query affected one row. Next, I switch the database context to the distribution database and run the stored procedure sp_browsereplcmds. SQL Server shows the following command being delivered:
{CALL [sp_MSupd_dboDimAccount] (,,,,N’work in progress’,,,,,,11,0x1000)}
Finally, I query the subscribing database to see whether the changes made on the publisher were replicated to the subscriber:
SELECT
AccountDescription
FROM dimAccount
WHERE AccountKey = 11

Results:

AccountDescription
--------------------------------------------------
work in progress


ummary

In this article, I showed you how to set up transactional replication using SQL Server 2005. We focused on how to configure a distributor, create a publication, and subscribe to the publication. The example used in this article was very simplistic. In the real world, setting up replication will present many challenges, but this overview gives you enough of the general pattern for replication with SQL Server 2005 to get you started.
Future articles in this series will dig deeper into transactional replication. I’ll show you how to maintain and troubleshoot replication and use some of its advanced functionality.