Change Data Capture(CDC) in SQL Server 2008
Change Data Capture is a process of logging of all the DML operations, applied on source tables, into a change tables. The change tables look similar to your source tables with extra Meta data information used to track changes done on the modified rows.
Change data capture is available only on the Enterprise, Developer, and Evaluation editions of SQL Server.
Change Data Capture Data Flow:
1.The source of change data for change data capture is the SQL Server transaction log.
2.As inserts, updates, and deletes are applied to tracked source tables, entries that describe those changes
are added to the log. The log serves as input to the change data capture capture process.
3.This reads the log and adds information about changes to the tracked table’s associated change table.
4.Functions are provided to enumerate the changes that appear in the change tables over a specified range,
returning the information in the form of a filtered result set.
5.The filtered result set is typically used by an application process to update a representation of the source in
some external.
Important CDC related procedures given below:
- sys.sp_cdc_enable_db
- sys.sp_cdc_disable_db
- sys.sp_cdc_enable_table
sys.sp_cdc_enable_db:
Creates all system objects related to CDC, such as tables, jobs, sprocs & functions. It also creates the CDC schema and CDC db user and sets the is_cdc_enabled column in the sys.databases catalog view to 1.
Note: Change Data Capture (CDC) cannot be enabled on system or distributed data bases.
Ex:
USE AdventureWorks;
EXEC sys.sp_cdc_enable_db
GO
sys.sp_cdc_disable_db:
Disables change data capture for all tables in the data bases currently enabled. All system objects related to change data capture, such as change tables, joins, sprocs, and functions are dropped.
The is_cdc_enabled column in the sys.databases catalog view is set to 0.
Ex:
USE AdventureWorks
EXEC sys.sp_cdc_disable_db
When you execute “sp_cdc_enable_db” sproc it will creates a user & schema name cdc in the current db.
Note: CDC is available only in SQL Server 2008 Enterprise, Dev, and Evaluation Editions.
sys.sp_cdc_enable_table:
Syntax:
Sys.sp_cdc_enable_table
[@source_schema=] ‘source_schema’,
[@source_name=] ‘source_name’,
[@role_name=] ‘role_name’,
[,[@capture_instance=] ‘capture_instance’]
[,[@support_net_changes=] ‘support_net_changes’]
[,[@index_name=] ‘index_name’]
[,[@captured_column_list=] ‘captured_column_list’]
[,[@filegroup_name=] ‘filegroup_name’]
[,[@partition_switch=] ‘partition_switch’]
Note: Tables in the CDC schema cannot ne enabled for change data capture.
@role_name = If the role currently exists, it is used, if the role does not exist, an attempt is made to create a db role with the specified name. The role name is trimmed of white space at the right of the string before attempting to create the role.
If the caller is not authorized to create a role within the database, the stored procedure operation fails.
@capture_instance = If not specified, the name is derived from the source schema name plus the source table name in the format schemaname_sourcename. @capture_instance cannot exceed 100 characters and must be unique within the data base. A source table can have a maximum of two capture instances.
There is a new column named is_tracked_by_cdc in sys.tables; you can query it to determine whether CDC will create certain tables, jobs, sprocs, and functions in the CDC enabled data bases.
Note: Disabling CDC at the table and/or db level will drop the respective tables, jobs, stored procedures and functions that were created in the data base when CDC was enabled.
To extract the changes for a table that has CDC enabled, you have to supply the relevant LSN (Log Sequence Number). An LSN is a number that uniquely identifies entries in the database transaction log.
If this is the first time you are querying to extract changes, you can get the min LSN and max LSN using the functions sys.fn_cdc_get_min_lsn() and sys.fn_cdc_get_max_lsn().
If you set @support_net_changes = 1 when enabling CDC on the table, you can query for the net changes using CDC. The function cdc.fn_cdc_get_net_changes_dbo_A(). This will group multiple changes to a row based on the primary key or unique index you specified when enabling CDC.
You can always invoke cdc.fn_cdc_get_net_changes_dbo_A() to retrieve every change to the table within the LSN range.
Comments
Post a Comment