Search This Blog

Wednesday, October 6, 2010

Data Guard implementation

DSDB Data Guard Reference




1. Why Data Guard



o Complete Data Protection, High Availability, and Disaster Recovery

o Balance data availability against performance using 3 data protection mode:

 Maximum availability


 Maximum protection


 Maximum performance

o Centralize and simplify management

o Safeguard against physical corruptions



2. Data Guard Architecture



o Primary database

o Standby database

o Log transport service

o Log apply service

o Gap detection and resolution

o Role management services



3. Log Transport Services



o LGWR. Log writer can write to the online redo log, archive log and and redo log for standby database, which can be local or remote.

o ARCn. Archiver can write to archive log and log for standby database.

o FAL (fetch archive log) client

o FAL server



4. Log Apply Services



o RFS (remote file server): receives redo logs from the primary database either in the form of archived redo logs or standby redo logs.

o Archiver (ARCn): If standby redo logs are being used, the ARCn process archives the standby redo logs that are to be applied by the managed recovery process (MRP).

o MRP (managed recovery process): applied information from the archived redo logs or standby redo logs to the standby database.



5. Gap Detection and Resolution



o Automatic Gap Resolution

 As the redo logs being sent to the standby log compared with the last one received by the RFS process

 In 9.2, arch process at primary site will poll the logs sequence on the standby site.

o Fal Gap Resolution

 As the RFS process on the standby receives an archived log, it updates the standby controlfile with info the log file. One the MRP sees the update to the controlfile, it will recover the file. If the MRP process finds the archived log is missing or corrupt, FAL is called to resolved the gap or obtain a new copy.

 FAL_SERVER and FAL_CLIENT

o Manually Resolving Gap

 Find the missing logs.

 Manually copy over the log files.

 Register the log file to the standby database. (alter database register log file ‘file_name’;

 Restart the MRP.



o Data Guard Configuration Setup



On Primary database side



Step 1 . from primary database you need to create pfile from spfile

Create spfile from pfile;



Step 2 : create locations for the archive logs



log_archive_dest_1='LOCATION=D:\oracle\product\10.2.0\prd\arch'


log_archive_dest_2='LOCATION=D:\oracle\product\10.2.0\std\arch'


Step 3 . insert below parameter in pfile . sample initprd.ora file for pfile;
prd.__db_cache_size=398458880


prd.__java_pool_size=4194304



prd.__large_pool_size=4194304



prd.__shared_pool_size=155189248



prd.__streams_pool_size=0

*.audit_file_dest='D:\oracle\product\10.2.0/admin/prd/adump'

*.background_dump_dest='D:\oracle\product\10.2.0/admin/prd/bdump'

*.compatible='10.2.0.1.0'

*.control_files='D:\ORACLE\PRODUCT\10.2.0\ORADATA\PRD\CONTROLFILE\O1_MF_6BPGGPX9_.CTL','D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\PRD\CONTROLFILE\O1_MF_6BPGGQ5C_.CTL'

*.core_dump_dest='D:\oracle\product\10.2.0/admin/prd/cdump'

*.db_block_size=8192

*.db_create_file_dest='D:\oracle\product\10.2.0\oradata'

*.db_domain=''

*.db_file_multiblock_read_count=16

*.db_name='prd'

*.db_recovery_file_dest='D:\oracle\product\10.2.0/flash_recovery_area'

*.db_recovery_file_dest_size=2147483648

*.dispatchers='(PROTOCOL=TCP) (SERVICE=prdXDB)'

*.job_queue_processes=10

*.log_archive_dest_1='LOCATION=D:\oracle\product\10.2.0\prd\arch'

*.log_archive_dest_2='LOCATION=D:\oracle\product\10.2.0\std\arch'

*.log_archive_format='ARC%S_%R.%T'

*.open_cursors=300

*.pga_aggregate_target=189792256

*.processes=150

*.remote_login_passwordfile='EXCLUSIVE'

*.sga_target=570425344

*.undo_management='AUTO'

*.undo_tablespace='UNDOTBS1'

*.user_dump_dest='D:\oracle\product\10.2.0/admin/prd/udump'





###########################################################

## STandy parameters

###########################################################

log_archive_dest_state_1= enable

log_archive_dest_state_2= enable

remote_archive_enable = true

local_listener= '(ADDRESS=(PROTOCOL=tcp)(HOST=oracle)(PORT=1521))'

standby_file_management= auto

*.log_archive_dest_1='LOCATION=D:\oracle\product\10.2.0\prd\arch'

*.log_archive_dest_2='service="(description=(address=(protocol=tcp)(host=oracle)(port=1521))(connect_data=(service_name=std)))" optional lgwr async affirm'


Step 4 : shutdown the database open with this parameter file that is initprd.ora file
Startup pfile =’d:\oracle\product\10.2.0\db_1\database\initprd.ora’;

Step 5 : Create control file for standby database

Alter database create standby controlfile as ‘d:\control1.ctl’;


Step 6 : shutdown the database and take a cold backup of primary database to standby location ,


On Standby side

Step 1: Copy all the files of primary database to the standby database location including the control file .

Step 2 : preparing the initstd.ora file for standby database as fallows copy the initprd.ora file in to the /$oracle_home/database location and edit the parameters as fallows


prd.__db_cache_size=398458880



prd.__java_pool_size=4194304



prd.__large_pool_size=4194304



prd.__shared_pool_size=155189248



prd.__streams_pool_size=0

*.audit_file_dest='D:\oracle\product\10.2.0/admin/std/adump'

*.background_dump_dest='D:\oracle\product\10.2.0/admin/std/bdump'

*.compatible='10.2.0.1.0'

*.control_files='D:\ORACLE\PRODUCT\10.2.0\ORADATA\std\CONTROLFILE\control01.CTL'

*.core_dump_dest='D:\oracle\product\10.2.0/admin/std/cdump'

*.db_block_size=8192

*.db_create_file_dest='D:\oracle\product\10.2.0\oradata'

*.db_domain=''

*.db_file_multiblock_read_count=16

*.db_name='prd'

*.instance_name='std'

*.service_names='std'

##*.db_name='prd'

*.db_recovery_file_dest='D:\oracle\product\10.2.0/flash_recovery_area'

*.db_recovery_file_dest_size=2147483648

*.dispatchers='(PROTOCOL=TCP) (SERVICE=stdXDB)'

*.job_queue_processes=10

*.log_archive_dest_1='LOCATION=D:\oracle\product\10.2.0\std\arch'

*.log_archive_dest_2='LOCATION=D:\oracle\product\10.2.0\std\arch'

*.log_archive_format='ARC%S_%R.%T'

*.open_cursors=300

*.pga_aggregate_target=189792256

*.processes=150

*.remote_login_passwordfile='EXCLUSIVE'

*.sga_target=570425344

*.undo_management='AUTO'

*.undo_tablespace='UNDOTBS1'

*.user_dump_dest='D:\oracle\product\10.2.0/admin/std/udump'

###########################################################

## STandy parameters

###########################################################

log_archive_dest_1 = ('location=D:\oracle\product\10.2.0\std')

log_archive_dest_state_1 = enable

log_archive_start = true

standby_archive_dest = "D:\oracle\product\10.2.0\std"

standby_file_management = auto

remote_archive_enable = true

log_archive_trace = 127

fal_server = prd

fal_client = std

*.DB_FILE_NAME_CONVERT='D:\oracle\PRODUCT\10.2.0\oradata\prd\','D:\oracle\PRODUCT\10.2.0\oradata\std\'

*.LOG_FILE_NAME_CONVERT='D:\oracle\product\10.2.0\oradata\prd\','D:\oracle\10.2.0\product\10.2.0\oradata\std\'

lock_name_space=std

we need to add all databsae base names in tnsname.ora, and listner.ora. mak soure this two database name are listion by the listener


Step 3: Create a service for the database by database

oradim -new -sid std -syspwd abbas -startmode AUTO -pfile D:\oracle\product\10.2.0\admin\ORCL1\pfile\initstd.ora
After creating the service then we need to stand the stadby database in nomount state .

Step 4 : Starting the standby database in nomount state


SQL> startup nomount pfile=’ D:\oracle\product\10.2.0\admin\ORCL1\pfile\initstd.ora’;


Step 5 : starting the standby database in mount state ;


SQL > alter database mount standby database;



o Put the standby database in managed recovery mode.
Alter database recover managed standby database disconnect



o Put the primary database in maximum performance mode and set force logging.


alter database set standby database to maximize availability;

alter database force logging



6. Data guard switchover procedure



The switchover is a graceful way to change the role of the primary database and data guard database. It will not lose any committed transaction during the switch and can be used in controlled DR scenario.



o ON THE PRIMARY SITE:

 Quiesce the database and get ready for the switchover



alter system set job_queue_processes=0;

alter system set aq_tm_processes=0;



 (optional) If the primary database is RAC enabled, leave only the primary instance up and shut down all the other instances.



 Switch the primary database into data guard mode. The following command will try to disconnect all the active sessions, stop the processing on the primary database, make sure all the online redo log files get sent over to the remote data guard site before committing to data guard mode. Depending on the activity of the database right before we do the switch, it will take anywhere between 20 minutes to 2 hours to finish.



alter database commit to switchover to physical standby with session shutdown;



• replace the primary init parameter file with the data guard init parameter file and mount database in data guard mode.



shutdown immediate


startup nomount


alter database mount standby database


alter database recover managed standby database disconnect;



o ON THE DATA GUARD SITE.



 (optional) check on the v$database view and make sure you got “SWITCHOVER PENDING” status in switchover_status column.. Lots of time we will get “SESSIONS ACTIVE” status back. It is still OK to move forward with switching the data guard database to primary.



Select switchover_status from v$database;



 (optional) If this is RAC enabled, make sure only the primary instance is up and running. Otherwise the following switch command will fail.



 Get the old data guard database to finish applying the archivelogs and switch over to primary database. If this is the first time to do the switchover, the online redo logs needs to be generated and you might get log files not exist. We can ignore those errors. Depends on the number of archivelogs need to be applied and the total size of online redo log files since all the online redo log files will be reset.



Alter database commit to switchover to primary with session shutdown;



 Replace the data guard init parameter file with the primary init parameter and recycle the database.



7. Data Guard Database Failover Procedure



This is when the primary database is completely lost during some unforeseeable event and the only thing we have is the data guard database.



 First finish applying the log files that has not been applied to the data guard databass.

Alter database recover managed standby database finish;



 If a standby redo log is not used, the above command may fail. Then we can use following command:

alter database recover managed standby database finish skip standby logfile;


 Then we can switch the database to primary:

Alter database commit to switchover to primary;


 If the switchover failed, then try to activate the database:



alter database activate standby database;



o After the switch or activation finish, an instance recycle is needed before turning the database over to the application:



 (optional) If you use temporary tempfile and the dba_temp_files view is empty, add tempfile accordingly.



Alter tablespace estm01 add tempfile ‘/dev/vx/rdsk/Otm01’ size 1m;



8. Disable the Data Guard database from Primary database


 In case of network issue, the primary database may get stopped or hung. Issue the following command to free it up:

Alter system set log_archive_dest_state_2 = defer;

o After the network issue is resolved, we need to re-establish the Data Guard setting: Sometimes we see this command doesn’t work in RAC database and the instance recycle is needed.

Alter system set log_archive_dest_state_2 = enable;



 (optional) If the standby database is RAC enabled, don’t bring up the second instance. It won’t be needed for the Data Guard process and will prevent the first instance from finishing the switching to primary database if needs to.



9. OPEN DATA GUARD DATABASE AS READONLY DATABASE



o If we need to open a read only standby database at a particular time:



Alter database recover managed standby database cancel;

Alter database open read only;



10. Common Issues and Recommendations



• Try to use maximum performance mode and lgwr async or even arch option with the remote destination when the data concurrency is not a high priority during the DR time. The maximum availability mode and lgwr sync option can incur anywhere from 100% to 400% overhead to the primary database, depending on the network and the nature of the application. If we do have to set up the maximum availability mode and lgwr sync option, make sure the application is very happy with performance and there won’t be significant increase with the workload in the foreseeable future.



• The following underscore parameter was introduced as of 9.2.0.5 to allow archiver write to local destination first before it try to write to the remote destination. This will free up the primary database from waiting on the slow transfer of the archive log over the remote network.



_LOG_ARCHIVE_CALLOUT='LOCAL_FIRST=TRUE'



• We can use multiple archiver if the network is so slow that the remote archiving can never keep up with the load of the primary database.



LOG_ARCHIVE_MAX_PROCESSES = n

• Following init parameter can be implemented to reduce contention with log writer process.

_lgwr_async_io = FALSE

• Change the SDU size to the maximum 32k from default 2k can help improve the remote archiving performance. The following setting can be put in the sqlnet.ora file starting 9.2.0.4.

DEFAULT_SDU_SIZE = 32767

• ARCHIVE_LAG_TARGET limits the amount of data that can be lost and effectively increases the availability of the standby database by forcing a log switch after a user-specified time period elapses.

archive_lag_target= 3600