Replicating RAC database using RMAN at Remote Server

Posted By Sagar Patil

Here I am duplicating 11g RAC database from one RHEL Server to Another by old 10g method.
I could have used 11g “DUPLICATE TARGET DATABASE TO TARGET_DB FROM ACTIVE DATABASE” which doesn’t need previous rman backup at source. But it may not be a good option for large databases or at places with narrow network bandwidth.

Assumptions Made:

- RAC Clusterware and Database binaries are installed at Destination Nodes
- Clusterware services “crsctl check crs” reported active

PRIMARY site Tasks (Ora01a1,Ora01a2):

  • Create FULL RMAN Backup
  • Copy backup files from PRIMARY server to New server
  • Create pfile from spfile at source RAC
  • Copy init.ora from $Primary_Server:ORACLE_HOME/dbs to $New_Server:ORACLE_HOME/dbs
  • Copy $Primary_Server:ORACLE_HOME/dbs/password file to $New_Server:ORACLE_HOME/dbs

[oracle@Ora01a1 RAC1]$ scp Ora01a1BKUP.tgz oracle@Node1:/mnt/data
Warning: Permanently added (RSA) to the list of known hosts.
Ora01a1BKUP.tgz                                                          100%  274MB  11.4MB/s   00:24

SQL> show parameter pfile
NAME                                 TYPE        VALUE
———————————— ———– ——————————
spfile                               string      /mnt/data/oradata/primary/spfileRAC.ora

SQL> show parameter spfile
NAME                                 TYPE        VALUE
———————————— ———– ——————————
spfile                               string      /mnt/data/oradata/primary/spfileRAC.ora

SQL> create pfile=’/mnt/data/oradata/primary/init.ora’ from spfile;
File created

[oracle@Ora01a1 RAC]$ scp init.ora oracle@Node1:/mnt/data/rman_backups/bkup/init.ora 100% 1612     1.6KB/s   00:00
[oracle@Ora01a1 dbs]$ scp /mnt/data/oradata/primary/orapwRAC oracle@Node1:/mnt/data/rman_backups/bkup   orapwRAC 100% 1536     1.5KB/s   00:00

Destination Site Tasks (Node1,Node2)
Create required directories for bdump,adump as well as database mount volumes.

[oracle@Node1]$ grep /mnt initRAC.ora
*.control_files=’/mnt/data/oradata/primary/control01.ctl’,'/mnt/data/oradata/primary/control02.ctl’
*.db_recovery_file_dest=’/mnt/logs/oradata/primary/fast_recovery_area’
*.log_archive_dest_1=’LOCATION=/mnt/logs/oradata/primary/arch’

[oracle@Node1]$ mkdir -p /mnt/data/oradata/primary/
[oracle@Node1]$ mkdir -p /mnt/logs/oradata/primary/fast_recovery_area
[oracle@Node1]$ mkdir -p /mnt/logs/oradata/primary/arch

“opt” is a local volume for each instance so create directories on both RAC nodes
[oracle@Node1]$ grep /opt initRAC.ora
*.audit_file_dest=’/opt/app/oracle/admin/primary/adump’
*.diagnostic_dest=’/opt/app/oracle’

[oracle@Node1]$ mkdir -p /opt/app/oracle/admin/primary/adump
[oracle@Node1]$ mkdir -p /opt/app/oracle

[oracle@Node2]$ mkdir -p /opt/app/oracle/admin/primary/adump
[oracle@Node3]$ mkdir -p /opt/app/oracle

Under 11g background trace will be maintained at “$ORACLE_BASE/diag/rdbms”, if required create necessary directories there.

Modify init.ora file ($ORACLE_HOME/dbs/init.ora) and amend/change parameters. I had to comment out “remote_listener” parameter as the serversnames at destination are different.

Copy init.ora at both nodes “Node1,Node2″@$ORACLE_HOME/dbs

[oracle@Node1 dbs]$ cp initRAC.ora initRAC1.ora

[oracle@Node1 dbs]$ echo $ORACLE_SID
RAC1
SQL> startup nomount;
ORACLE instance started.
Total System Global Area 9152860160 bytes
Fixed Size                  2234056 bytes
Variable Size            6945769784 bytes
Database Buffers         2181038080 bytes
Redo Buffers               23818240 bytes

[oracle@Node1 dbs]$ rman target / nocatalog
connected to target database: RAC (not mounted)
using target database control file instead of recovery catalog
RMAN> restore controlfile from ‘/mnt/data/rman_backups/bkup/c-4020163152-20110405-01′;
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=588 instance=RAC1 device type=DISK
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
output file name=/mnt/data/oradata/primary/control01.ctl
output file name=/mnt/data/oradata/primary/control02.ctl
.

………
Finished restore at 05-APR-11

Verify controlfiles are copied at right location

[oracle@Node2 RAC]$ pwd
/mnt/data/oradata/RAC

[oracle@Node2 RAC]$ ls -lrt
-rw-r—–  1 oracle oinstall 22986752 Apr  5 16:35 control01.ctl
-rw-r—–  1 oracle oinstall 22986752 Apr  5 16:35 control02.ctl

RMAN> alter database mount;
database mounted
RMAN> RESTORE DATABASE;
Starting restore at 05-APR-11
Starting implicit crosscheck backup at 05-APR-11
allocated channel: ORA_DISK_1
allocated channel: ORA_DISK_2
******* This returned errors
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 04/05/2011 16:36:54
RMAN-06026: some targets not found – aborting restore
RMAN-06023: no backup or copy of datafile 3 found to restore

RMAN was not able to locate backup since backupset was not registered with rman inventory & copied at different location. Let’s catalog backup pieces that were shipped from Primary database.

I have multiple copies of backup files so I used
RMAN> CATALOG START WITH ‘/mnt/data/rman_backups/bkup/’ NOPROMPT;
List of Cataloged Files
=======================
File Name: /mnt/data/rman_backups/bkup/c-4020163152-20110405-04
File Name: /mnt/data/rman_backups/bkup/c-4020163152-20110405-05
File Name: /mnt/data/rman_backups/bkup/db_bk_ub8m91cg7_s3432_p1_t747680263.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ub9m91cg8_s3433_p1_t747680264.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubam91cg9_s3434_p1_t747680265.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubcm91cgi_s3436_p1_t747680274.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubdm91cgi_s3437_p1_t747680274.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubbm91cgi_s3435_p1_t747680274.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubem91ck0_s3438_p1_t747680384.bkp
File Name: /mnt/data/rman_backups/bkup/db_bk_ubfm91ck0_s3439_p1_t747680384.bkp
File Name: /mnt/data/rman_backups/bkup/ctl_bk_ubhm91ck3_s3441_p1_t747680387.bkp

RMAN> RESTORE DATABASE;
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /mnt/data/oradata/primary/system01.dbf
channel ORA_DISK_1: restoring datafile 00005 to /mnt/data/oradata/primary/undotbs02.dbf
channel ORA_DISK_1: reading from backup piece /mnt/data/rman_backups/bkup/db_bk_ubdm91cgi_s3437_p1_t747680274.bkp
.
..
channel ORA_DISK_1: piece handle=/mnt/data/rman_backups/bkup/db_bk_ubdm91cgi_s3437_p1_t747680274.bkp tag=TAG20110405T165753
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:01:26
channel ORA_DISK_2: piece handle=/mnt/data/rman_backups/bkup/db_bk_ubcm91cgi_s3436_p1_t747680274.bkp tag=TAG20110405T165753
channel ORA_DISK_2: restored backup piece 1
channel ORA_DISK_2: restore complete, elapsed time: 00:01:26
channel ORA_DISK_3: piece handle=/mnt/data/rman_backups/bkup/db_bk_ubbm91cgi_s3435_p1_t747680274.bkp tag=TAG20110405T165753
channel ORA_DISK_3: restored backup piece 1
channel ORA_DISK_3: restore complete, elapsed time: 00:01:56
Finished restore at 05-APR-11

RMAN> recover database;
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=3289
channel ORA_DISK_1: reading from backup piece /mnt/data/rman_backups/bkup/db_bk_ubem91ck0_s3438_p1_t747680384.bkp
channel ORA_DISK_2: starting archived log restore to default destination
channel ORA_DISK_2: restoring archived log
archived log thread=2 sequence=3484
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 04/05/2011 17:17:46
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 3290 and starting SCN of 246447604

RMAN> ALTER DATABASE OPEN RESETLOGS;
database opened

Shutdown and restart database RAC1

SQL> shutdown abort;
ORACLE instance shut down.

set sqlprompt ‘&_CONNECT_IDENTIFIER > ‘
RAC1> startup;
ORACLE instance started.
Total System Global Area 9152860160 bytes
Fixed Size                  2234056 bytes
Variable Size            6945769784 bytes
Database Buffers         2181038080 bytes
Redo Buffers               23818240 bytes
Database mounted.
Database opened.

RAC1> set linesize 200;
RAC1> set pagesize 20;
RAC1> select inst_id,substr(member,1,35) from gv$logfile;
INST_ID SUBSTR(MEMBER,1,35)
———- ——————————————————————————————————————————————–
1 /mnt/data/oradata/primary/redo02.log
1 /mnt/data/oradata/primary/redo01.log
1 /mnt/data/oradata/primary/redo03.log
1 /mnt/data/oradata/primary/redo04.log

I can see , INSTANCE 2 REDO log files are not listed so startup RAC2 instance at Node2

[oracle@Node2 dbs]$ echo $ORACLE_SID
RAC2
SQL> set sqlprompt ‘&_CONNECT_IDENTIFIER > ‘
RAC2> startup;
ORACLE instance started.
Total System Global Area 9152860160 bytes
Fixed Size                  2234056 bytes
Variable Size            6610225464 bytes
Database Buffers         2516582400 bytes
Redo Buffers               23818240 bytes
Database mounted.
Database opened.

I can now locate REDO files for Instance 1 as well as 2

select inst_id,substr(member,1,35) from gv$logfile;
INST_ID SUBSTR(MEMBER,1,35)
———- ——————————————————————————————————————————————–
2 /mnt/data/oradata/primary/redo02.log
2 /mnt/data/oradata/primary/redo01.log
2 /mnt/data/oradata/primary/redo03.log
2 /mnt/data/oradata/primary/redo04.log
1 /mnt/data/oradata/primary/redo02.log
1 /mnt/data/oradata/primary/redo01.log
1 /mnt/data/oradata/primary/redo03.log
1 /mnt/data/oradata/primary/redo04.log
8 rows selected.

I will carry log switchs to see ARCHIVE files create at Archive Destination “/mnt/logs/oradata/primary/arch”

RAC1 > alter system switch logfile;
System altered.
RAC1 > /
System altered.
RAC2 > alter system switch logfile;
System altered.
RAC2 > /
System altered.
[oracle@Node2 arch]$ pwd
/mnt/logs/oradata/primary/arch
[oracle@Node2 arch]$ ls -lrt
total 5348
-rw-r—–  1 oracle oinstall  777216 Apr  6 10:00 1_10_747681489.arc
-rw-r—–  1 oracle oinstall    4096 Apr  6 10:00 1_11_747681489.arc
-rw-r—–  1 oracle oinstall 4667392 Apr  6 10:00 2_11_747681489.arc
-rw-r—–  1 oracle oinstall   56832 Apr  6 10:01 2_12_747681489.arc

We have some background jobs in this database. I will set them to sleep at both databases for some time

RAC1 > alter system set job_queue_processes=0;
System altered.

RAC2 > alter system set job_queue_processes=0;
System altered.

See if there are any alrtlog errors reported at nodes node1/node2 before Registering  database with CRS

RAC1> create spfile from pfile;
File created.

[oracle@Node1 dbs]$ pwd
/opt/app/oracle/product/11.2/db_1/dbs
-rw-r—–  1 oracle oinstall     3584 Apr  6 10:20 spfileRAC1.ora

Move spfile at a shared clustered location accessible to both Nodes/Instances RAC1/RAC2.

cp spfileRAC1.ora /mnt/data/oradata/primary/spfileRAC.ora

[oracle@(RAC1 or RAC2 ) ]$ df -k
/dev/mapper/System-Opt 20314748  14636172   4630208  76% /opt   – Local Storage
NETAPP_Server:/vol/prod_data 52428800  33919456  18509344  65% /mnt/data — Clustered Storage

[oracle@RAC1 PROD]$ ls -l /mnt/data/oradata/primary/spfile*
-rw-r—– 1 oracle oinstall 7680 May 10 15:18 spfileRAC.ora

Link individual init files on nodes RAC1/RAC2 to spfile

[oracle@RAC1]$ cd $ORACLE_HOME/dbs

[oracle@RAC1 dbs]$ cat initRAC1.ora
SPFILE=’/mnt/data/oradata/primary/spfileRAC.ora’

[oracle@RAC2 dbs]$ cat initRAC2.ora
SPFILE=’/mnt/data/oradata/primary/spfileRAC.ora’

Registering  database with CRS

[oracle@Node1 dbs]$ srvctl add database -d RAC -o /opt/app/oracle/product/11.2/db_1 -p  /mnt/data/oradata/primary/spfileRAC.ora
[oracle@Node1 dbs]$ srvctl add instance -d RAC -i RAC1 -n Node1
[oracle@Node1 dbs]$ srvctl add instance -d RAC -i RAC2 -n Node2
[oracle@Node2 arch]$ crsstat.sh  | grep RAC
ora.RAC.db                                 OFFLINE    OFFLINE

Before using services, we must check the cluster configuration is correct

[oracle@Node1 dbs]$ srvctl config database -d RAC
Database unique name: RAC
Database name:
Oracle home: /opt/app/oracle/product/11.2/db_1
Oracle user: oracle
Spfile: /mnt/data/oradata/primary/spfileRAC.ora
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: RAC
Database instances: RAC1,RAC2
Disk Groups:
Mount point paths:
Services:
Type: RAC
Database is administrator managed

[oracle@Node1 dbs]$ srvctl start database -d RAC
PRCR-1079 : Failed to start resource ora.RAC.db
CRS-5017: The resource action “ora.RAC.db start” encountered the following error:
ORA-29760: instance_number parameter not specified

Solution of the Problem
srvctl is case sensitive. So we need to ensure that instance and database definitions set in spfile/pfile are same case as those in the OCR and as are used in the srvctl commands. I made a mistake here and added “GDPROD1/2″ in lowercase “RAC1/RAC2″ while creating services.
Before going into solution be sure that ORACLE_SID reflects correct case so that instance can be accessed using SQL*Plus

I will have to remove services created earlier and add them with “UPPERCASE” instance name

[oracle@Node1 dbs]$ srvctl remove database -d RAC

Remove the database RAC? (y/[n]) y
[oracle@Node1 dbs]$ srvctl remove instance -d RAC -i RAC1
PRCD-1120 : The resource for database RAC could not be found.
PRCR-1001 : Resource ora.RAC.db does not exist
[oracle@Node1 dbs]$ srvctl remove instance -d RAC -i RAC2
PRCD-1120 : The resource for database RAC could not be found.

[oracle@Node1 dbs]$ srvctl add database -d RAC -o /opt/app/oracle/product/11.2/db_1 -p /mnt/data/oradata/primary/spfileRAC.ora
[oracle@Node1 dbs]$ srvctl add instance -d RAC -i RAC1 -n Node1
[oracle@Node1 dbs]$ srvctl add instance -d RAC -i RAC2 -n Node2
[oracle@Node2 arch]$ crsstat.sh  | grep RAC
ora.RAC.db                                 OFFLINE OFFLINE

Moment of TRUTH , start the Database

[oracle@Node1 dbs]$ srvctl start database -d RAC
[oracle@Node1 dbs]$ crsstat.sh  | grep RAC
ora.RAC.db                                 ONLINE ONLINE on Node1

[oracle@Node1 ~]$ export ORACLE_SID=RAC1
SQL> set sqlprompt ‘&_CONNECT_IDENTIFIER > ‘
RAC1 > archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /mnt/logs/oradata/primary/arch
Oldest online log sequence     18
Next log sequence to archive   19
Current log sequence           19

SQL>  set sqlprompt ‘&_CONNECT_IDENTIFIER > ‘
RAC2 > archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /mnt/logs/oradata/primary/arch
Oldest online log sequence     20
Next log sequence to archive   21
Current log sequence           21

Finally have a look at alrtlog for any issues reported

To test session failover , I will create SQLPLUS connection and see if  it gets migrated to other node when instance goes down.

SQL> select machine from v$session where rownum <5;
MACHINE
—————————————————————-
Node1
Node1
Node1
Node1

Node1 RAC1> shutdown abort;
ORACLE instance shut down.

SQL> select machine from v$session where rownum <5;
MACHINE
—————————————————————-
Node2
Node2
Node2
Node2

Leave a Reply

You must be logged in to post a comment.

Top of Page

Top menu