RAC Database Administration: 2011

Thursday, September 15, 2011

DataGuard Process Architecture

As shown in the figure below, the Data Guard operations involve many processes.

Let us see what each process is used for one by one,

In the Primary Site,

LGWR or ARCH: The redolog/archivelog from the primary database is shipped to the standby database site through LGWR/ARCH processes.

FAL (Fetch Archive Log): When there is a communication loss between primary and standby (for example, network failure), the log shipping is affected and there may be gaps. When the communication is established again the FAL processes enable the automatic gap resolution and resynchronization process.

In the Standby Site,

RFS (Remote File Server): The RFS process receives the redo records from the Primary site.

MRP (Managed Recovery Process): MRP applies the redo records to the Physical Standby databases.

LSP (Logical Standby Process): The LSP process translates the redo records in to SQL and fires the SQL to the logical standby databases.

If Data Guard Broker is enabled then DMON Process is enabled on both the primary and standby sites.

Thanks

Difference between Physical and Logical Standby

The Data Guard Configuration consists of one Production database and upto nine Standby databases. The production database can be a single instance or RAC database similarly Standby database can be single instance or RAC database. It is also possible to have single instance standby database for a RAC production database.

The standby database is a transactionally consistent copy of the production database. It is created initially from the backup of production database. Once created, the data guard automatically synchronizes the standby.

There are two types of standby database, they are

1. Physical Standby
2. Logical Standby

Physical standby uses the recovery technique to sychronize the standby database where as logical standby uses the sql apply method to synchronize both the database.
The following table provides you with the difference between physical and logical standby databases,

S.No	Physical Standby	Logical Standby
1	Identical to the Primary database including the physical organization inthe disk	Same logical information but physical organization and the structure ofdata are different.
2	DG uses Redo Apply technology, which applies redo data using standardrecovery techniques.	DG uses SQL Apply, which first transforms the redo data into SQLstatements and then executes the statement.
3	Can be used for Backups	Can be opened for reporting.
4	All data types are supported	Not all data types are supported for eg. LONG, NCLOB, LONG RAW, BFILE, XML types are not supported.
5	Can open in ‘Read only’ but cannot apply logs. From 11g open in‘read write’.	Can open in normal mode and simultaneously apply the logs.
6	No additional objects can be created.	Additional indexes, materialized views can be created

Usually organizations use Logical Standby databases mainly for reporting purposes and not for failover/switchover operations. For failover and switchover they use physical standby database. The reason is maintaining logical standby is almost a full time job, need extensive tuning of log apply services, and over hundreds of patches, the logical is usually 3 to 5 hours behind the live database, thus making it impossible for failover/switchover.

Thanks

Wednesday, September 14, 2011

DataGuard Features History

The concept of Standby database was existing since the Oracle version 7.3. Since then it had undergone a lot of changes and improvements. Let us summarise these changes/improvements from version 7.3 through 11g.

Oracle 7.3: Introduces Standby database. In this release the process of transferring archived redo logs was manual.

Oracle 8i: Supported the automatic shipping and application of archived redolog files from the primary site to standby site. It also supported opening of standby database for read only, however the recovery process must be stopped inorder to open it in read only mode.

Oracle 9iR1: Supported automatic gap resolution. Introduced various protection modes. Also introduced DataGuard Broker, an interface to manage the dataguard environment.

Oracle 9iR2: Introduced the concept of Logical Standby Database.

Oracle 10g: Introduced Real Time Apply, Fast Start Failover, Provided RAC support.

Oracle 11g: Introduces Active Dataguard which enables you to open the physical database in read only mode while the redo apply is active. Introduces Snapshot standby and redo compression.

Sample init.ora file for implementing Physical Standby

The following is a sample parameter file used for creating Physical Standby Database.

*.audit_file_dest='/home/oracle/admin/LCDR/adump'

*.background_dump_dest='/home/oracle/admin/LCDR/bdump'

*.compatible='10.2.0.3.0'

*.control_files='+ASMFLASH/LCDR/controlfile/current.261.740682331','+ASMUNDO/LCDR/controlfile/current.261.
740682331','+ASMIND/LCDR/controlfile/current.261.740682331'

*.core_dump_dest='/home/oracle/admin/LCDR/cdump'

*.cursor_sharing='SIMILAR'
*.db_block_size=8192
*.db_cache_size=5368709120
*.db_create_file_dest='+ASMDB'
*.db_create_online_log_dest_1='+ASMFLASH'
*.db_create_online_log_dest_2='+ASMUNDO'
*.db_create_online_log_dest_3='+ASMIND'
*.db_domain=''
*.db_file_multiblock_read_count=32

*.db_file_name_convert='+ASMDB/PRDB','+ASMDB/LCDR','+ASMIND/PRDB','+ASMIND/LCDR','+ASMFLASH/PRDB','+ASMFLASH/LCDR','+ASMUNDO/PRDB','+ASMUNDO/LCDR'

*.db_flashback_retention_target=0
*.db_keep_cache_size=2147483648

*.db_name='PRDB'

*.db_recovery_file_dest='+ASMFLASH'
*.db_recovery_file_dest_size=96636764160

*.db_unique_name='LCDR'

*.db_writer_processes=8

*.dispatchers='(PROTOCOL=TCP)(SERVICE=LCDRXDB)'

*.dml_locks=24220
*.event='44951 trace name context forever,level 1024'
*.filesystemio_options='SETALL'
*.job_queue_processes=10

*.fal_client='LCDR'
*.fal_server='PRDB'

*.instance_name='LCDR'

*.log_archive_config='DG_CONFIG=(LCDR,PRDB)'
*.log_archive_dest_1='LOCATION=/backup/prodarch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=LCDR'

*.log_archive_dest_state_1='ENABLE'
*.log_buffer=67108864

*.log_file_name_convert='+ASMDB/PRDB','+ASMDB/LCDR','+ASMIND/PRDB','+ASMIND/LCDR','+ASMFLASH/PRDB','+ASMFLASH/LCDR','+ASMUNDO/PRDB','+ASMUNDO/LCDR'

*.open_cursors=5000
*.pga_aggregate_target=4294967296
*.processes=500
*.query_rewrite_integrity='TRUSTED'
*.remote_login_passwordfile='EXCLUSIVE'

*.service_names='LCDR'

*.session_cached_cursors=5000
*.sessions=500
*.sga_max_size=21474836480
*.sga_target=21474836480
*.shared_pool_size=1073741824
*.standby_file_management='AUTO'
*.undo_management='AUTO'
*.undo_retention=10800
*.undo_tablespace='UNDOTBS1'

Thanks

Duplicate Script for implementing Physical Standby

After you create backup of primary database using the BACKUP SCRIPT and transfered the files to the standby server run the following script to create the physical standby.

export ORACLE_BASE=/home/oracle
export ORACLE_HOME=/home/oracle/OraHome1
export PATH=/usr/bin:/etc:/usr/sbin:/usr/ucb:$HOME/bin:/usr/bin/X11:/sbin:/home/oracle/OraHome1/bin:.
export ORACLE_SID=LCDR

v_bdest=/backup/RMANBKUP/spool
####################### DUPLICATE DATABASE ##########################

rman target sys/password@PRDB auxiliary / << EOF
spool log to $v_bdest/StandbyDB.log append;
run {
allocate auxiliary channel a1 type disk format '/backup/RMANBKUP/%U';
allocate auxiliary channel a2 type disk format '/backup/RMANBKUP/%U';
duplicate target database for standby;
}
exit
EOF

Thanks

Backup Script for implementing Physical Standby Database.

The following script is used for creating RMAN backup required for implementing Physical Standby database.

export ORACLE_BASE=/home/oracle
export ORACLE_HOME=/home/oracle/OraHome1
export PATH=/usr/bin:/etc:/usr/sbin:/usr/ucb:$HOME/bin:/usr/bin/X11:/sbin:/home/oracle/OraHome1/bin:.
export ORACLE_SID=PRDB

v_bdest=/backup/RMANBKUP
lev0="'$v_bdest/%U'"
ctlbak="'$v_bdest/%U'"

rman target sys/password@PRDB nocatalog << EOF
spool log to $v_bdest/spool/BACKUP.log append;
CONFIGURE CONTROLFILE AUTOBACKUP OFF;
run {
allocate channel c1 type disk;
allocate channel c2 type disk;
backup as compressed backupset incremental level 0 format $lev0 database plus archivelog;
backup current controlfile for standby format $ctlbak;
backup spfile;
release channel c1;
release channel c2;
}
exit
EOF

After the backup is finished, copy the backup files to the standby server and run the RMAN duplicate command to create the standby database.

Steps for implementing Physical Standby Database

In this post Iam going to give step by step procedure for creating the Physical Standby Database.

Production Database: PRDB
Standby Database: LCDR
Production server:x.x.x.41
Standby Server: x.x.x.47
Database Version:10.2.0.4.0

Step1: Enable Force Logging in Primary Database

SQL> Alter Database Force Logging:

Step2: Backup the primary database and copy the backup to the Standby server. The directory structure of the backup must be same as that of the primary server.

For backup script click here

Step3: Update the oratab file in the standby server.

$ cd /etc
$ vi oratab

+ASM:/home/oracle/OraHome1:N
LCDR:/home/oracle/OraHome1:N

$

Step4: Create a password file for LCDR.
Copy the password file from production and rename it.

Step5: Create necessary directories (bdump,cdump etc). Since this is a physical standby the directory structure of the standby database must be identical to that of primary database. If you are using ASM the diskgroup name should be identical to primary ASM.

Step 6: Update tnsnames.ora and listener.ora

Step 7: Test the connectivity from both server.

$ tnsping PRDB
$ tnsping LCDR

Step 8: Create init.ora file for standby database.
Copy the init.ora file from the target database to aux server.
Modify/Add the following parameters,

Audit_file_dest
Background_dump_dest
Core_dump_dest
User_dump_dest
Control_files
Db_name
Db_unique_name
dispatchers
Fal_client
Fal_server
Instance_name
Log_archive_config
Log_archive_dest_1
Service_names
Db_file_name_convert
Log_file_name_convert

Modify/Add the following parameters in the primary database (PRDB),
SQL> alter system set log_archive_dest_2='SERVICE=LCDR valid_for=(online_logfiles, primary_role) db_unique_name=LCDR' scope=both;

SQL> alter system set log_archive_config='dg_config=(LCDR,PRDB)' scope=both;

SQL> alter system set standby_file_management='AUTO' scope=both;

Click here for a sample init.ora file

Step 9: Startup standby database in nomount stage.

From standby server
$ export ORACLE_SID=LCDR
$ sqlplus / as sysdba
connected to an idle instance
SQL> create spfile from pfile;
File created
SQL> Startup nomount;

Step 10: Run the RMAN duplicate command to create standby database.
For duplicate script click here
Step 11: Create the standby redolog (SRL) groups in standby database.

Number of SRL is one more than the no. of online redolog groups in primary database.
For eg. if the no.of online redolog groups in primary database is 3, you have to create four SRL.

SQL> Alter database add standby logfile group 4 '+ASMFLASH' size 512M;
SQL> Alter database add standby logfile group 5 '+ASMFLASH' size 512M;
SQL> Alter database add standby logfile group 6 '+ASMFLASH' size 512M;
SQL> Alter database add standby logfile group 7 '+ASMFLASH' size 512M;

Step 12: Start the Managed Recovery Process on the standby.

SQL> Alter database recover managed standby database disconnect from session;

Step 13: Verify log shipment.

On the primary

SQL> Alter system set log_archive_dest_state_2= 'ENABLE' scope=both;
SQL> Alter system Switch logfile;

Check whether the logfiles are shipped to the standby server.

Thanks

Wednesday, June 29, 2011

ORA-29701: Unable to connect to Cluster Manager

Last week we restarted our production server, after that we were unable to start the ASM instance. When we tried to start the ASM instance we got the following error,

ORA-29701: Unable to connect to Cluster Manager.

The reason is Oracle cssd daemon process was not running.

Workaround:
Login to the server as root user

Step 1: Go to the /etc folder
Step 2: Execute the following command.
# init.cssd start
startup will be queued to init within 30 seconds

Step 3: Wait for 2 minutes and execute the following commands
# cd $ORACLE_HOME/bin
# ./localconfig delete
Stopping CSSD
Failure in CSS initialization opening OCR.
Shutdown has begun. The daemons should exit soon.

Step 4: Execute the following command

# ./localconfig add
Successfully accumulated necessary OCR keys.
Creating OCR keys for user root, privgrp 'system'
Operation Successful
Configuration for local css has been initialized.
Adding to inittab
Startup will be queued to init within 30 seconds
Checking the status of new oracle init process..
Expecting the CRS daemons to be up within 600 seconds
CSS is active on these nodes
nodelist
CSS is active on all nodes
Oracle CSS service is installed and running under init(1M)

Step 5: Start the ASM instance now.

Thanks

Friday, June 3, 2011

Convert Archivelog Mode to NoArchivelog Mode in RAC

Step 1: Set the Environment Variable
$ hostname
node1
$ export ORACLE_SID=gcprod1
$ sqlplus system @ gcprod1
password:

Step 2: Disable Cluster
SQL> alter system set CLUSTER_DATABASE=FALSE scope=spfile;
system altered

Step 3: Set the parameters (Optional)
SQL> alter system set LOG_ARCHIVE_START= FALSE scope=spfile;
system altered

Step 4: Shutdown the database
SQL> exit
$ srvctl stop database -d gcprod

Step 5: Mount the database

$ echo $ORACLE_SID
gcprod1

$ sqlplus "sys as sysdba"
SQL> startup mount
ORACLE instance started
Total System Global Area: 4.4678E+10 bytes
Fixed Size 2166864 bytes
Variable Size 4345302960 bytes
Database mounted.

Step 6: Disable Archive log
SQL> alter database noarchivelog;

Step 7: Enable the cluster
SQL> alter system set CLUSTER_DATABASE=TRUE scope=spfile;
system altered

Step 8: Shutdown the instance
SQL> shutdown immediate;

Step 9: Start all the instances
$ srvctl start database -d gcprod

Step 10: verify the status.
$ crs_stat -t
all instances must be online

$ sqlplus "sys as sysdba"
SQL> archive log list
Database log mode No Archive Mode
Automatic archival Disabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 308
Next log sequence to archive 309
Current log sequence 309

Your database is in No Archive Log Mode.

Configure Archivelog in RAC Database

Steps to configure ARCHIVELOG mode from NOARCHIVELOG mode.

Database name: gcprod
Instance 1: gcprod1
Instance 2: gcprod2

Step 1: Set the Environment Variable
$ hostname
node1
$ export ORACLE_SID=gcprod1
$ sqlplus system @ gcprod1
password:

Step 2: Disable Cluster

SQL> alter system set CLUSTER_DATABASE=FALSE scope=spfile;
system altered

Step 3: Set the parameters

SQL> alter system set LOG_ARCHIVE_FORMAT= 'gcprod_%t_%s_%r.arc' scope=spfile;
system altered

Step 4: Shutdown the database

SQL> exit
$ srvctl stop database -d gcprod

Step 5: Mount the database

$ echo $ORACLE_SID
gcprod1

$ sqlplus "sys as sysdba"
SQL> startup mount
ORACLE instance started
Total System Global Area: 4.4678E+10 bytes
Fixed Size 2166864 bytes
Variable Size 4345302960 bytes
Database Buffers 4.0316E+10 bytes
Redo Buffers 14606336 bytes
Database mounted.

Step 6: Enable Archive log

SQL> alter database archivelog;

Step 7: Enable the cluster

SQL> alter system set CLUSTER_DATABASE=TRUE scope=spfile;
system altered

Step 8: Shutdown the instance

SQL> shutdown immediate;

Step 9: Start all the instances

$ srvctl start database -d gcprod

Step 10: verify the status.

$ crs_stat -t
all instances must be online

$ sqlplus "sys as sysdba"
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 308
Next log sequence to archive 309
Current log sequence 309

Wednesday, June 1, 2011

Change VIP address

You need to enter the Virtual IP and Virtual host name during a clusterware installation. These information are stored in the OCR and different components within RAC depends on these VIPs.
If for any reason you want to change the VIP, you may do it using ifconfig and srvctl utilities.
The following steps are performed to change the VIP address.

Step 1: Confirm the current IP address for the VIP

$ ifconfig - a

Step 2: Stop all the resources that are dependent on the VIP on that particular node.

$ srvctl stop instance -d DB -i db1
$ srvctl stop asm -n node1
$ su - root

# srvctl stop nodeapps -n node1

Step 3: Verify that VIP is no longer running.

$ ifconfig -a

The output must be blank. If any interface is listed, it means there are components that is dependant on VIP is still running. Stop those resources.

Step 4: Change /etc/hosts file

Edit the /etc/hosts file with the new ip address and hostname.

Step 5: Modify the nodeapps and provide the new VIP address using srvctl.

$ su -root

# srvctl modify nodeapps -n node1 -A 192.168.4.41/255.255.255.0/eth0

where 192.168.4.41 is the new IP address and 255.255.255.0 is the subnet mask
and eth0 is the interface you want your VIP to use.

Step 6: Start the nodeapps again

# srvctl start nodeapps -n node1

Step 7: Repeat step 1 -6 on the other available nodes.

Step 8: Modify the IP address in the tnsnames.ora and listener.ora files.

Thanks

Monday, May 23, 2011

Cluster Verification Utility CLUVFY

CLUVFY utility is distributed with Oracle Clusterware. It is used to assist in the installation and configuration of Oracle Clusterware as well as RAC. It helps in verifying whether all the components that are required for successful installation of Clusterware and RAC are installed and configured correctly.

The CLUVFY commands are divided in to two categories,

1. Stage Commands
2. Component Commands

Stage Commands:

There are various phases during clusterware or RAC deployment, for example, hardware and software cofiguration, CRS installation, RAC software installation, Database creation etc...Each of these phases is called a stage. Each stage requires a pre-requisite conditions to be met before entering the stage (pre check) and another set of conditions to be met after the completion of that stage (post check)..

The pre-check verification and post check verification can be done using the CLUVFY commands. The commands used to perform these pre-check and post-check are called stage commands. To identify various stages use the following command,

$ cd ORA_CRS_HOME/bin
$ cluvfy stage -list

post hwos - Post check for hardware and Operating System
Pre cfs - Pre check for CFS (optional)
post cfs - Post check for CFS (optional)
pre crsinst - pre check for clusterware installation
post crsinst - post check for clusterware installation
pre dbinst - pre check for database installation
pre dbcfg - pre check for database configuration

Component Commands:

The commands in this category is used to verify the correctness of an individual cluster components and not associated with any stages. The various cluster components are listed using the following command,

$ cd $ORA_CRS_HOME/bin
$ cluvfy comp -list

nodereach - checks reachability between nodes
nodecon - checks node connectivity
cfs - checks cfs integrity
ssa - checks shared storage accessibility
space - checks space availability
sys - checks minimum system requirements
clu - checks cluster integrity
clumgr - checks cluster manager integrity
ocr - checks ocr integrity
nodeapp - checks existence of node applications
admprv - checks administrative privileges
peer - Compares properties with peers

Thanks

Network Time Protocol (NTP)

Network Time Protocol (NTP) is a protocol for synchronizing the clocks of computers in a network of computers. If you were to set up a RAC environment, one of the requirement is to synchronize the clock time of all your RAC nodes to avoid unnecessary node eviction. Time difference of more than 15 mins among nodes may cause node evictions. Also the trace file analysis and GV$ view analysis may not be accurate if the time is not synchronized among nodes.

Configuring NTP:

NTP configuration has 3 files in the /etc folder

ntp.conf
ntp.drift
ntp.trace

Add the names of the nodes in the ntp.conf file along with the names of drift and trace file as shown in the picture below.

cmsprod1 and cmsprod2 are the node names..

Add this information in all the nodes participating in the cluster.
Leave the drift and trace files as it is..

Thanks.

Saturday, May 21, 2011

RFS Process not working

Problem:

The filesystem containing the archive destination in DR server was not accessible. As a result the log shipping got stopped. We defered the log shipping in the production server. After the filesystem was back, we enabled the log shipping but the RFS process in DR server was not running...The problem occured when the log 24717 was being shipped... When we queried

SQL> Select status, sequence# from v$managed_standby;

The status for MRP showed that it was waiting for gap...We then manually shipped the log file and then applied... When we enabled the shipping we found that the RFS process was still not started.....There were no error in the alert log of DR...We found a trace file in the production server with the following message..

tkcrrsarc: (WARN) Failed to find ARCH for message (message:0xa)
tkcrrpa: (WARN) Failed initial attempt to send ARCH message (message:0xa)

So we thought that the issue is with the archiver process....

Solution:

Check whether the archiver process is available for shipping log files. You can identify this by querying V$ARCHIVE_PROCESSES view.

SQL> Select * from v$archive_processes;

The output have the following columns:

Process: Indicates the process number.
Status: This should be ACTIVE
Log Sequence: Log sequence number of the log that is being shipped by the archiver. If it is not shipping any log then it should be 0.
State: This should be IDLE if the archiver is not shipping any log. If it is shipping any log then its state is BUSY.

In our case we had two archiver process running.

The status of both the arch process is ACTIVE.
The log sequence of First arch process is 0 and its state is IDLE. Hence it is healthy. However the log sequence of 2nd arch process is 24717 and its state is BUSY.

This was interesting because the problem occured when the arch process was transfering the log 24717. This log was then manually shipped and applied. But the process still shows that it was shipping the 24717 log...

So we thought of increasing the arch processes. We increased the arch process from two to four.

SQL> alter system set log_archive_max_processes=4 scope=both;

We queried the v$archive_processes, the 3rd and 4th arch process was ready to ship the log files 24718 and 24719 logs respectively with their corresponding state as IDLE...

We enabled the log shipping and the RFS process in DR was started and the log shipping went smoothly.

However the 2nd arch process still was showing the same log sequence (24717) and state (BUSY).... We then killed that archiver process....

Thanks

Friday, May 20, 2011

Clusterware Log Files

In this post we will see where Oracle Clusterware stores its component log files, these files help in diagnostic information collection and problem analysis.

All clusterware log files are stored under $ORA_CRS_HOME/log/ directory.

1. alert<nodename>.log : Important clusterware alerts are stored in this log file. It is stored in $ORA_CRS_HOME/log/<hostname>/alert<hostname$gt;.log.

2. crsd.log : CRS logs are stored in $ORA_CRS_HOME/log/<hostname>/crsd/ directory. The crsd.log file is archived every 10MB as crsd.101, crsd.102 ...

3. cssd.log : CSS logs are stored in $ORA_CRS_HOME/log/<hostname>/cssd/ directory. The cssd.log file is archived every 20MB as cssd.101, cssd.102....

4. evmd.log : EVM logs are stored in $ORA_CRS_HOME/log/<hostname>/evmd/ directory.

5. OCR logs : OCR logs (ocrdump, ocrconfig, ocrcheck) log files are stored in $ORA_CRS_HOME/log/<hostname>/client/ directory.

6. SRVCTL logs: srvctl logs are stored in two locations, $ORA_CRS_HOME/log/<hostname>/client/ and in $ORACLE_HOME/log/<hostname>/client/ directories.

7. RACG logs : The high availability trace files are stored in two locations
$ORA_CRS_HOME/log/<hostname>/racg/ and in $ORACLE_HOME/log/<hostname>/racg/ directories.

RACG contains log files for node applications such as VIP, ONS etc.
Each RACG executable has a sub directory assigned exclusively for that executable.

racgeut : $ORA_CRS_HOME/log/<hostname>/racg/racgeut/
racgevtf : $ORA_CRS_HOME/log/<hostname>/racg/racgevtf/
racgmain : $ORA_CRS_HOME/log/<hostname>/racg/racgmain/

racgeut : $ORACLE_HOME/log/<hostname>/racg/racgeut/
racgmain: $ORACLE_HOME/log/<hostname>/racg/racgmain/
racgmdb : $ORACLE_HOME/log/<hostname>/racg/racgmdb/
racgimon: $ORACLE_HOME/log/<hostname>/racg/racgimon/

In that last directory imon_<service>.log is archived every 10MB for each service.

Thanks

Tuesday, May 17, 2011

Diagcollection.pl

Diagcollection.pl is a script used to collect the diagnostic information from clusterware installation. The script provides you with additional information so that the Oracle Support can resolve problems.

Invoking diagcollection script

Step 1: Log in as Root
Step 2: Set up the following environment variables

# export ORACLE_BASE= /..../
# export ORACLE_HOME = /..../
# export ORA_CRS_HOME = /.../

Step 3: Run the script

# cd $ORA_CRS_HOME/bin
# ./diagcollection.pl -collect

The script generates the following files in the local directory,

basData_.tar.gz (contains logfiles from ORACLE_BASE/admin)
crsData_.tar.gz (logs from $ORA_CRS_HOME/log/)
ocrData_.tar.gz (results of ocrcheck, ocrdump and ocr backups)
oraData_.tar.gz (logs from $ORACLE_HOME/log/)

To collect only subset of log files , you can invoke as follows,

# ./diagcollection.pl -collect -crs (CRS log files)
# ./diagcollection.pl -collect -oh (ORACLE_HOME logfiles)
# ./diagcollection.pl -collect -ob (ORACLE_BASE logfiles)
# ./diagcollection.pl -collect -all (default)

To clean out the files generated from the last run

# ./diagcollection.pl -clean

To extract only the core files found in the generated files and store it in a text file,

# ./diagcollection.pl -coreanalyze

Thanks

Wednesday, April 27, 2011

Performance Impact of Using Cache and Nocache in Sequences

When Caching is used the data dictionary is updated once till the final cached value is used. If you cache 1000 values then the data dictionary is updated once for every 1000 sequence numbers are used.

When Nocaching is used then the data dictionary is updated for every sequence number. Frequent updation means there will be lock to that particular object hence when 5 sessions simultaneously request the nextval then the four sessions will be waiting for "row cache lock" wait resulting in poor performance.

According to Metalink,

When caching and ordering is used in an RAC database, then the session wanting to get the NEXTVAL of a sequence need to get an exclusive instance SV lock before inserting or updating the sequence values in the the shared pool. When multiple sessions want the nextval of the same sequence, then some sessions will wait on 'DFS lock handle' waitevent.

As a consequence of those serialisation mechanisms, the sequence throughput, i.e. the maximum speed at which it is possible to increment a sequence, doesn't scale with the number of RAC nodes, when the sequence is not cached and/or when the sequence is ordered. The sequence throughput is always better for cached sequences compared to non cached sequences.

In summary, due to the serialization mechanisms for non-cached ordered sequences, it takes more time to get a row cache lock or the SQ enqueue when more nodes are involved. That decrease effect is certainly visible when going from 1 to 2 nodes and is still perceptible to a lower extent when going from 2 to 3 nodes, but flat away with more nodes.

Thanks

Gaps in Sequences in Non RAC databases

Gaps in Sequences may occur as a result of one of the following,

1. Rollback of a statement
2. Instance Crash or shutdown
3. Metadata aged out of the Data Dictionary (Use keep pool to avoid)
4. Process cleaned by PMON

As you can see above the number of instances have no part in obtaining Gaps. Hence you can get gaps both in RAC databases and non RAC single instance databases. Let us see an example of how we get Gaps in Single instance Databases.

Create sequence gapSeq start with 1 increment by 1 cache 20;

User A queries nextval and get 1
User B queries nextval and get 2
User B commits.
User A rollbacks.
User C queries nextval and get 3.

Hence the sequence value 1 is lost due to rollback of userA.

Also when you flush the shared pool the cached sequece numbers are lost.

Therefore the Gaps will occur in both the single instance and RAC databases..

Thanks

Tuesday, April 26, 2011

How Sequences Behave in RAC

Sequences in RAC environment should be used carefully otherwise there will be major performance problems.

The two main concern for the developers in using the sequences are 1. Gaps 2. Order.

Click Here to know How Gaps in sequences can occur even in Non-RAC databases.

The DBAs must know how efficiently to use the options like Cache/NoCache and Order/Noorder.

Let us see when to use the various combinations of Cache/Nocache and order/Noorder and their performance impact.

Cache + NoOrder

It is the default while creating Sequences without mentioning the options.
Each instances caches a distinct set of values.
The Sequences will not be globally ordered.
Gaps will occur.
Best Performance.

Cache + Order

Each instances will cache the same set of values.
Order of sequences is guaranteed.
Gaps will occur. (Shared pool Flush or Instance Shutdown)
Performance better than Nocache

Note: CACHE VALUE must not be too low otherwise significant performance problem.

NoCache + NoOrder

No Gaps will occur.
Ordering is not guaranteed.
Performance better than Nocache/order

NoCache + Order

No Gaps will occur.
Ordering is guaranteed.
Worst Performance of all.

Click Here to view the performance impact of Cache and NoCache options.

The following link will be useful
http://www.pythian.com/news/383/sequences-in-oracle-10g-rac/

Thursday, March 31, 2011

Shell Script to FTP

The following is a shell script to transfer file from one server to another server.
#!/bin/ksh HOST='172.16.4.41' -- Remote server ip
USER='oracle' -- Remote server User
PASSWD='******' -- Password ftp -n $HOST << EOF
quote USER $USER
quote PASS $PASSWD
bin
prompt off
cd /backup/RMANBKUP/spool/
lcd /backup/RMANBKUP/spool/
mget test*
bye
EOF

Thanks

RAC Database Administration