RAC Database Administration: May 2011

Monday, May 23, 2011

Cluster Verification Utility CLUVFY

CLUVFY utility is distributed with Oracle Clusterware. It is used to assist in the installation and configuration of Oracle Clusterware as well as RAC. It helps in verifying whether all the components that are required for successful installation of Clusterware and RAC are installed and configured correctly.

The CLUVFY commands are divided in to two categories,

1. Stage Commands
2. Component Commands

Stage Commands:

There are various phases during clusterware or RAC deployment, for example, hardware and software cofiguration, CRS installation, RAC software installation, Database creation etc...Each of these phases is called a stage. Each stage requires a pre-requisite conditions to be met before entering the stage (pre check) and another set of conditions to be met after the completion of that stage (post check)..

The pre-check verification and post check verification can be done using the CLUVFY commands. The commands used to perform these pre-check and post-check are called stage commands. To identify various stages use the following command,

$ cd ORA_CRS_HOME/bin
$ cluvfy stage -list

post hwos - Post check for hardware and Operating System
Pre cfs - Pre check for CFS (optional)
post cfs - Post check for CFS (optional)
pre crsinst - pre check for clusterware installation
post crsinst - post check for clusterware installation
pre dbinst - pre check for database installation
pre dbcfg - pre check for database configuration

Component Commands:

The commands in this category is used to verify the correctness of an individual cluster components and not associated with any stages. The various cluster components are listed using the following command,

$ cd $ORA_CRS_HOME/bin
$ cluvfy comp -list

nodereach - checks reachability between nodes
nodecon - checks node connectivity
cfs - checks cfs integrity
ssa - checks shared storage accessibility
space - checks space availability
sys - checks minimum system requirements
clu - checks cluster integrity
clumgr - checks cluster manager integrity
ocr - checks ocr integrity
nodeapp - checks existence of node applications
admprv - checks administrative privileges
peer - Compares properties with peers

Thanks

Network Time Protocol (NTP)

Network Time Protocol (NTP) is a protocol for synchronizing the clocks of computers in a network of computers. If you were to set up a RAC environment, one of the requirement is to synchronize the clock time of all your RAC nodes to avoid unnecessary node eviction. Time difference of more than 15 mins among nodes may cause node evictions. Also the trace file analysis and GV$ view analysis may not be accurate if the time is not synchronized among nodes.

Configuring NTP:

NTP configuration has 3 files in the /etc folder

ntp.conf
ntp.drift
ntp.trace

Add the names of the nodes in the ntp.conf file along with the names of drift and trace file as shown in the picture below.

cmsprod1 and cmsprod2 are the node names..

Add this information in all the nodes participating in the cluster.
Leave the drift and trace files as it is..

Thanks.

Saturday, May 21, 2011

RFS Process not working

Problem:

The filesystem containing the archive destination in DR server was not accessible. As a result the log shipping got stopped. We defered the log shipping in the production server. After the filesystem was back, we enabled the log shipping but the RFS process in DR server was not running...The problem occured when the log 24717 was being shipped... When we queried

SQL> Select status, sequence# from v$managed_standby;

The status for MRP showed that it was waiting for gap...We then manually shipped the log file and then applied... When we enabled the shipping we found that the RFS process was still not started.....There were no error in the alert log of DR...We found a trace file in the production server with the following message..

tkcrrsarc: (WARN) Failed to find ARCH for message (message:0xa)
tkcrrpa: (WARN) Failed initial attempt to send ARCH message (message:0xa)

So we thought that the issue is with the archiver process....

Solution:

Check whether the archiver process is available for shipping log files. You can identify this by querying V$ARCHIVE_PROCESSES view.

SQL> Select * from v$archive_processes;

The output have the following columns:

Process: Indicates the process number.
Status: This should be ACTIVE
Log Sequence: Log sequence number of the log that is being shipped by the archiver. If it is not shipping any log then it should be 0.
State: This should be IDLE if the archiver is not shipping any log. If it is shipping any log then its state is BUSY.

In our case we had two archiver process running.

The status of both the arch process is ACTIVE.
The log sequence of First arch process is 0 and its state is IDLE. Hence it is healthy. However the log sequence of 2nd arch process is 24717 and its state is BUSY.

This was interesting because the problem occured when the arch process was transfering the log 24717. This log was then manually shipped and applied. But the process still shows that it was shipping the 24717 log...

So we thought of increasing the arch processes. We increased the arch process from two to four.

SQL> alter system set log_archive_max_processes=4 scope=both;

We queried the v$archive_processes, the 3rd and 4th arch process was ready to ship the log files 24718 and 24719 logs respectively with their corresponding state as IDLE...

We enabled the log shipping and the RFS process in DR was started and the log shipping went smoothly.

However the 2nd arch process still was showing the same log sequence (24717) and state (BUSY).... We then killed that archiver process....

Thanks

Friday, May 20, 2011

Clusterware Log Files

In this post we will see where Oracle Clusterware stores its component log files, these files help in diagnostic information collection and problem analysis.

All clusterware log files are stored under $ORA_CRS_HOME/log/ directory.

1. alert<nodename>.log : Important clusterware alerts are stored in this log file. It is stored in $ORA_CRS_HOME/log/<hostname>/alert<hostname$gt;.log.

2. crsd.log : CRS logs are stored in $ORA_CRS_HOME/log/<hostname>/crsd/ directory. The crsd.log file is archived every 10MB as crsd.101, crsd.102 ...

3. cssd.log : CSS logs are stored in $ORA_CRS_HOME/log/<hostname>/cssd/ directory. The cssd.log file is archived every 20MB as cssd.101, cssd.102....

4. evmd.log : EVM logs are stored in $ORA_CRS_HOME/log/<hostname>/evmd/ directory.

5. OCR logs : OCR logs (ocrdump, ocrconfig, ocrcheck) log files are stored in $ORA_CRS_HOME/log/<hostname>/client/ directory.

6. SRVCTL logs: srvctl logs are stored in two locations, $ORA_CRS_HOME/log/<hostname>/client/ and in $ORACLE_HOME/log/<hostname>/client/ directories.

7. RACG logs : The high availability trace files are stored in two locations
$ORA_CRS_HOME/log/<hostname>/racg/ and in $ORACLE_HOME/log/<hostname>/racg/ directories.

RACG contains log files for node applications such as VIP, ONS etc.
Each RACG executable has a sub directory assigned exclusively for that executable.

racgeut : $ORA_CRS_HOME/log/<hostname>/racg/racgeut/
racgevtf : $ORA_CRS_HOME/log/<hostname>/racg/racgevtf/
racgmain : $ORA_CRS_HOME/log/<hostname>/racg/racgmain/

racgeut : $ORACLE_HOME/log/<hostname>/racg/racgeut/
racgmain: $ORACLE_HOME/log/<hostname>/racg/racgmain/
racgmdb : $ORACLE_HOME/log/<hostname>/racg/racgmdb/
racgimon: $ORACLE_HOME/log/<hostname>/racg/racgimon/

In that last directory imon_<service>.log is archived every 10MB for each service.

Thanks

Tuesday, May 17, 2011

Diagcollection.pl

Diagcollection.pl is a script used to collect the diagnostic information from clusterware installation. The script provides you with additional information so that the Oracle Support can resolve problems.

Invoking diagcollection script

Step 1: Log in as Root
Step 2: Set up the following environment variables

# export ORACLE_BASE= /..../
# export ORACLE_HOME = /..../
# export ORA_CRS_HOME = /.../

Step 3: Run the script

# cd $ORA_CRS_HOME/bin
# ./diagcollection.pl -collect

The script generates the following files in the local directory,

basData_.tar.gz (contains logfiles from ORACLE_BASE/admin)
crsData_.tar.gz (logs from $ORA_CRS_HOME/log/)
ocrData_.tar.gz (results of ocrcheck, ocrdump and ocr backups)
oraData_.tar.gz (logs from $ORACLE_HOME/log/)

To collect only subset of log files , you can invoke as follows,

# ./diagcollection.pl -collect -crs (CRS log files)
# ./diagcollection.pl -collect -oh (ORACLE_HOME logfiles)
# ./diagcollection.pl -collect -ob (ORACLE_BASE logfiles)
# ./diagcollection.pl -collect -all (default)

To clean out the files generated from the last run

# ./diagcollection.pl -clean

To extract only the core files found in the generated files and store it in a text file,

# ./diagcollection.pl -coreanalyze

Thanks

RAC Database Administration

Monday, May 23, 2011

Cluster Verification Utility CLUVFY

Network Time Protocol (NTP)

Saturday, May 21, 2011

RFS Process not working

Friday, May 20, 2011

Clusterware Log Files

Tuesday, May 17, 2011

Diagcollection.pl

My Fav Blogs

Archive

Labels

Followers