ORACLE APPSDBA (EBS) TECHNOLOGY

Friday, April 8, 2016

R12.2. Start and Stop Procedure

R12.2. Start and Stop Procedure

Individual Components: Application(Middle Tier)

$INST_TOP/admin/scripts

when we want to stop all the services using adstpal.sh we use adstpal.sh apps/apps<pwd> but in R12.2 it will ask Weblogic Server Password

We need to provide Weblogic admin password

Component	Command
Node Manager	$adnodemgrctl.sh start Enter Weblogic Admin Password:
Weblogic Admin Server	$adadminsrvctl.sh start Enter Weblogic Admin Password:
Application Listener	$adadlctl start
Oracle Process Manager	$adopmnctl.sh start
Apache Services	$adapchctl.sh start
Managed Server for OACORE Services	$admanagedsrvctl.sh start oacore_server1 Enter Weblogic Admin Password:
Managed Server for FormsServices	$admanagedsrvctl.sh start forms_server1 Enter Weblogic Admin Password:
Managed Server for Fusion MiddleWare Services	$admanagedsrvctl.sh start oafm_server1 Enter Weblogic Admin Password:
Managed Server for Forms web Services	$admanagedsrvctl.sh start forms-c4ws_server1 Enter Weblogic Admin Password:
Concurrent Manager Service	$adcmctl.sh start apps/apps
Fullfillment Serer Services	$jtffmctl.sh start

Stop

When we want to stop adstpal.sh apps/apps again it is going to ask weblogic password

Component	Command
Fullfillment Serer Services	$jtffmctl.sh stop
Concurrent Manager Service	$adcmctl.sh stop apps/apps
Managed Server for Forms web Services	$admanagedsrvctl.sh stop forms-c4ws_server1 Enter Weblogic Admin Password:
Managed Server for Fusion MiddleWare Services	$admanagedsrvctl.sh stop oafm_server1 Enter Weblogic Admin Password:
Managed Server for FormsServices	$admanagedsrvctl.sh stop forms_server1 Enter Weblogic Admin Password:
Managed Server for OACORE Services	$admanagedsrvctl.sh stop oacore_server1 Enter Weblogic Admin Password:
Apache Services	$adapchctl.sh stop
Oracle Process Manager	$adopmnctl.sh stop
Application Listener	$adadlctl stop
Weblogic Admin Server	$adadminsrvctl.sh stop Enter Weblogic Admin Password:
Node Manager	$adnodemgrctl.sh stop Enter Weblogic Admin Password:

Thursday, March 31, 2016

About R12.2 Questions

About R12.2:
===========
Question 1: What is Online Patching?
Answer:
Online patching is a new patching mechanism that is available with R12.2 that allows the application of patches while the system is up and running, and the users are working as normal.

Question 2: Which Oracle E-Business Suite releases Online Patching feature is available?
Answer Online patching is used with Oracle E-Business Suite 12.2 and higher.

Question 3: What types of patch are applied online?
Answer: All Oracle E-Business Suite Release 12.2 patches are applied online. This includes one-off patches, patch rollups, consolidated updates and security patches.

Question 4:What is the Online Patching cycle?
Answer: The Online Patching cycle is a sequence of inter-related steps (phases) used to apply patches to an Oracle E-Business Suite system.

Question 5:What tool is used to apply online patches?
Answer: The AD Online Patching (adop) command-line utility is used to manage the Online Patching cycle.adop invokes adpatch is the background only

Question 6: Is there any downtime in Online Patching?
Answer: There is a short period of downtime when the application tier services are shut down and restarted. The database remains open all the time.
Key technology changes in E-business Suite R12.2

Question 7 Once I upgrade to Release 12.2, can I still apply patches in the traditional way?
Answer :No. All patches for Release 12.2 will be online patches. The traditional, pre-12.2 method of applying patches will not work.There are option like downtime and hotpatch which works more like traditional way but they are used for patches directed by Oracle

Question 8:What is the Online Patching infrastructure?
Answer :This infrastructure includes database objects edition and patch/run file system components.

Question 9. Does Online Patching require the 11gR2 Oracle Database Edition Based Redefinition (EBR) feature?
Answer Yes. Online patching depends on the Edition Based Redefinition (EBR) feature that was introduced in the Oracle 11gR2 Database. Most notably, EBR allows editioning of code objects in the database. To do this, it provides new object types such as editions, editioning views, and cross-edition triggers, all of which are part of the Online Patching infrastructure.
R12.2 edition determination and setup

Question 10. What are the phases that make up the Online Patching cycle?
Answer: The Online Patching cycle consists of the following phases:
1.Prepare a virtual copy (patch edition) of the running application (run edition).
2.Apply patches to the patch edition of the application.
3.Finalize the system in readiness for the cutover phase.
4.Cutover to the patch edition and make it the new run edition.
5.Cleanup obsolete definitions or data to recover space.
Oracle Ebuisness Suite Architecture in R12.2

Question 11.What downtime is required during an Online Patching cycle? or as the name spells online,there is not downtime in the whole process
Answer The cutover phase requires a short period of downtime (typically a few minutes) for transition tasks such as a restart of the application tier services.

Question 12. Is any downtime required for the database tier?
Answer: No. In fact, the database needs to be up and running during each phase of the Online Patching cycle. Suite database.
Question 13.How does Online Patching work on the application tier?
Answer:During Release 12.2 installation, Rapid Install will lay down two copies of the application tier file system. One of the copies will be labeled as the run file system, and the other as the patch file system. Subsequently, when a patch is applied, adop will:
1.Synchronize the contents of the run file system to the patch file system. This happens during the prepare phase.
2.Perform patching actions on the patch file system. This happens during the apply phase.
3.Finally, during the cutover phase, the adop utility restarts the application tier services. The patch file system is then promoted to be the new run file system, and the old run file system becomes the patch file system for the next patching cycle.
Note that a third file system, the non-editioned file system (fs_ne), is created to store files containing data that is needed across all file systems,such as log files.
R12.2 Online patching cycle Summary
Adop explained R12.2

Question 14.How do I apply Oracle Fusion Middle-ware patches in Oracle E-Business Suite Release 12.2?
Answer: During the apply phase of an Online Patching cycle, you apply Oracle Fusion Middle-ware patches to the Oracle homes of the patch edition file system. Then, after the cut over phase is complete, you synchronize the file systems by performing an fs_clone operation. (Also see My Oracle Support Knowledge Document 1355068.1, as listed in Appendix A.)

Question 15. Can I use the patch edition for testing and development purposes?
Answer As a specialized component of the Online Patching infrastructure, the patch edition is not supported for use as a test environment. You should continue to employ a separate, dedicated test environment.

Question 16.Can Online Patching be used with database technologies such as Active Dataguard and Flashback?
Answer:Yes. Online patching can be used alongside Active Dataguard and Flashback. Infact we can use flashback to rollback the changes after the final cutover

Question 17. What are the key differences between the DBA_OBJECTS, DBA_OBJECTS_AE, and AD_OBJECTS tables?
Answer DBA_OBJECTS shows object information for the current edition, but the STATUS column in this view may show the object as VALID even if the object actually needs to be compiled before use.
DBA_OBJECTS_AE is similar to DBA_OBJECTS, but shows object information across all editions. This has the drawback of showing objects in old editions that are no longer accessible to the application.
AD_OBJECTS is the Oracle E-Business Suite workaround to the unreliable STATUS column in DBA_OBJECTS. AD_OBJECTS shows the correct status for each object visible in the current edition. It also shows whether the object is "actual" (a real object) in the current edition, or a "stub" object (the object definition was inherited from a previous edition). You can query AD_OBJECTS to locate objects that need to be recompiled before use:
SQL>select owner, object_name, object_type
from ad_objects
where status = ‘INVALID’
order by 1,2,3
/
The same logic can be applied by running the script:
SQL>sqlplus apps/apps @$AD_TOP/sql/ADZDSHOWINVALID

Question 18.Does Online Patching increase the network port requirements on an Oracle E-Business Suite instance?
Answer: Yes. Online patching requires an additional set of network ports for the Oracle WebLogic Server managed servers on the second file system. During the cutover phase, the managed servers run simultaneously on the patch file system and run file system for a brief period, in a rolling transition process.

Question 19.Is it possible to abort an Online Patching session?
Answer Yes. Up to cutover, you can run the abort phase to undo the changes made so far in the patching cycle. It is not possible to back out patches once cutover is complete.

Question 20.Is the shared APPL_TOP configuration supported with Online Patching?
Answer: Yes. A shared APPL_TOP configuration is supported and recommended for multi-node application tier implementations in Release 12.2.

Question 21.How does adop work in a multi-node environment?
Answer The adop Online Patching tool uses remote APIs and ssh login to execute patching operations on remote nodes in a multi-node environment. The node that launches adop becomes the ‘master’ node, and the remote nodes are referred to as ‘slaves’.

Question 22.How do I determine the status of my Online Patching session?
Answer: You can run the adop -status command. This will display information that includes phases completed and the time taken. If you want additional details of operations performed, you can run the adop -status -detail command.

Question 23.What is downtime mode and when can it be used?
Answer: To optimize the process of upgrading to E-Business Suite Release 12.2, the AD Delta 5 Release Update Pack introduced downtime mode, which is used as follows:
$ adop phase=apply patches=<patch_number> apply_mode=downtime
Downtime mode does not use an online patching cycle. The process of applying a patch in downtime mode completes more quickly than in online mode, but at the cost of increased system downtime.
When applying Oracle E-Business Suite patches in this mode, adop will first confirm that the application tier services are down, and will then proceed to apply the patch to the run edition of the Oracle E-Business Suite database and file system.
Downtime mode is supported for:
-All patching (including post-upgrade patching) that forms part of the Release 12.2 upgrade process and is completed before the system is scaled up, the application tier services are started, and users log in to the upgraded system.
-Single-node development or test environments, where production support and high availability are not required.
Downtime mode allows the 12.2 upgrade process to be completed as quickly as possible. Once the upgrade is complete and users are online, all subsequent patching on a production system should use online mode, not downtime mode, unless the patch readme states otherwise.
Several restrictions apply to the use of downtime mode:
-You cannot validate successful patch application before cutover to the updated code takes place.
-There is no capability to abort a failed patch and return to the existing run edition.
-Release 12.2 patches are not normally tested in downtime mode.
-Use of downtime mode in a multi-node application tier environment is not tested or supported.

Question 24. What can I do to reduce the time required for cutover?
Answer: It is important to distinguish between the time needed for the whole cutover phase, and the downtime period within the phase. The actual downtime (during which users cannot log in) is significantly shorter than the whole phase. To help reduce the overall time taken by cutover, you can do three things:
-Run the finalize phase explicitly, to obviate the need for cut over to do so.
-Shut down the concurrent managers before running cut over, to avoid having to wait for concurrent requests to complete. Alternatively, ensure no long-running concurrent jobs are submitted while a patching cycle is in progress.
-Ensure you are using the maximum number of parallel workers your system will support.

Question 25.What is fs_clone and how is it used?
Answer The command adop phase=fs_clone is a special command that is used to copy the run file system to the patch file system. Also see previous question.

Question 26.Will AutoConfig and adadmin maintenance tasks such as adrelink,forms compilation,report compilation be performed online?
Answer: Yes, these maintenance tasks will be performed online. The relevant operations will be targeted to the patch file system, and should be performed during a patching cycle. They will not have any impact on the run file system.

Question 27.Does Online Patching change the way data fix patches are applied to Oracle E-Business Suite 12.2?
Answer Yes. Data fix patches (used to fix transactional data) require special handling. The patch readme will give full instructions.

Question 28 How do I apply or patch my customizations in Oracle E-Business Suite Release 12.2?
Answer You should apply your customizations to the patch edition during the apply phase of the Online Patching cycle. Because this happens prior to the cutover phase, your changes will be propagated to the new run edition (along with all the fixes in the patches applied during the patching cycle).

Question 29. If custom code is installed on a separate database schema, do I have to edition-enable my custom database schema?
Answer The coding standards in the Oracle E-Business Suite Developer’s Guide state that the first step to any custom application development is to register the custom Oracle schema with the Oracle E-Business Suite applications. The Online Patching enablement patch enables editioning on all the schemas registered with the application. If you follow this process, your schema will be edition-enabled automatically.

Question 30. Are there any special considerations for creating custom patches that are compliant with Online Patching?
Answer Yes. There are some special considerations for creating custom patches that are compliant with Online Patching. Refer to the Patching Standards section of Oracle E-Business Suite Developer’s Guide.

Question 31: How is a non Oracle E-Business Suite database schema able to access the Oracle E-Business Suite tables?
Answer Any third-party schema, either from third-party products or custom code, must access Oracle E-Business Suite tables via the synonyms in the APPS schema. Direct access to Oracle E-Business Suite tables may produce incorrect results.

Question 32: What are the main technological difference between R12.2 and R12.1
Answer: R12.2 uses weblogic while R12.1 uses OC4J. Apart from that we have online patching feature in r12.2 using edition and patch/run file system

Question 33: How to change the apps password in R12.2?
Answer: apps password is same as R12.1 with the exception of changing that in weblogic console also.

Question 34: where are the log files stored for adop?
Answer: It is stored in the third filesystem which is non edition filesystem

Question 35. how do you connect to the patch edition?
Answer: Source the env using patch option
/u71/R122/EBSapps.env patch

Question 36: How to determine the weblogic version in R12.2
Answer How to find Weblogic Version

Question 37. How to increase the manage node in R12.2?
Answer
How to add the manage server in R12.2
How to delete the manage server in R12.2

Question 38. Where are the log file located in R12.2 Apache and Weblogic
Answer
Apache Logs
$IAS_ORACLE_HOME/instances/*/diagnostics/logs/OHS/EBS_web_*/*log
OPMN Log
$IAS_ORACLE_HOME/instances/*/diagnostics/logs/OPMN/opmn/*
Weblogic Logs
$IAS_ORACLE_HOME/../wlserver_10.3/common/nodemanager $EBS_DOMAIN_HOME/servers/oa*/logs/*
$EBS_DOMAIN_HOME/servers/forms*/logs/*
$EBS_DOMAIN_HOME/servers/AdminServer/logs/*
$EBS_DOMAIN_HOME/sysman/log/*

Question 39. How to stop and start the services in R12.2
Answer
When we want to stop all application services using script adstpall.sh , we provide apps password in R12.1.3.But in R12.2 it will ask weblogic admin password in addition to bring down to all services . We need to provide Weblogic admin password.

Thursday, January 7, 2016

Meaning of status_code and phase_code in FND_CONCURRENT_REQUESTS table

STATUS_CODE Column:
=================

A - Waiting
B - Resuming
C - Normal
D - Cancelled
E - Error
F - Scheduled
G - Warning
H - On Hold
I - Normal
M - No Manager
Q - Standby
R - Normal
S - Suspended
T - Terminating
U - Disabled
W - Paused
X - Terminated
Z - Waiting

PHASE_CODE column
=================

C - Completed
I - Inactive
P - Pending
R - Running

Tuesday, December 29, 2015

11gR2 RAC Administration Commands

1. Checking CRS status:

Below two commands are generally used to check status of CRS on local node and all the nodes of cluster.

crsctl check crs ==> to check the status of cluster on local node.

# pwd
/u01/app/11.2.0.3/grid/bin
You have new mail in /var/spool/mail/root
21:34:56 root@its0003: /u01/app/11.2.0.3/grid/bin
# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
21:35:12 root@its0003: /u01/app/11.2.0.3/grid/bin
#

crsctl check cluster ==> to check the status of cluster on remote nodes.

21:35:12 root@its0003: /u01/app/11.2.0.3/grid/bin
# ./crsctl check cluster
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
21:39:57 root@its0003: /u01/app/11.2.0.3/grid/bin
#

2.Viewing Cluster name:

Below are few ways to get the name of the cluster.

# ./cemutlo -n

itsrac01

22:01:01 root@its0003: /u01/app/11.2.0.3/grid/bin

Oracle will create a directory with the name of the cluster under $ORA_CRS_HOME/cdata. you can get the name of the cluster from this directory as well.

22:06:00 root@its0003: /u01/app/11.2.0.3/grid/cdata

# ls

its0003 its0003.olr itsrac01 localhost

22:06:18 root@its0003: /u01/app/11.2.0.3/grid/cdata

olsnodes -c ===> will displays the name of the cluster.

# ./olsnodes -c

itsrac01

06:28:57 root@its0003: /u01/app/11.2.0.3/grid/bin

3.Viewing Number Of Nodes configured in Cluster:

The below command will displays number of nodes registered in the cluster. it will also displays other information, see the usage detaials below.

olsnodes -n -s

# ./olsnodes -n -s

its0001 1 Active

its0002 2 Active

its0003 3 Active

06:25:53 root@its0003:

Usage: olsnodes [ [-n] [-i] [-s] [-t] [<node> | -l [-p]] | [-c] ] [-g] [-v]

where

-n print node number with the node name

-p print private interconnect address for the local node

-i print virtual IP address with the node name

<node> print information for the specified node

-l print information for the local node

-s print node status - active or inactive

-t print node type - pinned or unpinned

-g turn on logging

-v Run in debug mode; use at direction of Oracle Support only.

-c print clusterware name

4.Votedisk information:

The below command will display the number of votedisks configured in the Cluster.

crsctl query css votedisk

# ./crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE 039f2497bfbf4f63bfb6ba0455c69921 (ORCL:OCR02) [OCR_VOTE]

Located 1 voting disk(s).

06:37:07 root@its0003: /u01/app/11.2.0.3/grid/bin

#

- Use the ocssd.log utility to check for voting disks issues.
$ grep voting <grid_home>/log/<hostname>/cssd/ocssd.log

5.Viewing OCR Disk Information:

The below command will display the number of OCR files configured in the cluster and also displays the version of OCR as well as storage information.

Minimum 1 and maximum 5 copy of OCR is possible. we need to run this command as root user, if we run this command as oracle user we get this message "logical corruption check bypassed due to non-privileged user"

- Use the cluvfy utility or the ocrcheck command to check the integrity of the OCR.

# cluvfy comp ocr –n all -verbose

ocrcheck

# ./ocrcheck

Status of Oracle Cluster Registry is as follows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 5320

Available space (kbytes) : 256800

ID : 828879957

Device/File Name : +OCR_VOTE

Device/File integrity check succeeded

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

06:42:03 root@its0003: /u01/app/11.2.0.3/grid/bin

- To determine the location of the OCR:

$ cat /etc/oracle/ocr.loc

ocrconfig_loc=+DATA

local_only=FALSE

6. Various Timeout Settings in Cluster:

Disktimeout:

Disk Latencies in seconds from node-to-Votedisk. Default Value is 200. (Disk IO)

Misscount:

Network Latencies in second from node-to-node (Interconnect). Default Value is 60 Sec (Linux) and 30 Sec in Unix platform. (Network IO)

Misscount < Disktimeout

NOTE: Do not change them without contacting Oracle Support. This may cause logical corruption to the Data.

(Disk IO Time > Disktimeout) OR (Network IO time > Misscount)

THEN

REBOOT NODE

ELSE

@DO NOT REBOOT

END IF;

crsctl get css disktimeout
crsctl get css misscount
crsctl get css reboottime

Disktimeout:

# ./crsctl get css disktimeout

CRS-4678: Successful get disktimeout 200 for Cluster Synchronization Services.

06:48:54 root@its0003: /u01/app/11.2.0.3/grid/bin

Misscount:

# ./crsctl get css misscount

CRS-4678: Successful get misscount 60 for Cluster Synchronization Services.

06:49:26 root@its0003: /u01/app/11.2.0.3/grid/bin

- you can change the misscount values as below.

# ./crsctl set css misscount 80

Configuration parameter misscount is now set to 80

# ./crsctl get css misscount

- setting the value of misscount back to its Default value.

crsctl unset css misscount

# ./crsctl unset css misscount

Configuration parameter misscount is reset to default operation value.

# ./crsctl get css misscount

Rebootingtime:

# ./crsctl get css reboottime

7. OCR and Voting disks info.

OCR: It created at the time of Grid Installation. It’s store information to manage Oracle cluster-ware and it’s component such as RAC database, listener, VIP,Scan IP & Services.

Minimum 1 and maximum 5 copy of OCR is possible.

Voting Disk: It manage information about node membership. Each voting disk must be accessible by all nodes in the cluster.If any node is not passing heat-beat across other note or voting disk, then that node will be evicted by Voting disk.

Minimum 1 and maximum 15 copy of voting disk is possible.

New Facts in 11gR2:
• We can store OCR And Voting disk on ASM or certified cluster file system.
• We can dynamically add or replace voting disk & OCR.
• Backup of Voting disk using “dd” command not supported.
• Voting disk and OCR can be keep in same disk-group or different disk-group
• Voting disk and OCR automatic backup kept together in a single file.
• Automatic backup of Voting disk and OCR happen after every four hours, end of the day, end of the week
• You must have root or sudo privilege account to manage it.

OCR and Voting Disks Backup:

In 11g release 2 you no longer have to take voting disks backup as it included in all OCR backups (auto and manual).

- OCR backups are made to the GRID_HOME/cdata/<cluster name> directory on the node performing the backups. These backups are named as follows:

4-hour backups (3 max) –backup00.ocr, backup01.ocr, and backup02.ocr.

Daily backups (2 max) – day.ocr and day_.ocr

Weekly backups (2 max) – week.ocr and week_.ocr

- Note that RMAN does not backup the OCR.

- You can use the ocrconfig command to view the current OCR backups as seen in this example:

- check the auto backups of OCR using below command
# ./ocrconfig -showbackup auto

its0002 2014/06/04 05:43:16 /u01/app/11.2.0.3/grid/cdata/itsrac01/backup00.ocr

its0002 2014/06/04 01:43:14 /u01/app/11.2.0.3/grid/cdata/itsrac01/backup01.ocr

its0002 2014/06/03 21:43:14 /u01/app/11.2.0.3/grid/cdata/itsrac01/backup02.ocr

its0002 2014/06/02 09:43:07 /u01/app/11.2.0.3/grid/cdata/itsrac01/day.ocr

its0002 2014/05/22 09:42:21 /u01/app/11.2.0.3/grid/cdata/itsrac01/week.ocr

09:30:05 root@its0003: /u01/app/11.2.0.3/grid/bin

- One thing to be aware of is that if your cluster is shutdown, then the automatic backups will not occur (nor will the purging).

- If you feel that you need to backup the OCR immediately (for example, you have made a number of cluster related changes) then you can use the ocrconfig command to perform a manual backup:

Ocrconfig –manualbackup

- You can list the manual backups with the ocrconfig command too:

Ocrconfig –showbackup manual

# ./ocrconfig -showbackup manual

its0002 2013/07/03 19:31:44 /u01/app/11.2.0.3/grid/cdata/itsrac01/backup_20130703_193144.ocr

its0002 2013/07/01 15:52:04 /u01/app/11.2.0.3/grid/cdata/itsrac01/backup_20130701_155204.ocr
09:34:24 root@itsolx0003: /u01/app/11.2.0.3/grid/bin
#

- Ocrconfig also supports the creation of a logical backup of the OCR as seen here:

Ocrconfig –export /tmp/ocr.exp

- It is recommended that the OCR backup location be on a shared file system and that the cluster be configured to write the backups to that file system. To change the location of the OCR backups, you can use the ocrconfig command as seen in this example:

Ocrconfig –backuploc /u01/app/oracle/ocrloc

- Note that the ASM Cluster File System (ACFS) does not support storage of OCR backups.

Add/Remove Votedisks

- To add or remove voting disks on non-Automatic Storage Management (ASM) storage, use the following commands:
# crsctl delete css votedisk path_to_voting_disk
# crsctl add css votedisk path_to_voting_disk

- To add a voting disk to ASM:
#crsctl replace votedisk +asm_disk_group

- Use the crsctl replace votedisk command to replace a voting disk on ASM. You do not have to delete any voting disks from ASM using this command.
Restoring the OCR

- If you back it up, there might come a time to restore it. Recovering the OCR from the physical backups is fairly straight forward, just follow these steps:

1. Locate the OCR backup using the ocrconfig command.
  ocrconfig -showbackup
2. Stop Oracle Clusterware (on all nodes)
  crsctl stop cluster -all
3. Stop CRS on all nodes
  crsctl stop crs   ----> it will stop the CRS that particular node we have executed.
4. Restore the OCR backup (physical) with the ocrconfig command.
  ocrconfig –restore {path_to_backup/backup_file_to_restore}
5. Restart CRS
  crsctl start crs 6. Check the integrity of the newly restored OCR:
  cluvfy comp ocr –n all

You can also restore the OCR using a logical backup as seen here:

1. Locate your logical backup.

2. Stop Oracle Clusterware (on all nodes)
   crsctl stop cluster -all3. Stop CRS on all nodes
crsctl stop crs
4. Restore the OCR backup (physical) with the ocrconfig command.
   ocrconfig –import /tmp/export_file.fil5. Restart CRS
   crsctl start crs
6. Check the integrity of the newly restored OCR:
   cluvfy comp ocr –n all

- If you are upgrading to Oracle Database 11g you can migrate your voting disks to ASM easily with the crsctl replace command.

- You can also use the crsctl query command to locate the voting disks as seen in this example:

Crsctl query css votedisk
- You can also migrate voting disks between NAS and ASM or ASM to NAS using the crsctl replace command.

To check the clusterware vertion:
$ crsctl query crs activeversion
Oracle Clusterware active version on cluster is [11.2.0.1.0]

Troubleshooting Oracle Clusterware

Oracle Clusterware Main Log Files:

Cluster Ready Service (CRS) logs are in <Grid_Home>/log/<hostname>/crsd/. The crsd.log file is archived every 10 MB (crsd.l01, crsd.l02,...)
Cluster Synchronization Service (CSS) logs are in <Grid_Home>/log/<hostname>/cssd/. The cssd.log file is archived every 20 MB (cssd.l01, cssd.l02,...)
Event Manager (EVM) logs are in <Grid_Home>/log/<hostname>/evmd.
SRVM (srvctl) and OCR (ocrdump, ocrconfig, ocrcheck) logs are in <Grid_Home>/log/<hostname>/client/ and $ORACLE_HOME/log/<hostname>/client/.
Important Oracle Clusterware alerts can be found in alert<nodename>.log in the <Grid_Home>/log/<hostname> directory.
Oracle Cluster Registry tools (ocrdump, ocrcheck, ocrconfig) logs can be found in <Grid_Home>/log/<hostname>/client.
In addition, important Automatic Storage Management (ASM)related trace and alert information can be found in the <Grid_Base>/diag/asm/+asm/+ASMn directory, specifically the log and trace directories.

Diagnostics Collection Script:

- Use the diagcollection.pl script to collect diagnostic information from an Oracle Grid Infrastructure installation. The diagnostics provide additional information so that Oracle Support can resolve problems. This script is located in <Grid_Home>/bin

/u01/app/11.2.0/grid/bin/diagcollection.pl --collect

# /u01/app/11.2.0/grid/bin/diagcollection.pl --collect
Production Copyright 2004, 2008, Oracle. All rights reserved
Cluster Ready Services (CRS) diagnostic collection tool
The following diagnostic archives will be created in the local directory.
crsData_host01_20090729_1013.tar.gz -> logs,traces and cores from CRS home. Note: core files will be packaged only with the --core option.
ocrData_host01_20090729_1013.tar.gz -> ocrdump, ocrcheck etc
coreData_host01_20090729_1013.tar.gz -> contents of CRS core files
osData_host01_20090729_1013.tar.gz -> logs from Operating System

....

- To check css moduels;

$ crsctl lsmodules css
The following are the Cluster Synchronization Services modules:CSSD COMCRS COMMNS CLSF SKGFD
To enable tracing for cluvfy, netca, and srvctl, set SRVM_TRACE to TRUE:

$ export SRVM_TRACE=TRUE
$ srvctl config database -d orcl > /tmp/srvctl.trc
$ cat /tmp/srvctl.trc
...
[main] [ 2009-09-16 00:58:53.197 EDT ] [CRSNativeResult.addRIAttr:139] addRIAttr: name 'ora.orcl.db 3 1', 'USR_ORA_INST_NAME@SERVERNAME(host01)':'orcl1'
[main] [ 2009-09-16 00:58:53.197 EDT ] [CRSNativeResult.addRIAttr:139] addRIAttr: name 'ora.orcl.db 3 1', 'USR_ORA_INST_NAME@SERVERNAME(host02)':'orcl2'
[main] [ 2009-09-16 00:58:53.198 EDT ] [CRSNativeResult.addRIAttr:139] addRIAttr: name 'ora.orcl.db 3 1', 'USR_ORA_INST_NAME@SERVERNAME(host03)':'orcl3'
[main] [ 2009-09-16 00:58:53.198 EDT ] [CRSNative.searchEntities:857] found 3 ntitie
...
Cluster Verify Components:

CVU supports the notion of component verification. The verifications in this category are not associated with any specific stage. A component can range from basic, such as free disk space, to complex (spanning over multiple subcomponents), such as the Oracle Clusterware stack. Availability, integrity, or any other specific behavior of a cluster component can be verified.

You can list verifiable CVU components with the cluvfy comp -list command:

$ cluvfy comp -list command

nodereach - Checks node reachability
peer - Compares properties with peers
nodecon - Checks node connectivity
ha - Checks HA integrity
cfs -Checks CFS integrity
asm - Checks ASM integrity
ssa - Checks shared storage
acfs - Checks ACFS integrity
space - Checks space availability
olr - Checks OLR integrity
sys - Checks minimum requirements
gpnp - Checks GPnP integrity
clu - Checks cluster integrity
gns - Checks GNS integrity
clumgr - Checks cluster manager integrity
scan - Checks SCAN configuration
ocr - Checks OCR integrity
ohasd - Checks OHASD integrity
admprv - Checks administrative privileges
crs - Checks CRS integrity
software - Checks software distribution
vdisk - Checks Voting Disk Udev settings
clocksync - Checks clock synchronization
nodeapp - Checks node applications’ existence

Note: For manual installation, you need to install CVU on only one node. CVU deploys itself on remote nodes during executions that require access to remote nodes.

Cluster Verify Output: Example

$ cluvfy comp crs -n all -verbose
Verifying CRS integrity
Checking CRS integrity...
The Oracle clusterware is healthy on node "host03"
The Oracle clusterware is healthy on node "host02"
The Oracle clusterware is healthy on node "host01"
CRS integrity check passed
Verification of CRS integrity was successful.

Write a shell script to copy log files before they wrap:

# Script to archive log files before wrapping occurs
# Written for CSS logs. Modify for other log file types.
CSSLOGDIR=/u01/app/11.2.0/grid/log/host01/cssd
while [ 1 –ne 0 ]; do
   CSSFILE=/tmp/css_`date +%m%d%y"_"%H%M`.tar
   tar -cf $CSSFILE $CSSLOGDIR/*
   sleep 300
done
exit

Processes That Can Reboot Nodes:

The following processes can evict nodes from the cluster or cause a node reboot:

hangcheck-timer: Monitors for machine hangs and pauses (it is not required in 11gR2 but required for 11gR1)
oclskd: Is used by CSS to reboot a node based on requests from other nodes in the cluster
ocssd: Monitors the internode’s health status

Note: While the hangcheck-timer module is still required for Oracle Database 11g Release 1 RAC databases, it is no longer needed for Oracle Database 11g Release 2 RAC.

Determining Which Process Caused Reboot:
Most of the time, the process writes error messages to its log file when a reboot is required.

- ocssd

- /var/log/messages

- <Grid_Home>/log/<hostname>/cssd/ocssd.log

- oclskd

- <Grid_Home>/log/<hostname>/client/oclskd.log

- hangcheck-timer

- /var/log/messages

Using diagwait for Eviction Troubleshooting:
When a node is evicted on a busy system, the OS may not have had time to flush logs and trace files before reboot.
- Use the diagwait CSS attribute to allow more time.
- It does not guarantee that logs will be written.
- The recommended value is 13 seconds.
- Clusterwide outage must be changed.
- It is not enabled by default.
- To enable:

# crsctl set css diagwait 13 -force
- To Disable:
# crsctl unset css diagwait
Using ocrdump to View Logical Contents of the OCR:
- The ocrdump utility can be used to view the OCR content for troubleshooting. The ocrdump utility enables you to view logical information by writing the contents to a file or displaying the contents to stdout in a readable format.

- To dump the OCR contents into a text file for reading:
[grid]$ ocrdump filename_with_limited_results.txt
[root]# ocrdump filename_with_full_results.txt

- To dump the OCR contents for a specific key:
# ocrdump -keyname SYSTEM.language

- To dump the OCR contents to stdout in XML format:
# ocrdump -stdout -xml

- To dump the contents of an OCR backup file:
# ocrdump -backupfile week.ocr

- To determine all the changes that have occurred in the OCR over the previous week, locate the automatic backup from the previous week and compare it to a dump of the current OCR as follows:

- If the ocrdump command is issued without any options, the default file name of OCRDUMPFILE will be written to the current directory, provided that the directory is writable.

# ocrdump
# ocrdump -stdout -backupfile week.ocr | diff - OCRDUMPFILE

Checking the Integrity of the OCR:

- Use the ocrcheck command to check OCR integrity.
$ ocrcheck
Status of Oracle Cluster Registry is as follows :
    Version                  :          2
    Total space (kbytes)     :     275980
    Used space (kbytes)      :       2824
    Available space (kbytes) :     273156
    ID                       : 1274772838
    Device/File Name         : +DATA1
                        Device/File integrity check succeeded
    Device/File Name         : +DATA2
                        Device/File integrity check succeeded
    Cluster registry integrity check succeeded
  Logical corruption check succeeded

OCR-Related Tools for Debugging:

OCR tools:
- ocrdump
- ocrconfig
- ocrcheck
- srvctl
Logs are generated in the following directory:
<Grid_Home>/log/<hostname>/client/
Debugging is controlled through the following file:
<Grid_Home>/srvm/admin/ocrlog.ini

- These utilities create log files in <Grid_Home>/log/<hostname>/client/. To change the amount of logging, edit the <Grid_Home>/srvm/admin/ocrlog.ini file.

- The default logging level is 0, which basically means minimum logging. When mesg_logging_level is set to 0, which is its default value, only error conditions are logged. You can change this setting to 3 or 5 for detailed logging information.