Oracle Clusterware
   HOME

TheInfoList



OR:

Oracle Clusterware is the cross-platform cluster software required to run the Real Application Clusters (RAC) option for
Oracle Database Oracle Database (commonly referred to as Oracle DBMS, Oracle Autonomous Database, or simply as Oracle) is a multi-model database management system produced and marketed by Oracle Corporation. It is a database commonly used for running online t ...
. It provides the basic clustering services at the operating-system level that enable Oracle Database software to run in clustering mode. In earlier versions of Oracle (release 9i and earlier), RAC required a vendor-supplied clusterware like Sun Cluster or Veritas Cluster Server (except when running on Linux or on
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
).


Oracle Clusterware Components

Oracle Clusterware is the software which enables the nodes to communicate with each other, allowing them to form the cluster of nodes which behaves as a single logical server. Oracle Clusterware is run by Cluster Ready Services (CRS) consisting of two key components: Oracle Cluster Registry (OCR), which records and maintains the cluster and node membership information;
voting disk Voting is a method by which a group, such as a meeting or an electorate, can engage for the purpose of making a collective decision or expressing an opinion usually following discussions, debates or election campaigns. Democracies elect holde ...
, which polls for consistent heartbeat information from all the nodes when the cluster is running, and acts as a tiebreaker during communication failures. The CRS service has four components, each handling a variety of functions: Cluster Ready Services daemon (CRSd), Oracle Cluster Synchronization Service Daemon (OCSSd), Event Volume Manager Daemon (EVMd), and Oracle Process Clusterware Daemon (OPROCd). Failure or death of the CRS daemon can cause
node failure In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex). Node may refer to: In mathematics *Vertex (graph theory), a vertex in a mathematical graph *Vertex (geometry), a point where two or more curves, lines, ...
, which triggers automatic reboots of the nodes to avoid the corruption of data (due to the possible failure of communication between the nodes), also known as fencing. The CRS daemon runs as "root" ( super user) on UNIX platforms and runs as a service on Windows platforms.


CRSd

The following functions are provided by the Oracle Cluster Ready Services daemon (CRSd): * CRS is installed and run from a different ORACLE_HOME known as ORA_CRS_HOME, which is independent of ORACLE_HOME. * CRSd manages the resources like starting and stopping the services and failing-over the application resources. It spawns separate processes to manage application resources. * CRS daemon has two modes of running. During startup and after a shutdown. During planned clusterware start it is started as ‘reboot’ mode. It is started as ‘restart’ mode after unplanned shutdown. * In reboot mode it ‘auto’ starts all the resources under its management. In restart mode it prevails the previous state and brings back the resources to it previous state before shutdown * Manages the Oracle Cluster Registry and stores the current known state in the Oracle Cluster Registry * Runs as ‘root’ on Unix and ‘LocalSystem’ on windows and automatically restarts in case of failure. * CRS requires the public interface, private interface and the
Virtual IP A virtual IP address (VIP or VIPA) is an IP address that does not correspond to a physical network interface. Uses for VIPs include network address translation (especially, one-to-many NAT), fault-tolerance, and mobility. Usage For one-to-man ...
(VIP) for the operation. All these interfaces should be up and running, and they should be able to ping each other before starting CRS Installation. Without the above network infrastructure CRS cannot be installed.


OCSSd

Oracle Cluster Synchronization Services daemon (OCSSd) provides basic ‘group services’ support. Group Services is a distributed group membership system that allows the applications to coordinate activities to achieve a common result. As such, it provides synchronization services between nodes, access to the node membership information, as well as enabling basic cluster services, including cluster group services and cluster locking. It can also run without integration with vendor clusterware. Failure of OCSSd causes the machine to reboot to avoid a split-brain situation. This is also required in a single instance configuration if Automatic Storage Management (ASM) is used. ASM was a new feature in Oracle 10g. OCSSd runs as the "oracle" user. The following functions are provided by the Oracle Cluster Synchronization Services daemon (OCSSd): * ’Group Services’ uses vendor-provided clusterware group services when the latter is available, but is also capable of working independently if that is unavailable * ‘Lock Services’ provides the basic cluster-wide serialization locking functions, and uses a FIFO mechanism to manage locking * 'Node Services' uses OCR to store state data, and updates the information during reconfiguration. It also manages the OCR data, which is static otherwise.


EVMd

The third component in OCS is the Event Volume Management Logger daemon (EVMd). EVMd spawns a permanent child process called "evmlogger" and generates events. The EVMd child process ‘evmlogger’ spawns new children processes on demand and scans the callout directory to invoke callouts. It will restart automatically on failures and death of the EVMd process does not halt the instance. EVMd runs as the "oracle" user.


OPROCd

OPROCd provides the server fencing solution for the Oracle Clusterware. It is the process monitor for Oracle Clusterware and it uses the
hang check timer Hang or Hanging may refer to: People * Choe Hang (disambiguation), various people * Luciano Hang (born 1962/1963), Brazilian billionaire businessman * Ren Hang (disambiguation), various people Law * Hanging, a form of capital punishment Arts, e ...
or watchdog timer (depending on the implementation) for the cluster integrity. OPROCd is locked in the memory and runs as a real time process. This sleeps for a fixed time and runs as the "root" user. Failure of the OPROCd process causes the node to restart. OPROCd is so important that even it is being monitored by a process called OCLSOMON and causes a cluster node to reboot if OPROCd is hung.


References


Clusterware Administration and Deployment Guide
{{refend


External links




Oracle Database 10g Real Application Clusters Handbook - Oracle Press

Using srvctl to Manage your 10g RAC Database
- includes description of Oracle Clusterware components. Clusterware Cluster computing