Skip to main content

CRSD in Oracle Clusterware – Your RAC's Silent Orchestrator

πŸ”§ CRSD in Oracle Clusterware – Your RAC's Silent Orchestrator

In Oracle RAC (Real Application Clusters), high availability is not just a goal—it's a requirement. Behind the scenes, CRSD (Cluster Ready Services Daemon) is the component ensuring that every essential service, instance, and resource is up, running, and properly managed.

In this post, we will explore:

  • What is CRSD?
  • Its core responsibilities and architecture
  • Interaction with other clusterware components
  • Failover behavior and how it handles resource recovery
  • Diagnostic commands and logs

πŸ“˜ What is CRSD?

The Cluster Ready Services Daemon (CRSD) is a core component of Oracle Clusterware. It is responsible for:

  • Managing high availability (HA) resources
  • Starting and stopping databases, listeners, ASM, VIPs, and services
  • Orchestrating failover and restarts when a node or resource fails

CRSD runs as root and ensures that critical Oracle and non-Oracle applications are managed according to the policies defined in the OCR.

⚙️ CRSD Responsibilities

Function Description
πŸš€ Resource Management Starts/stops Oracle databases, listeners, services, and ASM
πŸ” Failover Handling Relocates resources during node or service failure
🧠 Dependency Tracking Maintains relationships (e.g., listener must start before DB)
πŸ“‘ VIP & SCAN Management Controls Virtual IPs and Single Client Access Names
πŸ“’ OCR Interaction Reads and writes to the Oracle Cluster Registry for resource state/config

🧱 CRSD Architecture

CRSD is part of the Oracle High Availability Service stack. It spawns and manages several subcomponents:

  • crsd.bin – Main daemon running as root
  • oraagent.bin – User-mode agent to handle database, ASM, and listener operations
  • crsagent.bin – Manages internal cluster resources

How it works:

crsd.bin
   ├── oraagent.bin     ➝ manages Oracle resources
   ├── crsagent.bin     ➝ manages internal infra like SCAN/VIP
   └── OCR               ➝ reads and stores configuration

πŸ”„ What Happens During Resource Failures?

  • If a service or DB crashes, CRSD attempts a local restart.
  • If unsuccessful, the resource is relocated to another node.
  • Failover logic is determined by target/placement policies.

If the CRSD itself fails:

  • The node may be evicted by CSSD if it fails health checks.
  • CRSD is auto-restarted by the OHAS daemon.

πŸ” What Resources Does CRSD Manage?

  • πŸ“‘ SCAN Listeners and SCAN VIPs
  • 🧱 Oracle Databases (RAC, RAC One Node)
  • πŸ”§ ASM Instances
  • πŸ’» Node VIPs
  • πŸ–₯️ Application resources registered by srvctl
  • 🧩 Custom resources via action scripts

πŸ§ͺ Diagnostic Commands

Use these to check the health and behavior of CRSD:

crsctl check crs
crsctl status resource -t
crsctl start resource ora.dbname.db
crsctl stop crs

πŸ“‚ Important Log Files

Component Log Path
CRSD Logs $GRID_HOME/log/<hostname>/crsd/crsd.log
Agent Logs $GRID_HOME/log/<hostname>/agent/
OCR Interaction $GRID_HOME/log/<hostname>/crsd/ocrdump.log

πŸ›‘ Best Practices

  • ✅ Monitor crsd.log regularly for failover events
  • πŸ” Avoid manual resource manipulation—always use crsctl or srvctl
  • πŸ”§ Register resources with proper dependencies and policies
  • πŸ“’ Keep OCR and voting disks healthy to prevent node eviction

🧭 Conclusion

CRSD is the brain of high availability in Oracle RAC. It automates resource startup, monitors health, and performs complex orchestration with little human interaction. Understanding its architecture and log files can make troubleshooting more predictable and help ensure your cluster runs smoothly.

When RAC behaves like magic—CRSD is the magician behind the curtain.

Comments

Popular posts from this blog

πŸš€ Automating Oracle Database Patching with Ansible: A Complete Guide

Oracle database patching has long been the bane of DBAs everywhere. It's a critical task that requires precision, expertise, and often results in extended maintenance windows. What if I told you that you could automate this entire process, reducing both risk and downtime while ensuring consistency across your Oracle estate? πŸ’‘ In this comprehensive guide, I'll walk you through a production-ready Ansible playbook that completely automates Oracle patch application using OPatch. Whether you're managing a single Oracle instance or hundreds of databases across your enterprise, this solution will transform your patch management strategy! 🎯 πŸ”₯ The Challenge: Why Oracle Patching is Complex Before diving into the solution, let's understand why Oracle patching is so challenging: πŸ”— Multiple dependencies : OPatch versions, Oracle Home configurations, running processes ⚠️ Risk of corruption : Incorrect patch application can render databases unusable ⏰ Downtime requirements : Da...

🐳Oracle 19c Database Deployment with Docker

Oracle 19c Database Deployment with Docker 🐳 Oracle 19c Database Deployment with Docker Welcome to this comprehensive guide on deploying, configuring, and managing Oracle 19c Database using Docker containers. This blog will walk you through the entire process from setup to production best practices with practical code examples. Docker provides an excellent way to run Oracle databases in isolated, portable containers, making it easy to deploy and manage Oracle 19c instances for development, testing, and production environments. This approach offers numerous benefits: πŸ”’ Isolation : Run Oracle in a containerized environment without affecting your host system 🚚 Portability : Easily move your database between different environments πŸ”„ Reproducibility : Quickly spin up identical database instances ⚡ Resource Efficiency : Use Docker's resource management capabilities to control CPU, memory, and stor...

πŸš€ DB BOT: Real-Time Oracle & GoldenGate Monitoring in Slack

In today's fast-paced DevOps environment, quick access to database metrics is essential. This blog will walk you through creating a Slack bot that provides real-time monitoring of Oracle databases and Golden Gate replication. With simple slash commands, your team can check tablespace usage, Flash Recovery Area status, and Golden Gate replication health directly in Slack. Project Overview Our "DB Bot" offers these key capabilities: Monitor tablespace usage across multiple Oracle databases Check Flash Recovery Area (FRA) status on multiple databases View GoldenGate process status across different servers List GoldenGate credential stores Monitor replication lag in GoldenGate Prerequisites Node.js v14+ Python 3.6+ Oracle client libraries (instantclient_21_19) Access to Oracle databases and GoldenGate servers A Slack workspace with permissions to add apps   Project Structure oracle-slack-bot...