Exadata Monitoring Configuration

Exadata Monitoring Configuration





Exadata Monitoring Configuration:

A Setup Guide for Our Customers




Preface


When technical assistance is required with your Exadata appliance, primary support is obtained by opening a ticket via the Natrinsic Helpdesk Ticketing System.  (Please refer to our guide on How to Create a Ticket for more details.)  Natrinsic also provides an event driven, 24X7 monitoring service which augments primary support.  By following the instructions in this document you can create or modify events on your Exadata appliance(s).  Once monitoring is enabled, and provided your support contract includes system monitoring, if specific events are detected a ticket will automatically be opened and investigated by the appropriate Natrinsic Support team.


Comments and Questions


If you would like to comment or ask questions about this guide, registered users may leave a comment below or email: support@natrinsic.com with the subject line “General Support Inquiry”  Within the body of your email provide your company name, your position and the system identifier of at least one system for which you have a current support contract.




Configuring your Exadata for Monitoring


Overview


Prerequisites


There are certain prerequisites that must be met in order for the scripts being supplied to work. These prerequisite tests must be performed or the monitoring system will not work as designed.


There needs to be two files present for the groups associated with an Exadata. One for all compute and cell nodes and the other for the cell nodes alone. In this document they are referred to as follows:


all_group - the file with each “short” hostname for the compute and cell nodes

cell_group - The file with each “short” name for the cell nodes


These files may already be located in the /root or /opt/oracle.SupportTools currently. If not they must be created by the customer.


There must be the ability to send email (SMTP) from the compute node running the supplied dcli script. 


Another prerequisite is that the root user must have user equivalence (that is to say passwordless ssh communication) configured for every compute and cell node.  


This simple test (which needs to be performed on every compute and cell node) will show if the user equivalence is currently setup. This example shows two cell nodes being tested:


[root@compute1 ~]# ssh cell1 date

Tue Mar 22 03:21:48 PDT 2011

[root@compute1 ~]# ssh cell2 date

Tue Mar 22 03:21:53 PDT 2011


Or

[root@compute1 ~]# dcli -k -g cell_group

Error: Neither RSA nor DSA keys have been generated for current user.

Run 'ssh-keygen -t dsa' to generate an ssh key pair.



If the command, when executed, asks for a password, or you get the output from the dcli command specified above, then user equivalence is not setup and must be configured as follows:


[root@compute1 ~]# ssh-keygen -t dsa

Generating public/private dsa key pair.

Enter file in which to save the key (/home/celladmin/.ssh/id_dsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_dsa.

Your public key has been saved in /root/.ssh/id_dsa.pub.

The key fingerprint is:

e6:25:1f:2f:22:a9:5c:ec:e4:98:64:67:91:60:ce:9d celladmin@cell1.example.com


This creates the key for the local node now to be exchanged with the other nodes. Now to set this up for all the nodes:


[root@compute1 ~]# dcli -k -g all_group (take the current node out of this file and add it back after this)

The authenticity of host 'cell1 (127.0.0.1)' can't be established.

RSA key fingerprint is 99:86:a5:3f:f1:98:75:53:e8:92:fc:7d:fd:4d:aa:45.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'cell1' (RSA) to the list of known hosts.

root@cell1's password:

The authenticity of host 'cell2 (192.168.56.103)' can't be established.

RSA key fingerprint is 99:86:a5:3f:f1:98:75:53:e8:92:fc:7d:fd:4d:aa:45.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'cell2,192.168.56.103' (RSA) to the list of known hosts.

celladmin@cell2's password:

cell1: ssh key added

cell2: ssh key added


You should now retest from above that that user equivalence now works.


[celladmin@cell1 ~]$ dcli -g cell_group "cellcli -e list cell"

cell1: cell1  online

cell2: cell2  online



For monitoring of each system you need to prepare the system as follows:


  1. Move the all_group and cell_group files into the user home directory
  2. Save the following as a script



Setting up Monitoring


# On Every Compute node 


$dbmcli


DBMCLI>alter dbserver smtpFrom='Exadata - <db server name>’

DBMCLI>alter dbserver smtpFromAddr=‘<your email address>’

DBMCLI>alter dbserver smtpToAddr=‘<ess_monitor@natrinsic.com>’

DBMCLI>alter dbserver smtpServer=‘<your local mail server>’

DBMCLI>alter dbserver notificationPolicy='critical,warning,clear'

DBMCLI>alter dbserver notificationMethod='mail,snmp'

DBMCLI>alter dbserver validate mail


# On every Storage Cell


$cellcli


CELLCLI>alter cell smtpFrom='Exadata - <cell server name>’

CELLCLI>alter cell smtpFromAddr=‘<your email address>’

CELLCLI>alter cell smtpToAddr=‘<ess_monitor@natrinsic.com>’

CELLCLI>alter cell smtpServer=‘<your local mail server>’

CELLCLI>alter cell notificationPolicy='critical,warning,clear'

CELLCLI>alter cell notificationMethod='mail,snmp'

CELLCLI>alter cell validate mail








    • Related Articles

    • How to Create a Ticket - Client and Partners

      The Natrinsic Helpdesk: A Guide for Our Customers Contents Preface Ticket Process Flow Comments and Questions Creating a Natrinsic Helpdesk Ticket Creating a ticket using the Online Ticketing System Creating a ticket by sending email to ...