Pages

Thursday, May 24, 2012

Oracle WebCenter Content Integration With Nagios



CheckSCSHealth is an Oracle WebCenter Content component who provides services that allow checking the health of some aspects of the content server instance, those services can be accessible from a third party tools monitoring like Nagios which is the use case for this post.
The services provided are the following :
  • CHECK_SEARCH_HEALTH 
  •  CHECK_FS_HEALTL 
  • CHECK_PROVIDER_HEALTH 
  • CHECK_ALL_HEALTH
I’ll show you how to use all these services and to integrate them in your Nagios web console to monitor your WebCenter Content instance.
I suppose that you have already installed Nagios and Oracle WebCenter Content, in my case I have installed Nagios (version 3.3.1), and Oracle WebCenter Content (11g PS5-11.1.1.6.0).
To check the state of the CheckSCSHealth component of your WebCenter Content instance, you must login in the content server console administration; it can be reached from http://yourserver:16200/cs (if you didn’t change the default listening port during the creation of the WebLogic domain). After that you develop the administration area, like it’s showed in the following picture.



To manage the components in the content server, you click in Admin Server which lets you in the following interface.


  

In the Component Manager interface, you click ‘advanced component manager’.




The system components of the content server are not showed by default; you must check ‘Show System Components’. In the ‘Enabled Components’ list, you must find CheckSCSHealth component in the list, once you click on it, the information of it show in the right area. As it’s showed the component is enabled.
Now, that we made sure that the CheckSCSHealth is enabled, we can move on the next step.
The integration of Nagios and WebCenter Content is made via perl script, this script is already provided by your WebCenter Content installation, and can be founded in :

WebCenter_Content_home/ucm/idc/components/CheckSCSHealth/perl/nagios_check_scs.pl


The content of the script is :

#!/usr/bin/perl
#
# This Perl script will ping the content server, and run services on the
# back end to verify that it is up and running and healthy.
#

use Getopt::Long;
use warnings;
require LWP;
require HTTP::Request;

# obtain the CGI root from the command line
GetOptions( "cgiroot=s" => \$cgi_root );

# Predefined exit codes for Nagios
%EXIT_CODES = ();
$EXIT_CODES{"UNKNOWN"} = -1;
$EXIT_CODES{"OK"} = 0;
$EXIT_CODES{"WARNING"} = 1;
$EXIT_CODES{"CRITICAL"} = 2;

# global variables
$state = "UNKNOWN";
$hda_response = "";

# run specific checks, or all checks. Run only one of the below functions
checkAll();
#checkSearch();
#checkFileSystem();
#checkDatabaseProvider();
#checkSocketProvider();

# parse the hda_response, and look for an error
determineError();


# check all content server services. Fail if any one of them is offline
sub checkAll
{
        $request = HTTP::Request->new(GET => ($cgi_root . '?IdcService=CHECK_ALL_HEALTH&IsJava=1'));
        $ua = LWP::UserAgent->new;
        $response = $ua->request($request);
        $hda_response = $response->content;
}

sub checkSearch
{
        $request = HTTP::Request->new(GET => ($cgi_root . '?IdcService=CHECK_SEARCH_HEALTH&IsJava=1'));
        $ua = LWP::UserAgent->new;
        $response = $ua->request($request);
        $hda_response = $response->content;
}

sub checkFileSystem
{
        $request = HTTP::Request->new(GET => ($cgi_root . '?IdcService=CHECK_FS_HEALTH&IsJava=1'));
        $ua = LWP::UserAgent->new;
        $response = $ua->request($request);
        $hda_response = $response->content;
}

sub checkDatabaseProvider
{
        $request = HTTP::Request->new(GET => ($cgi_root . '?IdcService=CHECK_PROVIDER_HEALTH&pName=SystemDatabase&IsJava=1'));
        $ua = LWP::UserAgent->new;
        $response = $ua->request($request);
        $hda_response = $response->content;
}

sub checkSocketProvider
{
        $request = HTTP::Request->new(GET => ($cgi_root . '?IdcService=CHECK_PROVIDER_HEALTH&pName=SystemServerSocket&IsJava=1'));
        $ua = LWP::UserAgent->new;
        $response = $ua->request($request);
        $hda_response = $response->content;
}

sub determineError
{
        $codeInd = index($hda_response, "StatusCode=");
        $msgInd = index($hda_response, "StatusMessage=");
        if ($codeInd < 0)
        {
                $state = "CRITICAL";
                print "Unknown fatal error!\n";
        }
        else
        {
                $nlInd = index($hda_response, "\n", $codeInd);
                $codeStr = substr($hda_response, $codeInd + 11, ($nlInd - $codeInd - 11));
               
                $nlInd =  index($hda_response, "\n", $msgInd);
                $msgStr = substr($hda_response, $msgInd+14, ($nlInd - $codeInd - 14));
                $dotInd = index($msgStr, ".");
                if ($dotInd > 0)
                {
                        $msgStr = substr($msgStr, 0, $dotInd+1);
                }

                if ($codeStr < 0)
                {
                        $state = "CRITICAL";
                }
                else
                {
                        $state = "OK";
                }
                print $msgStr . "\n";
        }
        exit $EXIT_CODES{$state};
}

As you might see, the script provides five methods that we’re going to use to test the WebCenter Content. The methods are :
  • checkAll(); 
  • checkSearch(); 
  • checkFileSystem(); 
  • checkDatabaseProvider(); 
  • checkSocketProvider();
Except checkAll method, all the others are by default are disabled.
To perform the integration test, we must execute the nagios_check_scs.pl script like this :

./nagios_check_scs.pl –cgi url=http://yourWebCenterContentServer:16200/cs/idcplg


If you got the error message ‘Unknown fatal error’, it can be related to the request line inside the perl script, in this case you have to change this line from:


To:


Once the edition is made, this time you execute the script without providing the URL of the content server as a parameter, because it was provided inside the script. Now you rerun the script, and you must have the message ‘All tests passed’ which means that the script was executed successfully.


As we saw earlier, the CheckSCSHealth provide four services, you must duplicate the nagios_check_scs.pl script to support all the services in independent calls which give the following:


For each file, you must uncomment the corresponding method call, and comment the others call, change the request line to include the full absolute url of the content server, rather than the concatenation of the $cgi_root variable and the rest of the request.




Now that the scripts files are done, you must configure Nagios (creation of services and commands). The configuration files of Nagios can be found in /usr/local/nagios/etc/objects.



We’re interested in two files:
  • localhost.cfg 
  • commands.cfg
You edit the commands.cfg file like follows and add the following content.


You edit the localhost.cfg file like follows and add the following content

 

Before that we start apache server and nagios, we must be sure that the entire configurations that we made in nagios files are ok. To do this, we execute the following command:


As you can see, the entire configuration is done successfully, now we can start apache server and nagios.

 

 
Once in nagios console administration, you click on services links, then you have the following interface.
Now we can see, the services that we created, and all of them have the status OK which means that our content server is running normally without issues.



1 comment:

  1. To execute the script, you have to type : ./nagios_check_all.pl -cgiroot=http...

    No need to edit the script ;)

    ReplyDelete