CheckSCSHealth is an
Oracle WebCenter Content component who provides services that allow checking
the health of some aspects of the content server instance, those services can
be accessible from a third party tools monitoring like Nagios which is the use
case for this post.
The services
provided are the following :
- CHECK_SEARCH_HEALTH
- CHECK_FS_HEALTL
- CHECK_PROVIDER_HEALTH
- CHECK_ALL_HEALTH
I’ll show you how
to use all these services and to integrate them in your Nagios web console to monitor
your WebCenter Content instance.
I suppose that you
have already installed Nagios and Oracle WebCenter Content, in my case I have installed
Nagios (version 3.3.1), and Oracle WebCenter Content (11g PS5-11.1.1.6.0).
To check the state
of the CheckSCSHealth component of your WebCenter Content instance, you must
login in the content server console administration; it can be reached from http://yourserver:16200/cs (if you didn’t change the default listening
port during the creation of the WebLogic domain). After that you develop the administration
area, like it’s showed in the following picture.
To manage the
components in the content server, you click in Admin Server which lets you in
the following interface.
In the Component
Manager interface, you click ‘advanced component manager’.
The system
components of the content server are not showed by default; you must check ‘Show
System Components’. In the ‘Enabled Components’ list, you must find
CheckSCSHealth component in the list, once you click on it, the information of
it show in the right area. As it’s showed the component is enabled.
Now, that we made
sure that the CheckSCSHealth is enabled, we can move on the next step.
The integration of Nagios and WebCenter
Content is made via perl script, this script is already provided by your WebCenter
Content installation, and can be founded in :
WebCenter_Content_home/ucm/idc/components/CheckSCSHealth/perl/nagios_check_scs.pl
The content of the
script is :
#!/usr/bin/perl
#
# This Perl script
will ping the content server, and run services on the
# back end to
verify that it is up and running and healthy.
#
use Getopt::Long;
use warnings;
require LWP;
require
HTTP::Request;
# obtain the CGI
root from the command line
GetOptions(
"cgiroot=s" => \$cgi_root );
# Predefined exit
codes for Nagios
%EXIT_CODES = ();
$EXIT_CODES{"UNKNOWN"}
= -1;
$EXIT_CODES{"OK"}
= 0;
$EXIT_CODES{"WARNING"}
= 1;
$EXIT_CODES{"CRITICAL"}
= 2;
# global variables
$state =
"UNKNOWN";
$hda_response =
"";
# run specific
checks, or all checks. Run only one of the below functions
checkAll();
#checkSearch();
#checkFileSystem();
#checkDatabaseProvider();
#checkSocketProvider();
# parse the
hda_response, and look for an error
determineError();
# check all content
server services. Fail if any one of them is offline
sub checkAll
{
$request = HTTP::Request->new(GET
=> ($cgi_root . '?IdcService=CHECK_ALL_HEALTH&IsJava=1'));
$ua = LWP::UserAgent->new;
$response = $ua->request($request);
$hda_response = $response->content;
}
sub checkSearch
{
$request = HTTP::Request->new(GET
=> ($cgi_root . '?IdcService=CHECK_SEARCH_HEALTH&IsJava=1'));
$ua = LWP::UserAgent->new;
$response = $ua->request($request);
$hda_response = $response->content;
}
sub checkFileSystem
{
$request = HTTP::Request->new(GET
=> ($cgi_root . '?IdcService=CHECK_FS_HEALTH&IsJava=1'));
$ua = LWP::UserAgent->new;
$response = $ua->request($request);
$hda_response = $response->content;
}
sub
checkDatabaseProvider
{
$request = HTTP::Request->new(GET
=> ($cgi_root .
'?IdcService=CHECK_PROVIDER_HEALTH&pName=SystemDatabase&IsJava=1'));
$ua = LWP::UserAgent->new;
$response = $ua->request($request);
$hda_response = $response->content;
}
sub
checkSocketProvider
{
$request = HTTP::Request->new(GET
=> ($cgi_root .
'?IdcService=CHECK_PROVIDER_HEALTH&pName=SystemServerSocket&IsJava=1'));
$ua = LWP::UserAgent->new;
$response = $ua->request($request);
$hda_response = $response->content;
}
sub determineError
{
$codeInd = index($hda_response,
"StatusCode=");
$msgInd = index($hda_response,
"StatusMessage=");
if ($codeInd < 0)
{
$state = "CRITICAL";
print "Unknown fatal
error!\n";
}
else
{
$nlInd = index($hda_response,
"\n", $codeInd);
$codeStr =
substr($hda_response, $codeInd + 11, ($nlInd - $codeInd - 11));
$nlInd =
index($hda_response, "\n", $msgInd);
$msgStr = substr($hda_response,
$msgInd+14, ($nlInd - $codeInd - 14));
$dotInd = index($msgStr,
".");
if ($dotInd > 0)
{
$msgStr =
substr($msgStr, 0, $dotInd+1);
}
if ($codeStr < 0)
{
$state =
"CRITICAL";
}
else
{
$state = "OK";
}
print
$msgStr . "\n";
}
exit
$EXIT_CODES{$state};
}
As you might see,
the script provides five methods that we’re going to use to test the WebCenter
Content. The methods are :
- checkAll();
- checkSearch();
- checkFileSystem();
- checkDatabaseProvider();
- checkSocketProvider();
Except checkAll
method, all the others are by default are disabled.
To perform the
integration test, we must execute the nagios_check_scs.pl script like this :
./nagios_check_scs.pl
–cgi url=http://yourWebCenterContentServer:16200/cs/idcplg
If you got the
error message ‘Unknown fatal error’, it can be related to the request line
inside the perl script, in this case you have to change this line from:
Once the edition is
made, this time you execute the script without providing the URL of the content
server as a parameter, because it was provided inside the script. Now you rerun
the script, and you must have the message ‘All tests passed’ which means that
the script was executed successfully.
As we saw earlier,
the CheckSCSHealth provide four services, you must duplicate the nagios_check_scs.pl
script to support all the services in independent calls which give the
following:
For each file, you
must uncomment the corresponding method call, and comment the others call, change
the request line to include the full absolute url of the content server, rather
than the concatenation of the $cgi_root variable and the rest of the request.
Now that the
scripts files are done, you must configure Nagios (creation of services and
commands). The configuration files of Nagios can be found in
/usr/local/nagios/etc/objects.
We’re interested in
two files:
- localhost.cfg
- commands.cfg
You edit the
commands.cfg file like follows and add the following content.
You edit the localhost.cfg
file like follows and add the following content
Before that we
start apache server and nagios, we must be sure that the entire configurations
that we made in nagios files are ok. To do this, we execute the following command:
As you can see, the
entire configuration is done successfully, now we can start apache server and
nagios.
Once in nagios
console administration, you click on services links, then you have the
following interface.
Now we can see, the
services that we created, and all of them have the status OK which means that
our content server is running normally without issues.
To execute the script, you have to type : ./nagios_check_all.pl -cgiroot=http...
ReplyDeleteNo need to edit the script ;)