Document toolboxDocument toolbox

14.1 Reading Log Files

Finding the failure point

ServiceOptimizer log files may be cyclic.  The size of the log file is determined by a database parameter.  Once this size is reached, then the logging will start overwriting from near the top of the file (just under the initialization information).  Therefore to find the latest logging may not be as simple as looking at the bottom of the file.  Instead you will need to search for the timestamp that the log file was last written to.  In this case you may see something like:

28-Mar-13 21:35:06 INFORMATION 4356 ---- User session id=Administrator 4 127.0.0.1 0  optid=  state=1  onhold=2
28-Mar-13 21:35:06 INFORMATION 4356 ---- User session id=Administrator 4 127.0.0.1 0  optid=  state=1  onhold=2
28-Mar-13 21:35:06 INFORMATION 4356 ---- User session id=Administrator 4 127.0.0.1 0  optid=  state=1  onhold=2
28-Mar-13 21:35:06 INFORMATION 4356 ---- User session id=Administrator 4 127.0.0.1 0  optid=  state=1  onhold=2
28-Mar-13 21:35:06 INFORMATION 4356 ---- Session count now: 0
28-Mar-13 21:35:06 DEBUG1 4356 ---- This session: 4/4356/Administrator
28-Mar-13 21:35:06 DEBUG3 4356 ---- Current sessions:
28-Mar-13 21:35:06 INFORMATION 1504 ---- Process shutdown is complete - all threads terminated.
28-Mar-13 16:11:04 INFORMATION 4840 RetryController JeopardyConditionsObserver - writeback time 0.000000
28-Mar-13 16:11:04 INFORMATION 4840 RetryController UpdateCommittedRJsObserver - writeback time 0.000000
28-Mar-13 16:11:04 INFORMATION 4840 RetryController LaidOutPatternObserver - writeback time 0.000000

As you can see in the above extract the time appears to jump from 16:11 to 21:35.  The timezone of timestamps in the log files are controlled by the tz_Default parameter in sp083_system_parameters.  The value in this field corresponds to a timezone in sp056_timezones table.

Another important fact is that the error may not be the last line written to the log.  In fact it may be quite a few lines up.  So find the last line and search up.

The log file will show the connection made, the data supplied, the return code SP responds with and the disconnection. Other levels of logging can show further information.

What is actually seen in this logfile will depend on the trace level set (see 14.1 Formats and Sizes) - the below explanation assumes SOAP trace is not switched on.

 The following sections describe each of the commonly used interfaces and the details that are seen in the log file for each connection to the interface. At the correct level of tracing each interface will display the information that is passed through to SERVICEOptimizer and the response that SERVICEOptimizer gives back. In all cases it is important to get as much information as possible from the user about what action they were trying to perform, what time the problem occurred and what data they were sending (eg postcode/zipcode of job, job reference, engineer details, date).

Identifying the entry

Each API call can be identified in the APIAgent log file by a key word.  This keyword can then be used in a search for all occurrences of that interface call.  Because of the volumes of each call that can occur in a day it may be worth redirecting the search results to a file for further diagnosis.  The following is the list of key words for the main interfaces

 

Each API call will appear at least twice in the log file. 

Occurence
Entry
Meaning
1 APIAgent<: The call incoming from the interface
>=2eg FRUNorth>:The redirections to 1 or more FRUs

 

The fields that follow the key word match the fields for the API in the same order as defined in Integration Guide

The other useful information about the API call is the date time stamp which is at the beginning of the line. 

If you want to know the results of the API call then you will need to look in the API Agent log file at the lines following and you should be able to see +<Number>/n - the number will match the ServicePower API return code as described in Return Codes.

Database Issues

Find the last entry in the log file and search up for the following:

ORACLE

SQL SERVER

ORA

[SQL]

SERVICEPower will record the type of error and any specifics that may be relevant (such as the query that it was trying to undertake at the time.) 

There are some errors that are easily identified, as listed below.  All other errors should be reported to your DBA for verification as to whether it is an application error or a database server error.

Audit Error - Oracle

The job tables and the system tables are audited using a database trigger.  This trigger requires access (grant select) to the v_$session table in order to work.  If this access is not granted then every time any of these tables are written to, an error will occur, which will refer to a problem with the spdb_audit package.

Primary Key (PK) Error - Oracle

Most of the key tables in the SERVICEScheduling database have sequences as the primary key.  If a database has been loaded incorrectly or copied incorrectly then the sequence numbers may not be in line with the values currently in the table.  Therefore when a new record tries to be inserted it will fail with a primary key (PK) validation.  Sequence numbers are defined in the SERVICEScheduling database specification.  In order to check whether this is the problem then compare the current sequence number value with the maximum value in the table.  For example:

Select max(job_no) from sp111_jobs;

Select job_no.nextval from dual;

The job_no.nextval should be higher than max(job_no).  If it is the same or lower then contact your DBA to let him know there is a problem with the sequence numbers.

Insufficient Database Connections

SERVICEOptimizer makes all its database connections on startup.  Each process makes multiple connections.  Each process has a default number of connections which is defined in the SERVICEOptimizer System Admin guide.  These values can be overridden in the database and can be found by:

Select component_id, parameters from sp400_coomponents;

The parameter field will contain something like –c <number> where number is the number of database connections that process is going to make.  These connections can be tuned for system performance purposes.  In order to effectively tune the connection parameter, SERVICEPower recommends a discussion with a technical consultant prior to change.   

If the database cannot allow the number of connections that the system requires then the processes will fail to start up.  Contact your DBA to ensure that the database is set to allow (at least) the minimum number of connections that SERVICEOptimizer requires to run.

Insufficient database space

Although in a long-running system the amount of space that the database takes up generally does not change much, any small change to business practices or temporary increase in job volumes can cause the database size to grow.  If there is no space left on the database then a database error will occur and SERVICEOptimizer will shut down.

Loss of Database connection

The problem could have been caused by a temporary drop in the database connection.  SERVICEOptimizer makes its database connections at startup and keeps them throughout its life.  If they are lost at any time while the system is running and it is not able to reconnect then it will require a restart.  Each process will only notice that the connection has been lost when it tries to use it.  This may be several minutes or even hours if it is during a quiet period of operation.  The following in the log file will indicate that a loss of database connection has occurred

ORACLE

SQL SERVER

ORA-03113: end-of-file on communication channel

[DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation.